Commit Graph

885 Commits

Author SHA1 Message Date
Omar Sandoval
ee51244dc1 libdrgn: add _cleanup_free_ scope guard, no_cleanup_ptr(), and return_ptr()
Kevin Svetlitski suggested making use of __attribute__((__cleanup__)) a
long time ago, and now that the kernel is doing it, I don't have a good
excuse not to. There are surprisingly only a handful of places that it
was straightforward to apply it to. A lot of potential uses are thwarted
by our policy that out parameters can be clobbered on failure, so that
may be something to revisit. Other cleanup guards will probably be more
useful, but this is just laying the groundwork for the future.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-08-02 12:26:50 -07:00
Omar Sandoval
3ce37c8002 libdrgn: python: fix creating compound value with 32-bit float member on big-endian
This is similar to commit 155ec92ef2 ("libdrgn: fix reading 32-bit
float object values on big-endian").

Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-08-02 10:39:34 -07:00
Omar Sandoval
0bc79c877a libdrgn: fix stray bits when reading bytes of bit field
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-08-01 16:31:17 -07:00
Omar Sandoval
55a3ebca6c libdrgn: dwarf_info: support DWO split DWARF
We've addressed all of the smaller differences with GNU Debug Fission
and split DWARF 5, so now all that remains is the DWARF index.

The general approach is: in drgn_dwarf_index_read_cus(), for each CU,
ask libdw for the "sub-DIE". For skeleton CUs, this is the split CU DIE
from the .dwo file. From that Dwarf_Die, we can get the Dwarf_CU and
then the Dwarf handle. Then, we wrap that in a struct drgn_elf_file
(cached in a hash table in the struct drgn_module), which the DWARF
index can work with from there.

Additionally, a couple of places (.debug_addr parsing and stack trace
local variable lookup) need to be updated to use the correct
drgn_elf_file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 10:10:08 -07:00
Omar Sandoval
c7f1d0d40c libdrgn: dwarf_info: read CU DIE with libdw in DWARF index
Split DWARF is challenging for the DWARF index for a couple of reasons:

1. We need libdw to look up the split files.
2. The file name table comes from the skeleton file, but everything else
   relevant to the index comes from the split file.

(1) requires the index to use libdw to get the CU DIE. Unfortunately,
due to the overhead of libdw, this makes the indexing step 5-10% slower.
On the plus side, getting the CU DIE upfront simplifies quite a bit: we
can read the file name table, compilation directory, and str_offsets
base before indexing, which makes supporting (2) possible.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 10:10:08 -07:00
Omar Sandoval
fc1ee46941 libdrgn: dwarf_info: parse units with dwarf_next_unit() in DWARF index
In the next change, we'll need more information about the unit, and
there's no benefit to doing it ourselves anymore.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 10:10:08 -07:00
Omar Sandoval
645950134b libdrgn: dwarf_info: move file name table parsing code
No changes, this just moves the code now so that later changes are more
obvious.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 10:10:08 -07:00
Omar Sandoval
0e6a0a5f94 libdrgn: dwarf_info: get rid of struct drgn_dwarf_index_pending_cu
Instead, reuse struct drgn_dwarf_index_cu for the pending CUs. This is
mainly so that we can save more information in the pending CU in a later
change. It also lets us merge our per-thread pending CU arrays with
memcpy() instead of element-by-element, but I didn't measure a
performance difference one way or the other.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 10:10:08 -07:00
Omar Sandoval
05c3b244bf libdrgn: dwarf_info: handle GNU Debug Fission location lists
GNU Debug Fission's location lists are a hybrid of the DWARF 5 and
non-split DWARF 4 versions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 10:10:08 -07:00
Omar Sandoval
9c307a4df4 libdrgn: dwarf_info: handle split DWARF .debug_addr
There are a couple of differences with non-split DWARF 5:

- DW_AT_addr_base/DW_AT_GNU_addr_base is in the skeleton DIE, so we need
  to use dwarf_attr_integrate().
- GNU Debug Fission for DWARF 4 doesn't have headers in .debug_addr.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 09:59:31 -07:00
Omar Sandoval
a78d30e13e libdrgn: dwarf_info: handle split DWARF in dwarf_module_find_dwarf_scopes()
dwarf_module_find_dwarf_scopes() and drgn_dwarf_die_iterator_next() just
need to go from skeleton units to split units. We need to use
dwarf_cu_info(), which was added in 0.171, which incidentally was when
elfutils gained split DWARF support anyways.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 09:59:31 -07:00
Omar Sandoval
4fa1dfc063 libdrgn: dwarf_info: handle missing DW_AT_loclists_base
It seems like GCC omits this for split units when using DWARF 5,
intending it to mean the first entry in .debug_loclists.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 09:59:31 -07:00
Omar Sandoval
28b3e016f9 libdrgn: dwarf_info: handle missing DW_AT_str_offsets_base
GNU Debug Fission doesn't have DW_AT_str_offsets_base but does have
.debug_str_offsets. GCC doesn't emit DW_AT_str_offsets_base for DWARF 5
split DWARF. In both cases, the default is the first entry in
.debug_str_offsets.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 09:59:31 -07:00
Omar Sandoval
c4ebbc29ca libdrgn: dwarf_info: fix CU header size computation for GNU Debug Fission
dwo_id was added in split DWARF 5; GNU Debug Fission doesn't have it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-19 09:59:31 -07:00
Omar Sandoval
81c8672d4d libdrgn: python: log to the standard logging module
Rather than coming up with our own, separate logging API for the Python
bindings, let's integrate with the logging module. The straightforward
part is creating a logger from the C extension and adding a log callback
that calls its log() method. However, syncing the log level between the
logging module and libdrgn requires monkey patching.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:47:34 -07:00
Omar Sandoval
c1a2792e6a libdrgn: add simple logging framework
Exceptions aren't enough to debug complicated code paths like debug info
discovery or stack unwinding. We really need logs for that, so let's add
a small logging framework. By default, we log to stderr, but we also
provide a way to direct logs to a different file, or even an arbitrary
callback so that logs can be directed to the application's logging
library of choice.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:47:34 -07:00
Omar Sandoval
fa82071618 libdrgn: call blocking hooks around DWARF index
DWARF indexing can take a long time; Kevin Svetlitski notes that it can
take almost a minute on some large binaries. Let's use the new blocking
API around it so that the Python bindings drop the GIL.

Closes #247.

Suggested-by: Kevin Svetlitski <svetlitski@meta.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:47:34 -07:00
Omar Sandoval
0ad19dc37b libdrgn: python: set blocking callback to release GIL
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:47:34 -07:00
Omar Sandoval
06a825f315 libdrgn: add API for hooks around blocking operations
There are places in drgn where it'd be a good idea to drop the Python
GIL. However, some of these are deep inside of libdrgn, where some code
paths are fast and dropping the GIL would be extra overhead and others
are slow (e.g., type lookups, which may be cached or may require DWARF
namespace indexing). Instead of trying to do this from the Python
bindings, add hooks to libdrgn. These hooks can be used directly or with
a new scope guard macro, drgn_blocking_guard, that we can start
sprinkling around in appropriate places in libdrgn.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:47:34 -07:00
Omar Sandoval
5c1b6cf764 docs: document thread safety
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:33:35 -07:00
Omar Sandoval
471e32e906 libdrgn: debug_info: try harder to get debug file path
We're getting (null) file paths in error messages (e.g., #233) because
libdwfl doesn't always return the debug file path. Fall back to the
loaded file path, which is better than nothing until we get rid of
libdwfl.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-18 12:33:35 -07:00
Omar Sandoval
0bcef5b77f libdrgn: dwarf_info: get byte order from passed file in drgn_eval_cfi_dwarf_expression()
Commit 18b12a5c7b ("libdrgn: get .eh_frame from the correct file")
missed this, but it's unlikely to matter in practice.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-07 15:33:44 -07:00
Omar Sandoval
c76f25b852 libdrgn: dwarf_info: ignore DW_OP_{,GNU_}entry_value
These opcodes appear in practice, and we choke on them with an exception
like "unknown DWARF expression opcode 0xf3" or "unknown DWARF expression
opcode 0xa3". In some cases, it'd be possible to recover the entry value
by looking at call site information, but that's pretty involved. For
now, just treat these operations as optimized out so we stop failing
hard.

Closes #233.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-06 22:00:25 -07:00
Omar Sandoval
916a7217fb libdrgn: dwarf_info: don't call dwarf_dieoffset() redundantly
When we get the DIE from the offset with dwarf_offdie(), there's no need
to go back to the offset with dwarf_dieoffset().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-06 13:56:00 -07:00
Omar Sandoval
b5018aa913 libdrgn: dwarf_info: only iterate necessary DIE subtrees in drgn_module_find_dwarf_scopes()
Thierry found that as soon as drgn_module_find_dwarf_scopes() finds any
DIE containing the PC, it walks the entire subtree rooted at that DIE.
However, we only need to look at the immediate children of a DIE
containing the PC. I think this is what I originally intended, but I
failed to reset the children flag to false when the last DIE didn't
contain the PC. Thierry's suggested check of it.dies.size == subtree is
simpler.

This is a massive performance improvement: for a kernel core dump with
10k threads, getting the stack trace of every thread took ~90 seconds
without this fix and ~50 seconds with it.

Let's also add a comment to this very subtle code.

Fixes: d8d4157346 ("libdrgn: debug_info: add drgn_debug_info_module_find_dwarf_scopes()")
Co-authored-by: Thierry Treyer <ttreyer@fb.com>
Signed-off-by: Thierry Treyer <ttreyer@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-07-06 11:00:16 -07:00
Omar Sandoval
7cb3e99b23 libdrgn: program: find crashed task with cpu_curr() instead of find_task()
s390x populates the pid field in NT_PRSTATUS with the CPU number plus 1
[1] instead of the PID of the task that was running on that CPU. This
means that we get the wrong task_struct from drgn_program_find_thread()
in drgn_program_kernel_core_dump_cache_crashed_thread(), or don't find
the task_struct and crash because of a missing NULL check.

We can work around this and also gracefully handle the normal and idle
cases by instead getting the current task_struct from the CPU runqueue.
This is slightly racy: rq->curr is updated in __schedule() [2] before
the registers and stack are switched in context_switch() [3]. However,
it was already racy, since the pid field in NT_PRSTATUS is populated
from current, which is updated after the registers and stack are
switched (at least on x86-64) [4].

Closes #314.

1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/s390/kernel/crash_dump.c?h=v6.4#n309
2: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c?h=v6.4#n6646
3: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c?h=v6.4#n5343
4: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/process_64.c?h=v6.4#n621

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-29 16:16:27 -07:00
Omar Sandoval
cc0994a010 drgn.helpers.linux.sched: add cpu_curr() helper
This will be used internally, but it's also a nice shortcut for
per_cpu(prog["runqueues"], cpu).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-29 15:58:52 -07:00
Omar Sandoval
5057308c0f drgn 0.0.23
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-28 13:59:18 -07:00
Omar Sandoval
0d6438d994 libdrgn: orc_info: use .orc_header to detect version
My kernel patch was merged for Linux 6.4 and backported to 6.3.10, so
now we can use the .orc_header section to reliably detect the ORC format
version. Since the 6.4 release candidates and older versions of 6.3
don't have .orc_header, we'll keep the version check as a fallback.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-28 11:10:18 -07:00
Omar Sandoval
91ede0c6a4 libdrgn: orc_info: handle ORC changes in Linux 6.3 and 6.4
The ORC format changed twice recently:

- Linux kernel commit ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field
  to ORC metadata") (in v6.3).
- Linux kernel commit fb799447ae29 ("x86,objtool: Split
  UNWIND_HINT_EMPTY in two") (in v6.4).

The former went unnoticed because the change was subtle, and the latter
completely broke x86-64 kernel stack traces.

To handle this, let's "upgrade" the format to the latest version when we
load and sort the ORC information. This is more work upfront but avoids
needing to handle the version differences every time we use ORC to
unwind.

Unfortunately, ORC currently doesn't have any sort of versioning, so we
have to break the rule of not checking kernel versions. However, I have
a kernel patch pending merging that should fix this for the future.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
fc47ec1b78 libdrgn: add prog pointer to struct drgn_module
The next commit needs this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
3085259d82 libdrgn: orc_info: use unsigned int instead of size_t for num_entries
It's unrealistic for there to be more than 4 billion ORC entries. Switch
to an unsigned int. The main benefit is that the indices array that we
use to sort the parallel arrays of entries and pc_offsets becomes half
the size, which also makes parsing ORC about 10% faster (down from ~5 ms
to ~4.5 ms for the Fedora vmlinux on my laptop).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
dd08658a6e libdrgn: don't cache ORC sections in struct drgn_elf_file
.orc_unwind_ip and .orc_unwind are only referenced while initially
parsing ORC data and then never touched again, so it's wasteful to cache
them in struct drgn_elf_file. Look them up if and when we parse the ORC
data instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
0bb503c6a0 libdrgn: orc_info: check ORC section alignment instead of copying
In practice, the .orc_unwind and .orc_unwind_ip sections will always be
suitably aligned. Check it, then assume the alignment later.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
8526b86644 libdrgn: linux_kernel: get slightly smaller code for kernel_module_iterator_next()
By using the same temporary objects in the Linux 6.4 branch as the
pre-6.4 branch, we get slightly better code generation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
2ee625fc74 libdrgn: handle DWARF sections exactly* like libdw
We only support .debug_* sections, but libdw also supports .zdebug_*,
.debug_*.dwo, and .gnu.debuglto_.debug_*. Mimic how libdw chooses debug
sections, with one exception: .debug_cu_index and .debug_tu_index (used
for DWP, which we don't support yet but will) should be considered DWO
sections (this needs to be fixed in libdw, too).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-20 13:45:04 -07:00
Omar Sandoval
cff9b6185c libdrgn: fix typo in ORC unwinder handling of ORC_REG_SP_INDIRECT
ORC_REG_SP_INDIRECT is supposed to be an indirect access via rsp, but we
have a typo and are using rbp instead. This is a partial fix for #304.

Fixes: 630d39e345 ("libdrgn: add ORC unwinder")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-15 13:03:10 -07:00
Omar Sandoval
e2e2ebc317 libdrgn: fix Linux kernel crashed_thread() on non-x86 architectures
We currently use crashing_cpu to determine the thread that caused a
kernel crash. However, crashing_cpu is x86-specific (it is defined in
arch/x86/kernel/reboot.c). Since Linux 4.5, the generic panic code
defines a very similar variable, panic_cpu. Use that instead so that we
support all architectures, but fall back to crashing_cpu to support
older kernels on x86 (even though we don't claim to support 4.4
anymore).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-15 07:56:19 -07:00
Omar Sandoval
772492838f drgn.helpers.linux.mm: add arbitrary address translation helpers
follow_{page,pfn,phys}() translate the virtual address by walking the
page table for a given mm_struct (built on top of the existing page
table iterator interface). vmalloc_to_page() and vmalloc_to_pfn() are
special cases for vmalloc addresses.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-02 23:40:38 -07:00
Stephen Brennan
ce8b2938e6 libdrgn: linux_kernel: Fix compiler warning
With GCC 13.1.1 and the recommended build
setup (CONFIGURE_FLAGS="--enable-compiler-warnings=error"), I get the
following failure:

In function 'linux_kernel_get_vmemmap',
    inlined from 'linux_kernel_object_find' at ../../libdrgn/linux_kernel_object_find.inc.strswitch:34:12:
../../libdrgn/linux_kernel.c:370:23: error: 'address' may be used uninitialized [-Werror=maybe-uninitialized]
  370 |                 err = drgn_object_set_unsigned(&prog->vmemmap, qualified_type,
      |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  371 |                                                address, 0);
      |                                                ~~~~~~~~~~~
../../libdrgn/linux_kernel.c: In function 'linux_kernel_object_find':
../../libdrgn/linux_kernel.c:361:26: note: 'address' was declared here
  361 |                 uint64_t address;
      |                          ^~~~~~~
cc1: all warnings being treated as errors

While linux_kernel_get_vmemmap_address should always update address in a
non-error case, the compiler seems to disagree. It's easy enough to shut
up the compiler by initializing address to 0. What's more, if there is
an actual issue where the linux_kernel_get_vmemmap_address does NOT
update the address variable, a 0 value will be easier to debug than
garbage from an uninitialized variable.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2023-06-01 14:49:34 -07:00
Stephen Brennan
49d6bfdb24 Fix test failure on Python 3.12 (fixes #298)
Running tests on Python 3.12, we get:

test_int (tests.test_language_c.TestLiteral.test_int) ... python3.12: /usr/include/python3.12/object.h:215: Py_SIZE: Assertion `ob->ob_type != &PyLong_Type' failed.
Aborted (core dumped)

We're relying on an implementation detail to check whether the object is
negative. Instead, catch an overflow error, negate and try again.
Genuine overflows will still overflow on the second time, but negative
numbers will succeed.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2023-06-01 14:23:51 -07:00
Ido Schimmel
3f3a957562 libdrgn: linux_kernel: Fix module detection on kernel v6.4
Kernel commit ac3b43283923 ("module: replace module_layout with
module_memory") in v6.4 changed the layout of `struct module`, resulting
in the following drgn error [1].

Fix this by first trying to determine the base address and size of each
kernel module via the `struct module_memory mem[MOD_TEXT]` member,
before falling back to previous methods that work on older kernels.

Tested on v6.4-rc2 and v6.3 which does not include the above mentioned
commit.

Note that kernel commit b4aff7513df3 ("scripts/gdb: use mem instead of
core_layout to get the module address") performs a similar fix in Python
GDB scripts.

Closes #296.

[1]
```
# drgn
drgn 0.0.22 (using Python 3.11.3, elfutils 0.189, with libkdumpfile)
For help, type help(drgn).
>>> import drgn
>>> from drgn import NULL, Object, cast, container_of, execscript, offsetof, reinterpret, sizeof
>>> from drgn.helpers.common import *
>>> from drgn.helpers.linux import *
warning: could not get debugging information for:
kernel modules (could not find loaded kernel modules: 'struct module' has no member 'core_size')
```

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
2023-05-28 22:08:18 -07:00
Omar Sandoval
fc3ea4184a libdrgn: use new include-what-you-use exported declarations and fix warnings
include-what-you-use/include-what-you-use#1164 fixed
include-what-you-use/include-what-you-use#971 so that we can export
forward declarations instead of hacking around it. I can't reproduce the
issue with BINARY_OP_SIGNED_2C anymore either, so we can remove that
hack, too. Also fix any other warnings.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-05-24 00:25:25 -07:00
Sven Schnelle
73e451d588 tests: enable MM tests on s390x
s390x now has full mm support, so enable the tests for it.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
2023-03-22 15:24:11 -07:00
Sven Schnelle
3483a69a56 libdrgn: add s390x pagetable walk support
Add support for walking s390x page tables. This supports
up to 5 level page table walking and huge/large pages. In order
to figure out the level of paging used, we read the first entry
of the pgd, which is always mapped for lowcore access and use the
level bits of the next page table. This is because drgn passes mm::pgd
as pgtable argument to the walker function which doesn't contain the
ASCE bits.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
2023-03-22 15:24:11 -07:00
Omar Sandoval
0d03be7d62 libdrgn: silence -Wmaybe-uninitialized false positive
This false positive appears to only trigger on 32-bit. I reproduced it
with GCC 10 and 12.

Fixes #242.

Reported-by: Timothée Cocault <timothee.cocault@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-24 01:04:15 -08:00
Omar Sandoval
18a8f69ad8 libdrgn: linux_kernel: add object finder for jiffies
We have a lot of examples that use jiffies, but they stopped working
long ago on x86-64 (since Linux kernel commit d8ad6d39c35d ("x86_64: Fix
jiffies ODR violation") (in v5.8 and backported to stable releases)) and
never worked on other architectures. This is because jiffies is defined
in the Linux kernel's linker script. #277 proposed updating the examples
to use jiffies_64, but I would guess that most kernel developers are
familiar with jiffies and many have never seen jiffies_64. jiffies is
also a nicer name to type in live demos. Let's add a case to the Linux
kernel object finder to get the jiffies variable.

Reported-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-22 11:15:37 -08:00
Omar Sandoval
2f97cc0f5f libdrgn: platform: expand on page table iterator documentation
There are a lot of details about how the page table iterator functions
should be used/implemented that commit 174b797ae3 ("libdrgn: platform:
add documentation (especially for drgn_architecture_info)") didn't
cover. Add an example and expand/clarify the documentation for the
callbacks.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-21 17:42:22 -08:00
Jay Kamat
08cb38cc2f Expand DW_AT_upper_bound quirk on zero size arrays
GCC appears to use data8 at -1 when reporting zero length arrays when
comping c++ code, this patch adds support and a test for that behavior.

dwarf_info.c: Remove check for sdata on quirk for array length == 0

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2023-02-21 16:44:20 -08:00
Omar Sandoval
94443457aa libdrgn: handle GNU Debug Fission attributes, forms, and opcodes
These are all equivalent to their DWARF 5 counterparts, which we already
support:

* DW_FORM_GNU_addr_index <-> DW_FORM_addrx
* DW_FORM_GNU_str_index <-> DW_FORM_strx
* DW_AT_GNU_addr_base <-> DW_AT_addr_base
* DW_OP_GNU_addr_index <-> DW_OP_addrx
* DW_OP_GNU_const_index <-> DW_OP_constx

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-08 13:25:45 -08:00
Omar Sandoval
02e344a7dd libdrgn: use strswitch for ELF section names
Move the definitions of the section names to a Python script,
gen_elf_sections.py, and use that to generate the enum definitions and a
lookup function. This is preparation for checking for section names with
the .dwo suffix in the future.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-08 13:25:22 -08:00
Imran Khan
4d7c709621 helpers: idr: Enable idr helpers to work with older kernel.
Prior to kernel v4.11, idr was not using radix tree as its backend.
So current idr helper(s) only work for kernel v4.11+.
Enable idr helpers(s) to work with non-radix tree based idr, so that
the helpers can be used with older kernels as well.

Thanks to Omar for optimizing the idr_for_each helper.

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
2023-01-23 17:32:17 -08:00
Kevin Svetlitski
7e6efe6649 Add support for looking up types in namespaces
Looking up objects in namespaces is already well-supported by `drgn`.
These changes bring the same to functionality type lookup, so that
`prog.type('struct A::B::C::MyType')` works in an analogous fashion to
`prog['A::B::C::MyVar']`.

Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>
2023-01-19 10:19:36 -08:00
Kevin Svetlitski
c32f0811cb Fix memory leak in c_format_compound_object
Found via CodeChecker static analysis.

Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>
2023-01-11 11:59:43 -08:00
Omar Sandoval
2181826570 drgn 0.0.22
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-01-05 20:38:32 -08:00
Omar Sandoval
4731de6acc libdrgn: x86_64: unwind with frame pointer more permissively
get_registers_from_frame_pointer() has a sanity check that the unwound
frame pointer must be greater than the current frame pointer. This is
generally true if the entire program is using frame pointers, but not
necessarily otherwise. In particular, if the program is a Linux kernel
configured with ORC, most of the time, rbp is a general purpose
register; it is only used as a frame pointer in special cases without
unwinder information like BPF programs. Those cases are exactly when we
want the frame pointer unwinder, but depending on what the caller was
using rbp for, the frame pointer unwinder might bail prematurely.

Let's remove the sanity check. In the worst case, this could lead us off
into the weeds chasing pointers, but the iteration limit in
drgn_get_stack_trace() prevents that from being dangerous.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-01-04 16:45:28 -08:00
Omar Sandoval
a6b6afaba2 libdrgn: return DRGN_ERROR_NOT_IMPLEMENTED_ERROR if virtual address translation is not implemented
This will allow us to distinguish it from other errors.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-01-04 15:09:56 -08:00
Omar Sandoval
c48cddbdb0 libdrgn: ppc64: fix stack unwinding since Linux v5.11 and before v4.20
linux_kernel_get_initial_registers_ppc64() depends on the size of struct
pt_regs, but this has changed multiple times, in:

- Linux kernel commit 4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than
  thread_struct") (in v4.20)
- Linux kernel commit 66f93c5a02d5 ("powerpc/64: Fix kernel stack
  16-byte alignment") (in v4.20)
- Linux kernel commit 8e560921b58c ("powerpc/book3s64/pkeys:
  Store/restore userspace AMR/IAMR correctly on entry and exit from
  kernel") (in v5.11)

It also depends on the overhead stored before struct pt_regs on the
stack, which changed in Linux kernel commit cd52414d5a6c ("powerpc/64:
ELFv2 use minimal stack frames in int and switch frame sizes") (in
v6.2).

We can handle all of these cases by reading the previous r1 from memory
instead of computing it from a hard-coded size and finding the struct
pt_regs based on that r1 and the actual size of struct pt_regs.

Reported in #232.

Reported-by: Sourabh Jain <jainsourabh679@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-01-04 13:42:28 -08:00
Sven Schnelle
1bbeff92bf libdrgn: add s390x unwinding support
Co-authored-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
2022-12-19 13:48:44 -08:00
Omar Sandoval
9ee1ccff98 libdrgn: add stub s390 and s390x architectures with relocation implementation
The only relocation type I saw in Debian's kernel module debug info was
R_390_32. R_390_8, R_390_16, R_390_64, R_390_PC16, R_390_PC32, and
R_390_PC64 are trivial to support, as well. The Linux kernel supports
many more, but hopefully they won't show up for debug info.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-19 13:48:44 -08:00
Omar Sandoval
aa5f121ac9 libdrgn: document implementation-defined behavior in add_to_possibly_null_pointer()
Konrad Borowski pointed out that add_to_possibly_null_pointer() relies
on GCC-specific behavior:
https://fosstodon.org/@xfix/109542070338182493. CONTRIBUTING.rst
mentions that we assume that casting between pointers and integers does
not change the bit representation, but we might as well document it
here, too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-19 12:07:40 -08:00
Kevin Svetlitski
4213bea149 libdrgn: add limited support for looking up types with template arguments
Currently, looking up a type with template arguments results in an
"invalid character" syntax error on the "<" character. The DWARF index
includes template arguments in indexed names, so we need to do lookups
including the template arguments. Full support for this would require
parsing the template argument list syntax and normalizing it or looking
it up as an AST in some way. For now, it's at least an improvement to
pass the user's string verbatim. To do so, kludge it by adding a token
containing everything from "<" to the matching ">" to the C++ lexer and
appending that to the identifier.

Co-authored-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>
2022-12-14 20:55:03 -08:00
Omar Sandoval
7ce84a3f1f drgn.helpers.linux: add proper XArray helpers
Commit 89eb868e95 ("helpers: make find_task() work on recent kernels")
made radix_tree_lookup() and radix_tree_for_each() work for basic
XArrays. However, it doesn't handle a couple of more advanced features:
multi-index entries (which old radix trees actually also supported) and
zero entries. It has also been really confusing to explain to people
unfamiliar with the radix tree -> XArray transition that they should use
helpers named radix_tree for a structure named xarray.

So, let's finally add xa_load(), xa_for_each(), and some additional
auxiliary helpers. The non-recursive xa_for_each() implementation is
based on Kevin Svetlitski's C implementation from commit 2b47583c73
("Rewrite linux helper iterators in C"). radix_tree_lookup() and
radix_tree_for_each() share the implementation with xa_load() and
xa_for_each(), respectively, so they are mostly interchangeable.

Fixes: #61

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-13 17:46:37 -08:00
Omar Sandoval
6486073148 libdrgn: python: fix Py_BuildValue() type in gen_constants.py
We're calling Py_BuildValue() with the "k" format for unsigned long but
passing the enum value itself, which is promoted to int. I don't know
whether there are any ABIs where this matters in practice, but let's use
"K" and cast to unsigned long long explicitly to be safe.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-07 16:46:33 -08:00
Omar Sandoval
94e1407a5f libdrgn: python: don't repeat class names in gen_constants.py
Instead, define the list of constant classes in one place so we can
generate all 3 places that need it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-07 15:41:49 -08:00
Omar Sandoval
af28419ee5 libdrgn: python: fix path_arg leaks in Program_find_{type,object}
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-06 13:25:55 -08:00
Omar Sandoval
d7204eaa00 libdrgn: python: simplify path_converter()
PyUnicode_FSConverter() already handles os.PathLike, so we only need to
handle None and save the string and length.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-06 10:49:00 -08:00
Omar Sandoval
73fea86792 libdrgn: python: add PyLong_From* and PyLong_As* wrappers for stdint.h types
It feels icky to write code that, for example, passes a uint64_t to
PyLong_FromUnsignedLongLong(). In practice it's fine, but it's much
nicer to have conversion functions specifically for the stdint.h types.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-05 16:06:22 -08:00
Alastair Robertson
7180304c88 libdrgn: dwarf_info: Support DW_TAG_GNU_template_parameter_pack
This DWARF tag is used by C++ classes which take a variable number
of template parameters, such as std::variant and std::tuple.

Signed-off-by: Alastair Robertson <ajor@meta.com>
2022-12-05 15:33:46 -08:00
Omar Sandoval
174b797ae3 libdrgn: platform: add documentation (especially for drgn_architecture_info)
While reviewing #214, I realized that we have very little documentation
for drgn_architecture_info (and platform internals in general). Let's
document all of the important stuff, and in particular how to add
support for new architectures.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-02 13:55:38 -08:00
Omar Sandoval
1088ef4a1e libdrgn: platform: replace demangle_return_address() with demangle_cfi_registers()
While documenting struct drgn_architecture_info, I realized that
demangle_return_address() is difficult to explain. It's more
straightforward to define this functionality as demangling any registers
that are mangled when using CFI rather than just the return address
register.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-02 13:52:06 -08:00
Omar Sandoval
0fad8a591a libdrgn: fix finding types beginning in size_t or ptrdiff_t
c_parse_specifier_qualifier_list() checks whether an identifier starts
with "size_t" or "ptrdiff_t" to decide whether to return the size_t or
ptrdiff_t type. This incorrectly matches stuff like like "size_tea" and
"ptrdiff_tee". Fix this by making it an exact comparison.

Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 16:21:56 -08:00
Omar Sandoval
18b12a5c7b libdrgn: get .eh_frame from the correct file
We're currently getting .eh_frame from the debug file. However, since
.eh_frame is an SHF_ALLOC section, it is actually in the loaded file,
and may not be in the debug file. This causes us to fail to unwind in
modules whose debug file was created with objcopy --only-keep-debug
(which is typical for Linux distro debug files).

Fix it by getting .eh_frame from the loaded file. To make this easier,
we split .eh_frame and .debug_frame data into two separate tables. We
also don't bother deduplicating them anymore, since GCC and Clang only
seem to generate one or the other in practice.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
270375f077 libdrgn: debug_info: get "loaded" ELF file
For upcoming changes, we will need loaded (SHF_ALLOC) sections for
modules. Some separate debug files (e.g., those created with objcopy
--only-keep-debug) don't have those sections. Let's get the loaded file
from libdwfl with dwfl_module_getelf() and save it in a struct
drgn_elf_file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
bcb53d712b libdrgn: bypass libdwfl with struct drgn_elf_file
Now that we track the debug file ourselves, we can avoid calling libdwfl
in a bunch of places. By tracking the bias ourselves, we can avoid a
bunch more.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
34f122144a libdrgn: debug_info: wrap ELF file information in new struct drgn_elf_file
struct drgn_module contains a bunch of information about the debug info
file. Let's pull it out into its own structure, struct drgn_elf_file.
This will be reused for the "main"/"loaded" file in an upcoming change.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
b3bab1c5b0 libdrgn: make module vs. program platform difference more clear
It's confusing that we have a platform both for the program and for each
module. They usually match, but they're not required to. For example,
the user can manually add a file with a different platform just to read
its debug info. Our rule is that if we're parsing anything from the
module, we use the module platform; and otherwise, use the program
platform. There are a couple of places where the platforms must match:
when using call frame information (CFI) or registers. Let's make all of
this more clear in the code (by using the module's platform even when it
must match the program's platform) and in comments. No functional
change.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 12:53:45 -08:00
Omar Sandoval
85f423dfb8 libdrgn: dwarf_info: get default pointer size from CU
If a DW_TAG_pointer_type DIE doesn't specify its size with
DW_AT_byte_size, we currently default to the program's address size.
However, the DWARF we're parsing could be for a platform with a
different address size. It's more correct to use the CU's address size.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 12:53:45 -08:00
Omar Sandoval
222680b47a Add StackFrame.sp
We have some generic helpers that we'd like to add (for example, #210)
that need to know the stack pointer of a frame. These shouldn't need to
hard-code register names for different architectures. Add a generic
shortcut, StackFrame.sp.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-22 18:47:16 -08:00
Boris Burkov
c8ff8728f7 Support systems without qsort_r
qsort_r is a non-standard glibc extension and turns out to be the only
thing that prevents drgn from working on a musl system. "Fix" the use of
qsort_r by switching it to qsort with a thread local variable for the
parameter.

Tested in a clean chroot install of musl voidlinux.

Signed-off-by: Boris Burkov <boris@bur.io>
2022-11-03 12:57:55 -04:00
Stephen Brennan
5f3a91f80d Add StackFrame.locals() method
The StackFrame's __getitem__() method allows looking up names in the
scope of a stack frame, which is an incredibly useful tool for
debugging. However, the names are not discoverable -- you must already
be looking at the source code or some other source to know what names
can be queried. To fix this, add a locals() method to StackFrame, which
lists names that can be queried in the scope. Since this method is named
locals(), it stops at the function scope and doesn't include globals or
class members.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-11-02 22:40:33 -07:00
Omar Sandoval
b3a5051ff4 libdrgn: dwarf_info: handle DW_TAG_enumerator DIE with missing or invalid DW_AT_name
find_dwarf_enumerator() needs to check that the return value of
dwarf_diename() is not NULL before calling strcmp(). This is similar to
commit 330c71b5b5 ("libdrgn: dwarf_info: fix segfault on anonymous
DIEs during scope search"), although I haven't seen this one happen in
practice.

Fixes: bc85767e5f ("libdrgn: support looking up parameters and variables in stack traces")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-02 22:19:44 -07:00
Omar Sandoval
4031093848 Add some missing copyright/license notices
I wanted to make REUSE pass, but I'm not sure what to do about trivial
files. REUSE suggests using CC0, but Fedora no longer allows CC0. I'll
punt that until later. For now, let's add notices to some code files.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 17:14:02 -07:00
Omar Sandoval
87b7292aa5 Relicense drgn from GPLv3+ to LGPLv2.1+
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.

All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 17:05:16 -07:00
Omar Sandoval
d465071651 libdrgn: replace copies of elfutils headers with generated files
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 15:41:53 -07:00
Omar Sandoval
99dc927f38 libdrgn: dwarf_info: rename dw_tag_str constants
Rename DW_TAG_{UNKNOWN_FORMAT,BUF_LEN} to
DW_TAG_STR_{UNKNOWN_FORMAT,BUF_LEN} to make it more clear that they're
for dw_tag_str.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-31 14:22:45 -07:00
Omar Sandoval
a4ae67b6b5 libdrgn: replace BUILD_BUG_ON* with static_assert
Our container_of() and array_size() were copied from the Linux kernel
and use some really ugly BUILD_BUG_ON_ZERO() and BUILD_BUG_ON_MSG()
macros. C11 has _Static_assert, which is much nicer. We just have to
shoehorn it into an expression, which we do with clever use of _Generic
and sizeof a struct type definition. (We could accomplish the same idea
with a comma expression, but GCC warns when the left-hand operand of a
comma expression has no effect. We could also do it with a compound
statement, but it's cooler to do it with standard C11.)

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-28 13:38:35 -07:00
Omar Sandoval
40f2d4b2aa drgn 0.0.21
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-12 12:00:54 -07:00
Omar Sandoval
70af25849c libdrgn: rename drgn_debug_info_module to drgn_module
Eventually, modules will be exposed as part of the public libdrgn API,
so they should have a clean name. Additionally, the module API I'm
currently working on will allow modules for which we don't have the
debug info file, so "debug info module" would be a misnomer.

Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and
drgn_orc_module_info to drgn_module_orc_info to fit the new naming
scheme better.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 16:52:46 -07:00
Omar Sandoval
8bfc9f1e07 libdrgn: python: rename module.c to main.c
We're eventually going to add a drgn.Module class, which logically
should go in a file called module.c. But, we already have a module.c
with module-level definitions. Rename that file to main.c to free up
module.c

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 16:24:32 -07:00
Omar Sandoval
1fe01bb4b8 libdrgn: python: add call_tp_alloc()
There are a bunch of places where we call .tp_alloc() directly, which is
very verbose. Add a macro that removes the boilerplate.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 16:24:29 -07:00
Omar Sandoval
60bafe96db libdrgn: examples: use noreturn for usage()
-Wimplicit-fallthrough has a false positive because the compiler
apparently doesn't know that usage() never returns.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 16:12:38 -07:00
Omar Sandoval
03d5c2ebac libdrgn: string_builder: replace string_builder_finalize()
Instead of string_builder_finalize(), which leaves the string_builder in
an undefined state (according to the documentation, at least), define
string_builder_null_terminate(), which documents exactly what it does.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 15:55:04 -07:00
Omar Sandoval
cd41d9d576 libdrgn: string_builder: rework reserving
Make string_builder_reserve() allocate an exact capacity, and add a
string_builder_reserve_for_append() wrapper that does the
next_power_of_two(current length + number to append) that all of the
current callers want.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 15:55:02 -07:00
Omar Sandoval
d76a3a338f libdrgn: string_builder: add dedicated initializer
Rather than documenting how to initialize a struct string_builder,
provide an initializer, STRING_BUILDER_INIT.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 15:32:07 -07:00
Omar Sandoval
05a3695d5b libdrgn: enable -Wimplicit-fallthrough, take 2
This time, in order to work on both GCC and Clang, use
__attribute__((__fallthrough__)) instead of /* fallthrough */ comments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-04 23:36:01 -07:00
Omar Sandoval
2b4d5fd237 Revert "libdrgn: enable -Wimplicit-fallthrough"
This reverts commit e05bfbddc2. Clang
doesn't support /* fallthrough */ comments, so we'll need to use
__attribute__((falthrough)), which will need some additional feature
detection.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-04 18:12:03 -07:00
Omar Sandoval
e05bfbddc2 libdrgn: enable -Wimplicit-fallthrough
This only required one change in the code where GCC wanted the comment
placed differently.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-04 17:53:35 -07:00
Omar Sandoval
0b7ac5b046 Fix vmcore stack traces on Linux < 4.9 or >= 5.16 and add drgn.helpers.linux.task_cpu()
task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes
drgn_get_initial_registers() to think that the kernel is !SMP and use
CPU 0 instead, producing incorrect stack traces. This has also always
been wrong for Linux < 4.9 and on architectures that don't enable
CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be
((struct thread_info *)task->stack)->cpu.

Fix it by factoring out a new task_cpu() helper that handles all of the
above cases. Also add a test case for task_cpu() in case this changes
again.

Fixes: eea5422546 ("libdrgn: make Linux kernel stack unwinding more robust")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-03 16:21:12 -07:00
Omar Sandoval
330c71b5b5 libdrgn: dwarf_info: fix segfault on anonymous DIEs during scope search
Jakub Kicinski reported that
prog.crashed_thread().stack_trace()[1]['does not exist'] segfaulted on a
vmcore he encountered. The segfault was a NULL pointer dereference of
dwarf_diename() of a DW_TAG_subprogram DIE in
drgn_find_in_dwarf_scopes(). The fix is to ignore DIEs without a name.

I was curious what this anonymous DW_TAG_subprogram was. It turned out
to be some dubious DWARF generated by Clang when a local variable is
defined via a macro. One such example comes from the following code in
arch/x86/events/intel/uncore.h:

static inline bool uncore_mmio_is_valid_offset(struct intel_uncore_box *box,
					       unsigned long offset)
{
	if (offset < box->pmu->type->mmio_map_size)
		return true;

	pr_warn_once("perf uncore: Invalid offset 0x%lx exceeds mapped area of %s.\n",
		     offset, box->pmu->type->name);

	return false;
}

pr_warn_once() expands to:

#define pr_warn_once(fmt, ...)					\
	printk_once(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
#define printk_once(fmt, ...)					\
({								\
	static bool __section(".data.once") __print_once;	\
	bool __ret_print_once = !__print_once;			\
								\
	if (!__print_once) {					\
		__print_once = true;				\
		printk(fmt, ##__VA_ARGS__);			\
	}							\
	unlikely(__ret_print_once);				\
})

For some reason, Clang generates an anonymous, top-level
DW_TAG_subprogram DIE to contain the __print_once variable:

 <1><1cf86e>: Abbrev Number: 62 (DW_TAG_subprogram)
 <2><1cf86f>: Abbrev Number: 61 (DW_TAG_variable)
    <1cf870>   DW_AT_name        : (indirect string, offset: 0x34fb2e): __print_once
    <1cf874>   DW_AT_type        : <0x1c574c>
    <1cf878>   DW_AT_decl_file   : 1
    <1cf879>   DW_AT_decl_line   : 229
    <1cf87a>   DW_AT_location    : 16 byte block: 3 2c 84 66 83 ff ff ff ff 94 1 31 1e 30 22 9f         (DW_OP_addr: ffffffff8366842c; DW_OP_deref_size: 1; DW_OP_lit1; DW_OP_mul; DW_OP_lit0; DW_OP_plus; DW_OP_stack_value)

Whereas GCC puts it in a DW_TAG_lexical block DIE inside of the
DW_TAG_subprogram DIE for uncore_mmio_is_valid_offset():

 <1><3110b2>: Abbrev Number: 45 (DW_TAG_subprogram)
    <3110b3>   DW_AT_name        : (indirect string, offset: 0x2e13e): uncore_mmio_is_valid_offset
    <3110b7>   DW_AT_decl_file   : 4
    <3110b8>   DW_AT_decl_line   : 223
    <3110b9>   DW_AT_decl_column : 20
    <3110ba>   DW_AT_prototyped  : 1
    <3110ba>   DW_AT_type        : <0x2f416b>
    <3110be>   DW_AT_inline      : 3    (declared as inline and inlined)
    <3110bf>   DW_AT_sibling     : <0x311142>
 <2><3110ef>: Abbrev Number: 66 (DW_TAG_lexical_block)
 <3><3110f0>: Abbrev Number: 120 (DW_TAG_variable)
    <3110f1>   DW_AT_name        : (indirect string, offset: 0x2da3f): __print_once
    <3110f5>   DW_AT_decl_file   : 4
    <3110f6>   DW_AT_decl_line   : 229
    <3110f7>   DW_AT_decl_column : 2
    <3110f8>   DW_AT_type        : <0x2f416b>
    <3110fc>   DW_AT_location    : 9 byte block: 3 2c 28 48 83 ff ff ff ff      (DW_OP_addr: ffffffff8348282c)

Regardless, we shouldn't crash on this input.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-21 14:12:16 -07:00
Omar Sandoval
30c9ad452d libdrgn: linux_kernel: fix global per-CPU variables in kernel modules
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.

Closes #185.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:27:28 -07:00
Omar Sandoval
a52016c4cb libdrgn: linux_kernel: always use module list from core
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).

This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:11:59 -07:00
Omar Sandoval
94036f6daf libdrgn: linux_kernel: optimize reading module list
An upcoming fix requires us to always use the module list from the core
dump rather than /proc/modules. However, with the existing code, this
would cause a major startup time regression for the live kernel, mainly
because reading from /proc/kcore is stupidly slow. We currently do 3 +
strlen(module->name) reads for every module. We can reduce this to 1
read per module by reading the entire struct module at once. The size of
struct module is ~700-900 bytes depending on the kernel configuration,
which is still much faster to read than only reading what we need.

In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this
reduced the time spent in the kernel module iterator from ~2.5ms per
module to ~0.4ms per module.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:08:33 -07:00
Omar Sandoval
a2db11ebae libdrgn: object: fix use after free in drgn_object_set_from_buffer_internal()
If drgn_object_set_buffer_from_internal() (used to implement
drgn_object_set_from_buffer(), drgn_object_slice(), and
drgn_object_reinterpret()) sets an object to a primitive type from a
buffer that comes from the same object, then drgn_object_reinit() will
free the value and then drgn_value_serialize() will access the freed
value, probably resulting in garbage. Handle this case the same way we
do if the result type is encoded as a buffer, by first copying to a
temporary value.

This doesn't affect usage through Python because objects are immutable
in the Python bindings.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:08:33 -07:00
Omar Sandoval
f8ba278bc1 libdrgn: fix include-what-you-use warnings
It's been awhile since I've run this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
b8cdfff250 libdrgn: add read(2) and pread(2) wrappers that don't return short reads
We have a couple of loops that deal with short reads/EINTR from read(2)
and pread(2), and upcoming changes would need to add more. Add some
wrappers to abstract this away.

drgn_read_memory_file() still needs the loop so it can fault on the
exact offset that returns EIO.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
56fda2a0cf libdrgn: fix min() warning on 32-bit architectures
The call to min() in drgn_read_memory_file() results in the following
warning on 32-bit architectures that I missed on review:

In file included from ../../libdrgn/memory_reader.c:10:
../../libdrgn/memory_reader.c: In function 'drgn_read_memory_file':
../../libdrgn/minmax.h:36:26: warning: comparison of distinct pointer types lacks a cast
   36 |         (void)(&unique_x == &unique_y);                                         \
      |                          ^~
../../libdrgn/minmax.h:28:19: note: in expansion of macro 'cmp_once_impl'
   28 | #define min(x, y) cmp_once_impl(x, y, PP_UNIQUE(_x), PP_UNIQUE(_y), <)
      |                   ^~~~~~~~~~~~~
../../libdrgn/memory_reader.c:284:34: note: in expansion of macro 'min'
  284 |                 size_t readlen = min(file_end - file_offset, count);
      |                                  ^~~

We can fix it with a cast, and additionally do the call to min() earlier
and rework the logic a bit.

Fixes: 9684771d61 ("libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
04d2dee964 libdrgn: elaborate on core dump p_filesz < p_memsz ambiguity
There's a lot more context here that we should write down. It's also
worth noting that it appears that GDB always zero fills the range
between p_filesz and p_memsz, so if we end up having any other issues
because of this, we might have to concede and go back to the behavior
before commit 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz
in core dumps").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Shung-Hsi Yu
9335e227d6 libdrgn: python: add Jupyter pretty printing support
Add pretty printing support in Jupyter notebook for Object, Type,
StackFrame, and StackTrace; it will print out their representation in
programming language syntax with str(), similar to what's being done in
interactive mode.

Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
2022-08-25 13:52:11 -07:00
Glen McCready
9684771d61 libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError
makedumpfile will exclude zero pages. We found a core file where a
structure straddled a page boundary and the end of the structure
was all zeros so the page was excluded and we were generating a
FaultError trying to access the structure.

This change reverts a portion of that behaviour such that when we are
debugging a kernel core we go back to the zero fill behaviour. To do this
we go back to creating segments based on memsz instead of filesz and
handling the filesz->memsz gap in drgn_read_memory_file.

Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Glen McCready <gkm@mysteryinc.ca>
2022-08-25 11:59:39 -07:00
Omar Sandoval
ca373fe38a docs: use "programmable debugger" description consistently
Replace the old "Scriptable debugger library" and
"Debugger-as-a-library" taglines with the one we're using on GitHub,
"Programmable debugger". Make up for it by emphasizing that drgn can
also be used as a library a tiny bit more in the README.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-19 01:21:32 -07:00
Michel Alexandre Salim
c0ed1a3203 Fix spelling error
abbrevation => abbreviation; caught by Debian's lintian

Signed-off-by: Michel Alexandre Salim <michel@michel-slm.name>
2022-08-17 21:45:51 -07:00
Omar Sandoval
6c90315f6f python: fix FaultError reference leak
PyErr_SetObject() takes a reference on the exception value, so we need
to drop the reference we got when we created the value. Issue #196 ran
into this by reading tons of unmapped addresses.

Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-16 17:35:36 -07:00
Omar Sandoval
a19203a73e libdrgn: fix QEMU guest memory dump Kconfig suggestion
The config option is and always has been CONFIG_FW_CFG_SYSFS, not
CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the
internal CONFIG_CRASH_CORE.

Fixes: 2bd861f719 ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-15 15:11:56 -07:00
Omar Sandoval
faaf01ad1b Add drgn.StackTrace.prog and drgn_stack_trace_program()
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-11 14:45:54 -07:00
Omar Sandoval
e3ba4d2f99 drgn 0.0.20
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-25 16:52:28 -07:00
Omar Sandoval
e9d16732d6 libdrgn: x86_64: fix page table iteration over non-canonical range
We're currently checking whether the iterator has entered the
non-canonical range when fetching the last level of the page table, but
the cutover actually happens while we're in the last level. Fix it by
doing the check unconditionally.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-24 00:03:45 -07:00
Jay Kamat
063850325f libdrgn: dwarf: look up complete types in namespaces
drgn_debug_info_find_complete() looks up the name of the incomplete type
in the global namespace. This is incorrect for C++: we need to look it
up in the namespace that the DIE is in.

To find the containing namespace, we need to do a DIE ancestor walk. We
don't want to do this for C, so add a flag indicating whether a language
has namespaces to struct drgn_language. If it's true, then we do the
ancestor walk and then look up the name in the appropriate namespace.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2022-07-15 16:02:56 -07:00
Omar Sandoval
db3babd42e libdrgn: aarch64: implement page table iterator
Now that we made the other memory management helpers generic, the last
thing to implement for AArch64 is page table walking. This looks a lot
like the x86-64 equivalent but has to support the various page and
virtual address sizes that can be configured for AArch64.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:23:08 -07:00
Omar Sandoval
b28bd9f0a3 libdrgn: linux_kernel: get vmemmap generically
AArch64 has changed the location of vmemmap multiple times, and not all
of these can be easily distinguished. Rather than restorting to kernel
version checks, this replaces the vmemmap architecture callback with a
generic approach that gets the vmemmap address directly from the
mem_section table.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
a213573b23 libdrgn: linux_kernel: make virt_to_phys() and phys_to_virt() generic
On x86-64, the difference between virtual addresses in the direct map
and the corresponding physical addresses is called PAGE_OFFSET, so we
exposed that via an architecture callback and the Linux kernel object
finder. However, this doesn't translate to other architectures. Namely,
on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both
PAGE_OFFSET and PHYS_OFFSET have varied over time and between
configurations.

We can remove the architecture callback and avoid version-specific logic
by letting the page table tell us the offset. We just need an address in
the direct map, which is easy to find since this includes kmalloc and
memblock allocations.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
5fe38c7371 libdrgn: linux_kernel: fix read_vm() coalescing comparison
linux_helper_read_vm() has logic to merge adjacent physical address
ranges returned by the page table iterator. However, the check for
whether the ranges are adjacent is incorrect. Fix it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
571949a743 libdrgn: x86_64: don't bother zeroing cached page table on initialization
pgtable_iterator_x86_64::table is only used if
pgtable_iterator_x86_64::index indicates that it has any cached entries,
so there's no point initializing table since we initialize index to
indicate that nothing is cached.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
36fecd1ded libdrgn: refactor page table iterators
AArch64 will need different sizes of page table iterators depending on
the page size and virtual address size. Rather than the static
pgtable_iterator_arch_size, allow architectures to define callbacks for
allocating and freeing a page table iterator. Also remove the generic
page table iterator wrapper and just pass that information to the
iterator function.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
b3a6d6a35f libdrgn: linux_kernel: cache PAGE_SHIFT derived from PAGE_SIZE
Rather than computing it every time we need it, compute it once when we
parse PAGE_SIZE from VMCOREINFO (and validate that PAGE_SIZE is a power
of two). This will be more important for AArch64 page table walking.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:09 -07:00
Kevin Svetlitski
5aaf3db6fc libdrgn: support reference and absent objects with float types which aren't 32 or 64 bits
Very similar to a541e9b170, but adds
partial support for floats (as opposed to integers) which aren't 32 or
64 bits.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-07-06 15:47:18 -07:00
Kevin Svetlitski
661d6a186c Add support for UTF character base types
Previously `drgn` did not recognize the	`DW_ATE_UTF` encoding for base
types, and consequently could not handle `char8_t`, `char16_t`, or
`char32_t`. This has been remedied, and a corresponding test case added
to prevent regressions.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-07-06 09:44:16 -07:00
Omar Sandoval
2bd861f719 libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO
Issue #182 reported that a core dump created by QEMU's dump-guest-memory
command confuses drgn: by default, it only has NT_PRSTATUS notes and
QEMU state notes for each CPU, so drgn thinks it's a userspace core
dump, and it doesn't have the necessary VMCOREINFO to use it as a Linux
kernel core dump.

It turns out that QEMU and Linux can cooperate to add a VMCOREINFO note
to the guest memory dump, which suffices for drgn. Let's detect a QEMU
guest memory dump without a VMCOREINFO note and include instructions on
how to capture a QEMU dump that makes drgn happy.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-28 00:41:05 -07:00
Omar Sandoval
63c0684b68 libdrgn: aarch64: mask away pointer authentication code in return addresses
Now that we track RA_SIGN_STATE and get the pointer authentication code
mask, we can remove the pointer authentication code from the return
address while unwinding. Add a new architecture callback,
->demangle_return_address(), for this purpose.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
61befc1606 libdrgn: parse AArch64 PAC mask from core dumps
In order to support removing the pointer authentication code (PAC) from
return addresses on AArch64, we need to know what bits are being used
for the PAC. We can get this from the NT_ARM_PAC_MASK note in userspace
core dumps and from the NUMBER(KERNELPACMASK) field in VMCOREINFO for
Linux kernel core dumps.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
3cba315293 libdrgn: linux_kernel: use memswitch for drgn_program_parse_vmcoreinfo()
We currently have 5 names that we match against, and there are more on
the way, so we might as well use a memswitch.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
9da9f6a871 libdrgn: fold struct vmcoreinfo into struct drgn_program
In an upcoming commit, we will parse the AArch64 pointer authentication
code mask either from the VMCOREINFO note or the NT_ARM_PAC_MASK note.
Since it doesn't always come from VMCOREINFO, it doesn't make sense to
put it in struct vmcoreinfo; struct drgn_program makes more sense. So,
make parse_vmcoreinfo() take struct drgn_program instead of struct
vmcoreinfo, rename it to drgn_program_parse_vmcoreinfo(), and replace
struct vmcoreinfo with an anonymous struct in struct drgn_program.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
4d1b608507 libdrgn: aarch64: add RA_SIGN_STATE pseudo-register and DW_CFA_AARCH64_negate_ra_state
The RA_SIGN_STATE pseudo-register indicates whether the return address
is signed with a pointer authentication code. Add it to the register
definitions. It can be set through a normal CFI register rule or the
vendor-specific DW_CFA_AARCH64_negate_ra_state rule.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
9c9a2136f1 libdrgn: cfi: add rule to set register to constant
This will be used to implement DW_CFA_AARCH64_negate_ra_state. Also fix
a typographical error in a nearby comment.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
6bc55036e2 libdrgn: aarch64: add stack unwinding support
Add the basic register definitions and stack unwinding support
functions. Pointer authentication support will be added in subsequent
commits.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
ad119cc1a6 libdrgn: ppc64: fix fallback unwinding
Reading the ABI specification, I realized that fallback_unwind_ppc64()
is completely wrong. Fix it.

Fixes: eec67768aa ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
deabe2cb56 libdrgn: register_state: add and use drgn_register_state_get_u64()
This factors out some boilerplate for getting registers as a uint64_t.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
0a7849d791 libdrgn: rename drgn_register_state_set_from_integer() -> from_u64()
This is for consistency with drgn_register_state_get_u64() that we're
about to add.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
cbdf6094b7 libdrgn: ppc64: fix DWARF link register confusion
The usage of the link register in DWARF is a little confusing. On entry
to a function, the link register contains the address that should be
returned to. However, for DWARF, the link register is usually used as
the CFI return_address_register, which means that in an unwound frame,
it will contain the same thing as the program counter. I initially
thought that this was a mistake, believing that the link register should
contain the _next_ return address. However, after a return (with the blr
instruction), the link register will indeed contain the same address as
the program counter. This is consistent with our documentation of
register values for function call frames: "the register values are the
values when control returns to this frame".

So, rename our internal "ra" register to "lr", expose it to the API, and
add a little more documentation to the ppc64 initial register code.

Fixes: 221a218704 ("libdrgn: add powerpc stack trace support")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
49ae42ccfd libdrgn: x86-64: add a few more register definitions
In additional to the general-purpose registers, struct pt_regs also
provides the cs and ss segment registers and the rflags register.
elf_gregset_t provides the other segment registers as well. We should
expose all of those.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
33d14f7703 libdrgn: rework architecture definition files
Currently, register definitions are split across two files:
arch_foo.defs lists the names of registers, and arch_foo.c defines the
layout used to store registers in memory. The main rationale for this
was that the layout could be processed entirely by the C preprocessor,
but the register names needed an AWK script that we wanted to keep
minimal. But since commit af6f5a887d ("libdrgn: replace gen_arch.awk
with gen_arch_inc_strswitch.py"), arch_foo.defs is processed by a Python
script.

Let's define both the register names and the register layout in a new
file, arch_foo_defs.py, which is processed by gen_arch_inc_strswitch.py
This has a few benefits:

* It puts all of the register definitions for an architecture in one
  place.
* It is easier to maintain than preprocessor magic. (It also makes it
  trivial to support registers that don't exist in DWARF, which would've
  been harder to do with our preprocessor code.)
* It gets rid of our DSL in favor of Python (which also lets us reduce
  repetition for the ppc64 definitions).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:26 -07:00
Omar Sandoval
5681b99c3a libdrgn: remove unused struct drgn_register::dwarf_number
This hasn't been used since commit eec67768aa ("libdrgn: replace
elfutils DWARF unwinder with our own").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-24 09:17:56 -07:00
Omar Sandoval
42e37e72c1 libdrgn: stack_trace: fix byte order for drgn_stack_frame_register()
drgn_stack_frame_register() gets the register value with copy_lsbytes()
and then byte swaps it if the program's byte order is different from the
host's. But, copy_lsbytes() already fixes the byte order, so this ends
up with the original (wrong) byte order. We also don't need to zero out
the integer that we copy into since copy_lsbytes() also does that.

Fixes: eec67768aa ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-24 09:17:56 -07:00
Omar Sandoval
ebfcabd5a8 libdrgn: linux_kernel: match explicitly-reported kernel modules by build ID
Currently, we identify explicitly-reported kernel modules by the module
name that we get from the .modinfo or the .gnu.linkonce.this_module
section. However, objcopy --only-keep-debug (used for some Linux distro's
separate debug files) does not keep these sections. This means that
passing a file processed by objcopy --only-keep-debug to, e.g., drgn -s,
fails with "could not find kernel module name".

Instead of using the module name as the identifier, let's use the
module's GNU build ID. We can get it on a live system from
/sys/module/<module>/notes/, and on a core dump from struct
module::notes_attrs (which is the implementation of that sysfs
directory).

This was split out of my larger debug info discovery rework, which will
make more use of the build ID.

Closes #178.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-01 14:21:12 -07:00
Omar Sandoval
c9265ef6d6 libdrgn: move alloc_or_reuse() to util.h
Preparation for using it elsewhere. Also make it take a size_t instead
of uint64_t.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-01 05:11:58 -07:00
Omar Sandoval
3595c81a8c libdrgn: binary_search_tree: move member and entry_to_key to DEFINE_BINARY_SEARCH_TREE_FUNCTIONS()
DEFINE_BINARY_SEARCH_TREE_TYPE() doesn't need these. This is preparation
for a potential new use of a BST. But, it's also a good cleanup on its
own and allows us to move some code out of memory_reader.h and into
memory_reader.c. (This is similar to commit 1339dc6a2f ("libdrgn:
hash_table: move entry_to_key to DEFINE_HASH_TABLE_FUNCTIONS()").)

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-24 15:26:39 -07:00
Omar Sandoval
931d9c999d libdrgn: linux_kernel: get module address range directly
Instead of getting the address range from the sections we find, get it
directly from /proc/modules or from the `struct module`. (We already had
partial code to get the address range, but I can't remember why I didn't
use it.)

The real motivation for this is the upcoming module rework: it'll allow
us to report the module and its address range before iterating through
its sections. But it also means that we don't need the heuristic to
ignore special sections that shouldn't be considered part of the address
range (e.g., .init, .data..percpu [the latter of which we should be
ignoring but get away with not because it's excluded from sysfs]).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-23 16:27:49 -07:00
Omar Sandoval
7bbe0bc1c6 libdrgn: add drgn_error_dwrite()
Like drgn_error_fwrite(), but writes to a file descriptor instead of a
stdio stream. This will be used for logging.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-23 11:09:10 -07:00
Omar Sandoval
09ee4daed3 libdrgn: fix drgn_error_fwrite() for DRGN_ERROR_FAULT
drgn_error_fwrite() only calls string_builder_append_error() to get
special formatting for DRGN_ERROR_OS, but DRGN_ERROR_FAULT also needs
special formatting. Rather than needing to keep drgn_error_fwrite() and
string_builder_append_error() in sync, define them both in terms of a
common macro.

Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-23 11:08:54 -07:00
Omar Sandoval
bc582b8a58 drgn 0.0.19
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-18 13:57:02 -07:00
Omar Sandoval
a3b72e33c8 Fix some more flake8 errors
Several have snuck in since the last time I did this in commit
5541fad063 ("Fix some flake8 errors"). Prepare for adding flake8 to
pre-commit by fixing them.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-17 15:23:42 -07:00
Omar Sandoval
a541e9b170 libdrgn: support reference and absent objects with >64-bit integer types
GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and
unsigned __int128. Clang additionally has N-bit integers of up to 2<<24
bits with _ExtInt(N), which was standardized in C23 as _BitInt(N).

Currently, we disallow creating objects with a >64-bit integer type. Jay
Kamat reported that this would cause errors when examining some
binaries. The reason we disallow this is that we don't have a way to
represent or do operations on >64-bit values. We could make use of a
bignum library like GMP to do this in the future.

However, for now, we can loosen this restriction and at least allow
reference and absent objects with big integer types. This requires
enforcing two things: that we never create a value object with a >64-bit
integer type, and that we never read the value of a reference object
with a >64-bit integer type.

Co-authored-by: Jay Kamat <jaygkamat@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-28 13:38:38 -07:00
Omar Sandoval
084e636341 libdrgn: add DRGN_ERROR_NOT_IMPLEMENTED
This will be used for partial 128-bit object support. There are other
places that should probably be converted to use it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-28 13:38:38 -07:00
Omar Sandoval
ad7e64d9d8 libdrgn: make memdup() take a const void *
Just like strdup(), memdup() doesn't modify the thing it's copying, so
mark it const.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-27 00:37:46 -07:00
Omar Sandoval
14642fb3b6 libdrgn: add stub RISC-V architecture with relocation implementation
The 32-bit and 64-bit variants have different register sizes, so they're
different architectures in drgn. For now, put them in the same file so
that they can share the relocation implementation. We'll need to figure
out how to handle registers later.

P.S. RISC-V has the weirdest relocations so far. /proc/kcore also
appears to be broken.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-19 11:51:23 -07:00
Omar Sandoval
d27204260e libdrgn: add stub Arm architecture with relocation implementation
The only relocation type I saw in Debian's kernel module debug info was
R_ARM_ABS32. R_ARM_REL32 is easy. The Linux kernel supports a bunch of
other ones that don't seem relevant to debug info.

Unfortunately, I wasn't able to test this because /proc/kcore doesn't
exist on Arm. This apparently goes all the way back to 2003:
https://lwn.net/Articles/45315/.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-19 00:25:05 -07:00
Omar Sandoval
3f246f7054 libdrgn: add stub AArch64 architecture with relocation implementation
The only relocation types I saw in Debian's kernel module debug info
were R_AARCH64_ABS64 and R_AARCH64_ABS32. R_AARCH64_ABS16,
R_AARCH64_PREL64, R_AARCH64_PREL32, and R_AARCH64_PREL16 are all easy.
The remaining types supported by the Linux kernel are for movw and
immediate instructions, which aren't relevant to debug info.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-19 00:23:56 -07:00
Omar Sandoval
7535838cd5 libdrgn: add stub i386 architecture with relocation implementation
The only relocation type I saw in Debian's kernel module debug info was
R_386_32. R_386_PC32 is easy. The Linux kernel also supports
R_386_PLT32, but that's the same story as R_X86_64_PLT32 in x86-64, so
we don't implement it for now.

I was torn between naming it i386, x86, or IA-32. x86 isn't immediately
clear whether x86-64 is included or not. No one other than Intel calls
it IA-32. i386 might incorrectly imply that it is strictly the original
i386 instruction set with no later extensions, but the more general
meaning is used frequently in the Linux world (e.g., Debian and QEMU
both call it i386), so I went with that in the end.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-19 00:21:59 -07:00
Omar Sandoval
03f9f339e5 libdrgn: ppc64: add relocation implementation
One of the biggest things we depend on libdwfl for is applying
relocations on architectures other than x86-64. I'm exploring the
possibility of removing the libdwfl dependency, so I'm going to add
relocation implementations for more architectures, starting with ppc64.

R_PPC64_ADDR32 and R_PPC64_ADDR64 were the only ones I saw in Debian's
kernel module debug info. R_PPC64_REL32 and R_PPC64_REL64 are
straightforward. The Linux kernel also implements R_PPC64_TOC*, which
don't seem relevant to debugging information, and R_PPC64_REL24 and
R_PPC64_REL16*, which I'd prefer to have a real example of.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-18 17:56:37 -07:00
Omar Sandoval
da16a12fad libdrgn: x86_64: implement more relocation types
Implement R_X86_64_32S and R_X86_64_PC64. I haven't seen these for debug
info in the wild, but they're supported by the Linux kernel and they're
easy to support. The only other type of relocation currently supported
by the kernel is R_X86_64_PLT32, which is trickier. For kernel modules,
it's equivalent to R_X86_64_PC32 (see Linux kernel commit b21ebf2fb4cd
("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but that doesn't seem to
be true in general. It doesn't seem applicable to debug info sections,
so hopefully we don't need to worry about it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-18 17:56:37 -07:00
Omar Sandoval
b16dad8a36 libdrgn: support SHT_REL relocations
In preparation for supporting ELF relocations for more architectures,
generalize ELF relocations to handle SHT_REL sections/ElfN_Rel.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-18 17:56:37 -07:00
Omar Sandoval
558aa52d86 libdrgn: hash_table: sanity check integer sizes more
Check that size_t makes sense and make sure int_key_hash_pair() doesn't
get an integer type larger than it supports. I can't imagine either of
these failing in practice, but make our assumptions explicit.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-12 16:18:46 -07:00
Omar Sandoval
af01ee63c5 libdrgn: hash_table: support types larger than size_t for hash_combine()
We call hash_combine() with a uint64_t in
drgn_debug_info_module_key_hash_pair() and drgn_type_dedupe_hash_pair().
On 32-bit systems, this only uses the least-significant 32 bits. Use
hash_64_to_32() on 32-bit and hash_128_to_64() on 64-bit to ensure that
we use all bits if we're given a type larger than size_t, and sanity
check that we're not given anything larger than we support.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-12 16:18:13 -07:00
Omar Sandoval
f43af4b037 libdrgn: fix drgn_program_crashed_thread() on !SMP kernels
On !SMP kernels, crashing_cpu either doesn't exist or is always -1, so
drgn_program_crashed_thread() fails. Detect those cases and treat
crashing_cpu as 0.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-01 15:10:04 -07:00
Omar Sandoval
9803c4ac65 drgn 0.0.18
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-03 00:30:06 -08:00
Omar Sandoval
bf95af8c0d drgn 0.0.17
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-02 23:32:29 -08:00
Omar Sandoval
af6f5a887d libdrgn: replace gen_arch.awk with gen_arch_inc_strswitch.py
Now that we have gen_strswitch.py, there's no reason to keep this AWK
script around. Replace it with a Python script that outputs a strswitch
file. This also gets rid of our gawk dependency.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-02 16:10:43 -08:00
Omar Sandoval
95dff5b755 libdrgn: split some helpers out of gen_strswitch.py
These will be used by other code generation scripts.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-02 16:00:39 -08:00
Omar Sandoval
24609a3a2e libdrgn: add autoconf option to enable compiler warnings
This adds an --enable-compiler-warnings flag that:

* Defines a canonical list of warnings that we enforce. For now, this is
  -Wall -Wformat-overflow=2 -Wformat-truncation=2, but we can add to it
  going forward.
* Enables warnings by default.
* Allows erroring on warnings. We recommend that developers use this and
  use it for the CI.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-01 15:38:05 -08:00
Omar Sandoval
36277e22f3 libdrgn: add autoconf option to enable UBSan
Similar to --enable-asan for ASan, this is enabled with --enable-ubsan.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-01 15:11:16 -08:00
prozak
e8dada0ec1 Enable prog.type to work with classes
Also added test for CPP class type

This is a prerequisite to #83

Signed-off-by: mykolal <nickolay.lysenko@gmail.com>
2022-02-22 14:55:23 -08:00
Omar Sandoval
4f5249775d Fix various lints
Some functions that could be static found by -Wmissing-prototypes, some
include-what-you-use warnings, some missing SPDX identifiers. These
lints should be automated at some point.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-17 10:45:42 -08:00
Omar Sandoval
50e4ac8245 libdrgn: allow overriding program default language
Our cheap heuristic for the default language will not always be correct,
and although we can improve it as cases arise, we should also just have
a way for the user to explicitly set the default language. Add
drgn_program_set_language() to libdrgn and allow setting
drgn.Program.language in the Python bindings. This will also make unit
testing different languages easier.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-16 13:29:12 -08:00
Omar Sandoval
9397a11605 libdrgn: export drgn_language instances
libdrgn currently exports struct drgn_language pointers from
drgn_program_language(), drgn_type_language(), and
drgn_object_language(), but doesn't provide any way to do anything with
them. Export our drgn_language instances and add drgn_language_name() so
that they can at least be compared and printed.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-16 13:07:42 -08:00
Omar Sandoval
5d65ebb04b libdrgn: don't store language structures in one array
In the next change, we want to export languages to the public libdrgn
interface. I couldn't figure out any way to export array elements as
their own symbols. I'd also rather not export the drgn_languages array
indices as an enum because that would preclude ever having any sort of
language plugin support.

Instead, let's get rid of the drgn_languages array as it currently
exists and have separate drgn_language structures. This also allows us
to make a bunch of the C implementation functions static again. We keep
the language numbers so that we can store per-language data efficiently
(currently drgn_program::void_types and languages_py), as well as a
drgn_languages array to go from the language number to the struct
drgn_language. But, this is all internal and could be changed if we ever
support language plugins.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-16 12:47:12 -08:00
Omar Sandoval
12c2de2956 libdrgn: implement thread API for live processes
This implements the existing thread API methods for live processes other
than drgn_thread_stack_trace(). It also doesn't yet add support for
full-blown tracing, but it at least brings live processes to feature
parity. This is taken from the non-ptrace parts of Kevin Svetlitski's
PR #142, with some modifications.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 13:33:41 -08:00
Omar Sandoval
28c5a2016b libdrgn: split up some thread API functions
drgn_thread_iterator_create(), drgn_thread_iterator_next(), and
drgn_program_find_thread() have big, divergent code paths for different
targets, and this would get worse once we add live processes. Split them
up into multiple functions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 01:42:37 -08:00
Omar Sandoval
c71300d024 libdrgn: use exact buffer sizes when formatting decimal numbers
We have a few places where we format a decimal number with sprintf() or
snprintf() to a buffer with an arbitrary size. Instead of this arbitrary
size, let's add a macro to get the exact number of characters required
to format a decimal number, use it in all of these places, and make all
of these places use snprintf() just to be safe. This is more verbose but
self-documenting. The max_decimal_length() macro is inspired by
https://stackoverflow.com/a/13546502/1811295 with some improvements.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 01:16:54 -08:00
Omar Sandoval
98577e5e23 libdrgn: fix drgn_program_find_thread() for Linux kernel when thread isn't found
If a TID does not exist, then linux_helper_find_task() succeeds but
returns a null pointer object. Check for that instead of returning a
bogus thread.

Fixes: 301cc767ba ("Implement a new API for representing threads")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 01:16:49 -08:00
Mykola Lysenko
7580fffbdf Add drgn.Program.main_thread()
Currently only supported for user-space crash dumps. E.g. no support for
live user-space application debugging or kernel debugging.

Closes #144.

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
2022-02-10 15:53:50 -08:00
Omar Sandoval
55e2bc063a libdrgn: python: fix TypeTemplateParameter argument leak
LeakSanitizer reported a leak of a Python object when running
tests.test_type.TestTypeTemplateParameter.test_callable. This one is
caused by a missing Py_DECREF() in an error case.

Fixes: 352c31e1ac ("Add support for C++ template parameters")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-08 17:05:56 -08:00
Omar Sandoval
50ba745c24 libdrgn: python: fix drgn_thread_iterator leak
LeakSanitizer reported a leak of a drgn_thread_iterator when running the
unit tests. The root cause is that ThreadIterator_dealloc() isn't
freeing the underlying drgn_thread_iterator().

Fixes: 301cc767ba ("Implement a new API for representing threads")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-08 17:05:36 -08:00
Omar Sandoval
914ad8c53d libdrgn: use memswitch for linux_kernel_object_find
Replace the hand-written if-else ladder of memcmp() calls with a
memswitch.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-08 02:03:11 -08:00
Omar Sandoval
c49bba41b2 libdrgn: language_c: replace c_keywords with memswitch
GCC or binutils on Fedora Rawhide for ARM seems to have a bug where
c_keywords gets placed in the .data.rel.ro section (see
https://www.airs.com/blog/archives/189):

$ readelf -s .libs/libdrgnimpl_la-language_c.o | grep -w c_keywords
   475: 00000000    16 OBJECT  LOCAL  DEFAULT  175 c_keywords
$ readelf -S .libs/libdrgnimpl_la-language_c.o | grep -F '[175]'
  [175] .data.rel         PROGBITS        00000000 051f90 000010 00  WA  0   0  4
$ readelf -s .libs/_drgn.so | grep -w c_keywords
  9267: 0008e84c    16 OBJECT  LOCAL  DEFAULT   21 c_keywords.lto_priv.0
$ readelf -S .libs/_drgn.so | grep -F '[21]'
  [21] .data.rel.ro      PROGBITS        0008e018 07e018 000a10 00  WA  0   0  8

This results in a crash on startup when c_keywords_init() attempts to
populate c_keywords.

While this appears to be a compiler or linker bug, I've been meaning to
replace c_keywords with a static lookup function anyways. Now that we
have gen_strswitch.py, we can use it to generate the lookup function.
Add a script, gen_c_keywords_inc_strswitch.py, which generates an array
mapping token kind to spelling, and a memswitch mapping spelling to
token kind.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-04 20:26:35 -08:00
Omar Sandoval
da01da3a2d Add gen_strswitch.py
We have multiple places where we match an input string against several
cases:

* drgn_lexer_c() checks C identifiers against a runtime hash table of C
  keywords.
* linux_kernel_object_find() has an if-else ladder of checks for object
  names.
* drgn_debug_info_find_sections() loops over an array of ELF section
  names to look for sections we need.
* libdrgn/build-aux/gen_arch.awk generates a compile-time trie using
  nested switch statements to match register names.

This commit adds a script, gen_strswitch.py, that can hopefully be used
to replace all of these. gen_strswitch.py generalizes the compile-time
trie idea from gen_arch.awk in a few ways:

* It has syntax and semantics based on C switch statements.
* It supports both null-terminated strings and strings with an explicit
  length.
* It compresses unique substrings to calls to strcmp(), strncmp(), or
  memcmp() when appropriate.

In benchmarks, this approach is more performant than the above options
as well as a candidate based on gperf, while resulting in machine code
around the same size as the straightforward if-else ladder approach.

Future commits will convert the use cases above to use this script.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-04 20:26:35 -08:00
Omar Sandoval
41de5d72a2 Require Python to build libdrgn
Currently, Python is only required to build the Python bindings. I
originally wanted to avoid having Python as a build dependency of
libdrgn, which is why gen_arch is an AWK script. However, I want to add
another code generation script which is harder to do in AWK.
Additionally, these days more people are familiar with Python than AWK,
so let's just bite the bullet and require Python to build. No one builds
libdrgn by itself anyways.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-04 20:26:35 -08:00
Kevin Svetlitski
0b9f03752a Add autoconf option to enable ASAN
ASAN is incredibly useful during development, especially when dealing
with non-deterministic behavior where re-running the code under a debugger
won't necessarily reproduce the bug each time. In order not to break any
existing workflows, building with ASAN is opt-in (via --enable-asan).

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-02-02 17:04:05 -08:00
Kevin Svetlitski
d51843017e Fix double-free of crashed_thread
Running the test suite with ASAN enabled revealed that when
the current target was a userspace core dump, the `crashed_thread`
member of `struct drgn_program` was being freed twice – once indirectly
via `drgn_thread_set_deinit`, and once explicitly in `drgn_prog_deinit`.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-02-02 16:51:21 -08:00
Omar Sandoval
e59c779652 libdrgn: link against libm
libdrgn uses rint() for formatting floating-point numbers. rint() is
provided by libm, so we need to link with -lm.

This missing library has been masked for a couple of reasons:

1. Python is linked against libm, so the drgn Python bindings implicitly
   have this dependency satisfied.
2. On x86-64, GCC has a builtin implementation of rint().

This can be reproduced on x86-64 by building examples/load_debug_info
with -fno-builtin.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-27 17:40:07 -08:00
Omar Sandoval
929b7de266 libdrgn: handle reading data from SHT_NOBITS sections
Peilin Ye reported a couple of related crashes in drgn caused by Linux
kernel modules which had been processed with objcopy --only-keep-debug
(although he notes that since binutils-gdb commit 8c803a2dd7d3
("elf_backend_section_flags and _bfd_elf_init_private_section_data") (in
binutils v2.35), objcopy --only-keep-debug doesn't seem to work for
kernel modules).

If given an SHT_NOBITS section, elf_getdata() returns an Elf_Data with
d_buf = NULL and d_size set to the size in the section header, which is
often non-zero. There are a few places where this can cause us to
dereference a NULL pointer:

* In relocate_elf_sections() for the relocated section data.
* In relocate_elf_sections() for the symbol table section data.
* In get_kernel_module_name_from_modinfo().
* In get_kernel_module_name_from_this_module().

Fix it by checking the section type or directly checking Elf_Data::d_buf
everywhere that could potentially get an SHT_NOBITS section. This is
based on a PR from Peilin Ye.

Closes #145.

Reported-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-27 12:23:09 -08:00
Omar Sandoval
8e8e3a4f57 libdrgn: debug_info: refactor relocate_elf_section()
relocate_elf_section() shouldn't need to deal with reading the sections.
Pull that logic out into relocate_elf_file() (which will be shared with
REL-style relocations when we support those) and rename
relocate_elf_section() to apply_elf_relas().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-27 12:22:57 -08:00
Omar Sandoval
26ff3667cb libdrgn: debug_info: use elf_rawdata() instead of elf_getdata()
Most of the places where we call elf_getdata() (via read_elf_section())
deal with SHT_PROGBITS sections. elf_getdata() always returns the
literal contents of the file for SHT_PROGBITS sections (usually straight
out of the mmap'd file).

The exceptions are the SHT_RELA and SHT_SYMTAB sections in
relocate_elf_section(). For these, if the byte order or alignment of the
file do not match the host, elf_getdata() allocates a new buffer and
converts the contents. relocate_elf_section() also handles unaligned
buffers and swaps the byte order, so it mistakenly ends up with the
original byte order of the file.

Rather than removing that from relocate_elf_section(), let's avoid the
extra allocation and use elf_rawdata(), which always returns the literal
contents of the file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-27 12:20:10 -08:00
Omar Sandoval
0a643b6fab python: allow Program.type() to accept a Type
Some helpers can accept either a str or a Type. If they want to always
work with a Type internally, they need to do something like:

  if isinstance(type, str):
      type = prog.type(type)

Instead, let's let Program.type() accept a Type and return the exact
same type, so those helpers can unconditionally do:

  type = prog.type(type)

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-21 16:52:36 -08:00
Stephen Brennan
7970a60818 Add methods to return multiple matching symbols
Currently we can lookup symbols by name or address, but this will only
return one symbol, prioritizing the global symbols. However, symbols may
share the same name, and symbols may also overlap address ranges, so
it's possible for searches to return multiple results. Add functions
which can return a list of multiple matching symbols.

Signed-off-by: Stephen Brennan <stephen@brennan.io>
2022-01-15 11:44:33 -08:00
Stephen Brennan
52b96aed88 Run pre-commit on all files
`pre-commit run --all-files` results in the following minor
updates, which appear to be caused by my own failure to run linters.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-01-14 13:31:16 -08:00
Kevin Svetlitski
301cc767ba Implement a new API for representing threads
Previously, drgn had no way to represent a thread – retrieving a stack
trace (the only extant thread-specific operation) was achieved by
requiring the user to directly provide a tid.

This commit introduces the scaffolding for the design outlined in
issue #92, and implements the corresponding methods for userspace core
dumps, the live Linux kernel, and Linux kernel core dumps. Future work
will build on top of this commit to support live userspace processes.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Kevin Svetlitski
78139b6ba3 libdrgn: add Linux kernel task iterator
The thread API needs a way to iterate over all task_structs in the
kernel. Previously, we translated the existing for_each_task helper,
which supports iterating through specific PID namespaces by walking
through the PID radix tree or PID hashtable. However, we don't need
specific namespaces for the thread API, so we can instead use the much
simpler linked lists of thread groups and threads.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Omar Sandoval
95c4e2d748 Revert "Rewrite linux helper iterators in C"
This reverts commit 2b47583c73. After
Kevin had completed this, we realized that there is a simpler method for
iterating through tasks from libdrgn, which the next commit will
implement. Revert the translation, but keep the improved
tests.helpers.linux.test_pid.TestPid.test_for_each_task.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-11 17:28:17 -08:00
Omar Sandoval
69c069b09f libdrgn: allow NULL argument to drgn_stack_trace_destroy()
This is one place where I broke the convention that I just documented.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-06 18:23:27 -08:00
Omar Sandoval
2ff58a4d45 libdrgn: linux: make per_cpu_ptr() support !SMP kernels
Kernels built without multiprocessing support don't have
__per_cpu_offset; instead, per_cpu_ptr() is a no-op. Make the helper do
the same and update the test case to work on !SMP as well.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 16:51:15 -08:00