Commit Graph

120 Commits

Author SHA1 Message Date
Glen McCready
9684771d61 libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError
makedumpfile will exclude zero pages. We found a core file where a
structure straddled a page boundary and the end of the structure
was all zeros so the page was excluded and we were generating a
FaultError trying to access the structure.

This change reverts a portion of that behaviour such that when we are
debugging a kernel core we go back to the zero fill behaviour. To do this
we go back to creating segments based on memsz instead of filesz and
handling the filesz->memsz gap in drgn_read_memory_file.

Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Glen McCready <gkm@mysteryinc.ca>
2022-08-25 11:59:39 -07:00
Omar Sandoval
a19203a73e libdrgn: fix QEMU guest memory dump Kconfig suggestion
The config option is and always has been CONFIG_FW_CFG_SYSFS, not
CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the
internal CONFIG_CRASH_CORE.

Fixes: 2bd861f719 ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-15 15:11:56 -07:00
Omar Sandoval
a213573b23 libdrgn: linux_kernel: make virt_to_phys() and phys_to_virt() generic
On x86-64, the difference between virtual addresses in the direct map
and the corresponding physical addresses is called PAGE_OFFSET, so we
exposed that via an architecture callback and the Linux kernel object
finder. However, this doesn't translate to other architectures. Namely,
on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both
PAGE_OFFSET and PHYS_OFFSET have varied over time and between
configurations.

We can remove the architecture callback and avoid version-specific logic
by letting the page table tell us the offset. We just need an address in
the direct map, which is easy to find since this includes kmalloc and
memblock allocations.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
36fecd1ded libdrgn: refactor page table iterators
AArch64 will need different sizes of page table iterators depending on
the page size and virtual address size. Rather than the static
pgtable_iterator_arch_size, allow architectures to define callbacks for
allocating and freeing a page table iterator. Also remove the generic
page table iterator wrapper and just pass that information to the
iterator function.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
2bd861f719 libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO
Issue #182 reported that a core dump created by QEMU's dump-guest-memory
command confuses drgn: by default, it only has NT_PRSTATUS notes and
QEMU state notes for each CPU, so drgn thinks it's a userspace core
dump, and it doesn't have the necessary VMCOREINFO to use it as a Linux
kernel core dump.

It turns out that QEMU and Linux can cooperate to add a VMCOREINFO note
to the guest memory dump, which suffices for drgn. Let's detect a QEMU
guest memory dump without a VMCOREINFO note and include instructions on
how to capture a QEMU dump that makes drgn happy.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-28 00:41:05 -07:00
Omar Sandoval
61befc1606 libdrgn: parse AArch64 PAC mask from core dumps
In order to support removing the pointer authentication code (PAC) from
return addresses on AArch64, we need to know what bits are being used
for the PAC. We can get this from the NT_ARM_PAC_MASK note in userspace
core dumps and from the NUMBER(KERNELPACMASK) field in VMCOREINFO for
Linux kernel core dumps.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
9da9f6a871 libdrgn: fold struct vmcoreinfo into struct drgn_program
In an upcoming commit, we will parse the AArch64 pointer authentication
code mask either from the VMCOREINFO note or the NT_ARM_PAC_MASK note.
Since it doesn't always come from VMCOREINFO, it doesn't make sense to
put it in struct vmcoreinfo; struct drgn_program makes more sense. So,
make parse_vmcoreinfo() take struct drgn_program instead of struct
vmcoreinfo, rename it to drgn_program_parse_vmcoreinfo(), and replace
struct vmcoreinfo with an anonymous struct in struct drgn_program.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
f43af4b037 libdrgn: fix drgn_program_crashed_thread() on !SMP kernels
On !SMP kernels, crashing_cpu either doesn't exist or is always -1, so
drgn_program_crashed_thread() fails. Detect those cases and treat
crashing_cpu as 0.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-01 15:10:04 -07:00
Omar Sandoval
50e4ac8245 libdrgn: allow overriding program default language
Our cheap heuristic for the default language will not always be correct,
and although we can improve it as cases arise, we should also just have
a way for the user to explicitly set the default language. Add
drgn_program_set_language() to libdrgn and allow setting
drgn.Program.language in the Python bindings. This will also make unit
testing different languages easier.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-16 13:29:12 -08:00
Omar Sandoval
12c2de2956 libdrgn: implement thread API for live processes
This implements the existing thread API methods for live processes other
than drgn_thread_stack_trace(). It also doesn't yet add support for
full-blown tracing, but it at least brings live processes to feature
parity. This is taken from the non-ptrace parts of Kevin Svetlitski's
PR #142, with some modifications.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 13:33:41 -08:00
Omar Sandoval
28c5a2016b libdrgn: split up some thread API functions
drgn_thread_iterator_create(), drgn_thread_iterator_next(), and
drgn_program_find_thread() have big, divergent code paths for different
targets, and this would get worse once we add live processes. Split them
up into multiple functions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 01:42:37 -08:00
Omar Sandoval
c71300d024 libdrgn: use exact buffer sizes when formatting decimal numbers
We have a few places where we format a decimal number with sprintf() or
snprintf() to a buffer with an arbitrary size. Instead of this arbitrary
size, let's add a macro to get the exact number of characters required
to format a decimal number, use it in all of these places, and make all
of these places use snprintf() just to be safe. This is more verbose but
self-documenting. The max_decimal_length() macro is inspired by
https://stackoverflow.com/a/13546502/1811295 with some improvements.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 01:16:54 -08:00
Omar Sandoval
98577e5e23 libdrgn: fix drgn_program_find_thread() for Linux kernel when thread isn't found
If a TID does not exist, then linux_helper_find_task() succeeds but
returns a null pointer object. Check for that instead of returning a
bogus thread.

Fixes: 301cc767ba ("Implement a new API for representing threads")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-12 01:16:49 -08:00
Mykola Lysenko
7580fffbdf Add drgn.Program.main_thread()
Currently only supported for user-space crash dumps. E.g. no support for
live user-space application debugging or kernel debugging.

Closes #144.

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
2022-02-10 15:53:50 -08:00
Kevin Svetlitski
d51843017e Fix double-free of crashed_thread
Running the test suite with ASAN enabled revealed that when
the current target was a userspace core dump, the `crashed_thread`
member of `struct drgn_program` was being freed twice – once indirectly
via `drgn_thread_set_deinit`, and once explicitly in `drgn_prog_deinit`.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-02-02 16:51:21 -08:00
Stephen Brennan
7970a60818 Add methods to return multiple matching symbols
Currently we can lookup symbols by name or address, but this will only
return one symbol, prioritizing the global symbols. However, symbols may
share the same name, and symbols may also overlap address ranges, so
it's possible for searches to return multiple results. Add functions
which can return a list of multiple matching symbols.

Signed-off-by: Stephen Brennan <stephen@brennan.io>
2022-01-15 11:44:33 -08:00
Kevin Svetlitski
301cc767ba Implement a new API for representing threads
Previously, drgn had no way to represent a thread – retrieving a stack
trace (the only extant thread-specific operation) was achieved by
requiring the user to directly provide a tid.

This commit introduces the scaffolding for the design outlined in
issue #92, and implements the corresponding methods for userspace core
dumps, the live Linux kernel, and Linux kernel core dumps. Future work
will build on top of this commit to support live userspace processes.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Omar Sandoval
e6abfeac03 libdrgn: debug_info: report userspace core dump debug info ourselves
There are a few reasons for this:

1. dwfl_core_file_report() crashes on elfutils 0.183-0.185. Those
   versions are still used by several distros.
2. In order to support --main-symbols and --symbols properly, we need to
   report things ourselves.
3. I'm considering moving away from libdwfl in the long term.

We provide an escape hatch for now: setting the environment variable
DRGN_USE_LIBDWFL_REPORT=1 opts out of drgn's reporting and uses
libdwfl's.

Fixes #130.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 12:11:10 -08:00
Omar Sandoval
02912ca7d0 libdrgn: fix handling of p_filesz < p_memsz in core dumps
I implemented the case of a segment in a core file with p_filesz <
p_memsz by treating the difference as zero bytes. This is correct for
ET_EXEC and ET_DYN, but for ET_CORE, it actually means that the memory
existed in the program but was not saved. For userspace core dumps, this
typically happens for read-only file mappings. For kernel core dumps,
makedumpfile does this to indicate memory that was excluded.

Instead, let's return a DRGN_FAULT_ERROR if an attempt is made to read
from these bytes. In the future, we need to read from the
executable/library files when we can.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 00:02:44 -08:00
Omar Sandoval
91f6d03ee8 libdrgn: fix note name matching
The current code matches the desired note name as a prefix, but we need
an exact match.

Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-03 12:03:19 -08:00
Omar Sandoval
c0d8709b45 Update copyright headers to Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 15:59:44 -08:00
Omar Sandoval
ff40f65f0d libdrgn: allow symbol name lookup to get local symbols
Global symbols are preferred over weak symbols, and weak symbols are
preferred over other symbols.

dwfl_module_addrinfo() seems to have the same preference, so document
address lookups as having the same behavior. (This is actually incorrect
in the case of STB_GNU_UNIQUE, as dwfl_module_addrinfo() treats anything
other than STB_GLOBAL, STB_WEAK, and STB_LOCAL as having the lowest
precedence, but STB_GNU_UNIQUE is so obscure that it probably doesn't
matter.)

Based on work from Stephen Brennan. Closes #121.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 14:30:57 -08:00
Omar Sandoval
5591d199b1 libdrgn: debug_info: split DWARF support into its own file
Continuing the refactoring from the previous commit, move the DWARF code
from debug_info.c to its own file, leaving only the generic ELF file
management in debug_info.c

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:54 -08:00
Omar Sandoval
d1745755f1 Fix some include-what-you-use warnings
Also:

* Rename struct string to struct nstring and move it to its own header.
* Fix scripts/iwyu.py, which was broken by commit 5541fad063 ("Fix
  some flake8 errors").
* Add workarounds for a few outstanding include-what-you-use issues.

There is still a false positive for
include-what-you-use/include-what-you-use#970, but hopefully that is
fixed soon.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-10 15:09:29 -08:00
Omar Sandoval
1339dc6a2f libdrgn: hash_table: move entry_to_key to DEFINE_HASH_TABLE_FUNCTIONS()
DEFINE_HASH_TABLE_TYPE() doesn't actually need to know the key type.
Move that argument (and some of the derived constants) to
DEFINE_HASH_TABLE_FUNCTIONS(). This will allow recursive hash table
types. As a nice side effect, it also reduces the size of common header
files.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-10-23 00:52:23 -07:00
Omar Sandoval
802d6cc9ff libdrgn: rename drgn_program::_dbinfo to dbinfo
The underscore was meant to discourage direct access in favor of using
drgn_program_get_dbinfo(), but it turns out that it's more normal to
access it directly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-10-23 00:52:23 -07:00
Omar Sandoval
c1e16ae3ec libdrgn: fold drgn_program_get_dbinfo() into only caller
The only time that we want to create the drgn_debug_info is when we're
loading debugging information. Everywhere else, we fail fast if there is
no debugging information.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-10-23 00:40:57 -07:00
Omar Sandoval
fba5947fec libdrgn: add array_for_each()
And use it in a few appropriate places. This should hopefully make it
harder to make iteration mistakes like the one fixed by commit
4755cfac7c ("libdrgn: dwarf_index: increment correct variable when
rolling back"). While we're doing this, move ARRAY_SIZE() into a new
header file with array_for_each() and make it lowercase.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-08-23 17:32:00 -07:00
Stephen Brennan
3d8db22c47 libdrgn: Add kind and binding fields to drgn_symbol
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2021-08-20 18:16:57 -07:00
Omar Sandoval
0e3054a0ba libdrgn: make addresses wrap around when reading memory
Define that addresses for memory reads wrap around after the maximum
address rather than the current unpredictable behavior. This is done by:

1. Reworking drgn_memory_reader to work with an inclusive address range
   so that a segment can contain UINT64_MAX. drgn_memory_reader remains
   agnostic to the maximum address and requires that address ranges do
   not overflow a uint64_t.
2. Adding the overflow/wrap-around logic to
   drgn_program_add_memory_segment() and drgn_program_read_memory().
3. Changing direct uses of drgn_memory_reader_reader() to
   drgn_program_read_memory() now that they are no longer equivalent.

(For some platforms, a fault might be more appropriate than wrapping
around, but this is a step in the right direction.)

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-06-03 17:49:29 -07:00
Omar Sandoval
e5ff1ea7ac libdrgn: program: use preset platform in drgn_program_set_core_dump()
If the program already had a platform set, we should its callbacks
instead of the ones from the ELF file's platform.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-06-03 17:04:28 -07:00
Omar Sandoval
a4b9d68a8c Use GPL-3.0-or-later license identifier instead of GPL-3.0+
Apparently the latter is deprecated and the former is preferred.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:10:35 -07:00
Omar Sandoval
630d39e345 libdrgn: add ORC unwinder
The Linux kernel has its own stack unwinding format for x86-64 called
ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is
essentially a simplified, less complete version of DWARF CFI. ORC is
generated by analyzing machine code, so it is present for all but a few
ignored functions. In contrast, DWARF CFI is generated by the compiler
and is therefore missing for functions written in assembly and inline
assembly (which is widespread in the kernel).

This implements an ORC stack unwinder: it applies ELF relocations to the
ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule
kind, parses and efficiently stores ORC data, and translates ORC to drgn
CFI rules. This will allow us to stack trace through assembly code,
interrupts, and system calls.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-29 10:01:52 -07:00
Serapheim Dimitropoulos
a68abd5de4 libdrgn: stretch minimum supported version of libelf to 0.170
Currently libdrgn requires libelf to be of version 0.175 or
later. This patch allows the library to be compiled with libelf
0.170 (the newest version supported by Ubuntu 18.04 LTS).

Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
2021-03-21 14:28:29 -07:00
Omar Sandoval
a7962e9477 libdrgn: debug_info: pass around Dwfl_Module instead of bias
We're going to need the module start and end in
drgn_object_from_dwarf_variable(), so pass the Dwfl_Module around and
get the bias when we need it. This means we don't need the bias from
drgn_dwarf_index_get_die(), so get rid of that, too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:12:29 -08:00
Omar Sandoval
798f0887a5 libdrgn: simplify language fall back handling
If the language for a DWARF type is not found or unrecognized, we should
fall back to the global default, not the program default (the program
default language is for language-specific operations on the program, so
DWARF parsing shouldn't depend on it). Add a fall_back parameter to
drgn_language_from_die() and use it in DWARF parsing, and replace
drgn_language_or_default() with a drgn_default_language variable.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 10:46:35 -08:00
Omar Sandoval
e72ecd0e2c libdrgn: replace drgn_program_member_info() with drgn_type_find_member()
Now that types are associated with their program, we don't need to pass
the program separately to drgn_program_member_info() and can replace it
with a more natural drgn_type_find_member() API that takes only the type
and member name. While we're at it, get rid of drgn_member_info and
return the drgn_type_member and bit_offset directly. This also fixes a
bug that drgn_error_member_not_found() ignores the member name length.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 14:40:54 -08:00
Omar Sandoval
22c1d87aec libdrgn: cache page_offset and vmemmap as objects instead of uint64_t
This is a little cleaner and saves on conversions back and forth between
C values and objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:40:07 -08:00
Omar Sandoval
3360170336 libdrgn: only install page table memory reader when supported
If virtual address translation isn't implemented for the target
architecture, then we shouldn't add the page table memory reader. If we
do, we get a DRGN_ERROR_INVALID_ARGUMENT error from
linux_helper_read_vm() instead of a DRGN_ERROR_FAULT error as expected.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-27 01:27:30 -08:00
Omar Sandoval
3c5d22637e libdrgn: clean up hash function APIs and improve documentation
Use *_hash_pair() for hash functions that do the full double hashing and
return a struct hash_pair and hash_*() for other hashing utility
functions. Also change some of the equality function names to be more
symmetric and improve the documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 16:20:08 -07:00
Omar Sandoval
286c09844e Clean up #includes with include-what-you-use
I recently hit a couple of CI failures caused by relying on transitive
includes that weren't always present. include-what-you-use is a
Clang-based tool that helps with this. It's a bit finicky and noisy, so
this adds scripts/iwyu.py to make running it more convenient (but not
reliable enough to automate it in Travis).

This cleans up all reasonable include-what-you-use warnings and
reorganizes a few header files.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-23 16:29:42 -07:00
Omar Sandoval
f83bb7c71b libdrgn: move debugging information tracking into drgn_debug_info
Debugging information tracking is currently in two places: drgn_program
finds debugging information, and drgn_dwarf_index stores it. Both of
these responsibilities make more sense as part of drgn_debug_info, so
let's move them there. This prepares us to track extra debugging
information that isn't pertinent to indexing.

This also reworks a couple of details of loading debugging information:

- drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into
  a single structure, drgn_debug_info_module.
- The first pass of DWARF indexing now happens in parallel with reading
  compilation units (by using OpenMP tasks).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-22 10:58:24 -07:00
Omar Sandoval
3ac9ae357b libdrgn: rename drgn_dwarf_info_cache to drgn_debug_info
The current name is too verbose. Let's go with a shorter, more generic
name.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-11 17:41:23 -07:00
Jay Kamat
d1beb0184a libdrgn: add support for objects in C++ namespaces
DWARF represents namespaces with DW_TAG_namespace DIEs. Add these to the
DWARF index, with each namespace being its own sub-index. We only index
the namespace itself when it is first accessed, which should help with
startup time and simplifies tracking.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2020-09-02 17:13:16 -07:00
Omar Sandoval
66ad5077c9 libdrgn: dwarf_index: return indexed DIE entry from drgn_dwarf_index_iterator_next()
For namespace support, we will want to access the struct
drgn_dwarf_index_die for namespaces instead of the Dwarf_Die. Split
drgn_dwarf_index_get_die() out of drgn_dwarf_index_iterator_next().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-02 17:13:16 -07:00
Omar Sandoval
7a85b4188e libdrgn: clean up read.h helpers and avoid undefined pointer behavior
There are a couple of related ways that we can cause undefined behavior
when parsing a malformed DWARF or depmod index file:

1. There are several places where we increment the cursor to skip past
   some data. It is undefined behavior if the result points out of
   bounds of the data, even if we don't attempt to dereference it.
2. read_in_bounds() checks that ptr <= end. This pointer comparison is
   only defined if ptr and end both point to elements of the same array
   object or one past the last element. If ptr has gone past end, then
   this comparison is likely undefined anyways.

Fix it by adding a helper to skip past data with bounds checking. Then,
all of the helpers can assume that ptr <= end and maintain that
invariant. while we're here and auditing all of the call sites, let's
clean up the API and rename it from read_foo() to the less generic
mread_foo().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-02 17:13:16 -07:00
Omar Sandoval
ff96c75da0 helpers: translate task_state_to_char() to Python
Commit 326107f054 ("libdrgn: add task_state_to_char() helper")
implemented task_state_to_char() in libdrgn so that it could be used in
commit 4780c7a266 ("libdrgn: stack_trace: prohibit unwinding stack of
running tasks"). As of commit eea5422546 ("libdrgn: make Linux kernel
stack unwinding more robust"), it is no longer used in libdrgn, so we
can translate it to Python. This removes a bunch of code and is more
useful as an example.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-27 13:54:39 -07:00
Omar Sandoval
e49a87a3d7 libdrgn: remove struct drgn_object::prog
We can get it via the type now.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-27 11:31:21 -07:00
Omar Sandoval
c31208f69c libdrgn: fold drgn_type_index into drgn_program
This is preparation for associating types with a program.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 17:36:35 -07:00
Omar Sandoval
d4e0771f87 libdrgn: return error from drgn_program_{is_little_endian,bswap,is_64_bit}()
Most places that call these check has_platform and return an error, and
those that don't can live with the extra check.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 16:56:28 -07:00