Commit Graph

57 Commits

Author SHA1 Message Date
Omar Sandoval
1088ef4a1e libdrgn: platform: replace demangle_return_address() with demangle_cfi_registers()
While documenting struct drgn_architecture_info, I realized that
demangle_return_address() is difficult to explain. It's more
straightforward to define this functionality as demangling any registers
that are mangled when using CFI rather than just the return address
register.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-12-02 13:52:06 -08:00
Omar Sandoval
18b12a5c7b libdrgn: get .eh_frame from the correct file
We're currently getting .eh_frame from the debug file. However, since
.eh_frame is an SHF_ALLOC section, it is actually in the loaded file,
and may not be in the debug file. This causes us to fail to unwind in
modules whose debug file was created with objcopy --only-keep-debug
(which is typical for Linux distro debug files).

Fix it by getting .eh_frame from the loaded file. To make this easier,
we split .eh_frame and .debug_frame data into two separate tables. We
also don't bother deduplicating them anymore, since GCC and Clang only
seem to generate one or the other in practice.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
bcb53d712b libdrgn: bypass libdwfl with struct drgn_elf_file
Now that we track the debug file ourselves, we can avoid calling libdwfl
in a bunch of places. By tracking the bias ourselves, we can avoid a
bunch more.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
34f122144a libdrgn: debug_info: wrap ELF file information in new struct drgn_elf_file
struct drgn_module contains a bunch of information about the debug info
file. Let's pull it out into its own structure, struct drgn_elf_file.
This will be reused for the "main"/"loaded" file in an upcoming change.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
b3bab1c5b0 libdrgn: make module vs. program platform difference more clear
It's confusing that we have a platform both for the program and for each
module. They usually match, but they're not required to. For example,
the user can manually add a file with a different platform just to read
its debug info. Our rule is that if we're parsing anything from the
module, we use the module platform; and otherwise, use the program
platform. There are a couple of places where the platforms must match:
when using call frame information (CFI) or registers. Let's make all of
this more clear in the code (by using the module's platform even when it
must match the program's platform) and in comments. No functional
change.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 12:53:45 -08:00
Omar Sandoval
222680b47a Add StackFrame.sp
We have some generic helpers that we'd like to add (for example, #210)
that need to know the stack pointer of a frame. These shouldn't need to
hard-code register names for different architectures. Add a generic
shortcut, StackFrame.sp.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-22 18:47:16 -08:00
Stephen Brennan
5f3a91f80d Add StackFrame.locals() method
The StackFrame's __getitem__() method allows looking up names in the
scope of a stack frame, which is an incredibly useful tool for
debugging. However, the names are not discoverable -- you must already
be looking at the source code or some other source to know what names
can be queried. To fix this, add a locals() method to StackFrame, which
lists names that can be queried in the scope. Since this method is named
locals(), it stops at the function scope and doesn't include globals or
class members.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-11-02 22:40:33 -07:00
Omar Sandoval
87b7292aa5 Relicense drgn from GPLv3+ to LGPLv2.1+
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.

All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 17:05:16 -07:00
Omar Sandoval
d465071651 libdrgn: replace copies of elfutils headers with generated files
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 15:41:53 -07:00
Omar Sandoval
70af25849c libdrgn: rename drgn_debug_info_module to drgn_module
Eventually, modules will be exposed as part of the public libdrgn API,
so they should have a clean name. Additionally, the module API I'm
currently working on will allow modules for which we don't have the
debug info file, so "debug info module" would be a misnomer.

Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and
drgn_orc_module_info to drgn_module_orc_info to fit the new naming
scheme better.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 16:52:46 -07:00
Omar Sandoval
03d5c2ebac libdrgn: string_builder: replace string_builder_finalize()
Instead of string_builder_finalize(), which leaves the string_builder in
an undefined state (according to the documentation, at least), define
string_builder_null_terminate(), which documents exactly what it does.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 15:55:04 -07:00
Omar Sandoval
d76a3a338f libdrgn: string_builder: add dedicated initializer
Rather than documenting how to initialize a struct string_builder,
provide an initializer, STRING_BUILDER_INIT.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 15:32:07 -07:00
Omar Sandoval
0b7ac5b046 Fix vmcore stack traces on Linux < 4.9 or >= 5.16 and add drgn.helpers.linux.task_cpu()
task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes
drgn_get_initial_registers() to think that the kernel is !SMP and use
CPU 0 instead, producing incorrect stack traces. This has also always
been wrong for Linux < 4.9 and on architectures that don't enable
CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be
((struct thread_info *)task->stack)->cpu.

Fix it by factoring out a new task_cpu() helper that handles all of the
above cases. Also add a test case for task_cpu() in case this changes
again.

Fixes: eea5422546 ("libdrgn: make Linux kernel stack unwinding more robust")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-03 16:21:12 -07:00
Omar Sandoval
f8ba278bc1 libdrgn: fix include-what-you-use warnings
It's been awhile since I've run this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
faaf01ad1b Add drgn.StackTrace.prog and drgn_stack_trace_program()
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-11 14:45:54 -07:00
Omar Sandoval
63c0684b68 libdrgn: aarch64: mask away pointer authentication code in return addresses
Now that we track RA_SIGN_STATE and get the pointer authentication code
mask, we can remove the pointer authentication code from the return
address while unwinding. Add a new architecture callback,
->demangle_return_address(), for this purpose.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
9c9a2136f1 libdrgn: cfi: add rule to set register to constant
This will be used to implement DW_CFA_AARCH64_negate_ra_state. Also fix
a typographical error in a nearby comment.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
42e37e72c1 libdrgn: stack_trace: fix byte order for drgn_stack_frame_register()
drgn_stack_frame_register() gets the register value with copy_lsbytes()
and then byte swaps it if the program's byte order is different from the
host's. But, copy_lsbytes() already fixes the byte order, so this ends
up with the original (wrong) byte order. We also don't need to zero out
the integer that we copy into since copy_lsbytes() also does that.

Fixes: eec67768aa ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-24 09:17:56 -07:00
Kevin Svetlitski
301cc767ba Implement a new API for representing threads
Previously, drgn had no way to represent a thread – retrieving a stack
trace (the only extant thread-specific operation) was achieved by
requiring the user to directly provide a tid.

This commit introduces the scaffolding for the design outlined in
issue #92, and implements the corresponding methods for userspace core
dumps, the live Linux kernel, and Linux kernel core dumps. Future work
will build on top of this commit to support live userspace processes.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Omar Sandoval
69c069b09f libdrgn: allow NULL argument to drgn_stack_trace_destroy()
This is one place where I broke the convention that I just documented.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-06 18:23:27 -08:00
Omar Sandoval
c0d8709b45 Update copyright headers to Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 15:59:44 -08:00
Omar Sandoval
5591d199b1 libdrgn: debug_info: split DWARF support into its own file
Continuing the refactoring from the previous commit, move the DWARF code
from debug_info.c to its own file, leaving only the generic ELF file
management in debug_info.c

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:54 -08:00
Omar Sandoval
d1745755f1 Fix some include-what-you-use warnings
Also:

* Rename struct string to struct nstring and move it to its own header.
* Fix scripts/iwyu.py, which was broken by commit 5541fad063 ("Fix
  some flake8 errors").
* Add workarounds for a few outstanding include-what-you-use issues.

There is still a false positive for
include-what-you-use/include-what-you-use#970, but hopefully that is
fixed soon.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-10 15:09:29 -08:00
Omar Sandoval
802d6cc9ff libdrgn: rename drgn_program::_dbinfo to dbinfo
The underscore was meant to discourage direct access in favor of using
drgn_program_get_dbinfo(), but it turns out that it's more normal to
access it directly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-10-23 00:52:23 -07:00
Omar Sandoval
add17a9a36 libdrgn: stack_trace: fix source info without .debug_aranges
dwfl_module_getsrc() relies on .debug_aranges to find the CU containing
the PC. If the module has a missing or incomplete .debug_aranges, it
fails. This lookup is actually redundant since we already found the CU
when we unwound the stack. Use the libdw helpers that take the CU DIE
instead to avoid this. We also need to save the CU for frames where we
found it but couldn't find the subprogram (typically assembly files).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-07-07 13:41:17 -07:00
Omar Sandoval
bc85767e5f libdrgn: support looking up parameters and variables in stack traces
After all of the preparatory work, the last two missing pieces are a way
to find a variable by name in the list of scopes that we saved while
unwinding, and a way to find the containing scopes of an inlined
function. With that, we can finally look up parameters and variables in
stack traces.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-06-05 16:18:51 -07:00
Omar Sandoval
38573cfdde libdrgn: stack_trace: pretty print frames and add frames for inline functions
If we want to access a parameter or local variable in an inlined
function, then we need a stack frame for that function. It's also much
more useful to see inlined functions in the stack trace in general. So,
when we've unwound the registers for a stack frame, walk the debugging
information to find all of the (possibly inlined) functions at the
program counter, and add a drgn stack frame for each of those.

Also add StackFrame.name and StackFrame.is_inline so that we can
distinguish inline frames. Also add StackFrame.source() to get the
filename and line and column numbers. Finally, add the source code
location to pretty-printed stack traces and add pretty-printing for
individual stack frames that includes extra information.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-06-05 16:18:51 -07:00
Omar Sandoval
a4b9d68a8c Use GPL-3.0-or-later license identifier instead of GPL-3.0+
Apparently the latter is deprecated and the former is preferred.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:10:35 -07:00
Omar Sandoval
630d39e345 libdrgn: add ORC unwinder
The Linux kernel has its own stack unwinding format for x86-64 called
ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is
essentially a simplified, less complete version of DWARF CFI. ORC is
generated by analyzing machine code, so it is present for all but a few
ignored functions. In contrast, DWARF CFI is generated by the compiler
and is therefore missing for functions written in assembly and inline
assembly (which is widespread in the kernel).

This implements an ORC stack unwinder: it applies ELF relocations to the
ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule
kind, parses and efficiently stores ORC data, and translates ORC to drgn
CFI rules. This will allow us to stack trace through assembly code,
interrupts, and system calls.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-29 10:01:52 -07:00
Omar Sandoval
eec67768aa libdrgn: replace elfutils DWARF unwinder with our own
The elfutils DWARF unwinder has a couple of limitations:

1. libdwfl doesn't have an interface for getting register values, so we
   have to bundle a patched version of elfutils with drgn.
2. Error handling is very awkward: dwfl_getthread_frames() can return an
   error even on success, so we have to squirrel away our own errors in
   the callback.

Furthermore, there are a couple of things that will be easier with our
own unwinder:

1. Integrating unwinding using ORC will be easier when we're handling
   unwinding ourselves.
2. Support for local variables isn't too far away now that we have DWARF
   expression evaluation.

Now that we have the register state, CFI, and DWARF expression pieces in
place, stitch them together with the new unwinder, and tweak the public
API a bit to reflect it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:43:12 -07:00
Omar Sandoval
b899a10836 Remove register numbers from API and add register aliases
enum drgn_register_number in the public libdrgn API and
drgn.Register.number in the Python bindings are basically exports of
DWARF register numbers. They only exist as a way to identify registers
that's lighter weight than string lookups. libdrgn already has struct
drgn_register, so we can use that to identify registers in the public
API and remove enum drgn_register_number. This has a couple of benefits:
we don't depend on DWARF numbering in our API, and we don't have to
generate drgn.h from the architecture files. The Python bindings can
just use string names for now. If it seems useful, StackFrame.register()
can take a Register in the future, we'll just need to be careful to not
allow Registers from the wrong platform.

While we're changing the API anyways, also change it so that registers
have a list of names instead of one name. This isn't needed for x86-64
at the moment, but will be for architectures that have multiple names
for the same register (like ARM).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-28 17:47:45 -08:00
Omar Sandoval
46343ae08d libdrgn: get rid of struct drgn_stack_frame
In preparation for adding a "real", internal-only struct
drgn_stack_frame, replace the existing struct drgn_stack_frame with
explicit trace/frame arguments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-27 11:22:34 -08:00
Omar Sandoval
5f17281926 libdrgn: make drgn_object::is_reference an enum
To prepare for a new kind of object, replace the is_reference bool with
an enum drgn_object_kind.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 13:37:58 -08:00
Omar Sandoval
edb1fe7f2f libdrgn: rename drgn_object_kind to drgn_object_encoding
I'd like to use the name drgn_object_kind to distinguish between values
and references. "Encoding" is more accurate than "kind", anyways.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 12:02:26 -08:00
Omar Sandoval
286c09844e Clean up #includes with include-what-you-use
I recently hit a couple of CI failures caused by relying on transitive
includes that weren't always present. include-what-you-use is a
Clang-based tool that helps with this. It's a bit finicky and noisy, so
this adds scripts/iwyu.py to make running it more convenient (but not
reliable enough to automate it in Travis).

This cleans up all reasonable include-what-you-use warnings and
reorganizes a few header files.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-23 16:29:42 -07:00
Omar Sandoval
f83bb7c71b libdrgn: move debugging information tracking into drgn_debug_info
Debugging information tracking is currently in two places: drgn_program
finds debugging information, and drgn_dwarf_index stores it. Both of
these responsibilities make more sense as part of drgn_debug_info, so
let's move them there. This prepares us to track extra debugging
information that isn't pertinent to indexing.

This also reworks a couple of details of loading debugging information:

- drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into
  a single structure, drgn_debug_info_module.
- The first pass of DWARF indexing now happens in parallel with reading
  compilation units (by using OpenMP tasks).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-22 10:58:24 -07:00
Omar Sandoval
7a85b4188e libdrgn: clean up read.h helpers and avoid undefined pointer behavior
There are a couple of related ways that we can cause undefined behavior
when parsing a malformed DWARF or depmod index file:

1. There are several places where we increment the cursor to skip past
   some data. It is undefined behavior if the result points out of
   bounds of the data, even if we don't attempt to dereference it.
2. read_in_bounds() checks that ptr <= end. This pointer comparison is
   only defined if ptr and end both point to elements of the same array
   object or one past the last element. If ptr has gone past end, then
   this comparison is likely undefined anyways.

Fix it by adding a helper to skip past data with bounds checking. Then,
all of the helpers can assume that ptr <= end and maintain that
invariant. while we're here and auditing all of the call sites, let's
clean up the API and rename it from read_foo() to the less generic
mread_foo().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-02 17:13:16 -07:00
Omar Sandoval
e49a87a3d7 libdrgn: remove struct drgn_object::prog
We can get it via the type now.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-27 11:31:21 -07:00
Omar Sandoval
1b47b866b4 libdrgn: go back to trusting PRSTATUS PID
Commit eea5422546 ("libdrgn: make Linux kernel stack unwinding more
robust") overlooked that if the task is running in userspace, the stack
pointer in PRSTATUS obviously won't match the kernel stack pointer.
Let's bite the bullet and use the PID. If the race shows up in practice,
we can try to come up with another workaround.
2020-07-08 18:34:16 -07:00
Omar Sandoval
eea5422546 libdrgn: make Linux kernel stack unwinding more robust
drgn has a couple of issues unwinding stack traces for kernel core
dumps:

1. It can't unwind the stack for the idle task (PID 0), which commonly
   appears in core dumps.
2. It uses the PID in PRSTATUS, which is racy and can't actually be
   trusted.

The solution for both of these is to look up the PRSTATUS note by CPU
instead of PID.

For the live kernel, drgn refuses to unwind the stack of tasks in the
"R" state. However, the "R" state is running *or runnable*, so in the
latter case, we can still unwind the stack. The solution for this is to
look at on_cpu for the task instead of the state.
2020-05-20 12:03:00 -07:00
Omar Sandoval
146930aff8 libdrgn: replace arch frame_registers with callbacks
We currently unwind from pt_regs and NT_PRSTATUS using an array of
register definitions. It's more flexible and more efficient to do this
with an architecture-specific callback. For x86-64, this change also
makes us depend on the binary layout rather than member names of struct
pt_regs, but that shouldn't matter unless people are defining their own,
weird struct pt_regs.
2020-05-19 17:11:27 -07:00
Omar Sandoval
8b264f8823 Update copyright headers to Facebook and add missing headers
drgn was originally my side project, but for awhile now it's also been
my work project. Update the copyright headers to reflect this, and add a
copyright header to various files that were missing it.
2020-05-15 15:13:02 -07:00
Omar Sandoval
c339113f9c libdrgn: adjust program counter when looking up frame symbol
For functions that call a noreturn function, the compiler may omit code
after the call instruction. This means that the return address may not
lie in the caller's symbol. dwfl_frame_pc() returns whether a frame is
an "activation", i.e., its program counter is guaranteed to lie within
the caller. This is only the case for the initial frame, frames
interrupted by a signal, and the signal trampoline frame. For everything
else, we need to decrement the program counter before doing any lookups.
2020-05-13 17:11:54 -07:00
Omar Sandoval
0a100064c1 libdrgn: improve and rename DRGN_UNREACHABLE()
DRGN_UNREACHABLE() currently expands to abort(), but assert() provides
more information. If NDEBUG is defined, we can use
__builtin_unreachable() instead.

DRGN_UNREACHABLE() isn't drgn-specific, so this renames it to
UNREACHABLE(). It's also not really related to errors, so this moves it
to internal.h.
2020-05-07 15:16:22 -07:00
Omar Sandoval
10e58777c3 Add Program.read_{u8,u16,u32,u64,word}()
I've found that I do this manually a lot (e.g., when digging through a
task's stack). Add shortcuts for reading unsigned integers and a note
for how to manually read other formats.
2020-04-27 17:27:10 -07:00
Serapheim Dimitropoulos
08193a97aa Support stack traces for running threads on kdumps 2020-03-27 16:12:03 -07:00
Omar Sandoval
9246094cdc libdrgn: use dwfl_frame_register() instead of dwfl_frame_eval_expr()
I thought I'd be able to avoid adding a separate API for register values
and reuse dwfl_frame_eval_expr(), but this doesn't work if the frame is
missing debug information but has known register values (e.g., if the
program crashed with an invalid instruction pointer).
2020-02-20 14:13:08 -08:00
Jay Kamat
054cb54a01 libdrgn: Rename find_symbol to find_symbol_by_address 2020-02-12 14:06:49 -08:00
Omar Sandoval
0a707b0c9d libdrgn: rework drgn_find_symbol_internal()
Instead of having two internal variants (drgn_find_symbol_internal() and
drgn_program_find_symbol_in_module()), combine them into the former and
add a separate drgn_error_symbol_not_found() for translating the static
error to the user-facing one. This makes things more flexible for the
next change.
2019-12-19 11:43:54 -08:00
Omar Sandoval
3b22bd3022 libdrgn: rename pretty_print -> format
In preparation for making drgn_pretty_print_object() more flexible
(i.e., not always "pretty"), rename it to drgn_format_object(). For
consistency, let's rename drgn_pretty_print_type_name(),
drgn_pretty_print_type(), and drgn_pretty_print_stack_trace(), too.
2019-12-16 11:21:12 -08:00