There are a couple of related ways that we can cause undefined behavior
when parsing a malformed DWARF or depmod index file:
1. There are several places where we increment the cursor to skip past
some data. It is undefined behavior if the result points out of
bounds of the data, even if we don't attempt to dereference it.
2. read_in_bounds() checks that ptr <= end. This pointer comparison is
only defined if ptr and end both point to elements of the same array
object or one past the last element. If ptr has gone past end, then
this comparison is likely undefined anyways.
Fix it by adding a helper to skip past data with bounds checking. Then,
all of the helpers can assume that ptr <= end and maintain that
invariant. while we're here and auditing all of the call sites, let's
clean up the API and rename it from read_foo() to the less generic
mread_foo().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Commit eea5422546 ("libdrgn: make Linux kernel stack unwinding more
robust") overlooked that if the task is running in userspace, the stack
pointer in PRSTATUS obviously won't match the kernel stack pointer.
Let's bite the bullet and use the PID. If the race shows up in practice,
we can try to come up with another workaround.
drgn has a couple of issues unwinding stack traces for kernel core
dumps:
1. It can't unwind the stack for the idle task (PID 0), which commonly
appears in core dumps.
2. It uses the PID in PRSTATUS, which is racy and can't actually be
trusted.
The solution for both of these is to look up the PRSTATUS note by CPU
instead of PID.
For the live kernel, drgn refuses to unwind the stack of tasks in the
"R" state. However, the "R" state is running *or runnable*, so in the
latter case, we can still unwind the stack. The solution for this is to
look at on_cpu for the task instead of the state.
We currently unwind from pt_regs and NT_PRSTATUS using an array of
register definitions. It's more flexible and more efficient to do this
with an architecture-specific callback. For x86-64, this change also
makes us depend on the binary layout rather than member names of struct
pt_regs, but that shouldn't matter unless people are defining their own,
weird struct pt_regs.
drgn was originally my side project, but for awhile now it's also been
my work project. Update the copyright headers to reflect this, and add a
copyright header to various files that were missing it.
For functions that call a noreturn function, the compiler may omit code
after the call instruction. This means that the return address may not
lie in the caller's symbol. dwfl_frame_pc() returns whether a frame is
an "activation", i.e., its program counter is guaranteed to lie within
the caller. This is only the case for the initial frame, frames
interrupted by a signal, and the signal trampoline frame. For everything
else, we need to decrement the program counter before doing any lookups.
DRGN_UNREACHABLE() currently expands to abort(), but assert() provides
more information. If NDEBUG is defined, we can use
__builtin_unreachable() instead.
DRGN_UNREACHABLE() isn't drgn-specific, so this renames it to
UNREACHABLE(). It's also not really related to errors, so this moves it
to internal.h.
I've found that I do this manually a lot (e.g., when digging through a
task's stack). Add shortcuts for reading unsigned integers and a note
for how to manually read other formats.
I thought I'd be able to avoid adding a separate API for register values
and reuse dwfl_frame_eval_expr(), but this doesn't work if the frame is
missing debug information but has known register values (e.g., if the
program crashed with an invalid instruction pointer).
Instead of having two internal variants (drgn_find_symbol_internal() and
drgn_program_find_symbol_in_module()), combine them into the former and
add a separate drgn_error_symbol_not_found() for translating the static
error to the user-facing one. This makes things more flexible for the
next change.
In preparation for making drgn_pretty_print_object() more flexible
(i.e., not always "pretty"), rename it to drgn_format_object(). For
consistency, let's rename drgn_pretty_print_type_name(),
drgn_pretty_print_type(), and drgn_pretty_print_stack_trace(), too.
We currently don't check that the task we're unwinding is actually
blocked, which means that linux_kernel_set_initial_registers_x86_64()
will get garbage from the stack and we'll return a nonsense stack trace.
Let's avoid this by checking that the task isn't running if we didn't
find a NT_PRSTATUS note.
When debugging the Linux kernel, it's inconvenient to have to get the
task_struct of a thread in order to get its stack trace. This adds
support for looking it up solely by PID. In that case, we do the
find_task() inside of libdrgn. This also gives us stack trace support
for userspace core dumps almost for free since we already added support
for NT_PRSTATUS.
vmcores include a NT_PRSTATUS note for each CPU containing the PID of
the task running on that CPU at the time of the crash and its registers.
We can use that to unwind the stack of the crashed tasks.
Currently, the only information available from a stack frame is the
program counter. Eventually, we'd like to add support for getting
arguments and local variables, but that will require more work. In the
mean time, we can at least get the values of other registers. A
determined user can read the assembly for the code they're debugging and
derive the values of variables from the registers.
For now, we only support stack traces for the Linux kernel (at least
v4.9) on x86-64, and we only support getting the program counter and
corresponding function symbol from each stack frame.