Commit Graph

35 Commits

Author SHA1 Message Date
Stephen Brennan
5b39bfb547 libdrgn: x86_64: avoid recursive address translation for swapper_pg_dir
Most core dumps contain some virtual address mappings: usually at a
minimum, the kernel's direct map is represented in ELF vmcores via a
segment. So normally, drgn can rely on the vmcore to read the virtual
address of swapper_pg_dir. However, some vmcores only contain physical
address information, so when drgn reads memory at swapper_pg_dir, it
needs to first translate that address, thus causing a recursive
translation error like below:

>>> prog["slab_caches"]
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/home/stepbren/repos/drgn/drgn/cli.py", line 141, in _displayhook
    text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns)
_drgn.FaultError: recursive address translation; page table may be missing from core dump: 0xffffffff9662aff8

Debuggers like crash, as well as libkdumpfile, contain fallback code
which can translate swapper_pg_dir in order to bootstrap this address
translation. In fact, the above error does not occur in drgn when using
libkdumpfile. So, let's add this fallback case to drgn as well. Other
architectures will need to have equivalent support added.

Co-authored-by: Illia Ostapyshyn <ostapyshyn@sra.uni-hannover.de>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2024-05-30 11:56:23 -07:00
Omar Sandoval
c635adb7e6 Revert "libdrgn: add last virtual address hint to page table iterator"
This reverts commit 747e02857d (except for
the test improvements). Peter Collingbourne noticed that the change I
used to test the performance of reading a single PTE at a time [1]
didn't cache higher level entries. Keeping that caching makes the
regression I was worried about negligible. So, there's no reason to add
the extra complexity of the hint.

1: https://github.com/osandov/drgn/pull/312#issuecomment-1754082129

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-11-01 16:31:39 -07:00
Omar Sandoval
747e02857d libdrgn: add last virtual address hint to page table iterator
Peter Collingbourne reported that the over-reading we do in the AArch64
page table iterator uses too much bandwidth for remote targets. His
original proposal in #312 was to change the page table iterator to only
read one entry per level. However, this would regress large reads that
do end up using the additional entries (in particular when the target is
/proc/kcore, which has a high latency per read but also high enough
bandwidth that the over-read is essentially free).

We can get the best of both worlds by informing the page table iterator
how much we expect to need (at the cost of some additional complexity in
this admittedly already pretty complex code). Requiring an accurate end
would limit the flexibility of the page table iterator and be more
error-prone, so let's make it a non-binding hint.

Add the hint and use it in the x86-64 page table iterator to only read
as many entries as necessary. Also extend the test case for large page
table reads to test this better.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-10-11 11:29:35 -07:00
Omar Sandoval
b7af85df52 libdrgn: x86_64: update code style
The arch_x86_64.c is often used as a reference when implementing support
for other architectures, so make sure it uses our latest best practices.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-10-11 10:57:23 -07:00
Omar Sandoval
085b0bc9db libdrgn: add DRGN_OBJECT scope guard
Temporary local drgn_objects are fairly common. They require defining a
variable, initializing it with drgn_object_init, and cleaning it up with
drgn_object_deinit. Add a scope guard, DRGN_OBJECT, to do this all
automatically, and convert existing uses to it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-09-28 22:31:11 -07:00
inwardvessel
412ce956b0 libdrgn: x86_64: unwind call when pc is 0
While using drgn for debugging a linux vmcore, a stack trace was found
to contain a single frame of "0: 0x0". It seemed peculiar since the
corresponding panic dump in kmsg showed a stack trace of multiple frames
with resolved symbols. Digging into the drgn stack unwinding functions,
it was found that there was no way of handling a program counter (RIP on
x86) of zero. The main unwinding function would return an error leading
to an attempt to call a fallback unwind function. Even though the
fallback routine was attempted, it did not help since the fallback tries
to make use of frame pointers. In the case where the kernel is compiled
without frame pointers, the fallback function would also error out due
to an attempt to read from an unmapped register.

A program counter of zero can be assumed to be a call to a null function
pointer. Using this as a basis, when in the absence of frame pointers,
the call to a null address can be unwound in a special way. Registers
can be slightly adjusted so that unwinding is possible. First, the
program counter needs to point somewhere it can be unwound. At the time
of the null call, RSP will contain the return address so if we take that
RSP and assign it to RIP, we effectively put the program counter after
the null call. Next, the top of the stack frame needs to change to the
top of the calling frame. This is done by incrementing RSP. At this
point, unwinding can continue and any frames below the null call will be
included in the trace. This special unwinding operation is done in the
new function "unwind_call()". To solve the problem of being unable to
handle a zero pc when frame pointers are not present, the fallback
unwind function is updated to check for a zero pc and call the new
function if needed.

Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
2023-08-24 16:05:23 -07:00
Omar Sandoval
91ede0c6a4 libdrgn: orc_info: handle ORC changes in Linux 6.3 and 6.4
The ORC format changed twice recently:

- Linux kernel commit ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field
  to ORC metadata") (in v6.3).
- Linux kernel commit fb799447ae29 ("x86,objtool: Split
  UNWIND_HINT_EMPTY in two") (in v6.4).

The former went unnoticed because the change was subtle, and the latter
completely broke x86-64 kernel stack traces.

To handle this, let's "upgrade" the format to the latest version when we
load and sort the ORC information. This is more work upfront but avoids
needing to handle the version differences every time we use ORC to
unwind.

Unfortunately, ORC currently doesn't have any sort of versioning, so we
have to break the rule of not checking kernel versions. However, I have
a kernel patch pending merging that should fix this for the future.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-22 15:27:39 -07:00
Omar Sandoval
cff9b6185c libdrgn: fix typo in ORC unwinder handling of ORC_REG_SP_INDIRECT
ORC_REG_SP_INDIRECT is supposed to be an indirect access via rsp, but we
have a typo and are using rbp instead. This is a partial fix for #304.

Fixes: 630d39e345 ("libdrgn: add ORC unwinder")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-06-15 13:03:10 -07:00
Omar Sandoval
4731de6acc libdrgn: x86_64: unwind with frame pointer more permissively
get_registers_from_frame_pointer() has a sanity check that the unwound
frame pointer must be greater than the current frame pointer. This is
generally true if the entire program is using frame pointers, but not
necessarily otherwise. In particular, if the program is a Linux kernel
configured with ORC, most of the time, rbp is a general purpose
register; it is only used as a frame pointer in special cases without
unwinder information like BPF programs. Those cases are exactly when we
want the frame pointer unwinder, but depending on what the caller was
using rbp for, the frame pointer unwinder might bail prematurely.

Let's remove the sanity check. In the worst case, this could lead us off
into the weeds chasing pointers, but the iteration limit in
drgn_get_stack_trace() prevents that from being dangerous.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-01-04 16:45:28 -08:00
Omar Sandoval
87b7292aa5 Relicense drgn from GPLv3+ to LGPLv2.1+
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.

All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 17:05:16 -07:00
Omar Sandoval
f8ba278bc1 libdrgn: fix include-what-you-use warnings
It's been awhile since I've run this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
e9d16732d6 libdrgn: x86_64: fix page table iteration over non-canonical range
We're currently checking whether the iterator has entered the
non-canonical range when fetching the last level of the page table, but
the cutover actually happens while we're in the last level. Fix it by
doing the check unconditionally.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-24 00:03:45 -07:00
Omar Sandoval
b28bd9f0a3 libdrgn: linux_kernel: get vmemmap generically
AArch64 has changed the location of vmemmap multiple times, and not all
of these can be easily distinguished. Rather than restorting to kernel
version checks, this replaces the vmemmap architecture callback with a
generic approach that gets the vmemmap address directly from the
mem_section table.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
a213573b23 libdrgn: linux_kernel: make virt_to_phys() and phys_to_virt() generic
On x86-64, the difference between virtual addresses in the direct map
and the corresponding physical addresses is called PAGE_OFFSET, so we
exposed that via an architecture callback and the Linux kernel object
finder. However, this doesn't translate to other architectures. Namely,
on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both
PAGE_OFFSET and PHYS_OFFSET have varied over time and between
configurations.

We can remove the architecture callback and avoid version-specific logic
by letting the page table tell us the offset. We just need an address in
the direct map, which is easy to find since this includes kmalloc and
memblock allocations.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
571949a743 libdrgn: x86_64: don't bother zeroing cached page table on initialization
pgtable_iterator_x86_64::table is only used if
pgtable_iterator_x86_64::index indicates that it has any cached entries,
so there's no point initializing table since we initialize index to
indicate that nothing is cached.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
36fecd1ded libdrgn: refactor page table iterators
AArch64 will need different sizes of page table iterators depending on
the page size and virtual address size. Rather than the static
pgtable_iterator_arch_size, allow architectures to define callbacks for
allocating and freeing a page table iterator. Also remove the generic
page table iterator wrapper and just pass that information to the
iterator function.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
deabe2cb56 libdrgn: register_state: add and use drgn_register_state_get_u64()
This factors out some boilerplate for getting registers as a uint64_t.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
0a7849d791 libdrgn: rename drgn_register_state_set_from_integer() -> from_u64()
This is for consistency with drgn_register_state_get_u64() that we're
about to add.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
49ae42ccfd libdrgn: x86-64: add a few more register definitions
In additional to the general-purpose registers, struct pt_regs also
provides the cs and ss segment registers and the rflags register.
elf_gregset_t provides the other segment registers as well. We should
expose all of those.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:30 -07:00
Omar Sandoval
33d14f7703 libdrgn: rework architecture definition files
Currently, register definitions are split across two files:
arch_foo.defs lists the names of registers, and arch_foo.c defines the
layout used to store registers in memory. The main rationale for this
was that the layout could be processed entirely by the C preprocessor,
but the register names needed an AWK script that we wanted to keep
minimal. But since commit af6f5a887d ("libdrgn: replace gen_arch.awk
with gen_arch_inc_strswitch.py"), arch_foo.defs is processed by a Python
script.

Let's define both the register names and the register layout in a new
file, arch_foo_defs.py, which is processed by gen_arch_inc_strswitch.py
This has a few benefits:

* It puts all of the register definitions for an architecture in one
  place.
* It is easier to maintain than preprocessor magic. (It also makes it
  trivial to support registers that don't exist in DWARF, which would've
  been harder to do with our preprocessor code.)
* It gets rid of our DSL in favor of Python (which also lets us reduce
  repetition for the ppc64 definitions).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-25 22:39:26 -07:00
Omar Sandoval
da16a12fad libdrgn: x86_64: implement more relocation types
Implement R_X86_64_32S and R_X86_64_PC64. I haven't seen these for debug
info in the wild, but they're supported by the Linux kernel and they're
easy to support. The only other type of relocation currently supported
by the kernel is R_X86_64_PLT32, which is trickier. For kernel modules,
it's equivalent to R_X86_64_PC32 (see Linux kernel commit b21ebf2fb4cd
("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but that doesn't seem to
be true in general. It doesn't seem applicable to debug info sections,
so hopefully we don't need to worry about it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-18 17:56:37 -07:00
Omar Sandoval
b16dad8a36 libdrgn: support SHT_REL relocations
In preparation for supporting ELF relocations for more architectures,
generalize ELF relocations to handle SHT_REL sections/ElfN_Rel.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-18 17:56:37 -07:00
Omar Sandoval
c0d8709b45 Update copyright headers to Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 15:59:44 -08:00
Omar Sandoval
fba5947fec libdrgn: add array_for_each()
And use it in a few appropriate places. This should hopefully make it
harder to make iteration mistakes like the one fixed by commit
4755cfac7c ("libdrgn: dwarf_index: increment correct variable when
rolling back"). While we're doing this, move ARRAY_SIZE() into a new
header file with array_for_each() and make it lowercase.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-08-23 17:32:00 -07:00
Omar Sandoval
a4b9d68a8c Use GPL-3.0-or-later license identifier instead of GPL-3.0+
Apparently the latter is deprecated and the former is preferred.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:10:35 -07:00
Omar Sandoval
b772432a86 libdrgn: cfi: don't rely on member containing a flexible array
Clang enables -Wgnu-variable-sized-type-not-at-end by default, which
warns for DRGN_CFI_ROW():

  arch_x86_64.c:735:27: warning: field 'row' with variable sized type 'struct drgn_cfi_row' not at the end of a struct or class is a GNU extension
        [-Wgnu-variable-sized-type-not-at-end]
          .default_dwarf_cfi_row = DRGN_CFI_ROW(

DRGN_CFI_ROW() is gnarly anyways, so instead of having it expand to a
pointer expression relying on this GCC extension, make it expand to an
initializer. Then, we can initialize default_dwarf_cfi_row as a separate
variable rather than directly in the initializer for struct
drgn_architecture_info.

This still relies on a GCC extension for static initialization of
flexible array members, but apparently Clang is okay with that one by
default (-Wgnu-flexible-array-initializer must be enabled explictly or
by -Wgnu or -Wpedantic).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 16:19:21 -07:00
Omar Sandoval
630d39e345 libdrgn: add ORC unwinder
The Linux kernel has its own stack unwinding format for x86-64 called
ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is
essentially a simplified, less complete version of DWARF CFI. ORC is
generated by analyzing machine code, so it is present for all but a few
ignored functions. In contrast, DWARF CFI is generated by the compiler
and is therefore missing for functions written in assembly and inline
assembly (which is widespread in the kernel).

This implements an ORC stack unwinder: it applies ELF relocations to the
ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule
kind, parses and efficiently stores ORC data, and translates ORC to drgn
CFI rules. This will allow us to stack trace through assembly code,
interrupts, and system calls.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-29 10:01:52 -07:00
Omar Sandoval
090064f20d libdrgn: x86-64: support R_X86_64_PC32 relocation type
This is used for .orc_unwind_ip for kernel modules.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 15:16:36 -07:00
Omar Sandoval
e0aaaf203d libdrgn: generalize applying ELF relocations
To support unwinding with ORC, we need to apply relocations to
.orc_unwind_ip, which libdwfl doesn't do. That means that we always need
to apply relocations on x86-64, not just as a fast path when the file's
byte order matches the host's. So, generalize handling of 64- vs 32-bit
and little- vs big-endian relocations, and move the handling of
relocation types to an arch-specific callback.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 15:16:35 -07:00
Omar Sandoval
eec67768aa libdrgn: replace elfutils DWARF unwinder with our own
The elfutils DWARF unwinder has a couple of limitations:

1. libdwfl doesn't have an interface for getting register values, so we
   have to bundle a patched version of elfutils with drgn.
2. Error handling is very awkward: dwfl_getthread_frames() can return an
   error even on success, so we have to squirrel away our own errors in
   the callback.

Furthermore, there are a couple of things that will be easier with our
own unwinder:

1. Integrating unwinding using ORC will be easier when we're handling
   unwinding ourselves.
2. Support for local variables isn't too far away now that we have DWARF
   expression evaluation.

Now that we have the register state, CFI, and DWARF expression pieces in
place, stitch them together with the new unwinder, and tweak the public
API a bit to reflect it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:43:12 -07:00
Omar Sandoval
fdaf7790a9 libdrgn: add DWARF call frame information parsing
In preparation for adding our own unwinder, add support for parsing and
finding DWARF/EH call frame information. Use a generic representation of
call frame information so that we can support other formats like ORC in
the future.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:36:38 -07:00
Omar Sandoval
0a6aaaae5d libdrgn: define structure for storing processor register values
libdwfl stores registers in an array of uint64_t indexed by the DWARF
register number. This is suboptimal for a couple of reasons:

1. Although the DWARF specification states that registers should be
   numbered for "optimal density", in practice this isn't the case. ABIs
   include unused ranges of numbers and don't order registers based on
   how likely they are to be known (e.g., caller-saved registers usually
   aren't recovered while unwinding the stack, but they are often
   numbered before callee-saved registers).
2. This precludes support for registers larger than 64 bits, like SSE
   registers.

For our own unwinder, we want to store registers in an
architecture-specific format to solve both of these problems.

So, have each architecture define its layout with registers arranged for
space efficiency and convenience when parsing saved registers from core
dumps. Instead of generating an arch_foo.c file from arch_foo.c.in,
separately define the logical register order in an arch_foo.defs file,
and use it to generate an arch_foo.inc file that is included from
arch_foo.c. The layout is defined as a macro in arch_foo.c. While we're
here, drop some register definitions that aren't useful at the moment.

Then, define struct drgn_register_state to efficiently store registers
in the defined format.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:36:38 -07:00
Omar Sandoval
d60c6a1d68 libdrgn: add register information to platform
In order to retrieve registers from stack traces, we need to know what
registers are defined for a platform. This adds a small DSL for defining
registers for an architecture. The DSL is parsed by an awk script that
generates the necessary tables, lookup functions, and enum definitions.
2019-10-18 14:33:02 -07:00
Omar Sandoval
10142f922f Add basic stack trace support
For now, we only support stack traces for the Linux kernel (at least
v4.9) on x86-64, and we only support getting the program counter and
corresponding function symbol from each stack frame.
2019-08-02 00:26:28 -07:00
Omar Sandoval
690b5fd650 libdrgn: generalize architecture to platform
For stack trace support, we'll need to have some architecture-specific
functionality. drgn's current notion of an architecture doesn't actually
include the instruction set architecture. This change expands it to a
"platform", which includes the ISA as well as the existing flags.
2019-08-02 00:11:56 -07:00