JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-23 09:43:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	b28bd9f0a3	libdrgn: linux_kernel: get vmemmap generically AArch64 has changed the location of vmemmap multiple times, and not all of these can be easily distinguished. Rather than restorting to kernel version checks, this replaces the vmemmap architecture callback with a generic approach that gets the vmemmap address directly from the mem_section table. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-07-14 12:05:11 -07:00
Omar Sandoval	a213573b23	libdrgn: linux_kernel: make virt_to_phys() and phys_to_virt() generic On x86-64, the difference between virtual addresses in the direct map and the corresponding physical addresses is called PAGE_OFFSET, so we exposed that via an architecture callback and the Linux kernel object finder. However, this doesn't translate to other architectures. Namely, on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both PAGE_OFFSET and PHYS_OFFSET have varied over time and between configurations. We can remove the architecture callback and avoid version-specific logic by letting the page table tell us the offset. We just need an address in the direct map, which is easy to find since this includes kmalloc and memblock allocations. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-07-14 12:05:11 -07:00
Omar Sandoval	571949a743	libdrgn: x86_64: don't bother zeroing cached page table on initialization pgtable_iterator_x86_64::table is only used if pgtable_iterator_x86_64::index indicates that it has any cached entries, so there's no point initializing table since we initialize index to indicate that nothing is cached. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-07-14 12:05:11 -07:00
Omar Sandoval	36fecd1ded	libdrgn: refactor page table iterators AArch64 will need different sizes of page table iterators depending on the page size and virtual address size. Rather than the static pgtable_iterator_arch_size, allow architectures to define callbacks for allocating and freeing a page table iterator. Also remove the generic page table iterator wrapper and just pass that information to the iterator function. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-07-14 12:05:11 -07:00
Omar Sandoval	deabe2cb56	libdrgn: register_state: add and use drgn_register_state_get_u64() This factors out some boilerplate for getting registers as a uint64_t. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-06-25 22:39:30 -07:00
Omar Sandoval	0a7849d791	libdrgn: rename drgn_register_state_set_from_integer() -> from_u64() This is for consistency with drgn_register_state_get_u64() that we're about to add. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-06-25 22:39:30 -07:00
Omar Sandoval	49ae42ccfd	libdrgn: x86-64: add a few more register definitions In additional to the general-purpose registers, struct pt_regs also provides the cs and ss segment registers and the rflags register. elf_gregset_t provides the other segment registers as well. We should expose all of those. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-06-25 22:39:30 -07:00
Omar Sandoval	33d14f7703	libdrgn: rework architecture definition files Currently, register definitions are split across two files: arch_foo.defs lists the names of registers, and arch_foo.c defines the layout used to store registers in memory. The main rationale for this was that the layout could be processed entirely by the C preprocessor, but the register names needed an AWK script that we wanted to keep minimal. But since commit `af6f5a887d` ("libdrgn: replace gen_arch.awk with gen_arch_inc_strswitch.py"), arch_foo.defs is processed by a Python script. Let's define both the register names and the register layout in a new file, arch_foo_defs.py, which is processed by gen_arch_inc_strswitch.py This has a few benefits: * It puts all of the register definitions for an architecture in one place. * It is easier to maintain than preprocessor magic. (It also makes it trivial to support registers that don't exist in DWARF, which would've been harder to do with our preprocessor code.) * It gets rid of our DSL in favor of Python (which also lets us reduce repetition for the ppc64 definitions). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-06-25 22:39:26 -07:00
Omar Sandoval	da16a12fad	libdrgn: x86_64: implement more relocation types Implement R_X86_64_32S and R_X86_64_PC64. I haven't seen these for debug info in the wild, but they're supported by the Linux kernel and they're easy to support. The only other type of relocation currently supported by the kernel is R_X86_64_PLT32, which is trickier. For kernel modules, it's equivalent to R_X86_64_PC32 (see Linux kernel commit b21ebf2fb4cd ("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but that doesn't seem to be true in general. It doesn't seem applicable to debug info sections, so hopefully we don't need to worry about it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-18 17:56:37 -07:00
Omar Sandoval	b16dad8a36	libdrgn: support SHT_REL relocations In preparation for supporting ELF relocations for more architectures, generalize ELF relocations to handle SHT_REL sections/ElfN_Rel. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-18 17:56:37 -07:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	fba5947fec	libdrgn: add array_for_each() And use it in a few appropriate places. This should hopefully make it harder to make iteration mistakes like the one fixed by commit `4755cfac7c` ("libdrgn: dwarf_index: increment correct variable when rolling back"). While we're doing this, move ARRAY_SIZE() into a new header file with array_for_each() and make it lowercase. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-23 17:32:00 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	b772432a86	libdrgn: cfi: don't rely on member containing a flexible array Clang enables -Wgnu-variable-sized-type-not-at-end by default, which warns for DRGN_CFI_ROW(): arch_x86_64.c:735:27: warning: field 'row' with variable sized type 'struct drgn_cfi_row' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end] .default_dwarf_cfi_row = DRGN_CFI_ROW( DRGN_CFI_ROW() is gnarly anyways, so instead of having it expand to a pointer expression relying on this GCC extension, make it expand to an initializer. Then, we can initialize default_dwarf_cfi_row as a separate variable rather than directly in the initializer for struct drgn_architecture_info. This still relies on a GCC extension for static initialization of flexible array members, but apparently Clang is okay with that one by default (-Wgnu-flexible-array-initializer must be enabled explictly or by -Wgnu or -Wpedantic). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 16:19:21 -07:00
Omar Sandoval	630d39e345	libdrgn: add ORC unwinder The Linux kernel has its own stack unwinding format for x86-64 called ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is essentially a simplified, less complete version of DWARF CFI. ORC is generated by analyzing machine code, so it is present for all but a few ignored functions. In contrast, DWARF CFI is generated by the compiler and is therefore missing for functions written in assembly and inline assembly (which is widespread in the kernel). This implements an ORC stack unwinder: it applies ELF relocations to the ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule kind, parses and efficiently stores ORC data, and translates ORC to drgn CFI rules. This will allow us to stack trace through assembly code, interrupts, and system calls. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-29 10:01:52 -07:00
Omar Sandoval	090064f20d	libdrgn: x86-64: support R_X86_64_PC32 relocation type This is used for .orc_unwind_ip for kernel modules. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:16:36 -07:00
Omar Sandoval	e0aaaf203d	libdrgn: generalize applying ELF relocations To support unwinding with ORC, we need to apply relocations to .orc_unwind_ip, which libdwfl doesn't do. That means that we always need to apply relocations on x86-64, not just as a fast path when the file's byte order matches the host's. So, generalize handling of 64- vs 32-bit and little- vs big-endian relocations, and move the handling of relocation types to an arch-specific callback. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:16:35 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	fdaf7790a9	libdrgn: add DWARF call frame information parsing In preparation for adding our own unwinder, add support for parsing and finding DWARF/EH call frame information. Use a generic representation of call frame information so that we can support other formats like ORC in the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	0a6aaaae5d	libdrgn: define structure for storing processor register values libdwfl stores registers in an array of uint64_t indexed by the DWARF register number. This is suboptimal for a couple of reasons: 1. Although the DWARF specification states that registers should be numbered for "optimal density", in practice this isn't the case. ABIs include unused ranges of numbers and don't order registers based on how likely they are to be known (e.g., caller-saved registers usually aren't recovered while unwinding the stack, but they are often numbered before callee-saved registers). 2. This precludes support for registers larger than 64 bits, like SSE registers. For our own unwinder, we want to store registers in an architecture-specific format to solve both of these problems. So, have each architecture define its layout with registers arranged for space efficiency and convenience when parsing saved registers from core dumps. Instead of generating an arch_foo.c file from arch_foo.c.in, separately define the logical register order in an arch_foo.defs file, and use it to generate an arch_foo.inc file that is included from arch_foo.c. The layout is defined as a macro in arch_foo.c. While we're here, drop some register definitions that aren't useful at the moment. Then, define struct drgn_register_state to efficiently store registers in the defined format. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	d60c6a1d68	libdrgn: add register information to platform In order to retrieve registers from stack traces, we need to know what registers are defined for a platform. This adds a small DSL for defining registers for an architecture. The DSL is parsed by an awk script that generates the necessary tables, lookup functions, and enum definitions.	2019-10-18 14:33:02 -07:00
Omar Sandoval	10142f922f	Add basic stack trace support For now, we only support stack traces for the Linux kernel (at least v4.9) on x86-64, and we only support getting the program counter and corresponding function symbol from each stack frame.	2019-08-02 00:26:28 -07:00
Omar Sandoval	690b5fd650	libdrgn: generalize architecture to platform For stack trace support, we'll need to have some architecture-specific functionality. drgn's current notion of an architecture doesn't actually include the instruction set architecture. This change expands it to a "platform", which includes the ISA as well as the existing flags.	2019-08-02 00:11:56 -07:00

23 Commits