JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-22 17:23:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	2d8aeacb30	Allow naming and configuring order of object finders This one doesn't need any changes to the callback signature, just the new interface. We also keep add_object_finder() for compatibility. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2024-06-05 13:40:26 -07:00
Stephen Brennan	ff322c7070	libdrgn: introduce Symbol Finder API Symbol lookup is not yet modular, like type or object lookup. However, making it modular would enable easier development and prototyping of alternative Symbol providers, such as Linux kernel module symbol tables, vmlinux kallsyms tables, and BPF function symbols. To begin with, create a modular Symbol API within libdrgn, and refactor the ELF symbol search to use it. For now, we leave drgn_program_find_symbol_by_address_internal() alone. Its conversion will require some surgery, since the new API can return errors, whereas this function cannot. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2024-03-11 16:43:43 -07:00
Omar Sandoval	c85dd74f3e	libdrgn: embed drgn_debug_info in drgn_program This will simplify the implementation of the module API (#332). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-10-02 11:27:36 -07:00
Omar Sandoval	f24805486f	libdrgn: embed type and object finders in drgn_debug_info This is preparation for embedding drgn_debug_info in drgn_program and initializing it earlier. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-09-29 12:00:31 -07:00
Omar Sandoval	30ecdd901e	libdrgn: allow passing multiple type kinds to type finder function For the next change, we want to look up a name which may have one of multiple type kinds. Make drgn_type_kind_fn in libdrgn take a bitmask of kinds instead of a single kind. We could change the Python bindings to take the same bitmask, or a tuple of drgn.TypeKind, but either would be a breaking API change. For now, let's call the type finder function for each kind in the bitmask instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-21 15:18:26 -07:00
Omar Sandoval	c8406e1ea0	libdrgn: require semicolon after DEFINE_{HASH,VECTOR,BINARY_SEARCH_TREE}* The lack of a semicolon after these macros has always confused tooling like cscope. We could add semicolons everywhere now, but let's enforce it for the future, too. Let's add a dummy struct forward declaration at the end of each macro that enforces this requirement and also provides a useful error message. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:54:59 -07:00
Omar Sandoval	55a3ebca6c	libdrgn: dwarf_info: support DWO split DWARF We've addressed all of the smaller differences with GNU Debug Fission and split DWARF 5, so now all that remains is the DWARF index. The general approach is: in drgn_dwarf_index_read_cus(), for each CU, ask libdw for the "sub-DIE". For skeleton CUs, this is the split CU DIE from the .dwo file. From that Dwarf_Die, we can get the Dwarf_CU and then the Dwarf handle. Then, we wrap that in a struct drgn_elf_file (cached in a hash table in the struct drgn_module), which the DWARF index can work with from there. Additionally, a couple of places (.debug_addr parsing and stack trace local variable lookup) need to be updated to use the correct drgn_elf_file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	fc47ec1b78	libdrgn: add prog pointer to struct drgn_module The next commit needs this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	fc3ea4184a	libdrgn: use new include-what-you-use exported declarations and fix warnings include-what-you-use/include-what-you-use#1164 fixed include-what-you-use/include-what-you-use#971 so that we can export forward declarations instead of hacking around it. I can't reproduce the issue with BINARY_OP_SIGNED_2C anymore either, so we can remove that hack, too. Also fix any other warnings. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-05-24 00:25:25 -07:00
Omar Sandoval	18b12a5c7b	libdrgn: get .eh_frame from the correct file We're currently getting .eh_frame from the debug file. However, since .eh_frame is an SHF_ALLOC section, it is actually in the loaded file, and may not be in the debug file. This causes us to fail to unwind in modules whose debug file was created with objcopy --only-keep-debug (which is typical for Linux distro debug files). Fix it by getting .eh_frame from the loaded file. To make this easier, we split .eh_frame and .debug_frame data into two separate tables. We also don't bother deduplicating them anymore, since GCC and Clang only seem to generate one or the other in practice. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	270375f077	libdrgn: debug_info: get "loaded" ELF file For upcoming changes, we will need loaded (SHF_ALLOC) sections for modules. Some separate debug files (e.g., those created with objcopy --only-keep-debug) don't have those sections. Let's get the loaded file from libdwfl with dwfl_module_getelf() and save it in a struct drgn_elf_file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	bcb53d712b	libdrgn: bypass libdwfl with struct drgn_elf_file Now that we track the debug file ourselves, we can avoid calling libdwfl in a bunch of places. By tracking the bias ourselves, we can avoid a bunch more. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	34f122144a	libdrgn: debug_info: wrap ELF file information in new struct drgn_elf_file struct drgn_module contains a bunch of information about the debug info file. Let's pull it out into its own structure, struct drgn_elf_file. This will be reused for the "main"/"loaded" file in an upcoming change. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	b3bab1c5b0	libdrgn: make module vs. program platform difference more clear It's confusing that we have a platform both for the program and for each module. They usually match, but they're not required to. For example, the user can manually add a file with a different platform just to read its debug info. Our rule is that if we're parsing anything from the module, we use the module platform; and otherwise, use the program platform. There are a couple of places where the platforms must match: when using call frame information (CFI) or registers. Let's make all of this more clear in the code (by using the module's platform even when it must match the program's platform) and in comments. No functional change. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 12:53:45 -08:00
Omar Sandoval	87b7292aa5	Relicense drgn from GPLv3+ to LGPLv2.1+ drgn is currently licensed as GPLv3+. Part of the long term vision for drgn is that other projects can use it as a library providing programmatic interfaces for debugger functionality. A more permissive license is better suited to this goal. We decided on LGPLv2.1+ as a good balance between software freedom and permissiveness. All contributors not employed by Meta were contacted via email and consented to the license change. The only exception was the author of commit `c4fbf7e589` ("libdrgn: fix for compilation error"), who did not respond. That commit reverted a single line of code to one originally written by me in commit `640b1c011d` ("libdrgn: embed DWARF index in DWARF info cache"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 17:05:16 -07:00
Omar Sandoval	70af25849c	libdrgn: rename drgn_debug_info_module to drgn_module Eventually, modules will be exposed as part of the public libdrgn API, so they should have a clean name. Additionally, the module API I'm currently working on will allow modules for which we don't have the debug info file, so "debug info module" would be a misnomer. Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and drgn_orc_module_info to drgn_module_orc_info to fit the new naming scheme better. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 16:52:46 -07:00
Omar Sandoval	929b7de266	libdrgn: handle reading data from SHT_NOBITS sections Peilin Ye reported a couple of related crashes in drgn caused by Linux kernel modules which had been processed with objcopy --only-keep-debug (although he notes that since binutils-gdb commit 8c803a2dd7d3 ("elf_backend_section_flags and _bfd_elf_init_private_section_data") (in binutils v2.35), objcopy --only-keep-debug doesn't seem to work for kernel modules). If given an SHT_NOBITS section, elf_getdata() returns an Elf_Data with d_buf = NULL and d_size set to the size in the section header, which is often non-zero. There are a few places where this can cause us to dereference a NULL pointer: * In relocate_elf_sections() for the relocated section data. * In relocate_elf_sections() for the symbol table section data. * In get_kernel_module_name_from_modinfo(). * In get_kernel_module_name_from_this_module(). Fix it by checking the section type or directly checking Elf_Data::d_buf everywhere that could potentially get an SHT_NOBITS section. This is based on a PR from Peilin Ye. Closes #145. Reported-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-27 12:23:09 -08:00
Omar Sandoval	e6abfeac03	libdrgn: debug_info: report userspace core dump debug info ourselves There are a few reasons for this: 1. dwfl_core_file_report() crashes on elfutils 0.183-0.185. Those versions are still used by several distros. 2. In order to support --main-symbols and --symbols properly, we need to report things ourselves. 3. I'm considering moving away from libdwfl in the long term. We provide an escape hatch for now: setting the environment variable DRGN_USE_LIBDWFL_REPORT=1 opts out of drgn's reporting and uses libdwfl's. Fixes #130. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-08 12:11:10 -08:00
Omar Sandoval	844d82848c	libdrgn: add partial support for .gnu_debugaltlink Issue #130 reported an "unknown attribute form 0x1f20" from drgn. 0x1f20 is DW_FORM_GNU_ref_alt, which is a reference to a DIE in an alternate file. Similarly, DW_FORM_GNU_strp_alt is a string in an alternate file. The alternate file is specified by the .gnu_debugaltlink section. This is generated by dwz, which is used by at least Fedora and Debian. libdwfl already finds the alternate debug info file, so we can save its .debug_info and .debug_str and use those to support DW_FORM_GNU_ref_alt and DW_FORM_GNU_strp_alt in the DWARF index. Imported units are going to be more work to support in the DWARF index, but this at least lets drgn start up. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-07 13:49:09 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	5591d199b1	libdrgn: debug_info: split DWARF support into its own file Continuing the refactoring from the previous commit, move the DWARF code from debug_info.c to its own file, leaving only the generic ELF file management in debug_info.c Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-18 15:08:54 -08:00
Omar Sandoval	c6b2bc4181	libdrgn: debug_info: split ORC support into its own file debug_info.c currently contains code for managing ELF files with debugging information, for parsing DWARF, and for parsing ORC. Let's split it up, starting by moving ORC support to its own file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-18 15:08:04 -08:00
Omar Sandoval	1339dc6a2f	libdrgn: hash_table: move entry_to_key to DEFINE_HASH_TABLE_FUNCTIONS() DEFINE_HASH_TABLE_TYPE() doesn't actually need to know the key type. Move that argument (and some of the derived constants) to DEFINE_HASH_TABLE_FUNCTIONS(). This will allow recursive hash table types. As a nice side effect, it also reduces the size of common header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-10-23 00:52:23 -07:00
Omar Sandoval	26001733f6	libdrgn: debug_info: support DWARF 5 location lists The DWARF 5 format is a little more complicated than DWARF 2-4 but functionally very similar. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-09 01:52:08 -07:00
Omar Sandoval	215f7d79d7	libdrgn: debug_info: implement DW_OP_{addr,const}x These were added in DWARF 5. They need to know the CU that they're being evaluated in, but the parameters for drgn_eval_dwarf_expression() were already getting unwieldy. Wrap the evaluation context in a new struct drgn_dwarf_expression_context, add the additional CU information, and implement the operations. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-09 01:52:08 -07:00
Omar Sandoval	81053a1c57	libdrgn: dwarf_index: support DWARF 5 The main changes are: 1. Skipping the new attribute forms. 2. Handling DW_FORM_strx*, DW_FORM_line_strp, and DW_FORM_implicit_const for the attributes that we care about. 3. Parsing the new unit header format. 4. Parsing the new line number program header format. Note that Clang currently produces an incorrect DWARF 5 line number program header for the Linux kernel (https://reviews.llvm.org/D105662), so some types are not properly deduplicated in that case. Closes #104. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-09 01:51:59 -07:00
Omar Sandoval	bc85767e5f	libdrgn: support looking up parameters and variables in stack traces After all of the preparatory work, the last two missing pieces are a way to find a variable by name in the list of scopes that we saved while unwinding, and a way to find the containing scopes of an inlined function. With that, we can finally look up parameters and variables in stack traces. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	0e113ecc8d	libdrgn: debug_info: add drgn_find_die_ancestors() This will be used for finding the ancestors of the abstract instance root corresponding to a concrete inlined instance root for variable lookups in inlined functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	d8d4157346	libdrgn: debug_info: add drgn_debug_info_module_find_dwarf_scopes() This will be used for finding functions, inlined functions, and blocks containing a PC for stack unwinding and variable lookups. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	d5b68455b8	libdrgn: debug_info: save .debug_loc .debug_loc will be used for variable resolution. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Jay Kamat	9dabec1264	libdrgn: add support for parsing type units Adds support for parsing of type units as enabled by -fdebug-types-section. If a module has both a debug info section and type unit section, both are read. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-04-23 02:37:31 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	630d39e345	libdrgn: add ORC unwinder The Linux kernel has its own stack unwinding format for x86-64 called ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is essentially a simplified, less complete version of DWARF CFI. ORC is generated by analyzing machine code, so it is present for all but a few ignored functions. In contrast, DWARF CFI is generated by the compiler and is therefore missing for functions written in assembly and inline assembly (which is widespread in the kernel). This implements an ORC stack unwinder: it applies ELF relocations to the ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule kind, parses and efficiently stores ORC data, and translates ORC to drgn CFI rules. This will allow us to stack trace through assembly code, interrupts, and system calls. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-29 10:01:52 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	fdaf7790a9	libdrgn: add DWARF call frame information parsing In preparation for adding our own unwinder, add support for parsing and finding DWARF/EH call frame information. Use a generic representation of call frame information so that we can support other formats like ORC in the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	cc1a5606d0	libdrgn: debug_info: save platform per module Stack unwinding depends on some platform-specific information. If for some reason a program has debugging information with different platforms, then we need to make sure that while we're unwinding the stack, we don't end up in a frame with a different platform, because the registers won't make sense. Additionally, we should parse debugging information using the module's platform rather than the program's platform, which may not match. So, cache the platform derived from each module's ELF file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	6065fc87af	libdrgn: debug_info: save .debug_frame, .eh_frame, .text, and .got These sections are needed for stack unwinding. However, .debug_frame and .eh_frame don't need to be read right away, and .text and .got don't need to be read at all, so partition them accordingly. Also, check that the sections are specifically SHT_PROGBITS rather than not SHT_NOBITS. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	7eab40aaeb	libdrgn: rename drgn_error_debug_info() to drgn_error_debug_info_scn() An upcoming change will introduce a similar function for when the section isn't known. Rename the original so that the new one can take its name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-10 02:07:16 -08:00
Omar Sandoval	aaa98ccae3	libdrgn: consistently use __ for __attribute__ names In some places, we add __ preceding and following an attribute name, and in some places, we don't. Let's make it consistent. We might as well opt for the __ to make clashes with macros less likely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 03:16:23 -08:00
Omar Sandoval	5975d19580	libdrgn: report better errors when parsing DWARF/kmod index If the DWARF index encounters any error while parsing, it returns an error saying only "debug information is truncated", which makes it hard to track down parsing errors. The kmod index parser silently swallows errors. For both, replace the mread functions with a higher-level binary_buffer interface that can include more information including the location of the error. For example: /tmp/mybinary: .debug_info+0x4: expected at least 56 bytes, have 55 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-13 17:00:07 -08:00
Omar Sandoval	756e5d27ad	libdrgn: debug_info: put sections in an array (again) Back in commit `9ce9094ee0` ("libdrgn: dwarf_index: don't copy sections into each CU"), I changed the sections to be individual members. The next change will be easier if they're in an array. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-11 16:22:04 -08:00
Omar Sandoval	de6a4e07ae	libdrgn: fix Doxygen The Doxygen documentation for libdrgn has bit-rotted over time. Bring back the Internal module, clean up a few renamed members and parameters, and fix broken parsing caused by the generic definition macros. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-30 01:32:33 -07:00
Omar Sandoval	286c09844e	Clean up #includes with include-what-you-use I recently hit a couple of CI failures caused by relying on transitive includes that weren't always present. include-what-you-use is a Clang-based tool that helps with this. It's a bit finicky and noisy, so this adds scripts/iwyu.py to make running it more convenient (but not reliable enough to automate it in Travis). This cleans up all reasonable include-what-you-use warnings and reorganizes a few header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-23 16:29:42 -07:00
Omar Sandoval	f83bb7c71b	libdrgn: move debugging information tracking into drgn_debug_info Debugging information tracking is currently in two places: drgn_program finds debugging information, and drgn_dwarf_index stores it. Both of these responsibilities make more sense as part of drgn_debug_info, so let's move them there. This prepares us to track extra debugging information that isn't pertinent to indexing. This also reworks a couple of details of loading debugging information: - drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into a single structure, drgn_debug_info_module. - The first pass of DWARF indexing now happens in parallel with reading compilation units (by using OpenMP tasks). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 10:58:24 -07:00
Omar Sandoval	3ac9ae357b	libdrgn: rename drgn_dwarf_info_cache to drgn_debug_info The current name is too verbose. Let's go with a shorter, more generic name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-11 17:41:23 -07:00

45 Commits