JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-26 18:45:37 +00:00

Author	SHA1	Message	Date
Omar Sandoval	5fc879ef3e	libdrgn: debug_info: limit number of DWARF expression operations executed A malformed DWARF expression can easily get us into an infinite loop. Avoid this by capping the number of operations that we'll execute. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	43b90ffb1b	libdrgn: debug_info: add missing stack size check for DW_OP_deref Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-23 13:14:28 -07:00
Omar Sandoval	92fd967a3a	libdrgn: print uint8_t as hex with PRIx8 format, not x In practice, they're probably always the same, but PRIx8 is more correct. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-07 15:34:30 -07:00
Omar Sandoval	e0921c5bdb	libdrgn: don't use OpenMP tasking libomp (at least in LLVM 9 and 10) seems to have buggy OpenMP tasking support. See commit `1cc3868955` ("CI: temporarily disable Clang") for one example. OpenMP tasks aren't buying us much; they simplify DWARF index updates in some places but complicate it in others. Let's ditch tasks and go back to building an array of CUs to index similar to what we did before commit `f83bb7c71b` ("libdrgn: move debugging information tracking into drgn_debug_info"). There is no significant performance difference. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-06 16:56:02 -07:00
Jay Kamat	6be21f674a	libdrgn: follow DW_AT_signature when parsing DWARF types When using type units, skeleton declarations are made instead of concrete ones. However, these declarations have signature tags attached that point to the type unit with the definition, so we can simply follow the signature to get the concrete type. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-04-23 02:37:31 -07:00
Jay Kamat	9dabec1264	libdrgn: add support for parsing type units Adds support for parsing of type units as enabled by -fdebug-types-section. If a module has both a debug info section and type unit section, both are read. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-04-23 02:37:31 -07:00
Omar Sandoval	33300d426e	libdrgn: debug_info: don't overwrite Dwarf_Die passed to drgn_type_from_dwarf_internal() If the DIE passed to drgn_type_from_dwarf_internal() is a declaration, then we overwrite it with dwarf_offdie(). As far as I can tell, this doesn't break anything at the moment, but it's sketchy to overwrite an input parameter and may cause issues in the future. Use a temporary DIE on the stack in this case instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-23 02:37:31 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	b772432a86	libdrgn: cfi: don't rely on member containing a flexible array Clang enables -Wgnu-variable-sized-type-not-at-end by default, which warns for DRGN_CFI_ROW(): arch_x86_64.c:735:27: warning: field 'row' with variable sized type 'struct drgn_cfi_row' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end] .default_dwarf_cfi_row = DRGN_CFI_ROW( DRGN_CFI_ROW() is gnarly anyways, so instead of having it expand to a pointer expression relying on this GCC extension, make it expand to an initializer. Then, we can initialize default_dwarf_cfi_row as a separate variable rather than directly in the initializer for struct drgn_architecture_info. This still relies on a GCC extension for static initialization of flexible array members, but apparently Clang is okay with that one by default (-Wgnu-flexible-array-initializer must be enabled explictly or by -Wgnu or -Wpedantic). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 16:19:21 -07:00
Omar Sandoval	630d39e345	libdrgn: add ORC unwinder The Linux kernel has its own stack unwinding format for x86-64 called ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is essentially a simplified, less complete version of DWARF CFI. ORC is generated by analyzing machine code, so it is present for all but a few ignored functions. In contrast, DWARF CFI is generated by the compiler and is therefore missing for functions written in assembly and inline assembly (which is widespread in the kernel). This implements an ORC stack unwinder: it applies ELF relocations to the ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule kind, parses and efficiently stores ORC data, and translates ORC to drgn CFI rules. This will allow us to stack trace through assembly code, interrupts, and system calls. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-29 10:01:52 -07:00
Omar Sandoval	090064f20d	libdrgn: x86-64: support R_X86_64_PC32 relocation type This is used for .orc_unwind_ip for kernel modules. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:16:36 -07:00
Omar Sandoval	e0aaaf203d	libdrgn: generalize applying ELF relocations To support unwinding with ORC, we need to apply relocations to .orc_unwind_ip, which libdwfl doesn't do. That means that we always need to apply relocations on x86-64, not just as a fast path when the file's byte order matches the host's. So, generalize handling of 64- vs 32-bit and little- vs big-endian relocations, and move the handling of relocation types to an arch-specific callback. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:16:35 -07:00
Omar Sandoval	da180b7274	libdrgn: handle errors from elf_strptr() For some reason, we consistently ignore errors from elf_strptr(), but we shouldn't. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 14:28:16 -07:00
Omar Sandoval	e5bc41f16c	libdrgn: add latest elf.h and dwarf.h to support elfutils 0.165 The oldest LTS version of Ubuntu, 16.04, has elfutils 0.165. This version is missing some ELF and DWARF definitions used by drgn. Add copies of elf.h from glibc 2.33 and dwarf.h and elfutils/known-dwarf.h from elfutils 0.183 to get the latest definitions and drop the minimum required version of elfutils further to 0.165. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-21 23:18:39 -07:00
Serapheim Dimitropoulos	a68abd5de4	libdrgn: stretch minimum supported version of libelf to 0.170 Currently libdrgn requires libelf to be of version 0.175 or later. This patch allows the library to be compiled with libelf 0.170 (the newest version supported by Ubuntu 18.04 LTS). Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>	2021-03-21 14:28:29 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	35a1af7ad6	libdrgn: add DWARF expression evaluation For DW_CFA_def_cfa_expression, DW_CFA_expression, and DW_CFA_val_expression, we need to be evaluate a DWARF expression. Add an interface for this. It doesn't yet support operations that aren't applicable to CFI or some more exotic operations. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	fdaf7790a9	libdrgn: add DWARF call frame information parsing In preparation for adding our own unwinder, add support for parsing and finding DWARF/EH call frame information. Use a generic representation of call frame information so that we can support other formats like ORC in the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	cc1a5606d0	libdrgn: debug_info: save platform per module Stack unwinding depends on some platform-specific information. If for some reason a program has debugging information with different platforms, then we need to make sure that while we're unwinding the stack, we don't end up in a frame with a different platform, because the registers won't make sense. Additionally, we should parse debugging information using the module's platform rather than the program's platform, which may not match. So, cache the platform derived from each module's ELF file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	6065fc87af	libdrgn: debug_info: save .debug_frame, .eh_frame, .text, and .got These sections are needed for stack unwinding. However, .debug_frame and .eh_frame don't need to be read right away, and .text and .got don't need to be read at all, so partition them accordingly. Also, check that the sections are specifically SHT_PROGBITS rather than not SHT_NOBITS. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	7eab40aaeb	libdrgn: rename drgn_error_debug_info() to drgn_error_debug_info_scn() An upcoming change will introduce a similar function for when the section isn't known. Rename the original so that the new one can take its name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-10 02:07:16 -08:00
Jay Kamat	4552d78f4a	libdrgn: debug_info: try to find DIE specification when parsing type Currently, we look up incomplete types by name, which can fail if the name is ambiguous or the type is unnamed. Try finding the complete type via the DW_AT_specification map in the DWARF index first. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-03-08 15:24:24 -08:00
Omar Sandoval	a24c0f5b33	libdrgn: clean up usage of drgn_stop Use drgn_not_found where it's more appropriate, and check explicitly against drgn_stop instead of err->code == DRGN_ERROR_STOP. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-05 12:46:06 -08:00
Omar Sandoval	25eb2abb1a	libdrgn: add drgn_platform getters Add low-level getters equivalent to the drgn_program platform-related helpers and use them in places where we have checked or can assume that the platform is known. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	e04eda9880	libdrgn: define HOST_LITTLE_ENDIAN As a minor cleanup, instead of writing __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ everywhere, define and use HOST_LITTLE_ENDIAN. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Jay Kamat	c22e501295	libdrgn: debug_info: fix parsing specifications of declarations drgn_compound_type_from_dwarf() and drgn_enum_type_from_dwarf() check the DW_AT_declaration flag to decide whether the type is a declaration of an incomplete type or a definition of a complete type. However, they check DW_AT_declaration with dwarf_attr_integrate(), which follows the DW_AT_specification reference if it is present. The DIE referenced by DW_AT_specification typically is a declaration, so this erroneously identifies definitions as declarations. Additionally, if drgn_debug_info_find_complete() finds the same definition, we can end up recursing until we hit the DWARF parsing depth limit. Fix it by not using dwarf_attr_integrate() for DW_AT_declaration. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-02-25 10:46:34 -08:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	72b4aa9669	libdrgn: clean up object initialization Rename struct drgn_object_type to struct drgn_operand_type, add a new struct drgn_object_type which contains all of the type-related fields from struct drgn_object, and use it to implement drgn_object_type() and drgn_object_type_operand(), which are replacements for drgn_object_set_common() and drgn_object_type_encoding_and_size(). This cleans up a lot of the boilerplate around initializing objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 17:43:14 -08:00
Omar Sandoval	78316a28fb	libdrgn: remove half-baked support for complex types We've nominally supported complex types since commit `75c3679147` ("Rewrite drgn core in C"), but parsing them from DWARF has been incorrect from the start (they don't have a DW_AT_type attribute like we assume), and we never implemented proper support for complex objects. Drop the partial implementation; we can bring it back (properly) if someone requests it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 14:56:33 -08:00
Omar Sandoval	71c6ac6927	libdrgn: use drgn_debug_info_module instead of Dwfl_Module in more places It's easier to go from drgn_debug_info_module to Dwfl_Module than the other direction, and I'd rather use the "higher-level" drgn_debug_info_module wherever possible. So, store drgn_debug_info_module in the DWARF index (which also saves a dereference while building the index), and pass around drgn_debug_info_module when parsing types/objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-27 11:17:41 -08:00
Omar Sandoval	bbefc573d8	libdrgn: debug_info: make sure DW_TAG_template_value_parameter has value Otherwise, an invalid DW_TAG_template_value_parameter can be confused for a type parameter. Fixes: `352c31e1ac` ("Add support for C++ template parameters") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 12:07:46 -08:00
Omar Sandoval	2c612ea97f	libdrgn: fix address of global per-CPU variables with KASLR The address of a per-CPU variable is really an offset into the per-CPU area, but we're applying the load bias (i.e., KASLR offset) to it as if it were an address, resulting in an invalid pointer when it's eventually passed to per_cpu_ptr(). Fix this by applying the bias only if it the address is in the module's address range. This heuristic avoids any Linux kernel-specific logic; hopefully it doesn't have any undesired side effects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 10:14:50 -08:00
Omar Sandoval	a7962e9477	libdrgn: debug_info: pass around Dwfl_Module instead of bias We're going to need the module start and end in drgn_object_from_dwarf_variable(), so pass the Dwfl_Module around and get the bias when we need it. This means we don't need the bias from drgn_dwarf_index_get_die(), so get rid of that, too. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 10:12:29 -08:00
Omar Sandoval	352c31e1ac	Add support for C++ template parameters Add struct drgn_type_template_parameter to libdrgn, the corresponding TypeTemplateParameter to the Python bindings, and support for parsing them from DWARF. With this, support for templates is almost, but not quite, complete. The main wart is that DW_TAG_name of compound types includes the template parameters, so the type tag includes it as well. We should remove that from the tag and instead have the type formatting code add it only when getting the full type name. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	b6958f920c	libdrgn: debug_info: move object parsing code in debug_info.c In preparation for calling the object parsing code from the type parsing code, move it up in the file (and update the coding style in drgn_object_from_dwarf_enumerator() while we're at it). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	be1bb279aa	libdrgn: debug_info: pass DIE bias when parsing types This will be needed for types containing reference objects. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	d35243b354	libdrgn: replace lazy types with lazy objects In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	934dd36302	libdrgn: remove unused name parameter from drgn_object_from_dwarf_{subprogram,variable}() Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 12:16:29 -08:00
Omar Sandoval	a57c26ed32	libdrgn: fix zero-length array GCC < 9.0 workaround for qualified types We're not applying the zero-length array workaround when the array type is qualified. Make sure we pass through can_be_incomplete_array when parsing DW_TAG_{const,restrict,volatile,atomic}_type. Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 11:21:57 -08:00
Omar Sandoval	ca7682650d	libdrgn: rename drgn_type_from_dwarf_child() to drgn_type_from_dwarf_attr() The type comes from the DW_AT_type attribute of the DIE, not a child DIE, so this is a better name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 11:02:27 -08:00
Omar Sandoval	798f0887a5	libdrgn: simplify language fall back handling If the language for a DWARF type is not found or unrecognized, we should fall back to the global default, not the program default (the program default language is for language-specific operations on the program, so DWARF parsing shouldn't depend on it). Add a fall_back parameter to drgn_language_from_die() and use it in DWARF parsing, and replace drgn_language_or_default() with a drgn_default_language variable. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 10:46:35 -08:00
Omar Sandoval	30cfa40a72	libdrgn: rename "unavailable" objects to "absent" objects I was going to add an Object.available_ attribute, but that made me realize that the naming is somewhat ambiguous, as a reference object with an invalid address might also be considered "unavailable" by users. Use the name "absent" instead, which is more clear: the object isn't there at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 14:58:26 -08:00
Omar Sandoval	abafdd965f	Remove bit_offset from value objects There are a couple of reasons that it was the wrong choice to have a bit_offset for value objects: 1. When we store a buffer with a bit_offset, we're storing useless padding bits. 2. bit_offset describes a location, or in other words, part of an address. This makes sense for references, but not for values, which are just a bag of bytes. Get rid of union drgn_value.bit_offset in libdrgn, make Object.bit_offset None for value objects, and disallow passing bit_offset to the Object() constructor when creating a value. bit_offset can still be passed when creating an object from a buffer, but we'll shift the bytes down as necessary to store the value with no offset. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	97fbedec1f	libdrgn: return unavailable objects for DWARF objects without value or address Now that we have the concept of unavailable objects, use it for DWARF where appropriate. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 14:15:09 -08:00
Omar Sandoval	edb1fe7f2f	libdrgn: rename drgn_object_kind to drgn_object_encoding I'd like to use the name drgn_object_kind to distinguish between values and references. "Encoding" is more accurate than "kind", anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:26 -08:00
Omar Sandoval	5975d19580	libdrgn: report better errors when parsing DWARF/kmod index If the DWARF index encounters any error while parsing, it returns an error saying only "debug information is truncated", which makes it hard to track down parsing errors. The kmod index parser silently swallows errors. For both, replace the mread functions with a higher-level binary_buffer interface that can include more information including the location of the error. For example: /tmp/mybinary: .debug_info+0x4: expected at least 56 bytes, have 55 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-13 17:00:07 -08:00
Omar Sandoval	756e5d27ad	libdrgn: debug_info: put sections in an array (again) Back in commit `9ce9094ee0` ("libdrgn: dwarf_index: don't copy sections into each CU"), I changed the sections to be individual members. The next change will be easier if they're in an array. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-11 16:22:04 -08:00
Omar Sandoval	f0a3629c26	libdrgn: debug_info: add dwarf_tag_str() and use it for error messages There are several places where we manually pass around the string name of a tag so it can be used for error messages. Do it programatically instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-11 16:22:04 -08:00
Omar Sandoval	3c5d22637e	libdrgn: clean up hash function APIs and improve documentation Use _hash_pair() for hash functions that do the full double hashing and return a struct hash_pair and hash_() for other hashing utility functions. Also change some of the equality function names to be more symmetric and improve the documentation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-12 16:20:08 -07:00
Omar Sandoval	286c09844e	Clean up #includes with include-what-you-use I recently hit a couple of CI failures caused by relying on transitive includes that weren't always present. include-what-you-use is a Clang-based tool that helps with this. It's a bit finicky and noisy, so this adds scripts/iwyu.py to make running it more convenient (but not reliable enough to automate it in Travis). This cleans up all reasonable include-what-you-use warnings and reorganizes a few header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-23 16:29:42 -07:00
Omar Sandoval	f83bb7c71b	libdrgn: move debugging information tracking into drgn_debug_info Debugging information tracking is currently in two places: drgn_program finds debugging information, and drgn_dwarf_index stores it. Both of these responsibilities make more sense as part of drgn_debug_info, so let's move them there. This prepares us to track extra debugging information that isn't pertinent to indexing. This also reworks a couple of details of loading debugging information: - drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into a single structure, drgn_debug_info_module. - The first pass of DWARF indexing now happens in parallel with reading compilation units (by using OpenMP tasks). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 10:58:24 -07:00
Omar Sandoval	3ac9ae357b	libdrgn: rename drgn_dwarf_info_cache to drgn_debug_info The current name is too verbose. Let's go with a shorter, more generic name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-11 17:41:23 -07:00

1 2 3

102 Commits