JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-22 17:23:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	81053a1c57	libdrgn: dwarf_index: support DWARF 5 The main changes are: 1. Skipping the new attribute forms. 2. Handling DW_FORM_strx*, DW_FORM_line_strp, and DW_FORM_implicit_const for the attributes that we care about. 3. Parsing the new unit header format. 4. Parsing the new line number program header format. Note that Clang currently produces an incorrect DWARF 5 line number program header for the Linux kernel (https://reviews.llvm.org/D105662), so some types are not properly deduplicated in that case. Closes #104. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-09 01:51:59 -07:00
Omar Sandoval	347c578aa0	libdrgn: debug_info: don't open-code drgn_platform_address_size() Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-07 15:03:03 -07:00
Omar Sandoval	add17a9a36	libdrgn: stack_trace: fix source info without .debug_aranges dwfl_module_getsrc() relies on .debug_aranges to find the CU containing the PC. If the module has a missing or incomplete .debug_aranges, it fails. This lookup is actually redundant since we already found the CU when we unwound the stack. Use the libdw helpers that take the CU DIE instead to avoid this. We also need to save the CU for frames where we found it but couldn't find the subprogram (typically assembly files). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-07 13:41:17 -07:00
Omar Sandoval	fbe102f37e	libdrgn: debug_info: handle incomplete .debug_aranges Clang does not generate .debug_aranges by default, but the GNU toolchain does. This means that a Linux kernel binary compiled with Clang and GNU binutils will have ranges in .debug_aranges for assembly files and nothing else. This breaks our assumption that a non-empty .debug_aranges has ranges for every compilation unit. Fix it by always falling back to checking every CU if a range was not found in .debug_aranges. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-07 11:25:13 -07:00
Omar Sandoval	2e04e6b73c	libdrgn: binary_buffer: handle non-canonical LEB128 numbers LEB128 allows for redundant zero/sign bits, but we currently always treat extra bytes as overflow. Let's check those bytes correctly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 21:39:31 -07:00
Omar Sandoval	d12d4368b8	libdrgn: support passing debug info files to load_debug_info example program And don't set the target by default; -k must be given explicitly now. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 16:58:47 -07:00
Omar Sandoval	73d5a207c8	libdrgn: dwarf_index: fix skipping DW_FORM_ref_addr in DWARF 2 In DWARF 2, DW_FORM_ref_addr has the size of an address, not a size depending on the format. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 11:53:34 -07:00
Omar Sandoval	86e966fbf8	libdrgn: dwarf_index: handle DW_FORM_block Somehow I missed this form, and I've never seen it used. It's the same as DW_FORM_exprloc for our purposes, so it's an easy fix. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 01:34:52 -07:00
Omar Sandoval	018b00cede	libdrgn: binary_buffer: check bounds with 64-bit size There are a few places in the DWARF indexing code that we skip a 64-bit size. On 32-bit systems, this can wrap if the count is greater than SIZE_MAX. Rather than requiring vigilance against this, change the size to uint64_t. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 01:27:47 -07:00
Omar Sandoval	52df3cb5ff	libdrgn: dwarf_index: properly return error when hashing path fails Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 01:21:47 -07:00
Omar Sandoval	157f8ed7dc	libdrgn: dwarf_index: #define FNV constants Older versions of GCC and Clang don't accept const variables for initializers ("error: initializer element is not constant"), so #define the FNV constants instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-28 11:08:30 -07:00
Omar Sandoval	87809f7692	libdrgn: dwarf_index: improve file path hashing for deduplication We currently don't include the compilation directory when hashing file names for deduplication. This can cause us to incorrectly deduplicate a definition if, for example, two libraries have a definition with the same name in files with the same name. Fix this by hashing the full file path including the compilation directory. This also requires reworking our strategy for path normalization to better handle ".." components, since directories may end up outside of the compilation directory. The new strategy keeps a linked list of hashes (now FNV-1a instead of SipHash) for each parent directory. This is actually more efficient than the previous approach, offsetting the cost of the extra hash computations for the compilation directory. It also correctly handles file names in the line number program header which consist of multiple components. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-25 18:03:01 -07:00
Omar Sandoval	82824c0e5f	libdrgn: path: simplify logic in path_iterator_next() Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-25 18:03:01 -07:00
Omar Sandoval	57cc0deb98	libdrgn: replace struct path_iterator_component with struct string The former is the same as the latter with less generic naming. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-25 18:03:01 -07:00
Omar Sandoval	420d2bb1dc	libdrgn: dwarf_index: fix DW_AT_strp bounds check The string must be null terminated, so there must be at least one byte left in .debug_str. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-25 17:50:08 -07:00
Omar Sandoval	c4b174af74	libdrgn: fix kdump format support I missed the drgn_program_set_kdump() code path when making sure that we set the platform before adding memory segments. Fixes: `0e3054a0ba` ("libdrgn: make addresses wrap around when reading memory") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-09 15:30:48 -07:00
Omar Sandoval	6b2dda3f95	libdrgn: bring back dwfl_core_file_report() bug workaround This workaround was originally present in commit `e5874ad18a` ("libdrgn: use libdwfl"). We dropped in in commit `6a13d74c0c` ("libdrgn: build with bundled elfutils") because the bundled version of elfutils had the fix. We forgot to bring it back in commit `4c5c5f3842` ("Remove bundled version of elfutils") even though we support versions without the fix. Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-09 15:16:12 -07:00
Omar Sandoval	82ca5634b5	libdrgn: fix copying value to big-endian from little-endian copy_lsbytes() doesn't copy enough bytes when copying from a smaller little-endian value to a larger big-endian value. This was caught by the test cases for DW_OP_deref{,_size}, but it can affect other places when debugging a little-endian target from a big-endian host or vice-versa. Closes #105. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-08 12:24:20 -07:00
Omar Sandoval	5a03d6b13f	drgn 0.0.13 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 16:17:11 -07:00
Omar Sandoval	faad25d7b2	libdrgn: debug_info: fix address of objects with size zero The stack trace variable work introduced a regression that causes objects with size zero to always be marked absent even if they have an address. This matters because GCC sometimes seems to omit the complete array type for arrays declared without a length, so an array variable can end up with an incomplete array type. I saw this with the "swapper_spaces" variable in mm/swap_state.c from the Linux kernel. Make sure to use the address of an empty piece if the variable is also empty. Fixes: `ffcb9ccb19` ("libdrgn: debug_info: implement creating objects from DWARF location descriptions") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 15:46:22 -07:00
Omar Sandoval	f7fe93e573	cli: show elfutils version in use drgn depends heavily on libelf and libdw, so it's useful to know what version we're using. Add drgn._elfutils_version and use that in the CLI and in the test cases where we currently check the libdw version. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 11:10:50 -07:00
Omar Sandoval	6357cea46b	drgn 0.0.12 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 01:13:28 -07:00
Omar Sandoval	bc85767e5f	libdrgn: support looking up parameters and variables in stack traces After all of the preparatory work, the last two missing pieces are a way to find a variable by name in the list of scopes that we saved while unwinding, and a way to find the containing scopes of an inlined function. With that, we can finally look up parameters and variables in stack traces. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	38573cfdde	libdrgn: stack_trace: pretty print frames and add frames for inline functions If we want to access a parameter or local variable in an inlined function, then we need a stack frame for that function. It's also much more useful to see inlined functions in the stack trace in general. So, when we've unwound the registers for a stack frame, walk the debugging information to find all of the (possibly inlined) functions at the program counter, and add a drgn stack frame for each of those. Also add StackFrame.name and StackFrame.is_inline so that we can distinguish inline frames. Also add StackFrame.source() to get the filename and line and column numbers. Finally, add the source code location to pretty-printed stack traces and add pretty-printing for individual stack frames that includes extra information. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	0e113ecc8d	libdrgn: debug_info: add drgn_find_die_ancestors() This will be used for finding the ancestors of the abstract instance root corresponding to a concrete inlined instance root for variable lookups in inlined functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	d8d4157346	libdrgn: debug_info: add drgn_debug_info_module_find_dwarf_scopes() This will be used for finding functions, inlined functions, and blocks containing a PC for stack unwinding and variable lookups. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	b6d810b344	libdrgn: debug_info: add DWARF DIE iterator We have a couple of upcoming use cases for iterating through all of the DIEs in a module: searching for scopes and searching for a DIE's ancestors. Add a DIE iterator interface to abstract away the details of walking DIEs and allows us to efficiently track ancestors. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	ffcb9ccb19	libdrgn: debug_info: implement creating objects from DWARF location descriptions Add support for evaluating a DWARF location description and translating it into a drgn object. In this commit, this is just used for global variables, but an upcoming commit will wire this up to stack traces for parameters and local variables. There are a few locations that drgn's object model can't represent yet. DW_OP_piece/DW_OP_bit_piece can describe objects that are only partially known or partially in memory; we approximate these where we can. We don't have a good way to support DW_OP_implicit_pointer at all yet. This also adds test cases for DWARF expressions, which we couldn't easily test before. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	8335450ecb	libdrgn: debug_info: implement DW_OP_fbreg Implement looking up location descriptions and evaluating DW_OP_fbreg. This isn't actually used yet since CFI expressions don't have a current function DIE, but it will be used for parameters/local variables in stack traces. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	d5b68455b8	libdrgn: debug_info: save .debug_loc .debug_loc will be used for variable resolution. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	e105be6c18	libdrgn: debug_info: add helper to cache module section Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	dcda688c9a	libdrgn: debug_info: parenthesize PUSH() macro argument It doesn't make a difference anywhere it's currently used, but let's do it just in case that changes in the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	5fc879ef3e	libdrgn: debug_info: limit number of DWARF expression operations executed A malformed DWARF expression can easily get us into an infinite loop. Avoid this by capping the number of operations that we'll execute. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	0e3054a0ba	libdrgn: make addresses wrap around when reading memory Define that addresses for memory reads wrap around after the maximum address rather than the current unpredictable behavior. This is done by: 1. Reworking drgn_memory_reader to work with an inclusive address range so that a segment can contain UINT64_MAX. drgn_memory_reader remains agnostic to the maximum address and requires that address ranges do not overflow a uint64_t. 2. Adding the overflow/wrap-around logic to drgn_program_add_memory_segment() and drgn_program_read_memory(). 3. Changing direct uses of drgn_memory_reader_reader() to drgn_program_read_memory() now that they are no longer equivalent. (For some platforms, a fault might be more appropriate than wrapping around, but this is a step in the right direction.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-03 17:49:29 -07:00
Omar Sandoval	e5ff1ea7ac	libdrgn: program: use preset platform in drgn_program_set_core_dump() If the program already had a platform set, we should its callbacks instead of the ones from the ELF file's platform. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-03 17:04:28 -07:00
Omar Sandoval	43b90ffb1b	libdrgn: debug_info: add missing stack size check for DW_OP_deref Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-23 13:14:28 -07:00
Omar Sandoval	ad37c79cba	libdrgn: python: add documentation and type annotation for Program.__contains__() drgn.Program has supported the "in" operator since commit `25e7a9d3b8` ("libdrgn/python: implement Program.__contains__"), but it's undocumented and unannotated. Add a type annotation with a docstring along with a METH_COEXIST method. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-12 16:26:56 -07:00
Omar Sandoval	92fd967a3a	libdrgn: print uint8_t as hex with PRIx8 format, not x In practice, they're probably always the same, but PRIx8 is more correct. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-07 15:34:30 -07:00
Omar Sandoval	e0921c5bdb	libdrgn: don't use OpenMP tasking libomp (at least in LLVM 9 and 10) seems to have buggy OpenMP tasking support. See commit `1cc3868955` ("CI: temporarily disable Clang") for one example. OpenMP tasks aren't buying us much; they simplify DWARF index updates in some places but complicate it in others. Let's ditch tasks and go back to building an array of CUs to index similar to what we did before commit `f83bb7c71b` ("libdrgn: move debugging information tracking into drgn_debug_info"). There is no significant performance difference. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-06 16:56:02 -07:00
Jay Kamat	95646b47c9	libdrgn: dwarf_index: add support for DW_FORM_indirect First, add instructions for DW_FORM_indirect. Then, we can call the function to convert a form to an instruction whenever we see an indirect instruction. Note that without elfutils commit d63b26b8d21f ("libdw: handle DW_FORM_indirect when reading attributes") (queued for elfutils 0.184), DW_FORM_indirect will cause errors later when parsing with libdw. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-05-04 16:56:54 -07:00
Omar Sandoval	609a1cafc6	libdrgn: dwarf_index: check for attribute forms more strictly Rather than silently ignoring attributes whose form we don't recognize, return an error. This way, we won't mysteriously skip indexing DIEs. While we're doing this, split the form -> instruction mapping to its own functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-04 16:56:54 -07:00
Omar Sandoval	2ad52cb5f4	libdrgn: add option to time load_debug_info example program I often use examples/load_debug_info to benchmark loading/DWARF indexing, so add a -T option that prints the time it takes to load debug info. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-23 09:29:44 -07:00
Omar Sandoval	2d40d6e146	libdrgn: add configure~ to .gitignore Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-23 09:18:16 -07:00
Jay Kamat	6be21f674a	libdrgn: follow DW_AT_signature when parsing DWARF types When using type units, skeleton declarations are made instead of concrete ones. However, these declarations have signature tags attached that point to the type unit with the definition, so we can simply follow the signature to get the concrete type. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-04-23 02:37:31 -07:00
Jay Kamat	9dabec1264	libdrgn: add support for parsing type units Adds support for parsing of type units as enabled by -fdebug-types-section. If a module has both a debug info section and type unit section, both are read. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-04-23 02:37:31 -07:00
Omar Sandoval	33300d426e	libdrgn: debug_info: don't overwrite Dwarf_Die passed to drgn_type_from_dwarf_internal() If the DIE passed to drgn_type_from_dwarf_internal() is a declaration, then we overwrite it with dwarf_offdie(). As far as I can tell, this doesn't break anything at the moment, but it's sketchy to overwrite an input parameter and may cause issues in the future. Use a temporary DIE on the stack in this case instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-23 02:37:31 -07:00
Omar Sandoval	155ec92ef2	libdrgn: fix reading 32-bit float object values on big-endian Closes #99. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-22 09:45:41 -07:00
Omar Sandoval	0e2703dd4e	libdrgn: python: use _PyDict_GetItemIdWithError() CPython commit fb5db7ec5862 ("bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648)") (in v3.10) removed _PyDict_GetItemId() because it suppresses errors. Use _PyDict_GetItemIdWithError() instead (which we should've been using anyways). Closes #101. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-22 01:17:22 -07:00
Omar Sandoval	c768e97394	libdrgn: python: use _Thread_local instead of PyThreadState for drgn_in_python Using a Python dictionary for this is much more heavyweight than just using a thread-local variable (with no benefit as far as I can tell). This also gets rid of a call to _PyDict_GetItem(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-22 01:17:12 -07:00
Omar Sandoval	08498967f7	libdrgn: configure with large file support /proc/pid/mem is indexed by address. On 32-bit systems, addresses may be out of the range of a 32-bit signed off_t. This results in pread() returning EINVAL in drgn_read_memory_file(). Use AC_SYS_LARGEFILE in configure.ac so that we use 64-bit off_t by default. Closes #98. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-21 13:34:31 -07:00
Omar Sandoval	78b4188dd9	drgn 0.0.11 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:50:09 -07:00
Omar Sandoval	e7367a4a94	libdrgn: Makefile: remove generated source files from CLEANFILES We don't actually want make clean to remove the generated files that are included in a distribution tarball, because then the user will need to regenerate them, and they might not have the dependencies installed. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:31:14 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	76d3348a6d	libdrgn: hash_table: mark table##_delete_iterator() as unused GCC doesn't warn about table##_delete_iterator() being unused because it is inline, but Clang does, so add the unused attribute. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 16:28:46 -07:00
Omar Sandoval	acf722d315	libdrgn: hash_table: remove unused table##_chunk_set_capacity_scale() The folly implementation calls this elsewhere, but we only need it in table##_chunk_mark_eof(), so it was folded in there. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 16:27:27 -07:00
Omar Sandoval	b772432a86	libdrgn: cfi: don't rely on member containing a flexible array Clang enables -Wgnu-variable-sized-type-not-at-end by default, which warns for DRGN_CFI_ROW(): arch_x86_64.c:735:27: warning: field 'row' with variable sized type 'struct drgn_cfi_row' not at the end of a struct or class is a GNU extension [-Wgnu-variable-sized-type-not-at-end] .default_dwarf_cfi_row = DRGN_CFI_ROW( DRGN_CFI_ROW() is gnarly anyways, so instead of having it expand to a pointer expression relying on this GCC extension, make it expand to an initializer. Then, we can initialize default_dwarf_cfi_row as a separate variable rather than directly in the initializer for struct drgn_architecture_info. This still relies on a GCC extension for static initialization of flexible array members, but apparently Clang is okay with that one by default (-Wgnu-flexible-array-initializer must be enabled explictly or by -Wgnu or -Wpedantic). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 16:19:21 -07:00
Omar Sandoval	5c86e30b6e	libdrgn: work around Clang __muloti4 for the third time See commit `0cb77b303c` ("libdrgn: work around Clang __muloti4 again") and commit `2dd14ad522` ("libdrgn: work around "undefined reference to '__muloti4'" when using Clang"). These keep sneaking in because I don't have an old enough version of Clang lying around. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 15:30:07 -07:00
Omar Sandoval	301cc3f139	libdrgn: fix UBSan "applying zero offset to null pointer" errors There are a couple of places where we compute `NULL + 0`, which is undefined behavior. Add a helper to do this safely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 13:38:29 -07:00
Omar Sandoval	9c31f11e35	libdrgn: object: fix UBSan error for uninitialized boolean drgn_object_reinit() and drgn_object_copy() can both load from an uninitialized little_endian field, causing UBSan errors like: libdrgn/object.h:105:27: runtime error: load of value 68, which is not a valid value for type '_Bool' This only happens when little_endian isn't valid for the type and won't be used anyways, but it's easy enough to work around. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 13:38:13 -07:00
Omar Sandoval	c9dc7fd574	libdrgn: type: fix memcpy() undefined behavior It's undefined behavior to pass NULL to memcpy() even if the length is zero. See also commit `a17215e984` ("libdrgn: dwarf_index: fix memcpy() undefined behavior"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 13:38:13 -07:00
Omar Sandoval	beb0c9d640	drgn 0.0.10	2021-03-31 13:32:05 -07:00
Omar Sandoval	f285764f8a	Include full libdrgn distribution in drgn sdist Building drgn from an sdist currently requires autotools and gawk because libdrgn in the sdist is more or less a git checkout. It's more user-friendly to include the autotools output and generated code. Do this by extending the sdist command to include a full libdrgn distribution with `make distdir`. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-30 23:19:38 -07:00
Omar Sandoval	630d39e345	libdrgn: add ORC unwinder The Linux kernel has its own stack unwinding format for x86-64 called ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is essentially a simplified, less complete version of DWARF CFI. ORC is generated by analyzing machine code, so it is present for all but a few ignored functions. In contrast, DWARF CFI is generated by the compiler and is therefore missing for functions written in assembly and inline assembly (which is widespread in the kernel). This implements an ORC stack unwinder: it applies ELF relocations to the ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule kind, parses and efficiently stores ORC data, and translates ORC to drgn CFI rules. This will allow us to stack trace through assembly code, interrupts, and system calls. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-29 10:01:52 -07:00
Omar Sandoval	090064f20d	libdrgn: x86-64: support R_X86_64_PC32 relocation type This is used for .orc_unwind_ip for kernel modules. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:16:36 -07:00
Omar Sandoval	e0aaaf203d	libdrgn: generalize applying ELF relocations To support unwinding with ORC, we need to apply relocations to .orc_unwind_ip, which libdwfl doesn't do. That means that we always need to apply relocations on x86-64, not just as a fast path when the file's byte order matches the host's. So, generalize handling of 64- vs 32-bit and little- vs big-endian relocations, and move the handling of relocation types to an arch-specific callback. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:16:35 -07:00
Omar Sandoval	63672be809	libdrgn: linux_kernel: save module .init section addresses Linux kernel modules usually contain ELF relocations in DWARF and ORC sections for symbols in .init sections. Since we ignore .init sections entirely in cache_kernel_module_sections(), these relocations end up being based on an address of 0 (so, e.g., a function from .init.text could be reported as having an address of 0x0). It makes a little more sense to use the address where the .init section was before it was freed. So, let's update the sections' sh_addr but continue ignoring them for determining the module's address range. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 15:13:47 -07:00
Omar Sandoval	da180b7274	libdrgn: handle errors from elf_strptr() For some reason, we consistently ignore errors from elf_strptr(), but we shouldn't. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 14:28:16 -07:00
Omar Sandoval	e5bc41f16c	libdrgn: add latest elf.h and dwarf.h to support elfutils 0.165 The oldest LTS version of Ubuntu, 16.04, has elfutils 0.165. This version is missing some ELF and DWARF definitions used by drgn. Add copies of elf.h from glibc 2.33 and dwarf.h and elfutils/known-dwarf.h from elfutils 0.183 to get the latest definitions and drop the minimum required version of elfutils further to 0.165. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-21 23:18:39 -07:00
Serapheim Dimitropoulos	a68abd5de4	libdrgn: stretch minimum supported version of libelf to 0.170 Currently libdrgn requires libelf to be of version 0.175 or later. This patch allows the library to be compiled with libelf 0.170 (the newest version supported by Ubuntu 18.04 LTS). Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>	2021-03-21 14:28:29 -07:00
Omar Sandoval	da0280016c	libdrgn: python: identify bit fields in TypeMember.__repr__ If a member is a bit field, then we should format it with the underlying Object so that it shows the bit field size. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-17 12:02:53 -07:00
Omar Sandoval	55354b3038	libdrgn: use flexible array for pgtable_iterator::arch There's no reason to use GCC's zero-length array extension for this. Use a standard flexible array instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 16:18:49 -07:00
Omar Sandoval	38d4330fec	libdrgn: clean up stale comment references and Doxygen warnings Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 16:15:43 -07:00
Omar Sandoval	671947d185	libdrgn: remove unused drgn_program::attached_dwfl_state I missed this when I removed the code that used it. Fixes: `eec67768aa` ("libdrgn: replace elfutils DWARF unwinder with our own") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 15:41:07 -07:00
Omar Sandoval	4c5c5f3842	Remove bundled version of elfutils We currently bundle a version of elfutils with patches to export additional stack tracing functionality. This has a few drawbacks: - Most of drgn's build time is actually building elfutils. - Distributions don't like packages that bundle verions of other packages. - elfutils, and thus drgn, can't be built with clang. Now that we've replaced the elfutils DWARF unwinder with our own, we don't need the patches, so we can drop the bundled elfutils and fix these issues. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 00:52:09 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	35a1af7ad6	libdrgn: add DWARF expression evaluation For DW_CFA_def_cfa_expression, DW_CFA_expression, and DW_CFA_val_expression, we need to be evaluate a DWARF expression. Add an interface for this. It doesn't yet support operations that aren't applicable to CFI or some more exotic operations. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	fdaf7790a9	libdrgn: add DWARF call frame information parsing In preparation for adding our own unwinder, add support for parsing and finding DWARF/EH call frame information. Use a generic representation of call frame information so that we can support other formats like ORC in the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	0a6aaaae5d	libdrgn: define structure for storing processor register values libdwfl stores registers in an array of uint64_t indexed by the DWARF register number. This is suboptimal for a couple of reasons: 1. Although the DWARF specification states that registers should be numbered for "optimal density", in practice this isn't the case. ABIs include unused ranges of numbers and don't order registers based on how likely they are to be known (e.g., caller-saved registers usually aren't recovered while unwinding the stack, but they are often numbered before callee-saved registers). 2. This precludes support for registers larger than 64 bits, like SSE registers. For our own unwinder, we want to store registers in an architecture-specific format to solve both of these problems. So, have each architecture define its layout with registers arranged for space efficiency and convenience when parsing saved registers from core dumps. Instead of generating an arch_foo.c file from arch_foo.c.in, separately define the logical register order in an arch_foo.defs file, and use it to generate an arch_foo.inc file that is included from arch_foo.c. The layout is defined as a macro in arch_foo.c. While we're here, drop some register definitions that aren't useful at the moment. Then, define struct drgn_register_state to efficiently store registers in the defined format. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:36:38 -07:00
Omar Sandoval	cc1a5606d0	libdrgn: debug_info: save platform per module Stack unwinding depends on some platform-specific information. If for some reason a program has debugging information with different platforms, then we need to make sure that while we're unwinding the stack, we don't end up in a frame with a different platform, because the registers won't make sense. Additionally, we should parse debugging information using the module's platform rather than the program's platform, which may not match. So, cache the platform derived from each module's ELF file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	6065fc87af	libdrgn: debug_info: save .debug_frame, .eh_frame, .text, and .got These sections are needed for stack unwinding. However, .debug_frame and .eh_frame don't need to be read right away, and .text and .got don't need to be read at all, so partition them accordingly. Also, check that the sections are specifically SHT_PROGBITS rather than not SHT_NOBITS. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	744cc414d3	libdrgn: add copy_lsbytes() It will be used to copy register values. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:48 -07:00
Omar Sandoval	b0a6d12501	libdrgn: binary_buffer: add binary_buffer_next_[us]int() These will be used for parsing .debug_frame, .eh_frame, and DWARF expressions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 12:13:45 -07:00
Omar Sandoval	b55a5f7f4b	libdrgn: binary_buffer: add binary_buffer_next_sN() Along with _into_s64 and _into_u64 variants. These will be used for parsing .eh_frame and DWARF expressions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-10 02:07:30 -08:00
Omar Sandoval	e5219b13e3	libdrgn: binary_buffer: add binary_buffer_next_sleb128() Revive it from all the way back in commit `90fbec02fc` ("dwarfindex: delete unused read_sleb128() and read_strlen()") and add an _into_u64 variant. These will be used for parsing .debug_frame, .eh_frame, and DWARF expressions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-10 02:07:30 -08:00
Omar Sandoval	7eab40aaeb	libdrgn: rename drgn_error_debug_info() to drgn_error_debug_info_scn() An upcoming change will introduce a similar function for when the section isn't known. Rename the original so that the new one can take its name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-10 02:07:16 -08:00
Jay Kamat	4552d78f4a	libdrgn: debug_info: try to find DIE specification when parsing type Currently, we look up incomplete types by name, which can fail if the name is ambiguous or the type is unnamed. Try finding the complete type via the DW_AT_specification map in the DWARF index first. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-03-08 15:24:24 -08:00
Jay Kamat	3823b21e17	libdrgn: dwarf_index: uses DIE address instead of section offset To support indexing DWARF 4 type units, we need to be able to differentiate between DIEs in .debug_info and .debug_types. We can't do that with just a section offset, so instead store the address of the DIE in the index and specification map. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-03-08 15:24:24 -08:00
Omar Sandoval	ca1a2598fd	libdrgn: python: add missing function name to Object.format_() exceptions The ":function name" is missing from the PyArg_ParseTupleAndKeywords() call in DrgnObject_format(), so errors say, for example, "'foo' is an invalid keyword argument for this function" instead of "for format_()". Fixes: `cf3a07bdfb` ("libdrgn: python: replace Object.__format__ with Object.format_") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-08 14:27:33 -08:00
Omar Sandoval	a24c0f5b33	libdrgn: clean up usage of drgn_stop Use drgn_not_found where it's more appropriate, and check explicitly against drgn_stop instead of err->code == DRGN_ERROR_STOP. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-05 12:46:06 -08:00
Omar Sandoval	4680b93103	libdrgn: improve truncate_signed() and truncate_unsigned() truncate_signed() requires 5 operations (compute a mask for the lower bits, and it, compute the sign extension mask, xor it, subtract it) and a branch. We can do it in 3 operations and no branches if we assume that the compiler does an arithmetic shift for signed integers, which we already depend on. Then, we can remove sign_extend(), which is the same as truncate_signed() except it assumes that the upper bits are zero to save on a couple of operations. Similarly, for truncate_unsigned() we can remove the branch. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	b5ed892481	Fix some include-what-you-use warnings and update for Bear 3 Bear 3 changed the CLI arguments, so update scripts/iwyu.py for it and clean up some new warnings. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	98e1947d26	libdrgn: require non-NULL drgn_architecture_info::register_by_name Instead of checking whether it's NULL, define a stub for arch_info_unknown. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	25eb2abb1a	libdrgn: add drgn_platform getters Add low-level getters equivalent to the drgn_program platform-related helpers and use them in places where we have checked or can assume that the platform is known. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	e04eda9880	libdrgn: define HOST_LITTLE_ENDIAN As a minor cleanup, instead of writing __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ everywhere, define and use HOST_LITTLE_ENDIAN. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Jay Kamat	c22e501295	libdrgn: debug_info: fix parsing specifications of declarations drgn_compound_type_from_dwarf() and drgn_enum_type_from_dwarf() check the DW_AT_declaration flag to decide whether the type is a declaration of an incomplete type or a definition of a complete type. However, they check DW_AT_declaration with dwarf_attr_integrate(), which follows the DW_AT_specification reference if it is present. The DIE referenced by DW_AT_specification typically is a declaration, so this erroneously identifies definitions as declarations. Additionally, if drgn_debug_info_find_complete() finds the same definition, we can end up recursing until we hit the DWARF parsing depth limit. Fix it by not using dwarf_attr_integrate() for DW_AT_declaration. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-02-25 10:46:34 -08:00
Omar Sandoval	aaa98ccae3	libdrgn: consistently use __ for __attribute__ names In some places, we add __ preceding and following an attribute name, and in some places, we don't. Let's make it consistent. We might as well opt for the __ to make clashes with macros less likely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 03:16:23 -08:00
Omar Sandoval	c54ef80412	libdrgn: add missing LIBDRGN_PUBLIC exports drgn_object_dereference_offset() and drgn_object_member_dereference() are both in drgn.h.in but aren't exported. They should be. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:42:12 -08:00
Omar Sandoval	3ecb31de9f	libdrgn: update stale references in drgn_object_slice() comment drgn_program_member_info() was replaced by drgn_type_find_member() in commit `e72ecd0e2c` ("libdrgn: replace drgn_program_member_info() with drgn_type_find_member()"). drgn_object_pointer_offset() never existed; it's supposed to be drgn_object_dereference_offset(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:41:20 -08:00
Omar Sandoval	da1e72f0d5	libdrgn: remove drgn_{,qualified_}type_eq() from drgn.h.in The definitions were removed but these public declarations weren't. Fixes: `7d7aa7bf7b` ("libdrgn/python: remove Type == operator") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:37:36 -08:00
Omar Sandoval	55e3a58e06	libdrgn: python: use correct member offset when creating object from value We need to use the offset of the member in the outermost object type, not the offset in the immediate containing type in the case of nested anonymous structs. Fixes: `e72ecd0e2c` ("libdrgn: replace drgn_program_member_info() with drgn_type_find_member()") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:29:59 -08:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	72b4aa9669	libdrgn: clean up object initialization Rename struct drgn_object_type to struct drgn_operand_type, add a new struct drgn_object_type which contains all of the type-related fields from struct drgn_object, and use it to implement drgn_object_type() and drgn_object_type_operand(), which are replacements for drgn_object_set_common() and drgn_object_type_encoding_and_size(). This cleans up a lot of the boilerplate around initializing objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 17:43:14 -08:00
Omar Sandoval	78316a28fb	libdrgn: remove half-baked support for complex types We've nominally supported complex types since commit `75c3679147` ("Rewrite drgn core in C"), but parsing them from DWARF has been incorrect from the start (they don't have a DW_AT_type attribute like we assume), and we never implemented proper support for complex objects. Drop the partial implementation; we can bring it back (properly) if someone requests it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 14:56:33 -08:00
Omar Sandoval	f09ab62b73	drgn 0.0.9 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 02:19:09 -08:00
Omar Sandoval	36df5fc076	libdrgn: ppc64: fix fetching cr fields from pt_regs The condition register fields are numbered from most significant to least significant. Also, the CFI for unwinding the condition register fields restores them in their position in the condition register, so do the same when initially populating them. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 00:45:14 -08:00
Kamalesh Babulal	221a218704	libdrgn: add powerpc stack trace support Add powerpc specific register information required to retrive the stack traces of the tasks on both live system and from the core dump. It uses the existing DSL format to define platform registers and helper functions to initial them. It also adds architecture specific information to enable powerpc. Current support is for little-endian powerpc only. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>	2021-01-29 11:31:59 -08:00
Omar Sandoval	b899a10836	Remove register numbers from API and add register aliases enum drgn_register_number in the public libdrgn API and drgn.Register.number in the Python bindings are basically exports of DWARF register numbers. They only exist as a way to identify registers that's lighter weight than string lookups. libdrgn already has struct drgn_register, so we can use that to identify registers in the public API and remove enum drgn_register_number. This has a couple of benefits: we don't depend on DWARF numbering in our API, and we don't have to generate drgn.h from the architecture files. The Python bindings can just use string names for now. If it seems useful, StackFrame.register() can take a Register in the future, we'll just need to be careful to not allow Registers from the wrong platform. While we're changing the API anyways, also change it so that registers have a list of names instead of one name. This isn't needed for x86-64 at the moment, but will be for architectures that have multiple names for the same register (like ARM). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-28 17:47:45 -08:00
Omar Sandoval	10e6464769	libdrgn: python: clean up module creation Add a helper based on PyModule_AddType() from Python 3.9 and use it to simplify PyInit__drgn(). Also handle errors in PyInit__drgn(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-28 12:41:13 -08:00
Omar Sandoval	0d35dec8ee	libdrgn: python: define Py_RETURN_BOOL And use it instead of an if statement with Py_RETURN_TRUE/Py_RETURN_FALSE or PyBool_FromLong(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-28 11:35:09 -08:00
Omar Sandoval	46343ae08d	libdrgn: get rid of struct drgn_stack_frame In preparation for adding a "real", internal-only struct drgn_stack_frame, replace the existing struct drgn_stack_frame with explicit trace/frame arguments. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-27 11:22:34 -08:00
Omar Sandoval	71c6ac6927	libdrgn: use drgn_debug_info_module instead of Dwfl_Module in more places It's easier to go from drgn_debug_info_module to Dwfl_Module than the other direction, and I'd rather use the "higher-level" drgn_debug_info_module wherever possible. So, store drgn_debug_info_module in the DWARF index (which also saves a dereference while building the index), and pass around drgn_debug_info_module when parsing types/objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-27 11:17:41 -08:00
Omar Sandoval	bbefc573d8	libdrgn: debug_info: make sure DW_TAG_template_value_parameter has value Otherwise, an invalid DW_TAG_template_value_parameter can be confused for a type parameter. Fixes: `352c31e1ac` ("Add support for C++ template parameters") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 12:07:46 -08:00
Omar Sandoval	2c612ea97f	libdrgn: fix address of global per-CPU variables with KASLR The address of a per-CPU variable is really an offset into the per-CPU area, but we're applying the load bias (i.e., KASLR offset) to it as if it were an address, resulting in an invalid pointer when it's eventually passed to per_cpu_ptr(). Fix this by applying the bias only if it the address is in the module's address range. This heuristic avoids any Linux kernel-specific logic; hopefully it doesn't have any undesired side effects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 10:14:50 -08:00
Omar Sandoval	a7962e9477	libdrgn: debug_info: pass around Dwfl_Module instead of bias We're going to need the module start and end in drgn_object_from_dwarf_variable(), so pass the Dwfl_Module around and get the bias when we need it. This means we don't need the bias from drgn_dwarf_index_get_die(), so get rid of that, too. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 10:12:29 -08:00
Omar Sandoval	048952f9a6	libdrgn: x86-64: fix rsp of initial stack frame We're using task->thread.sp for rsp in the initial frame for both the struct inactive_task_frame path and frame pointer path. This is not correct for either. For kernels with struct inactive_task_frame, task->thread.sp points to to the struct inactive_task_frame. The stack pointer in the initial frame is the address immediately after the struct inactive_task_frame. For kernels without struct inactive_task_frame, task->thread.sp points to the saved rbp. We follow that rbp to the rbp and return address for the initial frame; its stack pointer is the address immediately after those. Fixes: `10142f922f` ("Add basic stack trace support") Fixes: `51596f4d6c` ("libdrgn: x86-64: remove garbage initial stack frame on old kernels") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-15 10:57:08 -08:00
Omar Sandoval	352c31e1ac	Add support for C++ template parameters Add struct drgn_type_template_parameter to libdrgn, the corresponding TypeTemplateParameter to the Python bindings, and support for parsing them from DWARF. With this, support for templates is almost, but not quite, complete. The main wart is that DW_TAG_name of compound types includes the template parameters, so the type tag includes it as well. We should remove that from the tag and instead have the type formatting code add it only when getting the full type name. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	b6958f920c	libdrgn: debug_info: move object parsing code in debug_info.c In preparation for calling the object parsing code from the type parsing code, move it up in the file (and update the coding style in drgn_object_from_dwarf_enumerator() while we're at it). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	be1bb279aa	libdrgn: debug_info: pass DIE bias when parsing types This will be needed for types containing reference objects. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	d35243b354	libdrgn: replace lazy types with lazy objects In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	190062f470	libdrgn: get drgn_type_member.bit_field_size through drgn_member_type() Getting the bit field size of a member will soon require evaluating the lazy type, so return it from drgn_member_type() instead of accessing it directly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	359177295d	libdrgn: move type definitions in drgn.h In preparation for struct drgn_type referencing struct drgn_object, move the former after the latter. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	934dd36302	libdrgn: remove unused name parameter from drgn_object_from_dwarf_{subprogram,variable}() Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 12:16:29 -08:00
Omar Sandoval	a57c26ed32	libdrgn: fix zero-length array GCC < 9.0 workaround for qualified types We're not applying the zero-length array workaround when the array type is qualified. Make sure we pass through can_be_incomplete_array when parsing DW_TAG_{const,restrict,volatile,atomic}_type. Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 11:21:57 -08:00
Omar Sandoval	ca7682650d	libdrgn: rename drgn_type_from_dwarf_child() to drgn_type_from_dwarf_attr() The type comes from the DW_AT_type attribute of the DIE, not a child DIE, so this is a better name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 11:02:27 -08:00
Omar Sandoval	798f0887a5	libdrgn: simplify language fall back handling If the language for a DWARF type is not found or unrecognized, we should fall back to the global default, not the program default (the program default language is for language-specific operations on the program, so DWARF parsing shouldn't depend on it). Add a fall_back parameter to drgn_language_from_die() and use it in DWARF parsing, and replace drgn_language_or_default() with a drgn_default_language variable. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 10:46:35 -08:00
Omar Sandoval	a8be40ca60	libdrgn: python: fix Program_hold_object() reference leak We should only increment a held object's reference count when it is initially inserted into the set; subsequent holds are no-ops. Fixes: `a8d632b4c1` ("libdrgn/python: use F14 instead of PyDict for Program::objects") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-06 17:48:29 -08:00
Omar Sandoval	b87070f98c	libdrgn: fix vector_shrink_to_fit() with size 0 realloc(ptr, 0) is equivalent to free(ptr). It may return NULL, in which case vector_do_shrink_to_fit() won't update the vector's data and capacity. A subsequent append will then try to reuse the previous allocation, causing a use-after-free. free() empty vectors explicitly instead. Fixes: `8d52536271` ("libdrgn: add common vector implementation") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-06 17:47:39 -08:00
Omar Sandoval	d6a840ec30	libdrgn: deinitialize empty members/parameters/enumerators when deduplicating Right now, an empty builder vector will not have anything to free, but if we start pre-reserving these later, it will be a leak. Fixes: `c7af566c6e` ("libdrgn: deduplicate all types with no members/parameters/enumerators") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-06 17:47:25 -08:00
Omar Sandoval	c7af566c6e	libdrgn: deduplicate all types with no members/parameters/enumerators Even if a compound, function, or enumerated type is complete, we can still deduplicate it as long as it doesn't have members, parameters, or enumerators. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-06 01:59:48 -08:00
Omar Sandoval	988e9e7190	libdrgn/python: add Object.absent_ Without this, the only way to check whether an object is absent in Python is to try to use the object and catch the ObjectAbsentError. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 15:06:40 -08:00
Omar Sandoval	30cfa40a72	libdrgn: rename "unavailable" objects to "absent" objects I was going to add an Object.available_ attribute, but that made me realize that the naming is somewhat ambiguous, as a reference object with an invalid address might also be considered "unavailable" by users. Use the name "absent" instead, which is more clear: the object isn't there at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 14:58:26 -08:00
Omar Sandoval	c2eec00ae0	libdrgn/python: use None instead of 0 for TypeMember.bit_field_size Make TypeMember.bit_field_size consistent with Object.bit_field_size_ by using None to represent a non-bit field instead of 0. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-25 01:53:23 -08:00
Omar Sandoval	7d7aa7bf7b	libdrgn/python: remove Type == operator The == operator on drgn.Type is only intended for testing. It's expensive and slow and not what people usually want. It's going to get even more awkward to define once types can refer to objects (for template parameters and static members and such). Let's replace == with a new identical() function only available in unit tests. Then, remove the operator from the Python bindings as well as the underlying libdrgn drgn_type_eq() and drgn_qualified_type_eq() functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 03:11:38 -08:00
Omar Sandoval	523fd26959	libdrgn: don't allow casting to non-scalar types at all Currently, we try to emulate the GNU C extension of casting a struct type to itself. This does a deep type comparison, which is expensive. We could take a shortcut like only comparing the kind and type name, but seeing as standard C only allows casting to a scalar type, let's drop support for casting to a struct (or other non-scalar) type entirely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 02:46:05 -08:00
Omar Sandoval	40004e5c8f	libdrgn/python: add offsetof() offsetof() can almost be implemented with Type.member(name).offset, but that doesn't parse member designators. Add an offsetof() function that does (and add drgn_type_offsetof() in libdrgn). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:46:41 -08:00
Omar Sandoval	a595e52d22	libdrgn/python: add Type.has_member() Add drgn_type_has_member() to libdrgn and Type.has_member() to the Python bindings. This can simplify some version checks, like the one in _for_each_block_device() since commit `9a10a927b0` ("helpers: fix for_each_{disk,partition}() on kernels >= v5.1"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:38:48 -08:00
Omar Sandoval	fd04463596	libdrgn/python: add Type.member() In Python, looking up a member in a drgn Type by name currently looks something like: member = [member for member in type.members if member.name == "foo"][0] Add a Type.member(name) method, which is both easier and more efficient. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:10:23 -08:00
Omar Sandoval	e72ecd0e2c	libdrgn: replace drgn_program_member_info() with drgn_type_find_member() Now that types are associated with their program, we don't need to pass the program separately to drgn_program_member_info() and can replace it with a more natural drgn_type_find_member() API that takes only the type and member name. While we're at it, get rid of drgn_member_info and return the drgn_type_member and bit_offset directly. This also fixes a bug that drgn_error_member_not_found() ignores the member name length. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:40:54 -08:00
Omar Sandoval	cf9a068820	libdrgn/python: fix reference counting on Type.members and Type.parameters The TypeMember and TypeParameter instances referring to a libdrgn drgn_lazy_type are only valid as long as the Type containing them is still alive. Hold a reference on the containing Type from LazyType. We can do this without growing LazyType by getting rid of the enum state and using sentinel values for LazyType::lazy_type as the state. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:09:12 -08:00
Omar Sandoval	738ae2c75f	libdrgn: pack struct drgn_object better We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64) by moving the bit_offset and little_endian members out of the value and reference structs. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	abafdd965f	Remove bit_offset from value objects There are a couple of reasons that it was the wrong choice to have a bit_offset for value objects: 1. When we store a buffer with a bit_offset, we're storing useless padding bits. 2. bit_offset describes a location, or in other words, part of an address. This makes sense for references, but not for values, which are just a bag of bytes. Get rid of union drgn_value.bit_offset in libdrgn, make Object.bit_offset None for value objects, and disallow passing bit_offset to the Object() constructor when creating a value. bit_offset can still be passed when creating an object from a buffer, but we'll shift the bytes down as necessary to store the value with no offset. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	22c1d87aec	libdrgn: cache page_offset and vmemmap as objects instead of uint64_t This is a little cleaner and saves on conversions back and forth between C values and objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 02:40:07 -08:00
Omar Sandoval	bce9ef5f8d	libdrgn: linux kernel: remove THREAD_SIZE object finder THREAD_SIZE is still broken and I haven't looked into the root cause (see commit `95be142d17` ("tests: disable THREAD_SIZE test")). We don't need it anymore anyways, so let's remove it entirely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 02:08:13 -08:00
Omar Sandoval	51596f4d6c	libdrgn: x86-64: remove garbage initial stack frame on old kernels On old kernels, we set the initial frame as containing only rbp and let libdwfl unwind it assuming frame pointers from there. This means that the initial frame has a garbage rip. Follow the frame pointer and set the previous rbp and return address ourselves instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 02:02:54 -08:00
Omar Sandoval	6e189027be	libdrgn: x86-64: pass frame object as const Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 01:55:36 -08:00
Omar Sandoval	3187453689	libdrgn: x86-64: remove unused read Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 01:38:11 -08:00
Omar Sandoval	ffa2e0acf1	libdrgn: add missing break in drgn_object_copy() Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-09 10:20:37 -08:00
Omar Sandoval	97fbedec1f	libdrgn: return unavailable objects for DWARF objects without value or address Now that we have the concept of unavailable objects, use it for DWARF where appropriate. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 14:15:09 -08:00
Omar Sandoval	6bd0c2b4d2	libdrgn: add concept of "unavailable" objects There are some situations where we can find an object but can't determine its value, like local variables that have been optimized out, inlined functions without a concrete instance, and pure virtual methods. It's still useful to get some information from these objects, namely their types. Let's add the concept of an "unavailable" object, which is an object with a known type but unknown value/address. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:58:19 -08:00
Omar Sandoval	5f17281926	libdrgn: make drgn_object::is_reference an enum To prepare for a new kind of object, replace the is_reference bool with an enum drgn_object_kind. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:37:58 -08:00
Omar Sandoval	edb1fe7f2f	libdrgn: rename drgn_object_kind to drgn_object_encoding I'd like to use the name drgn_object_kind to distinguish between values and references. "Encoding" is more accurate than "kind", anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:26 -08:00
Omar Sandoval	2710b4d2aa	libdrgn: add macros for strict enum switch statements There are several places where we'd like to enforce that every enumeration is handled in a switch. Add SWITCH_ENUM() and SWITCH_ENUM_DEFAULT() macros for that and use them. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:23 -08:00
Omar Sandoval	a4dbd7bf95	libdrgn: remove unused DRGN_NUM_ARCH Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:23 -08:00
Omar Sandoval	3360170336	libdrgn: only install page table memory reader when supported If virtual address translation isn't implemented for the target architecture, then we shouldn't add the page table memory reader. If we do, we get a DRGN_ERROR_INVALID_ARGUMENT error from linux_helper_read_vm() instead of a DRGN_ERROR_FAULT error as expected. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-27 01:27:30 -08:00
Omar Sandoval	5975d19580	libdrgn: report better errors when parsing DWARF/kmod index If the DWARF index encounters any error while parsing, it returns an error saying only "debug information is truncated", which makes it hard to track down parsing errors. The kmod index parser silently swallows errors. For both, replace the mread functions with a higher-level binary_buffer interface that can include more information including the location of the error. For example: /tmp/mybinary: .debug_info+0x4: expected at least 56 bytes, have 55 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-13 17:00:07 -08:00
Omar Sandoval	756e5d27ad	libdrgn: debug_info: put sections in an array (again) Back in commit `9ce9094ee0` ("libdrgn: dwarf_index: don't copy sections into each CU"), I changed the sections to be individual members. The next change will be easier if they're in an array. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-11 16:22:04 -08:00
Omar Sandoval	f0a3629c26	libdrgn: debug_info: add dwarf_tag_str() and use it for error messages There are several places where we manually pass around the string name of a tag so it can be used for error messages. Do it programatically instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-11 16:22:04 -08:00
Omar Sandoval	3885697696	drgn 0.0.8 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-11-11 13:32:04 -08:00
Omar Sandoval	fa081e32b9	libdrgn: update module section iterator for Linux v5.8 Linux v5.8 changed the module section structure, so we need to get the section name differently. Closes #73. Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-13 13:07:36 -07:00
Omar Sandoval	1c6465f0b0	libdrgn: fix infinite loop on error caching kernel module sections If cache_kernel_module_sections() in report_loaded_kernel_module() fails, we continue to the next iteration without advancing to the next kernel module. Then, we fail on that same kernel module and repeat. Make sure that we go to the next kernel module. Fixes: `423d2cd500` ("libdrgn: dwarf_index: rework file reporting") Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-13 12:25:22 -07:00
Omar Sandoval	431b91ddb5	libdrgn: fix use-after-free in kernel module reporting error case We're freeing path and then using it to report an error. This has some weird knock-on effects. Since we freed the path, the error message contains garbage. So, PyErr_SetString() can't decode it as a UTF-8 string. The end result is a MissingDebugInfoError with no message. Fix it by creating the error before freeing the path. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-13 11:55:06 -07:00
Omar Sandoval	2b325b9262	libdrgn: add an environment variable to disable use of /proc/modules and /sys/module We use /proc/modules and /sys/module to find loaded kernel modules for the running kernel instead of walking the module list in the core dump as an optimization. To make it easier to test the core dump path, add an environment variable to disable the optimization. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-13 11:24:39 -07:00
Omar Sandoval	661a5c56c3	libdrgn: refactor kernel module iterator The next commit will allow using the offline path for the live kernel, so the offline naming won't make much sense. Fold the offline path into the top-level functions, and make the live path an escape hatch. Also add some comments and improve naming for the file and directory handles and update the coding style. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-12 23:22:01 -07:00
Omar Sandoval	ce8540e39c	libdrgn: get rid of kernel_module_iterator::notes* These were added in commit `e5874ad18a` ("libdrgn: use libdwfl"), but they have never been used. Remove them. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-12 17:23:31 -07:00
Omar Sandoval	3c5d22637e	libdrgn: clean up hash function APIs and improve documentation Use _hash_pair() for hash functions that do the full double hashing and return a struct hash_pair and hash_() for other hashing utility functions. Also change some of the equality function names to be more symmetric and improve the documentation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-12 16:20:08 -07:00
Omar Sandoval	761da83ddd	libdrgn: add {min,max}_iconst() and rewrite min() and max() min() and max() from the Linux kernel go through the trouble of resulting in a constant expression if the arguments are constant expressions, but they can't be used outside of a function due to their use of ({ }). This means that they can't be used for, e.g., enumerators or global arrays. Let's simplify min() and max() and instead add explicit min_iconst() and max_iconst() macros that can be used everywhere that an integer constant expression is required. We can then use it in hash_table.h. While we're here, let's split these into their own header file and document them better. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-10 23:48:03 -07:00
Omar Sandoval	fa44171ba1	libdrgn: split bit operations into their own header And improve their documentation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-09 17:44:15 -07:00
Omar Sandoval	cae79d2676	libdrgn: add preprocessor utility macros These will be used in upcoming changes. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-09 16:36:59 -07:00
Omar Sandoval	4cbb9b552a	libdrgn: fix comparison of types with anonymous members drgn_type_members_eq() skips comparing the types of anonymous members. Fix that and add a test for it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-08 17:32:46 -07:00
Omar Sandoval	de6a4e07ae	libdrgn: fix Doxygen The Doxygen documentation for libdrgn has bit-rotted over time. Bring back the Internal module, clean up a few renamed members and parameters, and fix broken parsing caused by the generic definition macros. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-30 01:32:33 -07:00
Omar Sandoval	2704fd17c3	libdrgn: update Doxyfile doxygen warns about a few obsolete Doxyfile options. Update it with doxygen -u. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-29 23:04:08 -07:00
Omar Sandoval	286c09844e	Clean up #includes with include-what-you-use I recently hit a couple of CI failures caused by relying on transitive includes that weren't always present. include-what-you-use is a Clang-based tool that helps with this. It's a bit finicky and noisy, so this adds scripts/iwyu.py to make running it more convenient (but not reliable enough to automate it in Travis). This cleans up all reasonable include-what-you-use warnings and reorganizes a few header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-23 16:29:42 -07:00
Omar Sandoval	fdbe336386	libdrgn: use -isystem for elfutils headers The elfutils header files should be treated as if they were in the standard location, so use -isystem instead of -I. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 15:45:10 -07:00
Omar Sandoval	89b5da2abb	libdrgn: dwarf_index: free namespaces when rolling back Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 10:58:24 -07:00
Omar Sandoval	e69d0c0064	libdrgn: dwarf_index: fix use after free of pending CU If we create a pending CU for a namespace, then add more CUs to the index, the CU might get reallocated, resulting in a use after free. Fix it by storing the index of the CU instead of the pointer. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 10:58:24 -07:00
Omar Sandoval	f83bb7c71b	libdrgn: move debugging information tracking into drgn_debug_info Debugging information tracking is currently in two places: drgn_program finds debugging information, and drgn_dwarf_index stores it. Both of these responsibilities make more sense as part of drgn_debug_info, so let's move them there. This prepares us to track extra debugging information that isn't pertinent to indexing. This also reworks a couple of details of loading debugging information: - drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into a single structure, drgn_debug_info_module. - The first pass of DWARF indexing now happens in parallel with reading compilation units (by using OpenMP tasks). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 10:58:24 -07:00
Omar Sandoval	3ac9ae357b	libdrgn: rename drgn_dwarf_info_cache to drgn_debug_info The current name is too verbose. Let's go with a shorter, more generic name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-11 17:41:23 -07:00
Jay Kamat	d1beb0184a	libdrgn: add support for objects in C++ namespaces DWARF represents namespaces with DW_TAG_namespace DIEs. Add these to the DWARF index, with each namespace being its own sub-index. We only index the namespace itself when it is first accessed, which should help with startup time and simplifies tracking. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2020-09-02 17:13:16 -07:00
Jay Kamat	a51abfcd70	libdrgn: dwarf_index: keep CUs after indexing In order to index namespaces lazily, we need the CU structures. Rename struct compilation_unit to the less generic struct drgn_dwarf_index_cu and keep the CUs in a vector in the dindex. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	66ad5077c9	libdrgn: dwarf_index: return indexed DIE entry from drgn_dwarf_index_iterator_next() For namespace support, we will want to access the struct drgn_dwarf_index_die for namespaces instead of the Dwarf_Die. Split drgn_dwarf_index_get_die() out of drgn_dwarf_index_iterator_next(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	d512964c1e	libdrgn: add drgn_error_copy() This is needed for a future change where we'll want to save an error and return it multiple times. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	7a85b4188e	libdrgn: clean up read.h helpers and avoid undefined pointer behavior There are a couple of related ways that we can cause undefined behavior when parsing a malformed DWARF or depmod index file: 1. There are several places where we increment the cursor to skip past some data. It is undefined behavior if the result points out of bounds of the data, even if we don't attempt to dereference it. 2. read_in_bounds() checks that ptr <= end. This pointer comparison is only defined if ptr and end both point to elements of the same array object or one past the last element. If ptr has gone past end, then this comparison is likely undefined anyways. Fix it by adding a helper to skip past data with bounds checking. Then, all of the helpers can assume that ptr <= end and maintain that invariant. while we're here and auditing all of the call sites, let's clean up the API and rename it from read_foo() to the less generic mread_foo(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	c053c2b212	libdrgn: dwarf_index: handle DW_AT_specification with DW_FORM_ref_addr Now that we can handle a DW_AT_specification that references another compilation unit, add support for DW_FORM_ref_addr. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	26291647eb	libdrgn: dwarf_index: handle DW_AT_specification DIEs with two passes We currently handle DIEs with a DW_AT_specification attribute by parsing the corresponding declaration to get the name and inserting the DIE as usual. This has a couple of problems: 1. It only works if DW_AT_specification refers to the same compilation unit, which is true for DW_FORM_ref{1,2,4,8,_udata}, but not DW_FORM_ref_addr. As a result, drgn doesn't support the latter. 2. It assumes that the DIE with DW_AT_specification is in the correct "scope". Unfortunately, this is not true for g++: for a variable definition in a C++ namespace, it generates a DIE with DW_AT_declaration as a child of the DW_TAG_namespace DIE and a DIE which refers to the declaration with DW_AT_specification _outside_ of the DW_TAG_namespace as a child of the DW_TAG_compilation_unit DIE. Supporting both of these cases requires reworking how we handle DW_AT_specification. This commit takes an approach of parsing the DWARF data in two passes: the first pass reads the abbrevation and file name tables and builds a map of instances of DW_AT_specification; the second pass indexes DIEs as before, but ignores DIEs with DW_AT_specification and handles DIEs with DW_AT_declaration by looking them up in the map built by the first pass. This approach is a 10-20% regression in indexing time in the benchmarks I ran. Thankfully, it is not 100% slower for a couple of reasons. The first is that the two passes are simpler than the original combined pass. The second is that a decent part of the indexing time is spent faulting in the mapped debugging information, which only needs to happen once (even if the file is cached, minor page faults add non-negligible overhead). This doesn't handle DW_AT_specification "chains" yet, but neither did the original code. If it is necessary, it shouldn't be too difficult to add. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	507977664c	libdrgn: dwarf_index: store abbrevation and file name tables in CU This is preparation for the next change where we'll need to do two passes over the CUs. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	0b4ab1772b	libdrgn: dwarf_index: store DIE indices as uint32_t It's very unlikely that we'll ever index more than 4 billion DIEs in a single shard, so we can shrink the index a bit by using uint32_t indices (and uint8_t tag). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	9ce9094ee0	libdrgn: dwarf_index: don't copy sections into each CU I originally copied the sections into each compilation unit to avoid a pointer indirection, but performance-wise it's a wash, so we might as well save the memory. This will be more important when we keep the CUs after indexing. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	94e7b1f92c	libdrgn: dwarf_index: avoid copying CUs for one thread In read_cus(), the master thread can use the final CUs vector directly and the rest of the threads can merge their private vectors in. This consistently shaves a few milliseconds off of startup. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	53ba7262cd	libdrgn: dwarf_index: handle DW_AT_declaration with DW_FORM_flag We currently assume that if DW_AT_declaration is present, it is true. This seems to be true in practice, and I see no reason to ever use DW_FORM_flag with a value of zero. There's no performance hit to handle it, though, so we might as well. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	ea9f3f3114	libdrgn: dwarf_index: don't worry about tag of CU DIE As a small simplification, we can take commit `9bb2ccecb7` ("Enable DWARF indexing to work with partial units") further and not look at the tag of the top-level DIE at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	c8f84c57fb	libdrgn: dwarf_index: use size_t instead of uint64_t where appropriate The CU unit length and DIE offset are both limited by the size of the mapped debugging information, i.e., size_t. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	2252bef1a7	libdrgn: dwarf_index: rename TAG_FLAG_* and TAG_MASK to DIE_FLAG_* This is more clear: although these flags happen to be encoded with the DWARF tag, they are flags regarding the DIE. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	85c4b36820	libdrgn: dwarf_index: fix leak when parsing bad line number program header If we fail to read an include directory in read_file_name_table(), we need to free the directory hashes. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	ff96c75da0	helpers: translate task_state_to_char() to Python Commit `326107f054` ("libdrgn: add task_state_to_char() helper") implemented task_state_to_char() in libdrgn so that it could be used in commit `4780c7a266` ("libdrgn: stack_trace: prohibit unwinding stack of running tasks"). As of commit `eea5422546` ("libdrgn: make Linux kernel stack unwinding more robust"), it is no longer used in libdrgn, so we can translate it to Python. This removes a bunch of code and is more useful as an example. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 13:54:39 -07:00
Omar Sandoval	e96d9fd3fd	libdrgn/python: don't allow None for Program.object() flags Similar to the previous commit, this was to work around pydoc issues that we don't have anymore. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:29 -07:00
Omar Sandoval	2fc514f2a4	libdrgn/python: add Qualifiers.NONE and stop using Optional[Qualifiers] I originally did it this way because pydoc doesn't handle non-trivial defaults in signature very well (see commit `67a16a09b8` ("tests: test that Python documentation renders")). drgndoc doesn't generate signature for pydoc anymore, though, so we don't need to worry about it and can clean up the typing. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:29 -07:00
Omar Sandoval	e49a87a3d7	libdrgn: remove struct drgn_object::prog We can get it via the type now. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:21 -07:00
Omar Sandoval	a97f6c4fa2	Associate types with program I originally envisioned types as dumb descriptors. This mostly works for C because in C, types are fairly simple. However, even then the drgn_program_member_info() API is awkward. You should be able to look up a member directly from a type, but we need the program for caching purposes. This has also held me back from adding offsetof() or has_member() APIs. Things get even messier with C++. C++ template parameters can be objects (e.g., template <int N>). Such parameters would best be represented by a drgn object, which we need a drgn program for. Static members are a similar case. So, let's reimagine types as being owned by a program. This has a few parts: 1. In libdrgn, simple types are now created by factory functions, drgn_foo_type_create(). 2. To handle their variable length fields, compound types, enum types, and function types are constructed with a "builder" API. 3. Simple types are deduplicated. 4. The Python type factory functions are replaced by methods of the Program class. 5. While we're changing the API, the parameters to pointer_type() and array_type() are reordered to be more logical (and to allow pointer_type() to take a default size of None for the program's default pointer size). 6. Likewise, the type factory methods take qualifiers as a keyword argument only. A big part of this change is updating the tests and splitting up large test cases into smaller ones in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:41:09 -07:00
Omar Sandoval	c31208f69c	libdrgn: fold drgn_type_index into drgn_program This is preparation for associating types with a program. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:36:35 -07:00
Omar Sandoval	1c8181e22d	libdrgn: rearrange struct drgn_program members struct drgn_program has a bunch of state scattered around. Group it together more logically, even if it means sacrificing some padding here and there. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:34:44 -07:00
Omar Sandoval	d4e0771f87	libdrgn: return error from drgn_program_{is_little_endian,bswap,is_64_bit}() Most places that call these check has_platform and return an error, and those that don't can live with the extra check. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 16:56:28 -07:00
Omar Sandoval	a8d632b4c1	libdrgn/python: use F14 instead of PyDict for Program::objects Program::objects is used to store references to objects that must stay alive while the Program is alive. It is currently a PyDict where the keys are the object addresses as PyLong and the values are the objects themselves. This has two problems: 1. Allocating the key as a full object is obviously wasteful. 2. PyDict doesn't have an API for reserving capacity ahead of time, which we want for an upcoming change. Both of these are easily fixed by using our own hash table. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 16:56:28 -07:00
arsarwade	6f6c5f272f	libdrgn: export function drgn_object_init() (#70 ) drgn_object_init() is available in drgh.h file and seems to a required call before calling drgn_program_find_object(). Without this, trying to call drgn_object_init() from an external C application results in undefined reference. Signed-off-by: Aditya Sarwade <asarwade@fb.com>	2020-08-21 10:24:52 -07:00
Omar Sandoval	0cf3320a89	Add type annotations to helpers Now that drgndoc can handle overloads and we have the IntegerLike and Path aliases, we can add type annotations to all helpers. There are also a couple of functional changes that snuck in here to make annotating easier. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-20 16:28:02 -07:00
Omar Sandoval	2d49ef657b	Add Path type alias Rather than duplicating Union[str, bytes, os.PathLike] everywhere, add an alias. Also make it explicitly os.PathLike[str] or os.PathLike[bytes] to get rid of some mypy --strict errors. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-20 11:20:29 -07:00
Omar Sandoval	66c5cc83a6	Add IntegerLike type annotation Lots if interfaces in drgn transparently turn an integer Object into an int by using __index__(), so add an IntegerLike protocol for this and use it everywhere applicable. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-20 11:16:50 -07:00
Omar Sandoval	20bcde1f1d	drgn 0.0.7 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-27 23:32:32 -07:00
Omar Sandoval	025989871b	drgn 0.0.6 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-27 17:25:54 -07:00
Omar Sandoval	e3309765f9	helpers: add kaslr_offset() and move pgtable_l5_enabled() Make the KASLR offset available to Python in a new drgn.helpers.linux.boot module, and move pgtable_l5_enabled() there, too. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-27 17:00:16 -07:00
Omar Sandoval	e7f353c118	libdrgn: hash_table: clean up coding style Clean up the coding style of the remaining few places that the last couple of changes didn't rewrite. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-18 11:53:05 -07:00
Omar Sandoval	f94b0262c6	libdrgn: hash_table: implement vector storage policy The folly F14 implementation provides 3 storage policies: value, node, and vector. The default F14FastMap/F14FastSet chooses between the value and vector policies based on the value size. We currently only implement the value policy, as the node policy is easy to emulate and the vector policy would've added more complexity. This adds support for the vector policy (adding even more C abuse :) and automatically chooses the policy the same way as folly. It'd be easy to add a way to choose the policy if needed. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-18 11:53:00 -07:00
Omar Sandoval	9ea11a7c26	libdrgn: hash_table: port reserve optimization The only major change to the folly F14 implementation since I originally ported it is commit 3d169f4365cf ("memory savings for F14 tables with explicit reserve()"). That is a small improvement for small tables and a large improvement for vector tables, which are about to be added. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-18 01:42:37 -07:00
Omar Sandoval	2eab47ce9e	libdrgn: hash_table: use posix_memalign() instead of aligned_alloc() posix_memalign() doesn't have the restriction that the size must be a multiple of the alignment like aligned_alloc() does in C11. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-18 01:42:37 -07:00
Omar Sandoval	2409868409	libdrgn: hash_table: define chunk alignment constant Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-18 01:42:37 -07:00
Omar Sandoval	6d4af7e17e	libdrgn: dwarf_info_cache: handle variables DW_AT_const_value Compile-time constants have DW_AT_const_value instead of DW_AT_location. We can translate those to a value object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-13 15:23:51 -07:00
Omar Sandoval	213c148ce6	libdrgn: dwarf_info_cache: handle DW_AT_endianity Variables can have a non-default endianity. Handle it and clean up variable endian handling. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-13 14:26:58 -07:00
Omar Sandoval	c840072d05	libdrgn: make drgn_object_set_buffer() take a void * It's awkward to make callers cast to char *. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-13 10:25:03 -07:00
Omar Sandoval	f1eaf5b14c	libdrgn: add load_debug_info example program Really it's more of a test program than an example program. It's useful for benchmarking, testing with valgrind, etc. It's not built by default, but it can be built manually with: $ make -C build/temp.* examples/load_debug_info And run with: $ ./build/temp.*/examples/load_debug_info Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-10 16:18:58 -07:00
Omar Sandoval	3028da4d1d	libdrgn: compare language in drgn_type_eq() Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-08 22:07:49 -07:00
Omar Sandoval	1b47b866b4	libdrgn: go back to trusting PRSTATUS PID Commit `eea5422546` ("libdrgn: make Linux kernel stack unwinding more robust") overlooked that if the task is running in userspace, the stack pointer in PRSTATUS obviously won't match the kernel stack pointer. Let's bite the bullet and use the PID. If the race shows up in practice, we can try to come up with another workaround.	2020-07-08 18:34:16 -07:00
Omar Sandoval	293418294a	libdrgn: assume compiler uses sane integer implementation I once tried to implement a generic arithmetic right shift macro without relying on any implementation-defined behavior, but this turned out to be really hard. drgn is fairly tied to GCC and GCC-compatible compilers (like Clang), so let's just assume GCC's model [1]: modular conversion to signed types, two's complement signed bitwise operators, and sign extension for signed right shift. 1: https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html	2020-07-07 17:18:17 -07:00
Omar Sandoval	948cda2941	libdrgn: add vector/hash table initializers and update coding style Declaring a local vector or hash table and separately initializing it with vector_init()/hash_table_init() is annoying. Add macros that can be used as initializers. This exposes several places where the C89 style of placing all declarations at the beginning of a block is awkward. I adopted this style from the Linux kernel, which uses C89 and thus requires this style. I'm now convinced that it's usually nicer to declare variables where they're used. So let's officially adopt the style of mixing declarations and code (and ditch the blank line after declarations) and update the functions touched by this change.	2020-07-01 12:48:24 -07:00
Omar Sandoval	e4c52c5422	libdrgn: linux_kernel: use names for kmod index constants This makes it much easier to follow along with the code and understand the format.	2020-06-30 15:14:21 -07:00
Omar Sandoval	03d8cb0e32	libdrgn: fix hash_pair_from_non_avalanching_hash() on 64-bit without SSE 4.2 We were forgetting to mask away the extra bits. There are two places that we use the tag without converting it to a uint8_t: hash_table_probe_delta(), which is mostly benign since we mask it by the chunk mask anyways; and table_chunk_match() without SSE 2, which completely breaks. While we're here, let's align the comments better.	2020-06-24 13:33:08 -07:00
Omar Sandoval	8e7c1f1009	drgn 0.0.5	2020-05-26 10:04:06 -07:00
Omar Sandoval	a227d0d50e	Update elfutils and revert activation frame patch After thinking about it some more, I realized that "libdwfl: simplify activation frame logic" breaks the case where during unwinding someone queries isactivation for reasons other than knowing whether to decrement program counter. Revert the patch and refactor "libdwfl: add interface for getting Dwfl_Module and Dwarf_Frame for Dwfl_Frame" to handle it differently. Based on: c95081596 size: Also obey radix printing for bsd format. With the following patches: configure: Add --disable-programs configure: Add --disable-shared libdwfl: add interface for attaching to/detaching from threads libdwfl: export __libdwfl_frame_reg_get as dwfl_frame_register libdwfl: add interface for getting Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: add interface for evaluating DWARF expressions in a frame	2020-05-20 13:38:49 -07:00
Omar Sandoval	eea5422546	libdrgn: make Linux kernel stack unwinding more robust drgn has a couple of issues unwinding stack traces for kernel core dumps: 1. It can't unwind the stack for the idle task (PID 0), which commonly appears in core dumps. 2. It uses the PID in PRSTATUS, which is racy and can't actually be trusted. The solution for both of these is to look up the PRSTATUS note by CPU instead of PID. For the live kernel, drgn refuses to unwind the stack of tasks in the "R" state. However, the "R" state is running or runnable, so in the latter case, we can still unwind the stack. The solution for this is to look at on_cpu for the task instead of the state.	2020-05-20 12:03:00 -07:00
Omar Sandoval	146930aff8	libdrgn: replace arch frame_registers with callbacks We currently unwind from pt_regs and NT_PRSTATUS using an array of register definitions. It's more flexible and more efficient to do this with an architecture-specific callback. For x86-64, this change also makes us depend on the binary layout rather than member names of struct pt_regs, but that shouldn't matter unless people are defining their own, weird struct pt_regs.	2020-05-19 17:11:27 -07:00
Omar Sandoval	4d8597f0f8	libdrgn: add THREAD_SIZE to Linux kernel object finder Despite the naming, this is the kernel stack size.	2020-05-19 17:10:54 -07:00
Omar Sandoval	971a2d3687	libdrgn/python: make Objects fully immutable The model has always been that drgn Objects are immutable, but for some reason I went through the trouble of allowing __init__() to reinitialize an already initialized Object. Instead, let's fully initialize the Object in __new__() and get rid of __init__().	2020-05-18 00:07:49 -07:00
Omar Sandoval	ab876f3dbd	libdrgn/python: allow specifying Object value positionally It's annoying to have to do value= when creating objects, especially in interactive mode. Let's allow passing in the value positionally so that `Object(prog, "int", value=0)` becomes `Object(prog, "int", 0)`. It's clear enough that this is creating an int with value 0.	2020-05-18 00:07:49 -07:00
Omar Sandoval	8b264f8823	Update copyright headers to Facebook and add missing headers drgn was originally my side project, but for awhile now it's also been my work project. Update the copyright headers to reflect this, and add a copyright header to various files that were missing it.	2020-05-15 15:13:02 -07:00
Omar Sandoval	c339113f9c	libdrgn: adjust program counter when looking up frame symbol For functions that call a noreturn function, the compiler may omit code after the call instruction. This means that the return address may not lie in the caller's symbol. dwfl_frame_pc() returns whether a frame is an "activation", i.e., its program counter is guaranteed to lie within the caller. This is only the case for the initial frame, frames interrupted by a signal, and the signal trampoline frame. For everything else, we need to decrement the program counter before doing any lookups.	2020-05-13 17:11:54 -07:00
Omar Sandoval	175f83fc23	Update elfutils with noreturn unwinding fix Rebase on master and fix dwfl_frame_module/dwfl_frame_dwarf_frame to decrement the program counter when necessary. Based on: a8493c12a libdw: Skip imported compiler_units in libdw_visit_scopes walking DIE tree With the following patches: configure: Add --disable-programs configure: Add --disable-shared libdwfl: simplify activation frame logic libdwfl: add interface for attaching to/detaching from threads libdwfl: add interface for getting Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: export __libdwfl_frame_reg_get as dwfl_frame_register libdwfl: add interface for evaluating DWARF expressions in a frame	2020-05-13 16:41:52 -07:00
Omar Sandoval	bf545105c6	libdrgn: build in silent mode by default The automake/libtool compilation output is obnoxiously verbose. Switch on automake's silent mode, and make the custom rules honor it.	2020-05-10 00:12:50 -07:00
Omar Sandoval	2d1481f5ab	libdrgn: add page table walker kernel memory reader Now that we can walk page tables, we can use it in a memory reader that reads kernel memory via the kernel page table. This means that we don't need libkdumpfile for ELF vmcores anymore (although I'll keep the functionality around until this code has been validated more).	2020-05-08 17:37:56 -07:00
Omar Sandoval	e697be707c	libdrgn: use swapper_pg_dir in vmcoreinfo for fallback PAGE_OFFSET I originally wanted to avoid depending on another vmcoreinfo field, but an the next change is going to depend on swapper_pg_dir in vmcoreinfo anyways, and it ends up being simpler to use it.	2020-05-08 17:37:56 -07:00
Omar Sandoval	8a276838ac	helpers: add access_process_vm() and access_remote_vm() Now that we can walk page tables, we can finally read memory from userspace tasks. Closes #53.	2020-05-08 17:37:01 -07:00
Omar Sandoval	d0a1718451	libdrgn: implement virtual address translation/page table walking There are a few big use cases for this in drgn: * Helpers for accessing memory in the virtual address space of userspace tasks. * Removing the libkdumpfile dependency for vmcores. * Handling gaps in the virtual address space of /proc/kcore (cf. #27). I dragged my feet on implementing this because I thought it would be more complicated, but the page table layout on x86-64 isn't too bad. This commit implements page table walking using a page table iterator abstraction. The first thing we'll add on top of this will be a helper for reading memory from a virtual address space, but in the future it'd also be possible to export the page table iterator directly.	2020-05-08 17:36:19 -07:00
Omar Sandoval	63299e0701	libdrgn: actually use uint64_t for two's complement unary ops UNARY_OP_SIGNED_2C() uses a union of int64_t and uint64_t to avoid signed integer overflow... except that there's a typo and the uint64_t is actually an int64_t. Fix it and add a test that would catch it with -fsanitize=undefined.	2020-05-08 13:50:24 -07:00
Omar Sandoval	8f81ea255f	libdrgn: don't use unaligned loads to parse DWARF -fsanitize=undefined reports that the read_u* helpers rely on unaligned loads. Use memcpy() instead.	2020-05-08 13:50:24 -07:00
Omar Sandoval	3d59e042f4	libdrgn: don't open-code fls() c_integer_literal() has an open-coded equivalent of fls() that assumes that unsigned long long is 64 bits. Use fls() instead.	2020-05-08 00:20:42 -07:00
Omar Sandoval	340e00dfb5	libdrgn: improve and document bit operations fls() can be implemented with __bitop(), and we can get rid of clz() since it's only used by fls().	2020-05-08 00:14:25 -07:00
Omar Sandoval	f49d68d8f9	libdrgn: split generic utility functions out of internal.h internal.h includes both drgn-specific helpers and generic utility functions. Split the latter into their own util.h header and use it instead of internal.h in the generic data structure code. This makes it easier to copy the data structures into other projects/test programs.	2020-05-07 16:03:43 -07:00
Omar Sandoval	a95e42ef2e	libdrgn/python: use vector for Program_load_debug_info() Program_load_debug_info() is the last user of the resize_array()/realloc_array() utility functions. We can clean it up by using a vector and finally get rid of those functions. This also happens to fix three bugs in Program_load_debug_info(): we weren't setting a Python exception if we couldn't allocate the path_args array, we weren't zeroing path_args after resizing the array, and we weren't freeing the path_args array. Shame on whoever wrote this.	2020-05-07 15:47:57 -07:00
Omar Sandoval	0a100064c1	libdrgn: improve and rename DRGN_UNREACHABLE() DRGN_UNREACHABLE() currently expands to abort(), but assert() provides more information. If NDEBUG is defined, we can use __builtin_unreachable() instead. DRGN_UNREACHABLE() isn't drgn-specific, so this renames it to UNREACHABLE(). It's also not really related to errors, so this moves it to internal.h.	2020-05-07 15:16:22 -07:00
Omar Sandoval	d759c7ed20	libdrgn: get rid of OFF_MAX This hasn't been used since commit `417a6f0d76` ("libdrgn: make memory reader pluggable with callbacks").	2020-05-07 14:41:05 -07:00
Omar Sandoval	23574e59d5	libdrgn: add /proc/kcore physical segments on old kernels Before Linux v4.11, /proc/kcore didn't have valid physical addresses, so it's currently not possible to read from physical memory on old kernels. However, if we can figure out the address of the direct mapping, then we can determine the corresponding physical addresses for the segments and add them.	2020-05-04 13:20:27 -07:00
Omar Sandoval	f8c33518eb	libdrgn: handle kernel core dumps with all zero p_paddr We treat core dumps with all zero p_paddrs as not having valid physical addresses. However, it is theoretically possible for a kernel core dump to only have one segment which legitimately has a p_paddr of 0 (e.g., if it only has a segment for the direct mapping, although note that this isn't currently possible on x86, as Linux on x86 reserves PFN 0 for the BIOS [1]). If the core dump has a VMCOREINFO note, then it is either a vmcore, which has valid physical addresses, or it is /proc/kcore with Linux kernel commit 23c85094fe18 ("proc/kcore: add vmcoreinfo note to /proc/kcore") (in v4.19), so it must also have Linux kernel commit 464920104bf7 ("/proc/kcore: update physical address for kcore ram and text") (in v4.11) (ignoring the possibility of a franken-kernel which backported the former but not the latter). Therefore, treat core dumps with a VMCOREINFO note as having valid physical addresses. 1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/setup.c?h=v5.6#n678	2020-05-04 13:20:27 -07:00
Omar Sandoval	5505628235	libdrgn: get rid of struct drgn_program.num_file_segments This isn't used anymore. Remove it and simplify the loop adding file segments.	2020-05-04 13:20:27 -07:00

... 3 4 5 6 7 ...

669 Commits