JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-22 17:23:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	9da9f6a871	libdrgn: fold struct vmcoreinfo into struct drgn_program In an upcoming commit, we will parse the AArch64 pointer authentication code mask either from the VMCOREINFO note or the NT_ARM_PAC_MASK note. Since it doesn't always come from VMCOREINFO, it doesn't make sense to put it in struct vmcoreinfo; struct drgn_program makes more sense. So, make parse_vmcoreinfo() take struct drgn_program instead of struct vmcoreinfo, rename it to drgn_program_parse_vmcoreinfo(), and replace struct vmcoreinfo with an anonymous struct in struct drgn_program. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-06-26 09:18:07 -07:00
Omar Sandoval	4f5249775d	Fix various lints Some functions that could be static found by -Wmissing-prototypes, some include-what-you-use warnings, some missing SPDX identifiers. These lints should be automated at some point. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-17 10:45:42 -08:00
Mykola Lysenko	7580fffbdf	Add drgn.Program.main_thread() Currently only supported for user-space crash dumps. E.g. no support for live user-space application debugging or kernel debugging. Closes #144. Signed-off-by: Mykola Lysenko <mykolal@fb.com>	2022-02-10 15:53:50 -08:00
Kevin Svetlitski	301cc767ba	Implement a new API for representing threads Previously, drgn had no way to represent a thread – retrieving a stack trace (the only extant thread-specific operation) was achieved by requiring the user to directly provide a tid. This commit introduces the scaffolding for the design outlined in issue #92, and implements the corresponding methods for userspace core dumps, the live Linux kernel, and Linux kernel core dumps. Future work will build on top of this commit to support live userspace processes. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-01-11 17:28:17 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	d1745755f1	Fix some include-what-you-use warnings Also: * Rename struct string to struct nstring and move it to its own header. * Fix scripts/iwyu.py, which was broken by commit `5541fad063` ("Fix some flake8 errors"). * Add workarounds for a few outstanding include-what-you-use issues. There is still a false positive for include-what-you-use/include-what-you-use#970, but hopefully that is fixed soon. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-10 15:09:29 -08:00
Omar Sandoval	802d6cc9ff	libdrgn: rename drgn_program::_dbinfo to dbinfo The underscore was meant to discourage direct access in favor of using drgn_program_get_dbinfo(), but it turns out that it's more normal to access it directly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-10-23 00:52:23 -07:00
Omar Sandoval	c1e16ae3ec	libdrgn: fold drgn_program_get_dbinfo() into only caller The only time that we want to create the drgn_debug_info is when we're loading debugging information. Everywhere else, we fail fast if there is no debugging information. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-10-23 00:40:57 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	630d39e345	libdrgn: add ORC unwinder The Linux kernel has its own stack unwinding format for x86-64 called ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is essentially a simplified, less complete version of DWARF CFI. ORC is generated by analyzing machine code, so it is present for all but a few ignored functions. In contrast, DWARF CFI is generated by the compiler and is therefore missing for functions written in assembly and inline assembly (which is widespread in the kernel). This implements an ORC stack unwinder: it applies ELF relocations to the ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule kind, parses and efficiently stores ORC data, and translates ORC to drgn CFI rules. This will allow us to stack trace through assembly code, interrupts, and system calls. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-29 10:01:52 -07:00
Omar Sandoval	38d4330fec	libdrgn: clean up stale comment references and Doxygen warnings Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 16:15:43 -07:00
Omar Sandoval	671947d185	libdrgn: remove unused drgn_program::attached_dwfl_state I missed this when I removed the code that used it. Fixes: `eec67768aa` ("libdrgn: replace elfutils DWARF unwinder with our own") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 15:41:07 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	25eb2abb1a	libdrgn: add drgn_platform getters Add low-level getters equivalent to the drgn_program platform-related helpers and use them in places where we have checked or can assume that the platform is known. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	e04eda9880	libdrgn: define HOST_LITTLE_ENDIAN As a minor cleanup, instead of writing __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ everywhere, define and use HOST_LITTLE_ENDIAN. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	352c31e1ac	Add support for C++ template parameters Add struct drgn_type_template_parameter to libdrgn, the corresponding TypeTemplateParameter to the Python bindings, and support for parsing them from DWARF. With this, support for templates is almost, but not quite, complete. The main wart is that DW_TAG_name of compound types includes the template parameters, so the type tag includes it as well. We should remove that from the tag and instead have the type formatting code add it only when getting the full type name. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	d35243b354	libdrgn: replace lazy types with lazy objects In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	c7af566c6e	libdrgn: deduplicate all types with no members/parameters/enumerators Even if a compound, function, or enumerated type is complete, we can still deduplicate it as long as it doesn't have members, parameters, or enumerators. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-06 01:59:48 -08:00
Omar Sandoval	22c1d87aec	libdrgn: cache page_offset and vmemmap as objects instead of uint64_t This is a little cleaner and saves on conversions back and forth between C values and objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 02:40:07 -08:00
Omar Sandoval	bce9ef5f8d	libdrgn: linux kernel: remove THREAD_SIZE object finder THREAD_SIZE is still broken and I haven't looked into the root cause (see commit `95be142d17` ("tests: disable THREAD_SIZE test")). We don't need it anymore anyways, so let's remove it entirely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 02:08:13 -08:00
Omar Sandoval	de6a4e07ae	libdrgn: fix Doxygen The Doxygen documentation for libdrgn has bit-rotted over time. Bring back the Internal module, clean up a few renamed members and parameters, and fix broken parsing caused by the generic definition macros. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-30 01:32:33 -07:00
Omar Sandoval	286c09844e	Clean up #includes with include-what-you-use I recently hit a couple of CI failures caused by relying on transitive includes that weren't always present. include-what-you-use is a Clang-based tool that helps with this. It's a bit finicky and noisy, so this adds scripts/iwyu.py to make running it more convenient (but not reliable enough to automate it in Travis). This cleans up all reasonable include-what-you-use warnings and reorganizes a few header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-23 16:29:42 -07:00
Omar Sandoval	f83bb7c71b	libdrgn: move debugging information tracking into drgn_debug_info Debugging information tracking is currently in two places: drgn_program finds debugging information, and drgn_dwarf_index stores it. Both of these responsibilities make more sense as part of drgn_debug_info, so let's move them there. This prepares us to track extra debugging information that isn't pertinent to indexing. This also reworks a couple of details of loading debugging information: - drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into a single structure, drgn_debug_info_module. - The first pass of DWARF indexing now happens in parallel with reading compilation units (by using OpenMP tasks). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-22 10:58:24 -07:00
Omar Sandoval	3ac9ae357b	libdrgn: rename drgn_dwarf_info_cache to drgn_debug_info The current name is too verbose. Let's go with a shorter, more generic name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-11 17:41:23 -07:00
Omar Sandoval	ff96c75da0	helpers: translate task_state_to_char() to Python Commit `326107f054` ("libdrgn: add task_state_to_char() helper") implemented task_state_to_char() in libdrgn so that it could be used in commit `4780c7a266` ("libdrgn: stack_trace: prohibit unwinding stack of running tasks"). As of commit `eea5422546` ("libdrgn: make Linux kernel stack unwinding more robust"), it is no longer used in libdrgn, so we can translate it to Python. This removes a bunch of code and is more useful as an example. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 13:54:39 -07:00
Omar Sandoval	a97f6c4fa2	Associate types with program I originally envisioned types as dumb descriptors. This mostly works for C because in C, types are fairly simple. However, even then the drgn_program_member_info() API is awkward. You should be able to look up a member directly from a type, but we need the program for caching purposes. This has also held me back from adding offsetof() or has_member() APIs. Things get even messier with C++. C++ template parameters can be objects (e.g., template <int N>). Such parameters would best be represented by a drgn object, which we need a drgn program for. Static members are a similar case. So, let's reimagine types as being owned by a program. This has a few parts: 1. In libdrgn, simple types are now created by factory functions, drgn_foo_type_create(). 2. To handle their variable length fields, compound types, enum types, and function types are constructed with a "builder" API. 3. Simple types are deduplicated. 4. The Python type factory functions are replaced by methods of the Program class. 5. While we're changing the API, the parameters to pointer_type() and array_type() are reordered to be more logical (and to allow pointer_type() to take a default size of None for the program's default pointer size). 6. Likewise, the type factory methods take qualifiers as a keyword argument only. A big part of this change is updating the tests and splitting up large test cases into smaller ones in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:41:09 -07:00
Omar Sandoval	c31208f69c	libdrgn: fold drgn_type_index into drgn_program This is preparation for associating types with a program. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:36:35 -07:00
Omar Sandoval	1c8181e22d	libdrgn: rearrange struct drgn_program members struct drgn_program has a bunch of state scattered around. Group it together more logically, even if it means sacrificing some padding here and there. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:34:44 -07:00
Omar Sandoval	d4e0771f87	libdrgn: return error from drgn_program_{is_little_endian,bswap,is_64_bit}() Most places that call these check has_platform and return an error, and those that don't can live with the extra check. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 16:56:28 -07:00
Omar Sandoval	1b47b866b4	libdrgn: go back to trusting PRSTATUS PID Commit `eea5422546` ("libdrgn: make Linux kernel stack unwinding more robust") overlooked that if the task is running in userspace, the stack pointer in PRSTATUS obviously won't match the kernel stack pointer. Let's bite the bullet and use the PID. If the race shows up in practice, we can try to come up with another workaround.	2020-07-08 18:34:16 -07:00
Omar Sandoval	eea5422546	libdrgn: make Linux kernel stack unwinding more robust drgn has a couple of issues unwinding stack traces for kernel core dumps: 1. It can't unwind the stack for the idle task (PID 0), which commonly appears in core dumps. 2. It uses the PID in PRSTATUS, which is racy and can't actually be trusted. The solution for both of these is to look up the PRSTATUS note by CPU instead of PID. For the live kernel, drgn refuses to unwind the stack of tasks in the "R" state. However, the "R" state is running or runnable, so in the latter case, we can still unwind the stack. The solution for this is to look at on_cpu for the task instead of the state.	2020-05-20 12:03:00 -07:00
Omar Sandoval	4d8597f0f8	libdrgn: add THREAD_SIZE to Linux kernel object finder Despite the naming, this is the kernel stack size.	2020-05-19 17:10:54 -07:00
Omar Sandoval	8b264f8823	Update copyright headers to Facebook and add missing headers drgn was originally my side project, but for awhile now it's also been my work project. Update the copyright headers to reflect this, and add a copyright header to various files that were missing it.	2020-05-15 15:13:02 -07:00
Omar Sandoval	2d1481f5ab	libdrgn: add page table walker kernel memory reader Now that we can walk page tables, we can use it in a memory reader that reads kernel memory via the kernel page table. This means that we don't need libkdumpfile for ELF vmcores anymore (although I'll keep the functionality around until this code has been validated more).	2020-05-08 17:37:56 -07:00
Omar Sandoval	e697be707c	libdrgn: use swapper_pg_dir in vmcoreinfo for fallback PAGE_OFFSET I originally wanted to avoid depending on another vmcoreinfo field, but an the next change is going to depend on swapper_pg_dir in vmcoreinfo anyways, and it ends up being simpler to use it.	2020-05-08 17:37:56 -07:00
Omar Sandoval	d0a1718451	libdrgn: implement virtual address translation/page table walking There are a few big use cases for this in drgn: * Helpers for accessing memory in the virtual address space of userspace tasks. * Removing the libkdumpfile dependency for vmcores. * Handling gaps in the virtual address space of /proc/kcore (cf. #27). I dragged my feet on implementing this because I thought it would be more complicated, but the page table layout on x86-64 isn't too bad. This commit implements page table walking using a page table iterator abstraction. The first thing we'll add on top of this will be a helper for reading memory from a virtual address space, but in the future it'd also be possible to export the page table iterator directly.	2020-05-08 17:36:19 -07:00
Omar Sandoval	5505628235	libdrgn: get rid of struct drgn_program.num_file_segments This isn't used anymore. Remove it and simplify the loop adding file segments.	2020-05-04 13:20:27 -07:00
Omar Sandoval	b1315fcaa1	libdrgn: add drgn_program_bswap() This is clearer than open-coding the endianness check.	2020-04-27 17:08:02 -07:00
Omar Sandoval	7a9fad0fd2	libdrgn: move _vmemmap() to object finder Similarly to PAGE_OFFSET, vmemmap makes more sense as part of the Linux kernel object finder than an internal helper. While we're here, let's fix the definition for 5-level page tables. This only matters for kernels with commit 77ef56e4f0fb ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y") but without eedb92abb9bb ("x86/mm: Make virtual memory layout dynamic for CONFIG_X86_5LEVEL=y") (namely, v4.14, v4.15, and v4.16); since v4.17, 5-level page table support enables KASLR.	2020-04-10 15:33:29 -07:00
Omar Sandoval	5ac95e491a	libdrgn: fix _page_offset() helper and move to object finder The internal _page_offset() helper gets the value of PAGE_OFFSET, but the fallback when KASLR is disabled has been out of date since Linux v4.20 and never handled 5-level page tables. Additionally, it makes more sense as part of the Linux kernel (formerly vmcoreinfo) object finder so that it's cleanly accessible outside of drgn internals.	2020-04-10 15:33:27 -07:00
Omar Sandoval	1dbc718840	helpers: add pgtable_l5_enabled()	2020-04-10 15:18:46 -07:00
Serapheim Dimitropoulos	08193a97aa	Support stack traces for running threads on kdumps	2020-03-27 16:12:03 -07:00
Jay Kamat	3f870603fa	libdrgn: add default language to drgn_program For operations where we don't have a type available, we currently fall back to C. Instead, we should guess the language of the program and use that as the default. The heurisitic implemented here gets the language of the CU containing "main" (except for the Linux kernel, which is always C). In the future, we should allow manually overriding the automatically determined language.	2020-02-26 19:55:42 -08:00
Omar Sandoval	a5cd92f24e	libdrgn: make vmcoreinfo accessible before loading debug info UTS_RELEASE is currently only accessible once debug info is loaded with prog.load_debug_info(main=True). This makes it difficult to get the release, find the appropriate vmlinux, then load the found vmlinux. We can add vmcoreinfo_object_find as part of set_core_dump(), which makes it possible to do the following: prog = drgn.Program() prog.set_core_dump(core_dump_path) release = prog['UTS_RELEASE'].string_() vmlinux_path = find_vmlinux(release) prog.load_debug_info([vmlinux_path]) The only downside is that this ends up using the default definition of char rather than what we would get from the debug info, but that shouldn't be a big problem.	2020-02-19 12:11:45 -08:00
Jay Kamat	054cb54a01	libdrgn: Rename find_symbol to find_symbol_by_address	2020-02-12 14:06:49 -08:00
Omar Sandoval	0a707b0c9d	libdrgn: rework drgn_find_symbol_internal() Instead of having two internal variants (drgn_find_symbol_internal() and drgn_program_find_symbol_in_module()), combine them into the former and add a separate drgn_error_symbol_not_found() for translating the static error to the user-facing one. This makes things more flexible for the next change.	2019-12-19 11:43:54 -08:00
Omar Sandoval	326107f054	libdrgn: add task_state_to_char() helper Add a helper to get the state of a task (e.g., 'R', 'S', 'D'). This will be used to make sure that a task is not running when getting a stack trace, so implement it in libdrgn.	2019-10-28 13:37:57 -07:00
Omar Sandoval	91f5c8e2e7	libdrgn: stack_trace: support unwinding stack from thread ID When debugging the Linux kernel, it's inconvenient to have to get the task_struct of a thread in order to get its stack trace. This adds support for looking it up solely by PID. In that case, we do the find_task() inside of libdrgn. This also gives us stack trace support for userspace core dumps almost for free since we already added support for NT_PRSTATUS.	2019-10-28 13:37:53 -07:00
Omar Sandoval	0f7ad0ed26	libdrgn: stack_trace: support unwinding stack from core dump vmcores include a NT_PRSTATUS note for each CPU containing the PID of the task running on that CPU at the time of the crash and its registers. We can use that to unwind the stack of the crashed tasks.	2019-10-28 13:36:02 -07:00
Omar Sandoval	75c74022ff	libdrgn: save Elf handle for core dump Currently, we close the Elf handle in drgn_set_core_dump() after we're done with it. However, we need the Elf handle in userspace_report_debug_info(), so we reopen it temporarily. We will also need it to support getting stack traces from core dumps, so we might as well keep it open. Note that we keep it even if we're using libkdumpfile because libkdumpfile doesn't seem to have an API to access ELF notes.	2019-10-28 13:09:25 -07:00

1 2

70 Commits