JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-24 10:03:05 +00:00

Author	SHA1	Message	Date
Omar Sandoval	772492838f	drgn.helpers.linux.mm: add arbitrary address translation helpers follow_{page,pfn,phys}() translate the virtual address by walking the page table for a given mm_struct (built on top of the existing page table iterator interface). vmalloc_to_page() and vmalloc_to_pfn() are special cases for vmalloc addresses. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-02 23:40:38 -07:00
Imran Khan	4d7c709621	helpers: idr: Enable idr helpers to work with older kernel. Prior to kernel v4.11, idr was not using radix tree as its backend. So current idr helper(s) only work for kernel v4.11+. Enable idr helpers(s) to work with non-radix tree based idr, so that the helpers can be used with older kernels as well. Thanks to Omar for optimizing the idr_for_each helper. Signed-off-by: Imran Khan <imran.f.khan@oracle.com>	2023-01-23 17:32:17 -08:00
Kevin Svetlitski	e5754e47ba	Mark immutable attributes as `Final` in type stubs Object attributes which are not changed after the object's creation have been marked with the `Final` type attribute. The primary utility of this change is that some Python type checkers, [such as Pyre](https://pyre-check.org/docs/errors/#optional-attributes), rely on the presence of the `Final` attribute to determine whether `Optional` values have been validated as being present before their contents are accessed. Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>	2023-01-18 16:01:52 -08:00
Stephen Brennan	15f151c1e4	Document the linux kernel object finder drgn is able to lookup some special metadata for the Linux kernel, sometimes even without debuginfo. For users who may expect that Program.object() will only return objects corresponding to variables, these metadata are unexpected and can be quite useful. Regardless, they're currently undocumented. Add documentation under Advanced Usage for these, and reference it in the Program.object() docstring. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2023-01-17 16:10:49 -08:00
Omar Sandoval	9ee1ccff98	libdrgn: add stub s390 and s390x architectures with relocation implementation The only relocation type I saw in Debian's kernel module debug info was R_390_32. R_390_8, R_390_16, R_390_64, R_390_PC16, R_390_PC32, and R_390_PC64 are trivial to support, as well. The Linux kernel supports many more, but hopefully they won't show up for debug info. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-19 13:48:44 -08:00
Omar Sandoval	7ce84a3f1f	drgn.helpers.linux: add proper XArray helpers Commit `89eb868e95` ("helpers: make find_task() work on recent kernels") made radix_tree_lookup() and radix_tree_for_each() work for basic XArrays. However, it doesn't handle a couple of more advanced features: multi-index entries (which old radix trees actually also supported) and zero entries. It has also been really confusing to explain to people unfamiliar with the radix tree -> XArray transition that they should use helpers named radix_tree for a structure named xarray. So, let's finally add xa_load(), xa_for_each(), and some additional auxiliary helpers. The non-recursive xa_for_each() implementation is based on Kevin Svetlitski's C implementation from commit `2b47583c73` ("Rewrite linux helper iterators in C"). radix_tree_lookup() and radix_tree_for_each() share the implementation with xa_load() and xa_for_each(), respectively, so they are mostly interchangeable. Fixes: #61 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-13 17:46:37 -08:00
Nhat Pham	2a27cfb918	Add missing type annotations for StackTrace dunder methods StackTrace supports __len__() and __iter__(), but they aren't annotated. Add them. Fixes: `80c9fb35ff` ("Add type hint stubs and generate documentation from them") Signed-off-by: Nhat Pham <nphamcs@gmail.com>	2022-11-29 17:08:30 -08:00
Omar Sandoval	222680b47a	Add StackFrame.sp We have some generic helpers that we'd like to add (for example, #210) that need to know the stack pointer of a frame. These shouldn't need to hard-code register names for different architectures. Add a generic shortcut, StackFrame.sp. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-22 18:47:16 -08:00
Stephen Brennan	5f3a91f80d	Add StackFrame.locals() method The StackFrame's __getitem__() method allows looking up names in the scope of a stack frame, which is an incredibly useful tool for debugging. However, the names are not discoverable -- you must already be looking at the source code or some other source to know what names can be queried. To fix this, add a locals() method to StackFrame, which lists names that can be queried in the scope. Since this method is named locals(), it stops at the function scope and doesn't include globals or class members. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2022-11-02 22:40:33 -07:00
Omar Sandoval	87b7292aa5	Relicense drgn from GPLv3+ to LGPLv2.1+ drgn is currently licensed as GPLv3+. Part of the long term vision for drgn is that other projects can use it as a library providing programmatic interfaces for debugger functionality. A more permissive license is better suited to this goal. We decided on LGPLv2.1+ as a good balance between software freedom and permissiveness. All contributors not employed by Meta were contacted via email and consented to the license change. The only exception was the author of commit `c4fbf7e589` ("libdrgn: fix for compilation error"), who did not respond. That commit reverted a single line of code to one originally written by me in commit `640b1c011d` ("libdrgn: embed DWARF index in DWARF info cache"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 17:05:16 -07:00
Omar Sandoval	0b7ac5b046	Fix vmcore stack traces on Linux < 4.9 or >= 5.16 and add drgn.helpers.linux.task_cpu() task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes drgn_get_initial_registers() to think that the kernel is !SMP and use CPU 0 instead, producing incorrect stack traces. This has also always been wrong for Linux < 4.9 and on architectures that don't enable CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be ((struct thread_info *)task->stack)->cpu. Fix it by factoring out a new task_cpu() helper that handles all of the above cases. Also add a test case for task_cpu() in case this changes again. Fixes: `eea5422546` ("libdrgn: make Linux kernel stack unwinding more robust") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-03 16:21:12 -07:00
Omar Sandoval	eb38d88f15	docs: link to man pages with :manpage: consistently The only exception is the link to ps(1) in task_state_to_char() because that needs to link to a specific section. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-26 12:43:20 -07:00
Shung-Hsi Yu	9335e227d6	libdrgn: python: add Jupyter pretty printing support Add pretty printing support in Jupyter notebook for Object, Type, StackFrame, and StackTrace; it will print out their representation in programming language syntax with str(), similar to what's being done in interactive mode. Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>	2022-08-25 13:52:11 -07:00
Omar Sandoval	faaf01ad1b	Add drgn.StackTrace.prog and drgn_stack_trace_program() If we only have the stack trace available, it's useful to get the program it came from. This'll be used eventually for helpers that take a stack trace. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-11 14:45:54 -07:00
Omar Sandoval	a213573b23	libdrgn: linux_kernel: make virt_to_phys() and phys_to_virt() generic On x86-64, the difference between virtual addresses in the direct map and the corresponding physical addresses is called PAGE_OFFSET, so we exposed that via an architecture callback and the Linux kernel object finder. However, this doesn't translate to other architectures. Namely, on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both PAGE_OFFSET and PHYS_OFFSET have varied over time and between configurations. We can remove the architecture callback and avoid version-specific logic by letting the page table tell us the offset. We just need an address in the direct map, which is easy to find since this includes kmalloc and memblock allocations. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-07-14 12:05:11 -07:00
Omar Sandoval	f3f51942e2	docs: document that StackFrame.name requires debugging information And how to get a reasonable name for, e.g., functions implemented in assembly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-05-05 08:22:34 -07:00
Omar Sandoval	14642fb3b6	libdrgn: add stub RISC-V architecture with relocation implementation The 32-bit and 64-bit variants have different register sizes, so they're different architectures in drgn. For now, put them in the same file so that they can share the relocation implementation. We'll need to figure out how to handle registers later. P.S. RISC-V has the weirdest relocations so far. /proc/kcore also appears to be broken. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 11:51:23 -07:00
Omar Sandoval	d27204260e	libdrgn: add stub Arm architecture with relocation implementation The only relocation type I saw in Debian's kernel module debug info was R_ARM_ABS32. R_ARM_REL32 is easy. The Linux kernel supports a bunch of other ones that don't seem relevant to debug info. Unfortunately, I wasn't able to test this because /proc/kcore doesn't exist on Arm. This apparently goes all the way back to 2003: https://lwn.net/Articles/45315/. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 00:25:05 -07:00
Omar Sandoval	3f246f7054	libdrgn: add stub AArch64 architecture with relocation implementation The only relocation types I saw in Debian's kernel module debug info were R_AARCH64_ABS64 and R_AARCH64_ABS32. R_AARCH64_ABS16, R_AARCH64_PREL64, R_AARCH64_PREL32, and R_AARCH64_PREL16 are all easy. The remaining types supported by the Linux kernel are for movw and immediate instructions, which aren't relevant to debug info. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 00:23:56 -07:00
Omar Sandoval	7535838cd5	libdrgn: add stub i386 architecture with relocation implementation The only relocation type I saw in Debian's kernel module debug info was R_386_32. R_386_PC32 is easy. The Linux kernel also supports R_386_PLT32, but that's the same story as R_X86_64_PLT32 in x86-64, so we don't implement it for now. I was torn between naming it i386, x86, or IA-32. x86 isn't immediately clear whether x86-64 is included or not. No one other than Intel calls it IA-32. i386 might incorrectly imply that it is strictly the original i386 instruction set with no later extensions, but the more general meaning is used frequently in the Linux world (e.g., Debian and QEMU both call it i386), so I went with that in the end. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 00:21:59 -07:00
Omar Sandoval	50e4ac8245	libdrgn: allow overriding program default language Our cheap heuristic for the default language will not always be correct, and although we can improve it as cases arise, we should also just have a way for the user to explicitly set the default language. Add drgn_program_set_language() to libdrgn and allow setting drgn.Program.language in the Python bindings. This will also make unit testing different languages easier. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-16 13:29:12 -08:00
Omar Sandoval	16bb5d92e0	Add missing drgn.Language.CPP type annotation We forgot to add this to _drgn.pyi back when we added the language definition. Fixes: `d8fadf10ee` ("libdrgn: Add cpp language and tests") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-16 12:38:38 -08:00
Mykola Lysenko	7580fffbdf	Add drgn.Program.main_thread() Currently only supported for user-space crash dumps. E.g. no support for live user-space application debugging or kernel debugging. Closes #144. Signed-off-by: Mykola Lysenko <mykolal@fb.com>	2022-02-10 15:53:50 -08:00
Omar Sandoval	0a643b6fab	python: allow Program.type() to accept a Type Some helpers can accept either a str or a Type. If they want to always work with a Type internally, they need to do something like: if isinstance(type, str): type = prog.type(type) Instead, let's let Program.type() accept a Type and return the exact same type, so those helpers can unconditionally do: type = prog.type(type) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-21 16:52:36 -08:00
Stephen Brennan	7970a60818	Add methods to return multiple matching symbols Currently we can lookup symbols by name or address, but this will only return one symbol, prioritizing the global symbols. However, symbols may share the same name, and symbols may also overlap address ranges, so it's possible for searches to return multiple results. Add functions which can return a list of multiple matching symbols. Signed-off-by: Stephen Brennan <stephen@brennan.io>	2022-01-15 11:44:33 -08:00
Kevin Svetlitski	301cc767ba	Implement a new API for representing threads Previously, drgn had no way to represent a thread – retrieving a stack trace (the only extant thread-specific operation) was achieved by requiring the user to directly provide a tid. This commit introduces the scaffolding for the design outlined in issue #92, and implements the corresponding methods for userspace core dumps, the live Linux kernel, and Linux kernel core dumps. Future work will build on top of this commit to support live userspace processes. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-01-11 17:28:17 -08:00
Omar Sandoval	95c4e2d748	Revert "Rewrite linux helper iterators in C" This reverts commit `2b47583c73`. After Kevin had completed this, we realized that there is a simpler method for iterating through tasks from libdrgn, which the next commit will implement. Revert the translation, but keep the improved tests.helpers.linux.test_pid.TestPid.test_for_each_task. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-11 17:28:17 -08:00
Omar Sandoval	d72a9043b0	libdrgn: linux: replace idle_thread() with idle_task() I missed that the kernel has an idle_task() function which uses cpu_rq()->idle instead of idle_threads; the latter is technically architecture-specific. So, replace idle_thread() with idle_task(), which is architecture-independent and more consistent with the kernel. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 16:03:25 -08:00
Omar Sandoval	adfb04579b	libdrgn: linux: add idle_thread() helper PR #129 will need to get the idle thread for a CPU when the idle thread crashed. Add a helper for this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 14:40:57 -08:00
Omar Sandoval	b916e6905b	libdrgn: linux: translate per_cpu_ptr() helper to C The next change will add a C helper that needs per_cpu_ptr(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 14:39:50 -08:00
Kevin Svetlitski	2b47583c73	Rewrite linux helper iterators in C In preparation for introducing an API to represent threads, the linux helper iterators, radix_tree_for_each, idr_for_each, for_each_pid, and for_each_task have been rewritten in C. This will allow them to be accessed from libdrgn, which will be necessary for the threads API. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2021-12-17 16:24:54 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	ff40f65f0d	libdrgn: allow symbol name lookup to get local symbols Global symbols are preferred over weak symbols, and weak symbols are preferred over other symbols. dwfl_module_addrinfo() seems to have the same preference, so document address lookups as having the same behavior. (This is actually incorrect in the case of STB_GNU_UNIQUE, as dwfl_module_addrinfo() treats anything other than STB_GLOBAL, STB_WEAK, and STB_LOCAL as having the lowest precedence, but STB_GNU_UNIQUE is so obscure that it probably doesn't matter.) Based on work from Stephen Brennan. Closes #121. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 14:30:57 -08:00
Stephen Brennan	1744d8d93c	libdrgn: python: Add binding, kind to drgn.Symbol Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2021-08-20 18:16:57 -07:00
Omar Sandoval	39b76e8486	docs: update repr(drgn.Type) and type constructors in documentation Commit `a97f6c4fa2` ("Associate types with program") changed repr() for drgn.Type to include a "prog." prefix, but it didn't update the documentation to reflect that. It also forgot to update a global type constructor to the new Program methods. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-02 15:59:43 -07:00
Omar Sandoval	7335df114c	libdrgn: python: add Object.to_bytes_() And the libdrgn implementation, drgn_object_read_bytes(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:12:34 -07:00
Omar Sandoval	9c00552007	libdrgn: python: add Object.from_bytes_() Add a way to create an object from raw bytes. One example where I've wanted this is creating a struct pt_regs from a PRSTATUS note or other source. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:06:58 -07:00
Omar Sandoval	f7fe93e573	cli: show elfutils version in use drgn depends heavily on libelf and libdw, so it's useful to know what version we're using. Add drgn._elfutils_version and use that in the CLI and in the test cases where we currently check the libdw version. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 11:10:50 -07:00
Omar Sandoval	bc85767e5f	libdrgn: support looking up parameters and variables in stack traces After all of the preparatory work, the last two missing pieces are a way to find a variable by name in the list of scopes that we saved while unwinding, and a way to find the containing scopes of an inlined function. With that, we can finally look up parameters and variables in stack traces. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	38573cfdde	libdrgn: stack_trace: pretty print frames and add frames for inline functions If we want to access a parameter or local variable in an inlined function, then we need a stack frame for that function. It's also much more useful to see inlined functions in the stack trace in general. So, when we've unwound the registers for a stack frame, walk the debugging information to find all of the (possibly inlined) functions at the program counter, and add a drgn stack frame for each of those. Also add StackFrame.name and StackFrame.is_inline so that we can distinguish inline frames. Also add StackFrame.source() to get the filename and line and column numbers. Finally, add the source code location to pretty-printed stack traces and add pretty-printing for individual stack frames that includes extra information. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	ad37c79cba	libdrgn: python: add documentation and type annotation for Program.__contains__() drgn.Program has supported the "in" operator since commit `25e7a9d3b8` ("libdrgn/python: implement Program.__contains__"), but it's undocumented and unannotated. Add a type annotation with a docstring along with a METH_COEXIST method. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-12 16:26:56 -07:00
Omar Sandoval	85c367bf79	Reformat empty docstrings Black 21.4b2 now replaces empty docstrings with a docstring containing a single space. Apply that formatting. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-30 17:06:47 -07:00
Omar Sandoval	037a510ff2	Fix drgn.FaultError type annotations FaultError() also takes an error message. Fixes: `80c9fb35ff` ("Add type hint stubs and generate documentation from them") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-30 17:03:16 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	78316a28fb	libdrgn: remove half-baked support for complex types We've nominally supported complex types since commit `75c3679147` ("Rewrite drgn core in C"), but parsing them from DWARF has been incorrect from the start (they don't have a DW_AT_type attribute like we assume), and we never implemented proper support for complex objects. Drop the partial implementation; we can bring it back (properly) if someone requests it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 14:56:33 -08:00
Omar Sandoval	9a066b409f	docs: mention that default arguments are not yet parsed from DWARF TypeParameter.default_argument is currently basically a placeholder because we don't parse it from DWARF and compilers don't emit it, so document that. See #82. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 02:16:23 -08:00
Kamalesh Babulal	221a218704	libdrgn: add powerpc stack trace support Add powerpc specific register information required to retrive the stack traces of the tasks on both live system and from the core dump. It uses the existing DSL format to define platform registers and helper functions to initial them. It also adds architecture specific information to enable powerpc. Current support is for little-endian powerpc only. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>	2021-01-29 11:31:59 -08:00
Omar Sandoval	b899a10836	Remove register numbers from API and add register aliases enum drgn_register_number in the public libdrgn API and drgn.Register.number in the Python bindings are basically exports of DWARF register numbers. They only exist as a way to identify registers that's lighter weight than string lookups. libdrgn already has struct drgn_register, so we can use that to identify registers in the public API and remove enum drgn_register_number. This has a couple of benefits: we don't depend on DWARF numbering in our API, and we don't have to generate drgn.h from the architecture files. The Python bindings can just use string names for now. If it seems useful, StackFrame.register() can take a Register in the future, we'll just need to be careful to not allow Registers from the wrong platform. While we're changing the API anyways, also change it so that registers have a list of names instead of one name. This isn't needed for x86-64 at the moment, but will be for architectures that have multiple names for the same register (like ARM). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-28 17:47:45 -08:00

1 2

89 Commits