JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-23 01:33:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	084e636341	libdrgn: add DRGN_ERROR_NOT_IMPLEMENTED This will be used for partial 128-bit object support. There are other places that should probably be converted to use it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-28 13:38:38 -07:00
Omar Sandoval	14642fb3b6	libdrgn: add stub RISC-V architecture with relocation implementation The 32-bit and 64-bit variants have different register sizes, so they're different architectures in drgn. For now, put them in the same file so that they can share the relocation implementation. We'll need to figure out how to handle registers later. P.S. RISC-V has the weirdest relocations so far. /proc/kcore also appears to be broken. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 11:51:23 -07:00
Omar Sandoval	d27204260e	libdrgn: add stub Arm architecture with relocation implementation The only relocation type I saw in Debian's kernel module debug info was R_ARM_ABS32. R_ARM_REL32 is easy. The Linux kernel supports a bunch of other ones that don't seem relevant to debug info. Unfortunately, I wasn't able to test this because /proc/kcore doesn't exist on Arm. This apparently goes all the way back to 2003: https://lwn.net/Articles/45315/. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 00:25:05 -07:00
Omar Sandoval	3f246f7054	libdrgn: add stub AArch64 architecture with relocation implementation The only relocation types I saw in Debian's kernel module debug info were R_AARCH64_ABS64 and R_AARCH64_ABS32. R_AARCH64_ABS16, R_AARCH64_PREL64, R_AARCH64_PREL32, and R_AARCH64_PREL16 are all easy. The remaining types supported by the Linux kernel are for movw and immediate instructions, which aren't relevant to debug info. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 00:23:56 -07:00
Omar Sandoval	7535838cd5	libdrgn: add stub i386 architecture with relocation implementation The only relocation type I saw in Debian's kernel module debug info was R_386_32. R_386_PC32 is easy. The Linux kernel also supports R_386_PLT32, but that's the same story as R_X86_64_PLT32 in x86-64, so we don't implement it for now. I was torn between naming it i386, x86, or IA-32. x86 isn't immediately clear whether x86-64 is included or not. No one other than Intel calls it IA-32. i386 might incorrectly imply that it is strictly the original i386 instruction set with no later extensions, but the more general meaning is used frequently in the Linux world (e.g., Debian and QEMU both call it i386), so I went with that in the end. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-19 00:21:59 -07:00
Omar Sandoval	50e4ac8245	libdrgn: allow overriding program default language Our cheap heuristic for the default language will not always be correct, and although we can improve it as cases arise, we should also just have a way for the user to explicitly set the default language. Add drgn_program_set_language() to libdrgn and allow setting drgn.Program.language in the Python bindings. This will also make unit testing different languages easier. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-16 13:29:12 -08:00
Omar Sandoval	9397a11605	libdrgn: export drgn_language instances libdrgn currently exports struct drgn_language pointers from drgn_program_language(), drgn_type_language(), and drgn_object_language(), but doesn't provide any way to do anything with them. Export our drgn_language instances and add drgn_language_name() so that they can at least be compared and printed. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-16 13:07:42 -08:00
Omar Sandoval	98577e5e23	libdrgn: fix drgn_program_find_thread() for Linux kernel when thread isn't found If a TID does not exist, then linux_helper_find_task() succeeds but returns a null pointer object. Check for that instead of returning a bogus thread. Fixes: `301cc767ba` ("Implement a new API for representing threads") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-12 01:16:49 -08:00
Mykola Lysenko	7580fffbdf	Add drgn.Program.main_thread() Currently only supported for user-space crash dumps. E.g. no support for live user-space application debugging or kernel debugging. Closes #144. Signed-off-by: Mykola Lysenko <mykolal@fb.com>	2022-02-10 15:53:50 -08:00
Stephen Brennan	7970a60818	Add methods to return multiple matching symbols Currently we can lookup symbols by name or address, but this will only return one symbol, prioritizing the global symbols. However, symbols may share the same name, and symbols may also overlap address ranges, so it's possible for searches to return multiple results. Add functions which can return a list of multiple matching symbols. Signed-off-by: Stephen Brennan <stephen@brennan.io>	2022-01-15 11:44:33 -08:00
Kevin Svetlitski	301cc767ba	Implement a new API for representing threads Previously, drgn had no way to represent a thread – retrieving a stack trace (the only extant thread-specific operation) was achieved by requiring the user to directly provide a tid. This commit introduces the scaffolding for the design outlined in issue #92, and implements the corresponding methods for userspace core dumps, the live Linux kernel, and Linux kernel core dumps. Future work will build on top of this commit to support live userspace processes. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-01-11 17:28:17 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Stephen Brennan	3d8db22c47	libdrgn: Add kind and binding fields to drgn_symbol Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2021-08-20 18:16:57 -07:00
Omar Sandoval	7335df114c	libdrgn: python: add Object.to_bytes_() And the libdrgn implementation, drgn_object_read_bytes(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:12:34 -07:00
Omar Sandoval	bc85767e5f	libdrgn: support looking up parameters and variables in stack traces After all of the preparatory work, the last two missing pieces are a way to find a variable by name in the list of scopes that we saved while unwinding, and a way to find the containing scopes of an inlined function. With that, we can finally look up parameters and variables in stack traces. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	38573cfdde	libdrgn: stack_trace: pretty print frames and add frames for inline functions If we want to access a parameter or local variable in an inlined function, then we need a stack frame for that function. It's also much more useful to see inlined functions in the stack trace in general. So, when we've unwound the registers for a stack frame, walk the debugging information to find all of the (possibly inlined) functions at the program counter, and add a drgn stack frame for each of those. Also add StackFrame.name and StackFrame.is_inline so that we can distinguish inline frames. Also add StackFrame.source() to get the filename and line and column numbers. Finally, add the source code location to pretty-printed stack traces and add pretty-printing for individual stack frames that includes extra information. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	0e3054a0ba	libdrgn: make addresses wrap around when reading memory Define that addresses for memory reads wrap around after the maximum address rather than the current unpredictable behavior. This is done by: 1. Reworking drgn_memory_reader to work with an inclusive address range so that a segment can contain UINT64_MAX. drgn_memory_reader remains agnostic to the maximum address and requires that address ranges do not overflow a uint64_t. 2. Adding the overflow/wrap-around logic to drgn_program_add_memory_segment() and drgn_program_read_memory(). 3. Changing direct uses of drgn_memory_reader_reader() to drgn_program_read_memory() now that they are no longer equivalent. (For some platforms, a fault might be more appropriate than wrapping around, but this is a step in the right direction.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-03 17:49:29 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	38d4330fec	libdrgn: clean up stale comment references and Doxygen warnings Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-16 16:15:43 -07:00
Omar Sandoval	eec67768aa	libdrgn: replace elfutils DWARF unwinder with our own The elfutils DWARF unwinder has a couple of limitations: 1. libdwfl doesn't have an interface for getting register values, so we have to bundle a patched version of elfutils with drgn. 2. Error handling is very awkward: dwfl_getthread_frames() can return an error even on success, so we have to squirrel away our own errors in the callback. Furthermore, there are a couple of things that will be easier with our own unwinder: 1. Integrating unwinding using ORC will be easier when we're handling unwinding ourselves. 2. Support for local variables isn't too far away now that we have DWARF expression evaluation. Now that we have the register state, CFI, and DWARF expression pieces in place, stitch them together with the new unwinder, and tweak the public API a bit to reflect it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-15 16:43:12 -07:00
Omar Sandoval	a24c0f5b33	libdrgn: clean up usage of drgn_stop Use drgn_not_found where it's more appropriate, and check explicitly against drgn_stop instead of err->code == DRGN_ERROR_STOP. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-05 12:46:06 -08:00
Omar Sandoval	aaa98ccae3	libdrgn: consistently use __ for __attribute__ names In some places, we add __ preceding and following an attribute name, and in some places, we don't. Let's make it consistent. We might as well opt for the __ to make clashes with macros less likely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 03:16:23 -08:00
Omar Sandoval	3ecb31de9f	libdrgn: update stale references in drgn_object_slice() comment drgn_program_member_info() was replaced by drgn_type_find_member() in commit `e72ecd0e2c` ("libdrgn: replace drgn_program_member_info() with drgn_type_find_member()"). drgn_object_pointer_offset() never existed; it's supposed to be drgn_object_dereference_offset(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:41:20 -08:00
Omar Sandoval	da1e72f0d5	libdrgn: remove drgn_{,qualified_}type_eq() from drgn.h.in The definitions were removed but these public declarations weren't. Fixes: `7d7aa7bf7b` ("libdrgn/python: remove Type == operator") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:37:36 -08:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	78316a28fb	libdrgn: remove half-baked support for complex types We've nominally supported complex types since commit `75c3679147` ("Rewrite drgn core in C"), but parsing them from DWARF has been incorrect from the start (they don't have a DW_AT_type attribute like we assume), and we never implemented proper support for complex objects. Drop the partial implementation; we can bring it back (properly) if someone requests it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 14:56:33 -08:00
Kamalesh Babulal	221a218704	libdrgn: add powerpc stack trace support Add powerpc specific register information required to retrive the stack traces of the tasks on both live system and from the core dump. It uses the existing DSL format to define platform registers and helper functions to initial them. It also adds architecture specific information to enable powerpc. Current support is for little-endian powerpc only. Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>	2021-01-29 11:31:59 -08:00
Omar Sandoval	b899a10836	Remove register numbers from API and add register aliases enum drgn_register_number in the public libdrgn API and drgn.Register.number in the Python bindings are basically exports of DWARF register numbers. They only exist as a way to identify registers that's lighter weight than string lookups. libdrgn already has struct drgn_register, so we can use that to identify registers in the public API and remove enum drgn_register_number. This has a couple of benefits: we don't depend on DWARF numbering in our API, and we don't have to generate drgn.h from the architecture files. The Python bindings can just use string names for now. If it seems useful, StackFrame.register() can take a Register in the future, we'll just need to be careful to not allow Registers from the wrong platform. While we're changing the API anyways, also change it so that registers have a list of names instead of one name. This isn't needed for x86-64 at the moment, but will be for architectures that have multiple names for the same register (like ARM). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-28 17:47:45 -08:00
Omar Sandoval	46343ae08d	libdrgn: get rid of struct drgn_stack_frame In preparation for adding a "real", internal-only struct drgn_stack_frame, replace the existing struct drgn_stack_frame with explicit trace/frame arguments. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-27 11:22:34 -08:00
Omar Sandoval	352c31e1ac	Add support for C++ template parameters Add struct drgn_type_template_parameter to libdrgn, the corresponding TypeTemplateParameter to the Python bindings, and support for parsing them from DWARF. With this, support for templates is almost, but not quite, complete. The main wart is that DW_TAG_name of compound types includes the template parameters, so the type tag includes it as well. We should remove that from the tag and instead have the type formatting code add it only when getting the full type name. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	d35243b354	libdrgn: replace lazy types with lazy objects In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	190062f470	libdrgn: get drgn_type_member.bit_field_size through drgn_member_type() Getting the bit field size of a member will soon require evaluating the lazy type, so return it from drgn_member_type() instead of accessing it directly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	359177295d	libdrgn: move type definitions in drgn.h In preparation for struct drgn_type referencing struct drgn_object, move the former after the latter. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	30cfa40a72	libdrgn: rename "unavailable" objects to "absent" objects I was going to add an Object.available_ attribute, but that made me realize that the naming is somewhat ambiguous, as a reference object with an invalid address might also be considered "unavailable" by users. Use the name "absent" instead, which is more clear: the object isn't there at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 14:58:26 -08:00
Omar Sandoval	7d7aa7bf7b	libdrgn/python: remove Type == operator The == operator on drgn.Type is only intended for testing. It's expensive and slow and not what people usually want. It's going to get even more awkward to define once types can refer to objects (for template parameters and static members and such). Let's replace == with a new identical() function only available in unit tests. Then, remove the operator from the Python bindings as well as the underlying libdrgn drgn_type_eq() and drgn_qualified_type_eq() functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 03:11:38 -08:00
Omar Sandoval	40004e5c8f	libdrgn/python: add offsetof() offsetof() can almost be implemented with Type.member(name).offset, but that doesn't parse member designators. Add an offsetof() function that does (and add drgn_type_offsetof() in libdrgn). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:46:41 -08:00
Omar Sandoval	a595e52d22	libdrgn/python: add Type.has_member() Add drgn_type_has_member() to libdrgn and Type.has_member() to the Python bindings. This can simplify some version checks, like the one in _for_each_block_device() since commit `9a10a927b0` ("helpers: fix for_each_{disk,partition}() on kernels >= v5.1"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:38:48 -08:00
Omar Sandoval	e72ecd0e2c	libdrgn: replace drgn_program_member_info() with drgn_type_find_member() Now that types are associated with their program, we don't need to pass the program separately to drgn_program_member_info() and can replace it with a more natural drgn_type_find_member() API that takes only the type and member name. While we're at it, get rid of drgn_member_info and return the drgn_type_member and bit_offset directly. This also fixes a bug that drgn_error_member_not_found() ignores the member name length. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:40:54 -08:00
Omar Sandoval	738ae2c75f	libdrgn: pack struct drgn_object better We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64) by moving the bit_offset and little_endian members out of the value and reference structs. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	abafdd965f	Remove bit_offset from value objects There are a couple of reasons that it was the wrong choice to have a bit_offset for value objects: 1. When we store a buffer with a bit_offset, we're storing useless padding bits. 2. bit_offset describes a location, or in other words, part of an address. This makes sense for references, but not for values, which are just a bag of bytes. Get rid of union drgn_value.bit_offset in libdrgn, make Object.bit_offset None for value objects, and disallow passing bit_offset to the Object() constructor when creating a value. bit_offset can still be passed when creating an object from a buffer, but we'll shift the bytes down as necessary to store the value with no offset. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	6bd0c2b4d2	libdrgn: add concept of "unavailable" objects There are some situations where we can find an object but can't determine its value, like local variables that have been optimized out, inlined functions without a concrete instance, and pure virtual methods. It's still useful to get some information from these objects, namely their types. Let's add the concept of an "unavailable" object, which is an object with a known type but unknown value/address. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:58:19 -08:00
Omar Sandoval	5f17281926	libdrgn: make drgn_object::is_reference an enum To prepare for a new kind of object, replace the is_reference bool with an enum drgn_object_kind. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:37:58 -08:00
Omar Sandoval	edb1fe7f2f	libdrgn: rename drgn_object_kind to drgn_object_encoding I'd like to use the name drgn_object_kind to distinguish between values and references. "Encoding" is more accurate than "kind", anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:26 -08:00
Omar Sandoval	a4dbd7bf95	libdrgn: remove unused DRGN_NUM_ARCH Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:23 -08:00
Omar Sandoval	de6a4e07ae	libdrgn: fix Doxygen The Doxygen documentation for libdrgn has bit-rotted over time. Bring back the Internal module, clean up a few renamed members and parameters, and fix broken parsing caused by the generic definition macros. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-30 01:32:33 -07:00
Omar Sandoval	286c09844e	Clean up #includes with include-what-you-use I recently hit a couple of CI failures caused by relying on transitive includes that weren't always present. include-what-you-use is a Clang-based tool that helps with this. It's a bit finicky and noisy, so this adds scripts/iwyu.py to make running it more convenient (but not reliable enough to automate it in Travis). This cleans up all reasonable include-what-you-use warnings and reorganizes a few header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-23 16:29:42 -07:00
Omar Sandoval	d512964c1e	libdrgn: add drgn_error_copy() This is needed for a future change where we'll want to save an error and return it multiple times. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	e49a87a3d7	libdrgn: remove struct drgn_object::prog We can get it via the type now. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:21 -07:00
Omar Sandoval	a97f6c4fa2	Associate types with program I originally envisioned types as dumb descriptors. This mostly works for C because in C, types are fairly simple. However, even then the drgn_program_member_info() API is awkward. You should be able to look up a member directly from a type, but we need the program for caching purposes. This has also held me back from adding offsetof() or has_member() APIs. Things get even messier with C++. C++ template parameters can be objects (e.g., template <int N>). Such parameters would best be represented by a drgn object, which we need a drgn program for. Static members are a similar case. So, let's reimagine types as being owned by a program. This has a few parts: 1. In libdrgn, simple types are now created by factory functions, drgn_foo_type_create(). 2. To handle their variable length fields, compound types, enum types, and function types are constructed with a "builder" API. 3. Simple types are deduplicated. 4. The Python type factory functions are replaced by methods of the Program class. 5. While we're changing the API, the parameters to pointer_type() and array_type() are reordered to be more logical (and to allow pointer_type() to take a default size of None for the program's default pointer size). 6. Likewise, the type factory methods take qualifiers as a keyword argument only. A big part of this change is updating the tests and splitting up large test cases into smaller ones in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:41:09 -07:00
Omar Sandoval	c840072d05	libdrgn: make drgn_object_set_buffer() take a void * It's awkward to make callers cast to char *. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-07-13 10:25:03 -07:00

1 2

81 Commits