JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-23 09:43:06 +00:00

Author	SHA1	Message	Date
Jay Kamat	d8fadf10ee	libdrgn: Add cpp language and tests	2020-04-03 16:35:38 -07:00
Omar Sandoval	fa61977f60	libdrgn: fix default language detection Jay reported that the default language detection was happening too early and not finding "main". We need to make sure to do it after the DWARF index is actually populated. The problem with that is that it makes error reporting much harder, as we don't want to return a fatal error from drgn_program_set_language_from_main() if we actually succeeded in loading debug info. That means we probably need to ignore errors in drgn_program_set_language_from_main(). To reduce the surface area where we'd be failing, let's get the language directly from the DWARF index. This also allows us to avoid setting the language if it's actually unknown (information which is lost by the time we convert it to a drgn_object in the current code).	2020-04-03 16:35:38 -07:00
Omar Sandoval	bd902f299c	libdrgn: return unknown language from drgn_language_from_die() Currently, drgn_language_from_die() returns the default language when it encounters an unknown DW_LANG because the dwarf_info_cache always wants a language. The next change will want to detect the unknown language case, so make drgn_language_from_die() return NULL if the language is unknown, move it to language.c, and fold drgn_language_from_dw_lang() into it.	2020-04-03 16:35:38 -07:00
Serapheim Dimitropoulos	08193a97aa	Support stack traces for running threads on kdumps	2020-03-27 16:12:03 -07:00
Omar Sandoval	79f973007b	libdrgn/python: fix reference counting on Object.type_ We need to keep the Program alive for its types to stay valid, not just the objects the Program has pinned. (I have no idea why I changed this in commit `565e0343ef` ("libdrgn: make symbol index pluggable with callbacks").)	2020-03-13 16:05:43 -07:00
Omar Sandoval	cae7336750	libdrgn: fix error when expecting identifier after tag in type name We should be looking at the kind of the previous token, not the kind of the unexpected token. Closes #52.	2020-03-13 11:07:46 -07:00
Jay Kamat	3f870603fa	libdrgn: add default language to drgn_program For operations where we don't have a type available, we currently fall back to C. Instead, we should guess the language of the program and use that as the default. The heurisitic implemented here gets the language of the CU containing "main" (except for the Linux kernel, which is always C). In the future, we should allow manually overriding the automatically determined language.	2020-02-26 19:55:42 -08:00
Jay Kamat	6c264b0eae	libdrgn: add language to struct drgn_type For types obtained from DWARF, we determine it from the language of the CU. For other types, it can be specified manually or fall back to the default (C). Then, we can use the language for operations where the type is available.	2020-02-26 19:55:42 -08:00
Omar Sandoval	3d3c32f849	libdrgn/python: add Language to Python bindings	2020-02-26 19:55:42 -08:00
Omar Sandoval	9e2df9f217	libdrgn: put language definitions in one array This way, languages can be identified by an index, which will be useful for adding Python bindings for drgn_language and for adding a language field to drgn_type.	2020-02-26 19:55:42 -08:00
Omar Sandoval	c1e1724c8e	libdrgn: unify drgn_type_from_dwarf_child{,_internal} The plain variant is a trivial wrapper around the internal variant, so get rid of the wrapper and use the internal variant directly everywhere.	2020-02-26 19:55:42 -08:00
Omar Sandoval	efccc93e65	libdrgn: remove can_be_void from drgn_lazy_type_from_dwarf We only lazily evaluate compound type members and function type parameters, which are never void.	2020-02-26 19:55:42 -08:00
Omar Sandoval	376979d25a	Remove stray reference to gen_docstrings.py	2020-02-25 13:58:10 -08:00
Omar Sandoval	80c9fb35ff	Add type hint stubs and generate documentation from them I've been wanting to add type hints for the _drgn C extension for awhile. The main blocker was that there is a large overlap between the documentation (in docs/api_reference.rst) and the stub file, and I really didn't want to duplicate the information. Therefore, it was a requirement that the the documentation could be generated from the stub file, or vice versa. Unfortunately, none of the existing tools that I could find supported this very well. So, I bit the bullet and wrote my own Sphinx extension that uses the stub file as the source of truth (and subsumes my old autopackage extension and gen_docstrings script). The stub file is probably incomplete/inaccurate in places, but this should be a good starting point to improve on. Closes #22.	2020-02-25 13:39:06 -08:00
Omar Sandoval	52e9b2f8d8	drgn 0.0.3	2020-02-21 10:37:02 -08:00
Serapheim Dimitropoulos	e3789512ab	Fix leak in kdump code prog->kdump_ctx is never really initialized, and the kdump_ctx struct allocated in drgn_program_set_kdump() is leaked.	2020-02-20 15:49:44 -08:00
Omar Sandoval	9246094cdc	libdrgn: use dwfl_frame_register() instead of dwfl_frame_eval_expr() I thought I'd be able to avoid adding a separate API for register values and reuse dwfl_frame_eval_expr(), but this doesn't work if the frame is missing debug information but has known register values (e.g., if the program crashed with an invalid instruction pointer).	2020-02-20 14:13:08 -08:00
Omar Sandoval	016189f477	Update elfutils with improved stack frame interface Rebase on master, add the improved dwfl_frame_module/dwfl_frame_dwarf_frame patch, and add the dwfl_frame_register patch. Based on: 889edd912 PR25365: debuginfod-client: restrict cleanup to client-pattern files With the following patches: configure: Add --disable-programs configure: Add --disable-shared libdwfl: add interface for attaching to/detaching from threads libdwfl: add interface for getting Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: export __libdwfl_frame_reg_get as dwfl_frame_register libdwfl: add interface for evaluating DWARF expressions in a frame	2020-02-20 13:49:10 -08:00
Omar Sandoval	a5cd92f24e	libdrgn: make vmcoreinfo accessible before loading debug info UTS_RELEASE is currently only accessible once debug info is loaded with prog.load_debug_info(main=True). This makes it difficult to get the release, find the appropriate vmlinux, then load the found vmlinux. We can add vmcoreinfo_object_find as part of set_core_dump(), which makes it possible to do the following: prog = drgn.Program() prog.set_core_dump(core_dump_path) release = prog['UTS_RELEASE'].string_() vmlinux_path = find_vmlinux(release) prog.load_debug_info([vmlinux_path]) The only downside is that this ends up using the default definition of char rather than what we would get from the debug info, but that shouldn't be a big problem.	2020-02-19 12:11:45 -08:00
Omar Sandoval	cc18d9e502	libdrgn: add UTS_RELEASE to vmcoreinfo_object_find The osrelease is accessible via init_uts_ns.name.release, but we can also get it straight out of vmcoreinfo, which will be useful for the next change. UTS_RELEASE is the name of the macro defined in the kernel.	2020-02-19 12:11:20 -08:00
Omar Sandoval	26ef465007	libdrgn/python: add proper type for members and parameters This continues the conversion from the last commit. Members and parameters are basically the same, so we can do them together. Unlike enumerators, these don't make sense to unpack or access as sequences.	2020-02-12 15:40:19 -08:00
Omar Sandoval	7c70a1a384	libdrgn/python: add proper type for enumerators Currently, type members, enumerators, and parameters are all represented by tuples in the Python bindings. This is awkward to document and implement. Instead, let's replace these tuples with proper types, starting with the easiest one, TypeEnumerator. This one still makes sense to treat as a sequence so that it can be unpacked as (name, value).	2020-02-12 15:37:41 -08:00
Jay Kamat	31d544949f	libdrgn: Add find_symbol_by_name to look up ELF symbols	2020-02-12 14:06:49 -08:00
Jay Kamat	054cb54a01	libdrgn: Rename find_symbol to find_symbol_by_address	2020-02-12 14:06:49 -08:00
Omar Sandoval	9de2cc8410	libdrgn/python: make Object.__index__() TypeError message clearer Currently, we print: >>> prog.symbol(prog['init_task']) Traceback (most recent call last): File "<console>", line 1, in <module> TypeError: cannot convert 'struct task_struct' to index It's not obvious what it means to convert to an index. Instead, let's use the error message raised by operator.index(): TypeError: 'struct task_struct' object cannot be interpreted as an integer	2020-02-11 09:19:53 -08:00
Serapheim Dimitropoulos	80fef04c70	Add address attribute to FaultError exception	2020-02-04 14:59:31 -08:00
Serapheim Dimitropoulos	ad82e9623a	Introduce OutOfBoundsError Decouple some of the responsibilities of FaultError to OutOfBoundsError so consumers can differentiate between invalid memory accesses and running out of bounds in drgn Objects which may be based on valid memory address.	2020-02-04 14:59:31 -08:00
Omar Sandoval	653e923657	drgn 0.0.2	2020-01-31 13:25:33 -08:00
Omar Sandoval	d4cc7945af	Support building with alternative OpenMP runtime libraries At Facebook, we link OpenMP code with libomp instead of libgomp. We have an internal patch to drgn to do this, as it can't be done by setting CFLAGS/LDFLAGS. Let's add a way to specify the OpenMP library at configure time so that we can drop the internal patch.	2020-01-24 10:22:38 -08:00
Omar Sandoval	0088618578	Include git revision in version When investigating a reported bug, it's important to know which exact version of drgn is being used. Let's include the git revision in the version printed by the CLI in PEP440 format. This commit will also serve as the 0.0.1 release.	2020-01-23 12:29:43 -08:00
Omar Sandoval	660276a0b8	Format Python code with Black I'm not a fan of 100% of the Black coding style, but I've spent too much time manually formatting Python code, so let's just pull the trigger.	2020-01-14 11:51:58 -08:00
Omar Sandoval	e8d1ef82fa	Make drgn.h depend on configure.ac The previous commit forgot to add this dependency so that when the version number is updated drgn.h actually gets regenerated.	2020-01-11 22:34:03 -08:00
Omar Sandoval	09a64f5cba	Define version in libdrgn/configure.ac Currently the drgn version number is defined in drgn.h.in, and configure and setup.py both parse it out of there. However, now that we're generating drgn.h anyways, it's easier to make configure.ac the source of truth.	2020-01-11 10:05:57 -08:00
Omar Sandoval	1443d17fb4	libdrgn: add DRGN_FORMAT_OBJECT_IMPLICIT_ELEMENTS	2019-12-19 11:43:54 -08:00
Omar Sandoval	db66952b2e	libdrgn: add DRGN_FORMAT_OBJECT_IMPLICIT_MEMBERS	2019-12-19 11:43:54 -08:00
Omar Sandoval	c8434e9a9e	libdrgn: add DRGN_FORMAT_OBJECT_ELEMENT_INDICES	2019-12-19 11:43:54 -08:00
Omar Sandoval	cfceb491db	libdrgn: add DRGN_FORMAT_OBJECT_MEMBER_NAMES	2019-12-19 11:43:54 -08:00
Omar Sandoval	4fad941ec1	libdrgn: add DRGN_FORMAT_OBJECT_{MEMBERS,ELEMENTS}_SAME_LINE	2019-12-19 11:43:54 -08:00
Omar Sandoval	c3f69ba559	libdrgn: use c_format_initializer for struct/union/class	2019-12-19 11:43:54 -08:00
Omar Sandoval	e2d2df4024	libdrgn: factor c_format_initializer out of c_format_array This will also be used for compound types, and we're going to add a few more options that we should handle in one place.	2019-12-19 11:43:54 -08:00
Omar Sandoval	6bb8da04a0	libdrgn: omit trailing comma when formatting one-line array This is somewhat arbitrary, but I think it looks more natural to only use the trailing comma for multi-line initializers.	2019-12-19 11:43:54 -08:00
Omar Sandoval	1411ba36a8	libdrgn: remove dead code in c_format_array_object When we're checking whether the element that we formatted on one line would fit on the previous line, we check whether the previous line is empty with remaining_columns == start_columns. This is never true, as remaining_columns is always set to start_columns - 1 at most, and it only decreases from there until we start a new line.	2019-12-19 11:43:54 -08:00
Omar Sandoval	7a3bf73df0	libdrgn: replace drgn_object_truthiness() with drgn_object_is_zero() drgn_object_truthiness() is a misnomer, as truthiness is a language-specific concept. Instead, invert the return value and rename it to drgn_object_is_zero(), which more accurately conveys the meaning.	2019-12-19 11:43:54 -08:00
Omar Sandoval	d77b7bd7e3	libdrgn: add DRGN_FORMAT_OBJECT_{TYPE_NAME,MEMBER_TYPE_NAMES,ELEMENT_TYPE_NAMES}	2019-12-19 11:43:54 -08:00
Omar Sandoval	89307c532a	libdrgn: add DRGN_FORMAT_OBJECT_CHAR	2019-12-19 11:43:54 -08:00
Omar Sandoval	7cee597fff	libdrgn: add DRGN_FORMAT_OBJECT_STRING	2019-12-19 11:43:54 -08:00
Omar Sandoval	5865fa4d16	libdrgn: add DRGN_FORMAT_OBJECT_SYMBOLIZE	2019-12-19 11:43:54 -08:00
Omar Sandoval	0a707b0c9d	libdrgn: rework drgn_find_symbol_internal() Instead of having two internal variants (drgn_find_symbol_internal() and drgn_program_find_symbol_in_module()), combine them into the former and add a separate drgn_error_symbol_not_found() for translating the static error to the user-facing one. This makes things more flexible for the next change.	2019-12-19 11:43:54 -08:00
Omar Sandoval	f58bc4bf3a	libdrgn: add DRGN_FORMAT_OBJECT_DEREFERENCE	2019-12-19 11:43:54 -08:00
Omar Sandoval	5fb02f03fd	libdrgn: add flags to drgn_format_object()	2019-12-19 11:43:54 -08:00
Omar Sandoval	cf3a07bdfb	libdrgn: python: replace Object.__format__ with Object.format_ We'd like to have more control over how objects are formatted. I considered defining a custom string format specification syntax, but that's not easily discoverable. Instead, let's get rid of the current format specification support and replace it with a normal method.	2019-12-19 11:43:52 -08:00
Omar Sandoval	3b22bd3022	libdrgn: rename pretty_print -> format In preparation for making drgn_pretty_print_object() more flexible (i.e., not always "pretty"), rename it to drgn_format_object(). For consistency, let's rename drgn_pretty_print_type_name(), drgn_pretty_print_type(), and drgn_pretty_print_stack_trace(), too.	2019-12-16 11:21:12 -08:00
Serapheim Dimitropoulos	501d36c18e	libdrgn: fix regression in kernel module loading Commit `f327552229` ("libdrgn: add strstartswith()") flipped the test for a name entry in modinfo. This introduced a regression resulting in kernel modules not loading at the right offset. This patch fixes the regression.	2019-12-13 19:19:31 -05:00
Omar Sandoval	54e3e4a6d6	Rebase elfutils and remove dwfl_addrmodule patches The previous commit was the real fix for the failed symbol lookups. On the bright side, the build fixes were merged, so we can rebase on master and drop those. Based on: 277c2c54f libcpu: Compile i386_lex.c with -Wno-implicit-fallthrough With the following patches: configure: Add --disable-programs configure: Add --disable-shared libdwfl: add interface for attaching to/detaching from threads libdwfl: cache Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: add interface for evaluating DWARF expressions in a frame	2019-12-12 21:14:51 -08:00
Omar Sandoval	b0c4f894d4	libdrgn: really fix failed kernel module symbol lookups It turns out this wasn't a problem with dwfl_addrmodule() at all; the real problem is that .init sections are freed once the module is loaded but we're still considering them for the address range we pass to dwfl_report_module(). Ignore those sections entirely (by omitting them from the section name to section index map). While we're here, let's not bother inserting non-SHF_ALLOC sections in the map.	2019-12-12 21:14:02 -08:00
Omar Sandoval	f327552229	libdrgn: add strstartswith() Instead of open coding this check all over the place, add a helper function.	2019-12-12 13:26:50 -08:00
Omar Sandoval	ad5c925aff	Update elfutils with dwfl_addrmodule fix This fixes the issue that Program.symbol() sometimes fails for kernel module symbols. Based on: 2c7c4037 elfutils.spec.in: Sync with fedora spec, remove rhel/fedora specifics. With the following patches: configure: Add --disable-programs configure: Add --disable-shared configure: Fix -D_FORTIFY_SOURCE=2 check when CFLAGS contains -Wno-error libcpu: compile i386_lex.c with -Wno-implicit-fallthrough libdwfl: add interface for attaching to/detaching from threads libdwfl: cache Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: add interface for evaluating DWARF expressions in a frame libdwfl: return error from __libdwfl_relocate_value for unloaded sections libdwfl: remove broken coalescing logic in dwfl_report_segment libdwfl: store module lookup table separately from segments libdwfl: use sections of relocatable files for dwfl_addrmodule	2019-12-11 22:34:05 -08:00
Omar Sandoval	4a8152175b	libdrgn: translate EIO from /proc/$pid/mem to DRGN_ERROR_FAULT For live userspace processes, we add a single [0, UINT64_MAX) memory file segment for /proc/$pid/mem. Of course, not every address in that range is valid; reading from an invalid address returns EIO. We should translate this to a DRGN_ERROR_FAULT instead of DRGN_ERROR_OS, but only for /proc/$pid/mem.	2019-12-10 13:30:34 -08:00
Omar Sandoval	248cec7f7c	libdrgn: python: fix uninitialized index_args In commit `55a9700435` ("libdrgn: python: accept integer-like arguments in more places"), I converted Program_symbol to use index_converter but forgot to initialize the struct index_arg. Then, in commit `c243daed59` ("Translate find_task() helper (and dependencies) to C"), I added a bunch more cases of uninitialized struct index_arg. If index_arg.allow_none gets a non-zero garbage value, then this can end up allowing None through when it shouldn't. Furthermore, since commit `2561226918` ("libdrgn: python: add signed integer support to index_converter"), if index_arg.is_signed gets a non-zero garbage value, then this will try to get a signed integer when we're expecting an unsigned integer, which can blow up for values >= 2**63 (like kernel symbols). Fix it by initializing struct index_arg everywhere. Fixes #30.	2019-12-05 14:35:54 -08:00
Omar Sandoval	d3afc63ac9	Update to elfutils 0.178 Rebase on 0.178. The only additional change needed is to pass --disable-debuginfod to configure. Based on: 2c7c4037 elfutils.spec.in: Sync with fedora spec, remove rhel/fedora specifics. With the following patches: configure: Add --disable-programs configure: Add --disable-shared configure: Fix -D_FORTIFY_SOURCE=2 check when CFLAGS contains -Wno-error libcpu: compile i386_lex.c with -Wno-implicit-fallthrough libdwfl: add interface for attaching to/detaching from threads libdwfl: cache Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: add interface for evaluating DWARF expressions in a frame	2019-12-03 12:39:11 -08:00
Omar Sandoval	7b518fc2fd	libdrgn: support negative array subscripts This was an oversight, as negative indices are completely valid (and occasionally useful, like when looking at a stack).	2019-11-29 21:06:37 -08:00
Omar Sandoval	2561226918	libdrgn: python: add signed integer support to index_converter This is preparation for the next change.	2019-11-29 20:40:40 -08:00
Omar Sandoval	dd59e5431c	libdrgn: fix extremely slow type comparison Matt Ahrens reported that comparing two types would sometimes end up in a seemingly infinite loop, which he discovered was because we repeat comparisons of types as long as they're not in a cycle. Fix it by caching all comparisons during a call.	2019-11-24 09:46:00 -08:00
Omar Sandoval	b8b93ae3e6	libdrgn: python: fix deprecation warning in unit tests Some tests (e.g., tests.test_object.TestSpecialMethods.test_round) are printing: DeprecationWarning: an integer is required (got type float). Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python. See https://bugs.python.org/issue36048. This is coming from calls like: Object(prog, 'int', value=1.5) We actually want the truncating behavior, so explicitly call PyNumber_Long().	2019-11-22 17:18:55 -08:00
Omar Sandoval	6af6159cfc	libdrgn: support loading only load main debug info If we only want debugging information for vmlinux and not kernel modules, it'd be nice to only load the former. This adds a load_main parameter to drgn_program_load_debug_info() which specifies just that. For now, it's only implemented for the Linux kernel. While we're here, let's make the paths parameter optional for the Python bindings.	2019-11-22 16:38:52 -08:00
Omar Sandoval	09108d22fa	libdrgn: x86_64: support unwinding stack on Linux < 4.9	2019-11-22 16:38:49 -08:00
Amlan Nayak	0df2152307	Add basic class type support This implements the first step at supporting C++: class types. In particular, this adds a new drgn_type_kind, DRGN_TYPE_CLASS, and support for parsing DW_TAG_class_type from DWARF. Although classes are not valid in C, this adds support for pretty printing them, for completeness.	2019-11-18 10:36:40 -08:00
Omar Sandoval	b49f773fe6	libdrgn: python: fix build on Python 3.8 Python 3.8 replaced the unused void tp_print field with Py_ssize_t tp_vectorcall_offset, so with -Werror we get "error: initialization of ‘long int’ from ‘void ’ makes integer from pointer without a cast". Let's just use designated initializers.	2019-11-15 10:41:58 -08:00
Omar Sandoval	1c8eced0c6	libdrgn: stack_trace: support unwinding stack from struct pt_regs Linux kernel IRQ handlers store the registers from before the interrupt as struct pt_regs, so add a way to unwind the stack given only that structure.	2019-10-28 13:56:54 -07:00
Omar Sandoval	4780c7a266	libdrgn: stack_trace: prohibit unwinding stack of running tasks We currently don't check that the task we're unwinding is actually blocked, which means that linux_kernel_set_initial_registers_x86_64() will get garbage from the stack and we'll return a nonsense stack trace. Let's avoid this by checking that the task isn't running if we didn't find a NT_PRSTATUS note.	2019-10-28 13:37:57 -07:00
Omar Sandoval	326107f054	libdrgn: add task_state_to_char() helper Add a helper to get the state of a task (e.g., 'R', 'S', 'D'). This will be used to make sure that a task is not running when getting a stack trace, so implement it in libdrgn.	2019-10-28 13:37:57 -07:00
Omar Sandoval	91f5c8e2e7	libdrgn: stack_trace: support unwinding stack from thread ID When debugging the Linux kernel, it's inconvenient to have to get the task_struct of a thread in order to get its stack trace. This adds support for looking it up solely by PID. In that case, we do the find_task() inside of libdrgn. This also gives us stack trace support for userspace core dumps almost for free since we already added support for NT_PRSTATUS.	2019-10-28 13:37:53 -07:00
Omar Sandoval	0f7ad0ed26	libdrgn: stack_trace: support unwinding stack from core dump vmcores include a NT_PRSTATUS note for each CPU containing the PID of the task running on that CPU at the time of the crash and its registers. We can use that to unwind the stack of the crashed tasks.	2019-10-28 13:36:02 -07:00
Omar Sandoval	75c74022ff	libdrgn: save Elf handle for core dump Currently, we close the Elf handle in drgn_set_core_dump() after we're done with it. However, we need the Elf handle in userspace_report_debug_info(), so we reopen it temporarily. We will also need it to support getting stack traces from core dumps, so we might as well keep it open. Note that we keep it even if we're using libkdumpfile because libkdumpfile doesn't seem to have an API to access ELF notes.	2019-10-28 13:09:25 -07:00
Omar Sandoval	c243daed59	Translate find_task() helper (and dependencies) to C We'd like to be able to look up tasks by PID from libdrgn, but those helpers are written in Python. Translate them to C and add some thin bindings so we can use the same implementation from Python.	2019-10-28 13:08:57 -07:00
Omar Sandoval	b5735de8dc	libdrgn: add drgn_object_read_integer() There are some cases where we want to read an integer regardless of its signedness, so drgn_object_read_signed() and drgn_object_read_unsigned() are cumbersome to use, and drgn_object_read_value() is too permissive.	2019-10-28 13:06:38 -07:00
Omar Sandoval	97b5967c37	libdrgn: add a couple of helpers for working with buffer and reference objects Expose drgn_object_buffer() and add drgn_buffer_object_size() and drgn_reference_object_size().	2019-10-28 11:34:08 -07:00
Omar Sandoval	0da60a41cd	libdrgn: support getting register values from stack frames Currently, the only information available from a stack frame is the program counter. Eventually, we'd like to add support for getting arguments and local variables, but that will require more work. In the mean time, we can at least get the values of other registers. A determined user can read the assembly for the code they're debugging and derive the values of variables from the registers.	2019-10-19 13:53:06 -07:00
Omar Sandoval	4fb0e2e110	libdrgn: use new libdwfl stack trace API	2019-10-18 14:34:11 -07:00
Omar Sandoval	6f43fff627	Update elfutils with new stack frame interface Rebase the existing patches and add the patches which extend the libdwfl stack frame interface. Based on: 47780c9e elflint, readelf: enhance error diagnostics With the following patches: configure: Add --disable-programs configure: Add --disable-shared configure: Fix -D_FORTIFY_SOURCE=2 check when CFLAGS contains -Wno-error libcpu: compile i386_lex.c with -Wno-implicit-fallthrough libdwfl: don't bother freeing frames outside of dwfl_thread_getframes libdwfl: only use thread->unwound for initial frame libdwfl: add interface for attaching to/detaching from threads libdwfl: cache Dwfl_Module and Dwarf_Frame for Dwfl_Frame libdwfl: add interface for evaluating DWARF expressions in a frame	2019-10-18 14:34:11 -07:00
Omar Sandoval	d60c6a1d68	libdrgn: add register information to platform In order to retrieve registers from stack traces, we need to know what registers are defined for a platform. This adds a small DSL for defining registers for an architecture. The DSL is parsed by an awk script that generates the necessary tables, lookup functions, and enum definitions.	2019-10-18 14:33:02 -07:00
Omar Sandoval	b8c657d760	libdrgn: python: add sizeof() It's annoying to do obj.type_.size, and that doesn't even work for every type. Add sizeof() that does the right thing whether it's given a Type or Object.	2019-10-18 11:47:32 -07:00
Omar Sandoval	12b0214b4d	libdrgn: work around DW_AT_upper_bound of -1 for empty arrays For the following source code: int arr[] = {}; GCC emits the following DWARF: DWARF section [ 4] '.debug_info' at offset 0x40: [Offset] Compilation unit at offset 0: Version: 4, Abbreviation section offset: 0, Address size: 8, Offset size: 4 [ b] compile_unit abbrev: 1 producer (strp) "GNU C17 9.2.0 -mtune=generic -march=x86-64 -g" language (data1) C99 (12) name (strp) "test.c" comp_dir (strp) "/home/osandov" stmt_list (sec_offset) 0 [ 1d] array_type abbrev: 2 type (ref4) [ 34] sibling (ref4) [ 2d] [ 26] subrange_type abbrev: 3 type (ref4) [ 2d] upper_bound (sdata) -1 [ 2d] base_type abbrev: 4 byte_size (data1) 8 encoding (data1) signed (5) name (strp) "ssizetype" [ 34] base_type abbrev: 5 byte_size (data1) 4 encoding (data1) signed (5) name (string) "int" [ 3b] variable abbrev: 6 name (string) "arr" decl_file (data1) test.c (1) decl_line (data1) 1 decl_column (data1) 5 type (ref4) [ 1d] external (flag_present) yes location (exprloc) [ 0] addr .bss+0 <arr> Note the DW_AT_upper_bound of -1. We end up parsing this as UINT64_MAX and returning a "DW_AT_upper_bound is too large" error. It appears that GCC is simply emitting the array length minus one, so let's treat these as having a length of zero. Fixes #19.	2019-10-18 03:18:21 -07:00
Omar Sandoval	430732093d	libdrgn: python: add converter for byteorder Rather than open-coding the conversion where we need it, make it a proper converter function.	2019-10-15 21:21:21 -07:00
Omar Sandoval	55a9700435	libdrgn: python: accept integer-like arguments in more places There are a few places (e.g., Program.symbol(), Program.read()) where it makes sense to accept, e.g., a drgn.Object with integer type. Replace index_arg() with a converter function and use it everywhere that we use the "K" format for PyArg_Parse*.	2019-10-15 21:10:11 -07:00
Omar Sandoval	77253dbdd8	libdrgn: dwarf_info_cache: fix wrong DW_AT_upper_bound error message I got the error messages for DW_AT_upper_bound and DW_AT_count backwards; fix it. Also fix the condition for word + 1 overflowing dimension->length to be word >= UINT64_MAX. (Dwarf_Word is uint64_t so this is kind of silly, but at least it documents the intent).	2019-10-15 17:07:27 -07:00
Omar Sandoval	4e330bbb6e	cli: indicate if drgn was compiled with libkdumpfile	2019-10-03 16:22:10 -07:00
Omar Sandoval	78192cd61e	libdrgn: add environment variable to see more missing debug info errors Sometimes, I'd like to see all of the missing debug info errors rather than just the first 5. Allow setting this through the DRGN_MAX_DEBUG_INFO_ERRORS environment variable.	2019-10-02 17:22:12 -07:00
Omar Sandoval	7848c17097	libdrgn: dwarf_index: tweak missing debug section error message Make the error message more concise, and reorder the sections so that we check the most obviously-named section (.debug_info) first and least important section (.debug_line) last.	2019-10-02 17:22:12 -07:00
Omar Sandoval	423d2cd500	libdrgn: dwarf_index: rework file reporting Currently, the interface between the DWARF index, libdwfl, and the code which finds and reports vmlinux/kernel modules is spaghetti. The DWARF index tracks Dwfl_Modules via their userdata. However, despite conceptually being owned by the DWARF index, the reporting code reports the Dwfl_Modules and sets up the userdata. These Dwfl_Modules and drgn_dwfl_module_userdatas are messy to track and pass between the layers. This reworks the architecture so that the DWARF index owns the Dwfl instance and files are reported to the DWARF index; the DWARF index takes care of reporting to libdwfl internally. In addition to making the interface for the reporter much cleaner, this improves a few things as a side-effect: - We now deduplicate on build ID in addition to path. - We now skip searching for vmlinux and/or kernel modules if they were already indexed. - We now support compressed ELF files via libdwelf. - We can now load default debug info at the same time as additional debug info.	2019-10-02 17:22:11 -07:00
Omar Sandoval	91265c37a0	libdrgn: hash_table: fix memcmp() undefined behavior It's undefined behavior to pass NULL to memcmp() even if the length is zero. See also commit `a17215e984` ("libdrgn: dwarf_index: fix memcpy() undefined behavior").	2019-10-02 17:16:43 -07:00
Omar Sandoval	b05cc0eb75	libdrgn: use libkdumpfile for ELF vmcores when available vmcores don't include program headers for special memory regions like vmalloc and percpu. Instead, we need to walk the kernel page table to map those addresses. Luckily, libkdumpfile already does that. So, if drgn was built with libkdumpfile support, use it for ELF vmcores. Also add an environment variable to override this behavior. Closes #15.	2019-10-02 17:15:36 -07:00
Omar Sandoval	191c5ae253	libelf: clean up SHF_COMPRESSED handling We don't need the ifdef anymore since we're using the elf.h from our local elfutils. We can also fold a leftover nested if.	2019-09-24 17:16:17 -07:00
Omar Sandoval	ca9cdc1991	libdrgn: autogenerate docstrings.h I didn't want to use BUILT_SOURCES before because that would break make $TARGET. But, now that doesn't work anyways because we're using SUBDIRS, so we might as well use BUILT_SOURCES.	2019-09-19 11:08:04 -07:00
Omar Sandoval	aa4bfd646f	libdrgn: simplify gen_constants.py header search Instead of passing in a directory for header files, add -iquote for that directory.	2019-09-19 11:08:04 -07:00
Omar Sandoval	6a13d74c0c	libdrgn: build with bundled elfutils Now that we have the bundled version of elfutils, build it from libdrgn and link to it. We can also get rid of the elfutils version checks from the libdrgn code.	2019-09-19 11:07:12 -07:00
Omar Sandoval	1cedca8ff4	Import elfutils Based on: c950e8a9 config: Fix spec file, add manpages and new GFDL license. With the following patches: configure: Add --disable-programs configure: Add --disable-shared configure: Fix -D_FORTIFY_SOURCE=2 check when CFLAGS contains -Wno-error libcpu: compile i386_lex.c with -Wno-implicit-fallthrough The plan is to stop relying on the distribution's version of elfutils and instead ship our own. This gives us freedom to assume that we're using the latest version and even ship our own patches (starting with a few build system improvements). More details are in scripts/update-elfutils.sh, which was used to generate this commit.	2019-09-05 01:04:33 -07:00
Omar Sandoval	f11a8766bf	setup.py: get list of source files from git Currently, we have a special Makefile target to output the files for a libdrgn source tarball, and we use that for setuptools. However, the next change is going to import elfutils, and it'd be a pain to add the same thing for the elfutils sources. Instead, let's just use git ls-files for everything. The only difference is that source distributions won't have the autoconf/automake output.	2019-09-03 17:19:02 -07:00
Omar Sandoval	a3f4fe0518	libdrgn: handle get_debug_sections() errors per-module There's no reason to fail indexing just because one file is missing debug information.	2019-08-29 12:26:40 -07:00
Omar Sandoval	62d98b3016	libdrgn: fold ELF relocation code into dwarf_index I started with drgn_elf_relocator as a separate interface to parallelize by relocation. However, the final result is parallelized by file, which means that it can be done as part of the main read_cus() loop. Get rid of the elf_relocator interface and do it in dwarf_index.c instead. This means that if/when libdwfl gets faster at ELF relocations, we can rip out the relocation code without any other changes.	2019-08-29 12:26:22 -07:00
Omar Sandoval	698991b27b	Get rid of DRGN_ERROR_{ELF,DWARF}_ERROR and FileFormatError We're too inconsistent with how we use these for them to be useful (and it's impossible to distinguish between a format error and some other error from libelf/libdw/libdwfl), so let's just get rid of them and make it all DRGN_ERROR_OTHER/Exception.	2019-08-15 15:03:42 -07:00
Omar Sandoval	10142f922f	Add basic stack trace support For now, we only support stack traces for the Linux kernel (at least v4.9) on x86-64, and we only support getting the program counter and corresponding function symbol from each stack frame.	2019-08-02 00:26:28 -07:00
Serapheim Dimitropoulos	93d7ea9f01	Add support for kdump-compressed core dumps with libkdumpfile	2019-08-02 00:20:16 -07:00
Omar Sandoval	690b5fd650	libdrgn: generalize architecture to platform For stack trace support, we'll need to have some architecture-specific functionality. drgn's current notion of an architecture doesn't actually include the instruction set architecture. This change expands it to a "platform", which includes the ISA as well as the existing flags.	2019-08-02 00:11:56 -07:00
Omar Sandoval	71e6744210	libdrgn: add symbol table interface Now that we're not overloading the name "symbol", we can define struct drgn_symbol as a symbol table entry. For now, this is very minimal: it's just a name, address, and size. We can then add a way to find the symbol for a given address, drgn_program_find_symbol(). For now, this is only supported through the actual ELF symbol tables. However, in the future, we can probably support adding "symbol finders".	2019-07-30 09:25:34 -07:00
Omar Sandoval	0c5df56fba	libdrgn: replace symbol index with object index struct drgn_symbol doesn't really represent a symbol; it's just an object which hasn't been fully initialized (see `c2be52dff0` ("libdrgn: rename object index to symbol index"), it used to be called a "partial object"). For stack traces, we're going to have a notion of a symbol that more closely represents an ELF symbol, so let's get rid of the temporary struct drgn_symbol representation and just return an object directly.	2019-07-29 17:04:47 -07:00
Omar Sandoval	74bd59e38a	libdrgn: python: get rid of Program._symbol() We can test with Program.object() just as easily, so get rid of this undocumented method.	2019-07-29 17:04:47 -07:00
Omar Sandoval	62ff4e1dba	libdrgn: indicate finder lookup failure with special error Currently, finders indicate a non-fatal lookup error by setting the type member to NULL. This won't work when we replace the symbol finder with an object finder (which shouldn't modify the object on failure). Instead, use a static error for this purpose.	2019-07-29 17:04:47 -07:00
Omar Sandoval	0cb77b303c	libdrgn: work around Clang __muloti4 again See `2dd14ad522` ("libdrgn: work around "undefined reference to '__muloti4'" when using Clang").	2019-07-29 17:03:45 -07:00
Omar Sandoval	b01d1a943f	libdrgn: python: make set_drgn_error() return void * It still always returns NULL, but now we can directly return from functions returning some PyObject subtype.	2019-07-28 00:58:36 -07:00
Omar Sandoval	0a74a610bc	libdrgn: python: only repr() one level of type members Currently, repr() of structure and union types goes arbitrarily deep (except for cycles). However, for lots of real-world types, this is easily deeper than Python's recursion limit, so we can't get a useful repr() at all: >>> repr(prog.type('struct task_struct')) Traceback (most recent call last): File "<console>", line 1, in <module> RecursionError: maximum recursion depth exceeded while getting the repr of an object Instead, only print one level of structure and union types.	2019-07-27 15:04:31 -07:00
Omar Sandoval	d63125f133	libdrgn: python: make Program.object() flags optional Default to FindObjectFlags.ANY.	2019-07-24 11:02:34 -07:00
Omar Sandoval	06cce1baa1	libdrgn: fix typo in drgn_enomem documentation	2019-07-22 17:23:27 -07:00
Omar Sandoval	3e95e88028	libdrgn: vector: protect against overflow when doubling capacity It seems extremely unlikely that we'd actually overflow before we run out of memory, but let's just be safe.	2019-07-19 09:27:20 -07:00
Omar Sandoval	27a27940bc	libdrgn: split up drgn_program_get_dwarf() We don't need to get the DWARF index at the time we get the Dwfl handle, so get rid of drgn_program_get_dwarf(), add drgn_program_get_dwfl(), and create the DWARF index right before we update in a new function, drgn_program_update_dwarf_index().	2019-07-19 09:26:30 -07:00
Omar Sandoval	a17215e984	libdrgn: dwarf_index: fix memcpy() undefined behavior Apparently, it's undefined behavior to pass NULL as the source to memcpy(), even if the length is zero. It's an easy fix, so let's appease UBSan.	2019-07-15 12:27:48 -07:00
Omar Sandoval	1d4854a5bc	libdrgn: implement optimized x86-64 ELF relocations After the libdwfl conversion, we apply ELF relocations with libdwfl instead of our homegrown implementation. However, libdwfl is much slower at it than the previous implementation. We can work around this by (again) applying ELF relocations ourselves for architectures that we care about (x86-64, to start). For other architectures, we can fall back to libdwfl. This new implementation of ELF relocation reworks the parallelization to be per-file rather than per-relocation. The latter was done originally because before commit `6f16ab09d6` ("libdrgn: only apply ELF relocations to relocatable files"), we applied relocations to vmlinux, which is much larger than most kernel modules. Now that we don't do that, it seems to be slightly faster to parallelize by file.	2019-07-15 12:27:48 -07:00
Omar Sandoval	e5874ad18a	libdrgn: use libdwfl libdwfl is the elfutils "DWARF frontend library". It has high-level functionality for looking up symbols, walking stack traces, etc. In order to use this functionality, we need to report our debugging information through libdwfl. For userspace programs, libdwfl has a much better implementation than drgn for automatically finding debug information from a core dump or PID. However, for the kernel, libdwfl has a few issues: - It only supports finding debug information for the running kernel, not vmcores. - It determines the vmlinux address range by reading /proc/kallsyms, which is slow (~70ms on my machine). - If separate debug information isn't available for a kernel module, it finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated for every module. - It doesn't find kernel modules with names containing both dashes and underscores (e.g., aes-x86_64). Luckily, drgn already solved all of these problems, and with some effort, we can keep doing it ourselves and report it to libdwfl. The conversion replaces a bunch of code for dealing with userspace core dump notes, /proc/$pid/maps, and relocations.	2019-07-15 12:27:48 -07:00
Omar Sandoval	a9a2cb7cac	libdrgn: dwarf_index: move bswap from file to compilation unit Remove an indirection.	2019-07-15 12:27:38 -07:00
Omar Sandoval	1c9ab2e7d1	libdrgn: dwarf_index: fix leak of DWARF index entries on failure We're forgetting to unchain new entries which are chained on old entries.	2019-07-15 12:27:36 -07:00
Omar Sandoval	996d3094ef	libdrgn: dwarf_index: fold unindex_files() into index_cus()	2019-07-15 12:27:33 -07:00
Omar Sandoval	b7e1b6ede6	libdrgn: dwarf_index: rename drgn_dwarf_index_iterator_next() output parameter	2019-07-15 12:27:24 -07:00
Omar Sandoval	d423361d8a	libdrgn: dwarf_index: move .debug_str null-termination check Check it right after we read the section instead of when updating the index.	2019-07-15 12:27:18 -07:00
Omar Sandoval	9f9bec4762	libdrgn: use common vector where applicable This converts several open-coded dynamic arrays to the new common vector implementation: - drgn_lexer stack - Array dimension array for DWARF parsing - drgn_program_read_c_string() - DWARF index directory name hashes - DWARF index file name hashes - DWARF index abbreviation table - DWARF index shard entries	2019-07-15 12:27:16 -07:00
Omar Sandoval	8d52536271	libdrgn: add common vector implementation drgn has enough open-coded dynamic arrays at this point to warrant a common implementation. Add one inspired by hash_table.h. The API is pretty minimal. I'll add more to it as the need arises.	2019-07-15 12:27:15 -07:00
Omar Sandoval	e2a27e7f59	libdrgn: add drgn_error_format_os() There are some cases where we format a path (e.g., with asprintf()) and keep it around only in case of errors. Add drgn_error_format_os() so we can just reformat it if we hit the error, which simplifies cleanup.	2019-07-11 16:19:18 -07:00
Omar Sandoval	74c0aa8612	libdrgn: reorder drgn_error_create_os() arguments To make it more consistent with the upcoming drgn_error_format_os().	2019-07-11 16:12:56 -07:00
Omar Sandoval	ce808440f7	libdrgn: move string_builder_line_break() to string_builder.c	2019-07-11 15:33:10 -07:00
Omar Sandoval	0ebcfc8178	libdrgn: use drgn_stop error in DWARF index The (struct drgn_error *)-1 hack predates DRGN_ERROR_STOP. Get rid of the hack.	2019-07-11 10:53:13 -07:00
Omar Sandoval	5b2da5e682	libdrgn: get rid of unnecessary gelf_getshdr() in read_elf_section()	2019-07-11 09:53:56 -07:00
Omar Sandoval	e73346b488	libdrgn: generalize IS_RUNNING_KERNEL flag to IS_LIVE I.e., also flag running processes as live.	2019-07-08 16:55:54 -07:00
Omar Sandoval	129f1493b8	libdrgn: split kernel-specific stuff out of program.c Almost half of program.c is stuff specific to the Linux kernel, so let's separate that out (and combine it with the existing kernel module code).	2019-07-08 16:53:58 -07:00
Omar Sandoval	a0fc02efd3	libdrgn: don't store version in struct compilation_unit We don't use it after checking it.	2019-07-08 16:23:38 -07:00
Omar Sandoval	8a59a7e819	libdrgn: don't preallocate DWARF index memory This doesn't make things any faster in my benchmarks, and it complicates DWARF index initialization.	2019-07-08 16:23:38 -07:00
Omar Sandoval	ec33f9bf73	libdrgn: get rid of DWARF index flags We always index everything, so simplify the code a bit.	2019-07-08 16:23:38 -07:00
Omar Sandoval	426ee1e57c	libdrgn/python: add missing name in Symbol argument parsing	2019-06-29 01:38:36 -07:00
Omar Sandoval	97ebc2a57c	libdrgn/python: add Program.cache For caching metadata between invocations of helpers (e.g., detected kernel version differences or config options).	2019-06-28 16:15:07 -07:00
Omar Sandoval	25e7a9d3b8	libdrgn/python: implement Program.__contains__	2019-06-28 16:02:52 -07:00
Omar Sandoval	f55158c74c	libdrgn: add PAGE_{SHIFT,SIZE,MASK} symbols from vmcoreinfo Since we currently don't parse DWARF macro information, there's no easy way to get the value PAGE_SIZE and friends in drgn. However, vmcoreinfo contains the value of PAGE_SIZE, so let's add a special symbol finder that returns that.	2019-05-29 00:02:48 -07:00
Omar Sandoval	1614b1e6f6	libdrgn: add better vmcoreinfo fallback Currently, if we don't get vmcoreinfo from /proc/kcore, and we can't get it from /sys/kernel/vmcoreinfo, then we manually determine the kernel release and KASLR offset. This has a couple of issues: 1. We look for vmlinux to determine the KASLR offset, which may not be in a standard location. 2. We might want to start using other information from vmcoreinfo which can't be determined as easily. Instead, we can get the virtual address of vmcoreinfo from /proc/kallsyms and read it directly from there.	2019-05-28 15:54:49 -07:00
Omar Sandoval	8e45a305fb	libdrgn: fix file/memory leak in proc_kallsyms_symbol_addr()	2019-05-28 15:01:35 -07:00
Serapheim Dimitropoulos	2396cdca47	libdrgn: add /usr/lib/debug/boot in the vmlinux_paths Ubuntu-based distros tend to put vmlinux with debug info under /usr/lib/debug/boot/vmlinux-<version>.	2019-05-27 17:43:10 -07:00
Omar Sandoval	eeac241c65	libdrgn: make all kernel module iterator errors non-fatal when loading default symbols kernel_module_iterator_next() can also fail in open_loaded_kernel_modules(), so handle it in the same way that we currently handle kernel_module_iterator_init().	2019-05-26 14:58:02 -07:00
Omar Sandoval	68f7b87d6a	libdrgn: ignore physical core dump segments with address -1 /proc/kcore contains segments which don't have a valid physical address, which it indicates with a p_paddr of -1. Skip those segments, otherwise we got an overflow error from the memory reader.	2019-05-26 14:49:48 -07:00
Omar Sandoval	c0bc72b0ea	libdrgn: use splay tree for memory reader The current array-based memory reader has a bug in the following scenario: prog.add_memory_segment(0xffff0000, 128, ...) # This should replace a subset of the first segment. prog.add_memory_segment(0xffff0020, 32, ...) # This moves the first segment back to the front of the array. prog.read(0xffff0000, 32) # This finds the first segment instead of the second segment. prog.read(0xffff0032, 32) Fix it by using the newly-added splay tree. This also splits up the virtual and physical memory segments into separate trees.	2019-05-24 17:48:08 -07:00
Omar Sandoval	10fb398338	libdrgn: add splay tree implementation This will be used to track memory segments instead of the array we currently use. The API is based on the hash table API; it can support alternative implementations in the future, like red-black trees.	2019-05-24 17:48:08 -07:00
Omar Sandoval	dcddaa2cc1	libdrgn: revamp hash table API This makes several improvements to the hash table API. The first two changes make things more general in order to be consistent with the upcoming binary search tree API: - Items are renamed to entries. - Positions are renamed to iterators. - hash_table_empty() is added. One change makes the definition API more convenient: - It is no longer necessary to pass the types into DEFINE_HASH_{MAP,SET}_FUNCTIONS(). A few changes take some good ideas from the C++ STL: - hash_table_insert() now fails on duplicates instead of overwriting. - hash_table_delete_iterator() returns the next iterator. - hash_table_next() returns an iterator instead of modifying it. One change reduces memory usage: - The lower-level DEFINE_HASH_TABLE() is cleaned up and exposed as an alternative to DEFINE_HASH_MAP() and DEFINE_HASH_SET(). This allows us to get rid of the duplicated key where a hash map value already embeds the key (the DWARF index file table) and gets rid of the need to make a dummy hash set entry to do a search (the pointer and array type caches).	2019-05-24 17:48:05 -07:00
Omar Sandoval	0026eeae66	libdrgn: find kernel module name in kernels < v4.13 If we can't find the module name in .modinfo, fall back to .gnu.linkonce.this_module.	2019-05-14 15:39:16 -07:00
Omar Sandoval	e4b8af7807	libdrgn/python: fix uninitialized variable in Program.load_debug_info() path_converter() reads arg->allow_none, so make sure we zero the array of path arguments.	2019-05-14 14:28:03 -07:00
Omar Sandoval	39876ccbac	libdrgn: find kernel module debuginfo on Debian Debian's linux-image*-dbg packages name the ELF files without the extra .debug suffix that Fedora includes.	2019-05-14 12:42:56 -07:00
Omar Sandoval	ac27f2c1ec	libdrgn: only load debug information from loaded kernel modules Currently, we load debug information for every kernel module that we find under /lib/modules/$(uname -r)/kernel. This has a few issues: 1. Distribution kernels have lots of modules (~3000 for Fedora and Debian). a) This can exceed the default soft limit on the number of open file descriptors. b) The mmap'd debug information can trip the overcommit heuristics and cause OOM kills. c) It can take a long time to parse all of the debug information. 2. Not all modules are under the "kernel" directory; some distros also have an "extra" directory. 3. The user is not made aware of loaded kernel modules that don't have debug information available. So, instead of walking /lib/modules, walk the list of loaded kernel modules and look up their debugging information.	2019-05-14 11:55:39 -07:00
Omar Sandoval	efaec41ca2	libdrgn: add string_builder_append_error()	2019-05-14 11:49:52 -07:00
Omar Sandoval	e21ed988fb	libdrgn: add drgn_error_from_string_builder() And use that instead of exposing drgn_error_create_nodup().	2019-05-14 10:16:01 -07:00
Omar Sandoval	b0f10d3b58	libdrgn: pass enum drgn_error_code to error constructors	2019-05-14 10:13:05 -07:00
Omar Sandoval	f08d4c9a08	libdrgn: make string_builder API return bool It can only fail with no memory, so simplify it.	2019-05-14 10:07:50 -07:00
Omar Sandoval	0135dbd0cc	libdrgn: get loaded module names from /proc/modules when possible Similar to the last optimization, for the running kernel, we can just read /proc/modules instead of walking the kernel data structures.	2019-05-13 18:05:17 -07:00
Omar Sandoval	ed6a6f0b3e	libdrgn: get module section address from sysfs when possible In the running kernel, we don't have to walk the list of modules and module sections, since we can just look it up directly in sysfs.	2019-05-13 18:05:09 -07:00
Omar Sandoval	f11e030aaa	libdrgn: factor out kernel module iteration and section lookup	2019-05-13 16:39:30 -07:00
Omar Sandoval	9b563170f8	libdrgn: make load_debug_info() API saner Rather than exposing the underlying open and load steps of DWARF index, simplify it down to a single load step.	2019-05-13 15:04:27 -07:00
Omar Sandoval	60c9e26ff5	libdrgn: make C string hash table helpers work with char * const char * const * is not compatible with char * const , so make c_string_hash() and c_string_eq() macros so they can work with both const char and char * keys.	2019-05-13 11:52:55 -07:00
Omar Sandoval	1206730730	libdrgn: fix hash_set_insert_searched_pos() documentation	2019-05-13 11:52:55 -07:00
Omar Sandoval	ab58a5bff0	libdrgn: determine default size_t and ptrdiff_t more intelligently Currently, size_t and ptrdiff_t default to typedefs of the default unsigned long and long, respectively, regardless of what the program actually defines unsigned long or long as. Instead, make them refer the whatever integer type (long, long long, or int) is the same size as the word size.	2019-05-10 15:14:03 -07:00
Omar Sandoval	baba1ff3f0	libdrgn: make program components pluggable Currently, programs can be created for three main use-cases: core dumps, the running kernel, and a running process. However, internally, the program memory, types, and symbols are pluggable. Expose that as a callback API, which makes it possible to use drgn in much more creative ways.	2019-05-10 12:41:07 -07:00
Omar Sandoval	ac946ba8a7	libdrgn: fix zero-filling reads from core dump segments	2019-05-09 16:35:48 -07:00
Omar Sandoval	15bc0286b9	libdrgn: add drgn_error_create_nodup()	2019-05-09 16:35:14 -07:00
Omar Sandoval	fb10623903	libdrgn: use next_power_of_two() for string_builder	2019-05-09 16:01:07 -07:00
Omar Sandoval	6d7b0631b9	libdrgn: simplify string_builder API Instead of maintaining a null-terminated string, null-terminate just before returning the string as a "finalize" step.	2019-05-09 15:25:58 -07:00
Omar Sandoval	5200a6652c	libdrgn: embed memory reader, type index, and symbol index in program	2019-05-06 14:55:34 -07:00
Omar Sandoval	bb2357bc09	libdrgn: don't require word size for type index initialization	2019-05-06 14:55:34 -07:00
Omar Sandoval	640b1c011d	libdrgn: embed DWARF index in DWARF info cache	2019-05-06 14:55:34 -07:00
Omar Sandoval	2ed8e3148c	libdrgn: get architecture info from core file instead of DWARF index	2019-05-06 14:55:34 -07:00
Omar Sandoval	ba162ac001	libdrgn: remove endianness from type index The type index doesn't need to know or care about endianness. Move it to the program.	2019-05-06 14:55:34 -07:00
Omar Sandoval	565e0343ef	libdrgn: make symbol index pluggable with callbacks The last piece of making the major program components pluggable.	2019-05-06 14:55:34 -07:00
Omar Sandoval	47151c4554	libdrgn/python: add Program.object()	2019-05-06 14:55:34 -07:00
Omar Sandoval	0ea2825817	libdrgn: refactor gen_constants.py All of the constants are generated with basically the same code, so refactor it.	2019-05-06 14:55:34 -07:00
Omar Sandoval	9c6575e783	libdrgn: move relocation hook to drgn_info_cache	2019-05-06 14:55:34 -07:00
Omar Sandoval	52a8681a8d	libdrgn: rename drgn_dwarf_type_cache to drgn_dwarf_info_cache This is preparation for sharing this with the symbol index.	2019-05-06 14:55:34 -07:00
Omar Sandoval	a98445c277	libdrgn: make type index pluggable with callbacks Similar to "libdrgn: make memory reader pluggable with callbacks", we want to support custom type indexes (imagine, e.g., using drgn to parse a binary format). For now, this disables the dwarf index tests; we'll have a better way to test them later, so let's not bother adding more test scaffolding.	2019-05-06 14:55:34 -07:00
Omar Sandoval	77443cecd1	libdrgn: use NULL-terminated arrays for mock {type,symbol} index This will simplify the next couple of changes.	2019-05-06 14:55:34 -07:00
Omar Sandoval	c2be52dff0	libdrgn: rename object index to symbol index An "object index" doesn't actually index objects, but really "partial objects" -- i.e., a type + address. "Symbol" is a better name for this.	2019-05-06 14:55:34 -07:00
Omar Sandoval	417a6f0d76	libdrgn: make memory reader pluggable with callbacks I've been planning to make memory readers pluggable (in order to support use cases like, e.g., reading a core file over the network), but the C-style "inheritance" drgn uses internally is awkward as a library interface; it's much easier to just register a callback. This change effectively makes drgn_memory_reader a mapping from a memory range to an arbitrary callback. As a bonus, this means that read callbacks can be mixed and matched; a part of memory can be in a core file, another part can be in the executable file, and another part could be filled from an arbitrary buffer.	2019-05-06 14:55:34 -07:00
Omar Sandoval	e78e6c9226	libdrgn/python: improve Python exception handling Make {set,clear}_drgn_in_python() handle nested calls, and return a valid error to libdrgn rather than -1.	2019-05-06 14:55:34 -07:00
Omar Sandoval	d0633d2f1a	libdrgn: allow multiple cleanup callbacks For now, this makes things slightly awkward, but it will be necessary for the upcoming changes making drgn_program more pluggable.	2019-05-06 14:55:34 -07:00
Omar Sandoval	043cddf6d8	libdrgn: move member cache to type index It makes more sense here than in struct drgn_program.	2019-05-06 14:55:34 -07:00
Omar Sandoval	3645ce78ea	libdrgn: assume all pointers have same size in type index	2019-05-06 14:55:34 -07:00
Omar Sandoval	06960f591c	libdrgn: look up primitive types on demand Instead of caching all primitive types ahead of time, look them up on demand. This is preparation for making the type index API more flexible.	2019-05-06 14:55:34 -07:00
Omar Sandoval	932b7857b5	libdrgn: expose primitive type concept to public interface Previously known as c_type.	2019-05-06 14:55:34 -07:00
Omar Sandoval	bad1b9b33c	libdrgn/python: use generated docstrings for generated types I missed ProgramFlags, Qualifiers, and TypeKind.	2019-05-06 14:55:34 -07:00
Omar Sandoval	839252564a	libdrgn: deduplicate files in DWARF index Currently, we deduplicate files for userspace mappings manually. However, to prepare for adding symbol files at runtime, move the deduplication to DWARF index. In the future, we probably want to deduplicate based on build ID, as well.	2019-05-06 14:55:34 -07:00
Omar Sandoval	6f16ab09d6	libdrgn: only apply ELF relocations to relocatable files Relocations are only supposed to be applied to ET_REL files, not ET_EXEC files like vmlinux. This hasn't been an issue with the kernel builds that I've tested on because the relocations match the contents of the section. However, on Fedora, the relocation sections don't match, probably because they post-process the binary in some way. This leads to completely bogus debug information being parsed by drgn_dwarf_index. Fix it by only relocating ET_REL files.	2019-05-06 11:09:46 -07:00
Omar Sandoval	4bb36fc150	libdrgn: check for Python development headers at configure time	2019-05-03 14:10:48 -07:00
Omar Sandoval	7282c40a75	libdrgn: fix crash in drgn_object_slice() We need to set the value after we've reinitialized the object, otherwise drgn_object_deinit() may try to free a buffer that we've already overwritten. This also adds a test which triggers the crash.	2019-04-24 17:57:36 -07:00
Omar Sandoval	97f5cf70c6	libdrgn: fix C array and function casting Casting an array or function should first convert the array or function into a pointer.	2019-04-12 16:40:12 -07:00
Omar Sandoval	1db8d11f84	libdrgn: allow void pointer arithmetic This is a GCC extension, but it's used pretty often in practice.	2019-04-12 16:06:24 -07:00
Omar Sandoval	7719d76820	libdrgn: allow comparison of function pointers and incomplete arrays	2019-04-12 15:49:15 -07:00
Omar Sandoval	309dc82789	libdrgn: allow comparing any pointer types in C There's a bug that we don't allow comparisons between void * and other pointer types, so let's fix it by allowing all pointer comparisons regardless of the referenced type. Although this isn't valid by the C standard, GCC and Clang both allow it by default (with a warning).	2019-04-12 15:44:08 -07:00
Omar Sandoval	73090f6128	libdrgn: fix error message for cast to incomplete type The errant type is the one we're trying to cast to, not the object's type. This fixes an abort in drgn_error_incomplete_type().	2019-04-12 13:24:51 -07:00
Omar Sandoval	1b4aee1a55	libdrgn/python: add Program.pointer_type()	2019-04-11 23:35:10 -07:00
Omar Sandoval	435640faf6	Fix some linter errors	2019-04-11 15:51:20 -07:00
Omar Sandoval	393a1f3149	Document with Sphinx drgn has pretty thorough in-program documentation, but it doesn't have a nice overview or introduction to the basic concepts. This commit adds that using Sphinx. In order to avoid documenting everything in two places, the libdrgn bindings have their docstrings generated from the API documentation. The alternative would be to use Sphinx's autodoc extension, but that's not as flexible and would also require building the extension to build the docs. The documentation for the helpers is generated using autodoc and a small custom extension.	2019-04-11 12:48:15 -07:00

... 2 3 4 5 6 ...

356 Commits