JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-23 17:53:07 +00:00

Author	SHA1	Message	Date
Omar Sandoval	919e95e2d5	libdrgn: dwarf_info: make drgn_dwarf_index_state::max_threads an int It doesn't really matter, but the return type of omp_get_max_threads() is int. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	b450a7b02b	libdrgn: vector: support using a smaller type for size/capacity For many use cases of vectors, a full size_t isn't necessary, and might even be unnecessary memory overhead. Allow using any unsigned integer type no larger than size_t, but continue to default to size_t. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:14:59 -07:00
Omar Sandoval	74865b2ba8	libdrgn: vector: support storing entries in vector structure Add an inline_size parameter to DEFINE_VECTOR()/DEFINE_VECTOR_TYPE() which specifies how many entries should be stored inline in the struct vector. This is similar to the std::string small string optimization (SSO) [1] and LLVM's SmallVector [2]. It allows avoiding malloc() and a cache miss for small vectors at the cost of an extra couple of branches. vector_steal() is also undefined for vector types with inline entries. 1: https://stackoverflow.com/questions/10315041/meaning-of-acronym-sso-in-the-context-of-stdstring/10319672#10319672 2: https://llvm.org/doxygen/classllvm_1_1SmallVector.html Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:13:54 -07:00
Omar Sandoval	8d4b607435	libdrgn: add macros for defining types conditionally Add type_if() and typedef_if() to a new header, generics.h. These will be used for the upcoming vector variants. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:13:54 -07:00
Omar Sandoval	d1a6350bdd	libdrgn: revamp generic vector API The current generic vector API is pretty minimal and exposes its internal members as part of the public interface. This has worked well but prevents us from changing the vector implementation. In particular, I'd like to have "small vector" variants that can store some entries directly in the vector structure, use a smaller integer type for the size and capacity, or both. So, let's make the generated vector type "private" and add accessor functions. This is very verbose in some cases, but it'll grant us much more flexibility. While we're changing every user anyways, let's also make use of _cleanup_(vector_deinit) where possible. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:13:38 -07:00
Omar Sandoval	edf8845fcc	libdrgn: get rid of compatible type requirement for {min,max}_iconst() This has gotten in the way more than it has helped. I'll probably do the same to min() and max() the next time they annoy me. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:12:11 -07:00
Omar Sandoval	ca810eec66	libdrgn: define auto to __auto_type via autoconf Let's pretend we live in the C23 future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 10:32:39 -07:00
Omar Sandoval	ef4321557d	libdrgn: dwarf_info: fix segfault if .debug_str_offsets is too short If a CU doesn't have a DW_AT_str_offsets_base attribute and the .debug_str_offsets section is too short, then we'll try to dereference a NULL Dwarf_Attribute pointer when reporting the error. Report that case explicitly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 10:32:39 -07:00
Omar Sandoval	245a383501	libdrgn: fix segfault if looking up main language fails If the call to drgn_debug_info_main_language() from drgn_program_set_language_from_main() fails, then the latter needs to bail, not write garbage from the stack into prog->lang, which will crash later. Fixes: `5591d199b1` ("libdrgn: debug_info: split DWARF support into its own file") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 10:32:39 -07:00
Stephen Brennan	16164dbe6e	libdrgn: detect flattened vmcores and raise error The makedumpfile flattened format is occasionally seen by users, but is not read by libkdumpfile and thus unsupported by Drgn. A simple 'reassembly' process is all that is necessary to allow Drgn to open the vmcore, but this fact isn't easily discoverable, resulting in issues like #344. To help users, detect this when we're testing for kdump signatures, and raise an error with reassembly instructions. For further details on the flattened format, consult makedumpfile(8), particularly the sections documenting options -F and -R. Signed-off-by: Stephen Brennan <stephen@brennan.io>	2023-08-16 09:41:26 -07:00
Omar Sandoval	579e68885a	libdrgn: examples: load_debug_info: pass struct drgn_program address to --{pre,post}-exec This is useful for debugging the state of the program after loading debugging information (e.g., debugging drgn with drgn!). For example: load_debug_info --post-exec 'echo drgn -p $1; echo "prog_obj = Object(prog, \"struct drgn_program *\", $2)"; sleep +inf' Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-04 12:49:48 -07:00
Omar Sandoval	56d6631c22	libdrgn: examples: load_debug_info: fix handling of --time vs --post-exec Fix a missing error goto and print the time after the post-exec command. Fixes: `a21355eb69` ("libdrgn: examples: add --pre-exec and --post-exec options to load_debug_info") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-03 10:43:52 -07:00
Omar Sandoval	a21355eb69	libdrgn: examples: add --pre-exec and --post-exec options to load_debug_info These can be used for benchmarking. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-03 01:26:22 -07:00
Omar Sandoval	c8406e1ea0	libdrgn: require semicolon after DEFINE_{HASH,VECTOR,BINARY_SEARCH_TREE}* The lack of a semicolon after these macros has always confused tooling like cscope. We could add semicolons everywhere now, but let's enforce it for the future, too. Let's add a dummy struct forward declaration at the end of each macro that enforces this requirement and also provides a useful error message. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:54:59 -07:00
Omar Sandoval	968abeda56	libdrgn: dwarf_info: fix memcpy() undefined behavior (again) Once again, UBSan has reported the stupid undefined behavior of memcpy() from a NULL source (even with a zero size). In fact, I fixed it in a previous incarnation of this code in commit `a17215e984` ("libdrgn: dwarf_index: fix memcpy() undefined behavior"). Fixes: `0e6a0a5f94` ("libdrgn: dwarf_info: get rid of struct drgn_dwarf_index_pending_cu") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:26:09 -07:00
Omar Sandoval	243f6fb7d5	libdrgn: support value objects with >64-bit integer types The Linux kernel's struct task_struct on AArch64 contains an array of __uint128_t: >>> task = find_task(prog, 1) >>> task.type_ struct task_struct * >>> task.thread.type_ struct thread_struct { struct cpu_context cpu_context; struct { unsigned long tp_value; unsigned long tp2_value; struct user_fpsimd_state fpsimd_state; } uw; enum fp_type fp_type; unsigned int fpsimd_cpu; void sve_state; void sme_state; unsigned int vl[2]; unsigned int vl_onexec[2]; unsigned long fault_address; unsigned long fault_code; struct debug_info debug; struct ptrauth_keys_user keys_user; struct ptrauth_keys_kernel keys_kernel; u64 mte_ctrl; u64 sctlr_user; u64 svcr; u64 tpidr2_el0; } >>> task.thread.uw.fpsimd_state.type_ struct user_fpsimd_state { __int128 unsigned vregs[32]; __u32 fpsr; __u32 fpcr; __u32 __reserved[2]; } As a result, printing a task_struct fails: >>> task Traceback (most recent call last): File "<console>", line 1, in <module> File "/host/home/osandov/repos/drgn3/drgn/cli.py", line 140, in _displayhook text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns) NotImplementedError: integer values larger than 64 bits are not yet supported PR #311 suggested treating >64-bit integers as byte arrays for now; I tried an alternate hack of handling >64-bit integers only in the pretty-printing code. Both of these had issues, though. Instead, let's push >64-bit integer support a little further and allow storing "big integer" value objects. We still don't support any operations on them, so this still doesn't complete #170. We store the raw bytes of the value for now, but we'll probably change this if we add support for operations (e.g., to store the value as an mp_limb_t array for GMP). We also print >64-bit integer types in hexadecimal for simplicity. This is inconsistent with the existing behavior of printing in decimal, but more readable. In the future, we might want to add heuristics to decide when to print in decimal vs hexadecimal for all sizes. Closes #311. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:21:46 -07:00
Omar Sandoval	91b26e2338	libdrgn: python: add _cleanup_pydecref_ scope guard We have tons of cleanup code just for calling Py_DECREF(); this is a perfect use case for a scope guard. Add it and use it everywhere that it is straightforward to. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 12:28:42 -07:00
Omar Sandoval	ee51244dc1	libdrgn: add _cleanup_free_ scope guard, no_cleanup_ptr(), and return_ptr() Kevin Svetlitski suggested making use of __attribute__((__cleanup__)) a long time ago, and now that the kernel is doing it, I don't have a good excuse not to. There are surprisingly only a handful of places that it was straightforward to apply it to. A lot of potential uses are thwarted by our policy that out parameters can be clobbered on failure, so that may be something to revisit. Other cleanup guards will probably be more useful, but this is just laying the groundwork for the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 12:26:50 -07:00
Omar Sandoval	3ce37c8002	libdrgn: python: fix creating compound value with 32-bit float member on big-endian This is similar to commit `155ec92ef2` ("libdrgn: fix reading 32-bit float object values on big-endian"). Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 10:39:34 -07:00
Omar Sandoval	0bc79c877a	libdrgn: fix stray bits when reading bytes of bit field Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-01 16:31:17 -07:00
Oleksandr Natalenko	54b1630370	helpers: support `dmesg` with RHEL 7 RHEL 7 kernel still uses `struct log ` for a structured kernel log instead of `struct printk_log `, so lets try to support it. Tested on: * `3.10.0-229` * `3.10.0-1160.80.1` * `3.10.0-1160.83.1` This should have no impact on existing supported cases. Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>	2023-07-25 08:52:13 -07:00
Omar Sandoval	470b2e02de	drgn.helpers.linux.net: add skb_shinfo() Closes #335. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-20 10:23:14 -07:00
Omar Sandoval	d572ecbe4f	drgn.helpers.linux.net: add netdev_priv() Closes #334. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 23:15:08 -07:00
Omar Sandoval	50258ad0b4	pre-commit: update Black to 23.7.0 This added one minor style fix. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 14:51:36 -07:00
Omar Sandoval	f27485670a	CONTRIBUTING: add Linux kernel helper guidelines Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 14:48:22 -07:00
Omar Sandoval	9022a01667	docs: update required Sphinx version to 5.3.0 This isn't the latest version, but it's the version that I use locally on Fedora 38. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 13:50:09 -07:00
Omar Sandoval	ece86bd260	docs: document supported architectures and kernel versions Closes #287, closes #288. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 13:45:28 -07:00
Omar Sandoval	55a3ebca6c	libdrgn: dwarf_info: support DWO split DWARF We've addressed all of the smaller differences with GNU Debug Fission and split DWARF 5, so now all that remains is the DWARF index. The general approach is: in drgn_dwarf_index_read_cus(), for each CU, ask libdw for the "sub-DIE". For skeleton CUs, this is the split CU DIE from the .dwo file. From that Dwarf_Die, we can get the Dwarf_CU and then the Dwarf handle. Then, we wrap that in a struct drgn_elf_file (cached in a hash table in the struct drgn_module), which the DWARF index can work with from there. Additionally, a couple of places (.debug_addr parsing and stack trace local variable lookup) need to be updated to use the correct drgn_elf_file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	c7f1d0d40c	libdrgn: dwarf_info: read CU DIE with libdw in DWARF index Split DWARF is challenging for the DWARF index for a couple of reasons: 1. We need libdw to look up the split files. 2. The file name table comes from the skeleton file, but everything else relevant to the index comes from the split file. (1) requires the index to use libdw to get the CU DIE. Unfortunately, due to the overhead of libdw, this makes the indexing step 5-10% slower. On the plus side, getting the CU DIE upfront simplifies quite a bit: we can read the file name table, compilation directory, and str_offsets base before indexing, which makes supporting (2) possible. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	fc1ee46941	libdrgn: dwarf_info: parse units with dwarf_next_unit() in DWARF index In the next change, we'll need more information about the unit, and there's no benefit to doing it ourselves anymore. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	645950134b	libdrgn: dwarf_info: move file name table parsing code No changes, this just moves the code now so that later changes are more obvious. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	0e6a0a5f94	libdrgn: dwarf_info: get rid of struct drgn_dwarf_index_pending_cu Instead, reuse struct drgn_dwarf_index_cu for the pending CUs. This is mainly so that we can save more information in the pending CU in a later change. It also lets us merge our per-thread pending CU arrays with memcpy() instead of element-by-element, but I didn't measure a performance difference one way or the other. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	05c3b244bf	libdrgn: dwarf_info: handle GNU Debug Fission location lists GNU Debug Fission's location lists are a hybrid of the DWARF 5 and non-split DWARF 4 versions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	9c307a4df4	libdrgn: dwarf_info: handle split DWARF .debug_addr There are a couple of differences with non-split DWARF 5: - DW_AT_addr_base/DW_AT_GNU_addr_base is in the skeleton DIE, so we need to use dwarf_attr_integrate(). - GNU Debug Fission for DWARF 4 doesn't have headers in .debug_addr. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	a78d30e13e	libdrgn: dwarf_info: handle split DWARF in dwarf_module_find_dwarf_scopes() dwarf_module_find_dwarf_scopes() and drgn_dwarf_die_iterator_next() just need to go from skeleton units to split units. We need to use dwarf_cu_info(), which was added in 0.171, which incidentally was when elfutils gained split DWARF support anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	4fa1dfc063	libdrgn: dwarf_info: handle missing DW_AT_loclists_base It seems like GCC omits this for split units when using DWARF 5, intending it to mean the first entry in .debug_loclists. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	28b3e016f9	libdrgn: dwarf_info: handle missing DW_AT_str_offsets_base GNU Debug Fission doesn't have DW_AT_str_offsets_base but does have .debug_str_offsets. GCC doesn't emit DW_AT_str_offsets_base for DWARF 5 split DWARF. In both cases, the default is the first entry in .debug_str_offsets. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	c4ebbc29ca	libdrgn: dwarf_info: fix CU header size computation for GNU Debug Fission dwo_id was added in split DWARF 5; GNU Debug Fission doesn't have it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	69a99dde0d	tests: test DWARF 5 Pick a few tests where the version difference matters rather than running every test twice. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	4eb0a8fa85	cli: configure logger Now that drgn is hooked up to log to the logging module, let's configure the logging module to print logs nicely and add a --log-level command line option. This makes the quiet parameter to run_interactive() redundant, so we ignore it now and will remove it in a future release. I'm not sure whether we should expose the log formatter, or maybe run_interactive() should also set up the logger. I may also want to break download progress out into a separate option from --quiet and then make --quiet equivalent to --log-level=none --progress=never. All of that can happen later. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	81c8672d4d	libdrgn: python: log to the standard logging module Rather than coming up with our own, separate logging API for the Python bindings, let's integrate with the logging module. The straightforward part is creating a logger from the C extension and adding a log callback that calls its log() method. However, syncing the log level between the logging module and libdrgn requires monkey patching. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	c1a2792e6a	libdrgn: add simple logging framework Exceptions aren't enough to debug complicated code paths like debug info discovery or stack unwinding. We really need logs for that, so let's add a small logging framework. By default, we log to stderr, but we also provide a way to direct logs to a different file, or even an arbitrary callback so that logs can be directed to the application's logging library of choice. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	fa82071618	libdrgn: call blocking hooks around DWARF index DWARF indexing can take a long time; Kevin Svetlitski notes that it can take almost a minute on some large binaries. Let's use the new blocking API around it so that the Python bindings drop the GIL. Closes #247. Suggested-by: Kevin Svetlitski <svetlitski@meta.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	0ad19dc37b	libdrgn: python: set blocking callback to release GIL Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	06a825f315	libdrgn: add API for hooks around blocking operations There are places in drgn where it'd be a good idea to drop the Python GIL. However, some of these are deep inside of libdrgn, where some code paths are fast and dropping the GIL would be extra overhead and others are slow (e.g., type lookups, which may be cached or may require DWARF namespace indexing). Instead of trying to do this from the Python bindings, add hooks to libdrgn. These hooks can be used directly or with a new scope guard macro, drgn_blocking_guard, that we can start sprinkling around in appropriate places in libdrgn. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	5c1b6cf764	docs: document thread safety Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:33:35 -07:00
Omar Sandoval	471e32e906	libdrgn: debug_info: try harder to get debug file path We're getting (null) file paths in error messages (e.g., #233) because libdwfl doesn't always return the debug file path. Fall back to the loaded file path, which is better than nothing until we get rid of libdwfl. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:33:35 -07:00
Omar Sandoval	859b3c5053	vmtest.config: add highmem=off to Arm QEMU config My Arm VM fails to boot on QEMU 7.2.1 after the following sequence of events on the kernel console: pci-host-generic 4010000000.pcie: can't claim ECAM area [mem 0x10000000-0x1fffffff]: address conflict with pcie@10000000 [mem 0x10000000-0x3efeffff] pci-host-generic: probe of 4010000000.pcie failed with error -16 ... 9pnet_virtio: no channels available for device /dev/root VFS: Cannot open root device "" or unknown-block(0,0): error -2 Please append a correct "root=" boot option; here are the available partitions: Can't find any bdev filesystem to be used for mount! Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) Turning off highmem fixes the conflict. (I think this previously worked without highmem=off on Arch Linux, so maybe there's something different in Fedora's QEMU.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-10 10:59:40 -07:00
Omar Sandoval	745217031f	vmtest.config: add 6.5 to supported kernels After a streak of changes required from 6.2-6.4, 6.5 thankfully doesn't need any updates. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-10 10:42:35 -07:00
Omar Sandoval	e901de6137	vmtest.config: work around ppc64 build failure on v6.5-rc1 I sent a patch titled "powerpc/crypto: fix missing skcipher dependency for aes-gcm-p10" [1] to fix the build failure, but we also have no need for this module anyways. 1: https://lists.ozlabs.org/pipermail/linuxppc-dev/2023-July/260369.html Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-10 09:51:51 -07:00

1 2 3 4 5 ...

1664 Commits