JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-22 17:23:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	d7b14b4575	libdrgn: add cleanup.h to Makefile.am sources Fixes: `ee51244dc1` ("libdrgn: add _cleanup_free_ scope guard, no_cleanup_ptr(), and return_ptr()") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-24 10:33:19 -07:00
Omar Sandoval	32c0ce9f0d	libdrgn: orc_info: don't assume ORC section alignment ORC sections do seem to be aligned as we expected in vmlinux, but not in kernel modules (the kernel's module loader takes care of aligning it properly regardless of the ELF section alignment). This causes stack tracing to fail when the stack trace contains a frame in a kernel module without DWARF, e.g., vmx_do_nmi_irqoff in arch/x86/kvm/vmx/vmenter.S. Fix it by going back to copying instead of assuming alignment. Fixes: `0bb503c6a0` ("libdrgn: orc_info: check ORC section alignment instead of copying") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-21 15:18:26 -07:00
Omar Sandoval	2b1b445315	libdrgn: don't require struct/union/class/enum keyword for C++ type lookups Requiring the elaborated type specifier has been a common source of confusion for people debugging C++ applications with drgn, since this makes it look like the type doesn't exist or debug info was missing. Now that drgn_program_find_type_impl() can look up multiple type kinds in one shot, make c_family_find_type() look up struct, union, class, and enum types in addition to typedefs when it gets a plain identifier and the program is C++. Closes #348. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-21 15:18:26 -07:00
Omar Sandoval	30ecdd901e	libdrgn: allow passing multiple type kinds to type finder function For the next change, we want to look up a name which may have one of multiple type kinds. Make drgn_type_kind_fn in libdrgn take a bitmask of kinds instead of a single kind. We could change the Python bindings to take the same bitmask, or a tuple of drgn.TypeKind, but either would be a breaking API change. For now, let's call the type finder function for each kind in the bitmask instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-21 15:18:26 -07:00
Omar Sandoval	2c82693d3b	libdrgn: python: add scope guard for PyGILState This was blocking the conversion to _cleanup_pydecref_ in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-21 15:18:26 -07:00
Stephen Brennan	eb83d51175	Add VMCOREINFO to special Linux Kernel objects For Python-based object, type, and symbol finders, the vmcoreinfo is a critical source of information. It can contain addresses necessary for loading certain information (such as kallsyms). Expose this information as a special object. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2023-08-18 22:21:13 -07:00
Omar Sandoval	a657c841d0	libdrgn: dwarf_info: fix crash after DW_CFA_restore_state I botched DW_CFA_restore_state when converting to the new vector API. Fixes: `d1a6350bdd` ("libdrgn: revamp generic vector API") Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-18 14:19:48 -07:00
Omar Sandoval	f4af9b5b1d	libdrgn: set default number of OpenMP threads without hyperthreads In commit `c4a122ead6` ("libdrgn: dwarf_info: scalably index all DIEs per name"), I noted that indexing the Linux kernel was slower with 80 threads than with 8. After experimenting on multiple systems, I determined that the slowdown was because of hyperthreading; limiting the number of threads to the number of cores (not hyperthreads) was usually faster, and never slower. Unfortunately, OpenMP doesn't have a convenient way to express this, so we have to parse sysfs and explicitly specify the number of threads ourselves. Compared to commit `c4a122ead6` ("libdrgn: dwarf_info: scalably index all DIEs per name"), this makes indexing the large C++ application half a second faster, and the Linux kernel 70 ms faster, all while using less resources. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-18 08:33:56 -07:00
Omar Sandoval	fd6796556d	libdrgn: add _cleanup_fclose_ scope guard Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-17 15:42:13 -07:00
Omar Sandoval	907c5c0d03	libdrgn: util.h: include <limits.h> It's needed for CHAR_BIT in max_decimal_length(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-17 15:40:43 -07:00
Omar Sandoval	bc8514512f	libdrgn: dwarf_info: fix resolving incomplete type in wrong scope find_namespace_containing_die() only looks for DW_TAG_namespace DIEs containing the target DIE, but it also needs to look fo nested classes/structs/unions. Consider the following program: namespace ns { class Bar { ... }; class Foo { class Bar { ... }; ... }; }; If we encounter a declaration DIE for ns::Foo::Bar, we'll end up looking for the definition directly in ns and finding ns::Bar instead, which is a completely different type. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:25:06 -07:00
Omar Sandoval	6840f10b03	libdrgn: dwarf_info: index nested classes/structs/unions C++ supports defining classes inside of other classes (as well as structs and unions). In C++, these are accessed with the same scope resolution operator (::) as namespaces. We can now support this by extending drgn_namespace_dwarf_index to index the children of DW_TAG_class_type, DW_TAG_structure_type, and DW_TAG_union_type DIEs the same way it currently does DW_TAG_namespace_type DIEs. The only complication is that declaration class, struct, and union DIEs can also have children for nested definitions. Rather than mixing declarations in with type definitions, we pretend they're DW_TAG_namespace DIEs. Closes #262. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:24:31 -07:00
Omar Sandoval	c4a122ead6	libdrgn: dwarf_info: scalably index all DIEs per name We currently deduplicate entries in the DWARF index by (name, tag, file name). We want to add support for looking up nested classes, so this is a problem: not every DIE defining a class also defines all of its nested types, so the one DIE we index may not allow us to find every nested class. Instead, we need to index every DIE with a given name. This sounds horribly expensive, both in terms of CPU and memory, but we can mitigate this in several ways: - We no longer need to parse the file name table, cache file name hashes, parse DW_AT_decl_file, or store the file name hash for indexed DIEs. - Instead of storing the tag for each indexed DIE, we can split the DIE map into a map per tag. - We can store the DIEs matching a name in a vector instead of a linked list. - We can use the new inline entry and small size variants of vectors. - We can move struct drgn_namespace_dwarf_index * to a tree separate from the indexed DIEs. After all of these changes, we only need a single uintptr_t per indexed DIE. - We can get rid of the struct drgn_dwarf_index_pending_die list for a namespace and use the indexed DIEs instead, which are half the size. - DW_TAG_base_type maps can be assumed to be globally unique, so they can be stored in their own map of one DIE indexed only by name. - Each thread can independently build the DIE maps without any synchronization to be merged at the end. Here are some performance results comparing the New version (this commit) to the Old version (commit `16164dbe6e` ("libdrgn: detect flattened vmcores and raise error")). Application is either a large, statically-linked C++ application or the live Linux kernel. Threads is the OMP_NUM_THREADS setting used (the machine used for testing has 80 CPUs). Time is the amount of time it took to load and index debugging information. Anon is the amount of anonymous (e.g., heap) memory used. File is the amount of file memory used. Large C++ Application \| Threads \| Version \| Time \| Anon \| File ------------+---------+---------+--------+--------+------- Large C++ \| 80 \| New \| 5 s \| 3.5 GB \| 1.4 GB \| \| Old \| 15 s \| 5.2 GB \| 1.7 GB \| 8 \| New \| 6.5 s \| 3.4 GB \| 1.4 GB \| \| Old \| 10 s \| 5.2 GB \| 1.7 GB \| 1 \| New \| 30 s \| 3.4 GB \| 1.4 GB \| \| Old \| 51 s \| 5.2 GB \| 1.7 GB Linux \| 80 \| New \| 270 ms \| 128 MB \| 300 MB \| \| Old \| 380 ms \| 73 MB \| 326 MB \| 8 \| New \| 240 ms \| 115 MB \| 300 MB \| \| Old \| 240 ms \| 73 MB \| 326 MB \| 1 \| New \| 700 ms \| 87 MB \| 300 MB \| \| Old \| 800 ms \| 73 MB \| 326 MB The results show that the new approach is almost always faster. For the large C++ application, it is much better for both time and memory usage. For the Linux kernel, it is slightly faster and uses more anonymous memory, although that is partially offset by less file memory. (For the Linux kernel, there is a dip in performance for both approaches from 8 threads to 80 which is worth looking into later.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	4d85f4a7cc	libdrgn: dwarf_info: index specifications per thread In commit `26291647eb` ("libdrgn: dwarf_index: handle DW_AT_specification DIEs with two passes"), I claimed that the specification map didn't need to be sharded "because there typically aren't enough of these in a program to cause contention". This is true for the Linux kernel, but not for large C++ applications. Instead of sharding, though, we can avoid synchronization entirely by having each indexing thread build its own specification map and then merging them at the end. This reduces the time to index one large, statically-linked C++ application from 15 seconds to 8.5 seconds! As expected, it has no significant performance difference for the Linux kernel. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	2ad157a795	libdrgn: dwarf_info: don't store file in DWARF index entries The upcoming rework of the DWARF index needs entries in the DWARF index to be as small as possible. The first thing we can get rid of is the struct drgn_elf_file * in struct drgn_dwarf_index_die and struct drgn_dwarf_specification. Instead, we can sort the struct drgn_dwarf_index_cu_vector index_cus by start address, then do a binary search on the DIE address to find the CU and file containing it. As a result of this change, struct drgn_dwarf_index_die no longer contains enough information for drgn_dwarf_index_get_die() to convert it into a libdw Dwarf_Die. But, after the last two commits, drgn_dwarf_index_get_die() is now always called immediately after drgn_dwarf_index_iterator_next(). So, let's get rid of drgn_dwarf_index_get_die() and make drgn_dwarf_index_iterator_next() return the Dwarf_Die and struct drgn_elf_file . We offset the cost of the binary search in index_cus by storing the libdw Dwarf_CU in struct drgn_dwarf_index_cu. This allows us to avoid calling dwarf_offdie{,_types}(), which does a (slower) binary tree search to find the Dwarf_CU * anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	35fc19792e	libdrgn: dwarf_info: add drgn_namespace_find_child() DWARF index iterators are used both for DIE lookups and namespace lookups. Split the latter out into its own interface so that we can simplify the former and support an upcoming rework of the DWARF index. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	e537999173	libdrgn: dwarf_info: remove ambiguous incomplete type check When we encounter an incomplete struct, union, class, or enum type, we try to find the complete definition by name. We also try to detect whether the name is ambiguous, i.e., whether there are multiple distinct types with that name. This is based on the DWARF index's deduplication by filename: if the index contains more than one DIE matching the (name, tag), then the type name was defined in more than one file, and therefore it is ambiguous. However, this breaks if the exact same definition came from different paths. For example, a Linux kernel module built out-of-tree may use different paths than the original kernel build. Other scenarios involving the compilation directory could also affect this. Furthermore, this check won't be feasible with an upcoming rework of the DWARF index. Let's drop the check and return the first match regardless of other matches. Hopefully it doesn't matter too much in practice. If the wrong type is returned, it can be worked around by casting to the correct type looked up by filename. Closes #186. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	147178b01a	libdrgn: dwarf_info: fail hard instead of rolling back on error during DWARF indexing The upcoming DWARF index rework will make it too difficult to roll back in the middle of DWARF indexing. It also doesn't make sense for the planned module API. Let's chuck that code and instead save the error to return forever like we do for index_namespace(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	919e95e2d5	libdrgn: dwarf_info: make drgn_dwarf_index_state::max_threads an int It doesn't really matter, but the return type of omp_get_max_threads() is int. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:15:03 -07:00
Omar Sandoval	b450a7b02b	libdrgn: vector: support using a smaller type for size/capacity For many use cases of vectors, a full size_t isn't necessary, and might even be unnecessary memory overhead. Allow using any unsigned integer type no larger than size_t, but continue to default to size_t. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:14:59 -07:00
Omar Sandoval	74865b2ba8	libdrgn: vector: support storing entries in vector structure Add an inline_size parameter to DEFINE_VECTOR()/DEFINE_VECTOR_TYPE() which specifies how many entries should be stored inline in the struct vector. This is similar to the std::string small string optimization (SSO) [1] and LLVM's SmallVector [2]. It allows avoiding malloc() and a cache miss for small vectors at the cost of an extra couple of branches. vector_steal() is also undefined for vector types with inline entries. 1: https://stackoverflow.com/questions/10315041/meaning-of-acronym-sso-in-the-context-of-stdstring/10319672#10319672 2: https://llvm.org/doxygen/classllvm_1_1SmallVector.html Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:13:54 -07:00
Omar Sandoval	8d4b607435	libdrgn: add macros for defining types conditionally Add type_if() and typedef_if() to a new header, generics.h. These will be used for the upcoming vector variants. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:13:54 -07:00
Omar Sandoval	d1a6350bdd	libdrgn: revamp generic vector API The current generic vector API is pretty minimal and exposes its internal members as part of the public interface. This has worked well but prevents us from changing the vector implementation. In particular, I'd like to have "small vector" variants that can store some entries directly in the vector structure, use a smaller integer type for the size and capacity, or both. So, let's make the generated vector type "private" and add accessor functions. This is very verbose in some cases, but it'll grant us much more flexibility. While we're changing every user anyways, let's also make use of _cleanup_(vector_deinit) where possible. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:13:38 -07:00
Omar Sandoval	edf8845fcc	libdrgn: get rid of compatible type requirement for {min,max}_iconst() This has gotten in the way more than it has helped. I'll probably do the same to min() and max() the next time they annoy me. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 14:12:11 -07:00
Omar Sandoval	ca810eec66	libdrgn: define auto to __auto_type via autoconf Let's pretend we live in the C23 future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 10:32:39 -07:00
Omar Sandoval	ef4321557d	libdrgn: dwarf_info: fix segfault if .debug_str_offsets is too short If a CU doesn't have a DW_AT_str_offsets_base attribute and the .debug_str_offsets section is too short, then we'll try to dereference a NULL Dwarf_Attribute pointer when reporting the error. Report that case explicitly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 10:32:39 -07:00
Omar Sandoval	245a383501	libdrgn: fix segfault if looking up main language fails If the call to drgn_debug_info_main_language() from drgn_program_set_language_from_main() fails, then the latter needs to bail, not write garbage from the stack into prog->lang, which will crash later. Fixes: `5591d199b1` ("libdrgn: debug_info: split DWARF support into its own file") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-16 10:32:39 -07:00
Stephen Brennan	16164dbe6e	libdrgn: detect flattened vmcores and raise error The makedumpfile flattened format is occasionally seen by users, but is not read by libkdumpfile and thus unsupported by Drgn. A simple 'reassembly' process is all that is necessary to allow Drgn to open the vmcore, but this fact isn't easily discoverable, resulting in issues like #344. To help users, detect this when we're testing for kdump signatures, and raise an error with reassembly instructions. For further details on the flattened format, consult makedumpfile(8), particularly the sections documenting options -F and -R. Signed-off-by: Stephen Brennan <stephen@brennan.io>	2023-08-16 09:41:26 -07:00
Omar Sandoval	579e68885a	libdrgn: examples: load_debug_info: pass struct drgn_program address to --{pre,post}-exec This is useful for debugging the state of the program after loading debugging information (e.g., debugging drgn with drgn!). For example: load_debug_info --post-exec 'echo drgn -p $1; echo "prog_obj = Object(prog, \"struct drgn_program *\", $2)"; sleep +inf' Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-04 12:49:48 -07:00
Omar Sandoval	56d6631c22	libdrgn: examples: load_debug_info: fix handling of --time vs --post-exec Fix a missing error goto and print the time after the post-exec command. Fixes: `a21355eb69` ("libdrgn: examples: add --pre-exec and --post-exec options to load_debug_info") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-03 10:43:52 -07:00
Omar Sandoval	a21355eb69	libdrgn: examples: add --pre-exec and --post-exec options to load_debug_info These can be used for benchmarking. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-03 01:26:22 -07:00
Omar Sandoval	c8406e1ea0	libdrgn: require semicolon after DEFINE_{HASH,VECTOR,BINARY_SEARCH_TREE}* The lack of a semicolon after these macros has always confused tooling like cscope. We could add semicolons everywhere now, but let's enforce it for the future, too. Let's add a dummy struct forward declaration at the end of each macro that enforces this requirement and also provides a useful error message. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:54:59 -07:00
Omar Sandoval	968abeda56	libdrgn: dwarf_info: fix memcpy() undefined behavior (again) Once again, UBSan has reported the stupid undefined behavior of memcpy() from a NULL source (even with a zero size). In fact, I fixed it in a previous incarnation of this code in commit `a17215e984` ("libdrgn: dwarf_index: fix memcpy() undefined behavior"). Fixes: `0e6a0a5f94` ("libdrgn: dwarf_info: get rid of struct drgn_dwarf_index_pending_cu") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:26:09 -07:00
Omar Sandoval	243f6fb7d5	libdrgn: support value objects with >64-bit integer types The Linux kernel's struct task_struct on AArch64 contains an array of __uint128_t: >>> task = find_task(prog, 1) >>> task.type_ struct task_struct * >>> task.thread.type_ struct thread_struct { struct cpu_context cpu_context; struct { unsigned long tp_value; unsigned long tp2_value; struct user_fpsimd_state fpsimd_state; } uw; enum fp_type fp_type; unsigned int fpsimd_cpu; void sve_state; void sme_state; unsigned int vl[2]; unsigned int vl_onexec[2]; unsigned long fault_address; unsigned long fault_code; struct debug_info debug; struct ptrauth_keys_user keys_user; struct ptrauth_keys_kernel keys_kernel; u64 mte_ctrl; u64 sctlr_user; u64 svcr; u64 tpidr2_el0; } >>> task.thread.uw.fpsimd_state.type_ struct user_fpsimd_state { __int128 unsigned vregs[32]; __u32 fpsr; __u32 fpcr; __u32 __reserved[2]; } As a result, printing a task_struct fails: >>> task Traceback (most recent call last): File "<console>", line 1, in <module> File "/host/home/osandov/repos/drgn3/drgn/cli.py", line 140, in _displayhook text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns) NotImplementedError: integer values larger than 64 bits are not yet supported PR #311 suggested treating >64-bit integers as byte arrays for now; I tried an alternate hack of handling >64-bit integers only in the pretty-printing code. Both of these had issues, though. Instead, let's push >64-bit integer support a little further and allow storing "big integer" value objects. We still don't support any operations on them, so this still doesn't complete #170. We store the raw bytes of the value for now, but we'll probably change this if we add support for operations (e.g., to store the value as an mp_limb_t array for GMP). We also print >64-bit integer types in hexadecimal for simplicity. This is inconsistent with the existing behavior of printing in decimal, but more readable. In the future, we might want to add heuristics to decide when to print in decimal vs hexadecimal for all sizes. Closes #311. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:21:46 -07:00
Omar Sandoval	91b26e2338	libdrgn: python: add _cleanup_pydecref_ scope guard We have tons of cleanup code just for calling Py_DECREF(); this is a perfect use case for a scope guard. Add it and use it everywhere that it is straightforward to. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 12:28:42 -07:00
Omar Sandoval	ee51244dc1	libdrgn: add _cleanup_free_ scope guard, no_cleanup_ptr(), and return_ptr() Kevin Svetlitski suggested making use of __attribute__((__cleanup__)) a long time ago, and now that the kernel is doing it, I don't have a good excuse not to. There are surprisingly only a handful of places that it was straightforward to apply it to. A lot of potential uses are thwarted by our policy that out parameters can be clobbered on failure, so that may be something to revisit. Other cleanup guards will probably be more useful, but this is just laying the groundwork for the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 12:26:50 -07:00
Omar Sandoval	3ce37c8002	libdrgn: python: fix creating compound value with 32-bit float member on big-endian This is similar to commit `155ec92ef2` ("libdrgn: fix reading 32-bit float object values on big-endian"). Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 10:39:34 -07:00
Omar Sandoval	0bc79c877a	libdrgn: fix stray bits when reading bytes of bit field Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-01 16:31:17 -07:00
Omar Sandoval	55a3ebca6c	libdrgn: dwarf_info: support DWO split DWARF We've addressed all of the smaller differences with GNU Debug Fission and split DWARF 5, so now all that remains is the DWARF index. The general approach is: in drgn_dwarf_index_read_cus(), for each CU, ask libdw for the "sub-DIE". For skeleton CUs, this is the split CU DIE from the .dwo file. From that Dwarf_Die, we can get the Dwarf_CU and then the Dwarf handle. Then, we wrap that in a struct drgn_elf_file (cached in a hash table in the struct drgn_module), which the DWARF index can work with from there. Additionally, a couple of places (.debug_addr parsing and stack trace local variable lookup) need to be updated to use the correct drgn_elf_file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	c7f1d0d40c	libdrgn: dwarf_info: read CU DIE with libdw in DWARF index Split DWARF is challenging for the DWARF index for a couple of reasons: 1. We need libdw to look up the split files. 2. The file name table comes from the skeleton file, but everything else relevant to the index comes from the split file. (1) requires the index to use libdw to get the CU DIE. Unfortunately, due to the overhead of libdw, this makes the indexing step 5-10% slower. On the plus side, getting the CU DIE upfront simplifies quite a bit: we can read the file name table, compilation directory, and str_offsets base before indexing, which makes supporting (2) possible. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	fc1ee46941	libdrgn: dwarf_info: parse units with dwarf_next_unit() in DWARF index In the next change, we'll need more information about the unit, and there's no benefit to doing it ourselves anymore. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	645950134b	libdrgn: dwarf_info: move file name table parsing code No changes, this just moves the code now so that later changes are more obvious. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	0e6a0a5f94	libdrgn: dwarf_info: get rid of struct drgn_dwarf_index_pending_cu Instead, reuse struct drgn_dwarf_index_cu for the pending CUs. This is mainly so that we can save more information in the pending CU in a later change. It also lets us merge our per-thread pending CU arrays with memcpy() instead of element-by-element, but I didn't measure a performance difference one way or the other. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	05c3b244bf	libdrgn: dwarf_info: handle GNU Debug Fission location lists GNU Debug Fission's location lists are a hybrid of the DWARF 5 and non-split DWARF 4 versions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 10:10:08 -07:00
Omar Sandoval	9c307a4df4	libdrgn: dwarf_info: handle split DWARF .debug_addr There are a couple of differences with non-split DWARF 5: - DW_AT_addr_base/DW_AT_GNU_addr_base is in the skeleton DIE, so we need to use dwarf_attr_integrate(). - GNU Debug Fission for DWARF 4 doesn't have headers in .debug_addr. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	a78d30e13e	libdrgn: dwarf_info: handle split DWARF in dwarf_module_find_dwarf_scopes() dwarf_module_find_dwarf_scopes() and drgn_dwarf_die_iterator_next() just need to go from skeleton units to split units. We need to use dwarf_cu_info(), which was added in 0.171, which incidentally was when elfutils gained split DWARF support anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	4fa1dfc063	libdrgn: dwarf_info: handle missing DW_AT_loclists_base It seems like GCC omits this for split units when using DWARF 5, intending it to mean the first entry in .debug_loclists. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	28b3e016f9	libdrgn: dwarf_info: handle missing DW_AT_str_offsets_base GNU Debug Fission doesn't have DW_AT_str_offsets_base but does have .debug_str_offsets. GCC doesn't emit DW_AT_str_offsets_base for DWARF 5 split DWARF. In both cases, the default is the first entry in .debug_str_offsets. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	c4ebbc29ca	libdrgn: dwarf_info: fix CU header size computation for GNU Debug Fission dwo_id was added in split DWARF 5; GNU Debug Fission doesn't have it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-19 09:59:31 -07:00
Omar Sandoval	81c8672d4d	libdrgn: python: log to the standard logging module Rather than coming up with our own, separate logging API for the Python bindings, let's integrate with the logging module. The straightforward part is creating a logger from the C extension and adding a log callback that calls its log() method. However, syncing the log level between the logging module and libdrgn requires monkey patching. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	c1a2792e6a	libdrgn: add simple logging framework Exceptions aren't enough to debug complicated code paths like debug info discovery or stack unwinding. We really need logs for that, so let's add a small logging framework. By default, we log to stderr, but we also provide a way to direct logs to a different file, or even an arbitrary callback so that logs can be directed to the application's logging library of choice. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	fa82071618	libdrgn: call blocking hooks around DWARF index DWARF indexing can take a long time; Kevin Svetlitski notes that it can take almost a minute on some large binaries. Let's use the new blocking API around it so that the Python bindings drop the GIL. Closes #247. Suggested-by: Kevin Svetlitski <svetlitski@meta.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	0ad19dc37b	libdrgn: python: set blocking callback to release GIL Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	06a825f315	libdrgn: add API for hooks around blocking operations There are places in drgn where it'd be a good idea to drop the Python GIL. However, some of these are deep inside of libdrgn, where some code paths are fast and dropping the GIL would be extra overhead and others are slow (e.g., type lookups, which may be cached or may require DWARF namespace indexing). Instead of trying to do this from the Python bindings, add hooks to libdrgn. These hooks can be used directly or with a new scope guard macro, drgn_blocking_guard, that we can start sprinkling around in appropriate places in libdrgn. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:47:34 -07:00
Omar Sandoval	5c1b6cf764	docs: document thread safety Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:33:35 -07:00
Omar Sandoval	471e32e906	libdrgn: debug_info: try harder to get debug file path We're getting (null) file paths in error messages (e.g., #233) because libdwfl doesn't always return the debug file path. Fall back to the loaded file path, which is better than nothing until we get rid of libdwfl. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-18 12:33:35 -07:00
Omar Sandoval	0bcef5b77f	libdrgn: dwarf_info: get byte order from passed file in drgn_eval_cfi_dwarf_expression() Commit `18b12a5c7b` ("libdrgn: get .eh_frame from the correct file") missed this, but it's unlikely to matter in practice. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-07 15:33:44 -07:00
Omar Sandoval	c76f25b852	libdrgn: dwarf_info: ignore DW_OP_{,GNU_}entry_value These opcodes appear in practice, and we choke on them with an exception like "unknown DWARF expression opcode 0xf3" or "unknown DWARF expression opcode 0xa3". In some cases, it'd be possible to recover the entry value by looking at call site information, but that's pretty involved. For now, just treat these operations as optimized out so we stop failing hard. Closes #233. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-06 22:00:25 -07:00
Omar Sandoval	916a7217fb	libdrgn: dwarf_info: don't call dwarf_dieoffset() redundantly When we get the DIE from the offset with dwarf_offdie(), there's no need to go back to the offset with dwarf_dieoffset(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-06 13:56:00 -07:00
Omar Sandoval	b5018aa913	libdrgn: dwarf_info: only iterate necessary DIE subtrees in drgn_module_find_dwarf_scopes() Thierry found that as soon as drgn_module_find_dwarf_scopes() finds any DIE containing the PC, it walks the entire subtree rooted at that DIE. However, we only need to look at the immediate children of a DIE containing the PC. I think this is what I originally intended, but I failed to reset the children flag to false when the last DIE didn't contain the PC. Thierry's suggested check of it.dies.size == subtree is simpler. This is a massive performance improvement: for a kernel core dump with 10k threads, getting the stack trace of every thread took ~90 seconds without this fix and ~50 seconds with it. Let's also add a comment to this very subtle code. Fixes: `d8d4157346` ("libdrgn: debug_info: add drgn_debug_info_module_find_dwarf_scopes()") Co-authored-by: Thierry Treyer <ttreyer@fb.com> Signed-off-by: Thierry Treyer <ttreyer@fb.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-07-06 11:00:16 -07:00
Omar Sandoval	7cb3e99b23	libdrgn: program: find crashed task with cpu_curr() instead of find_task() s390x populates the pid field in NT_PRSTATUS with the CPU number plus 1 [1] instead of the PID of the task that was running on that CPU. This means that we get the wrong task_struct from drgn_program_find_thread() in drgn_program_kernel_core_dump_cache_crashed_thread(), or don't find the task_struct and crash because of a missing NULL check. We can work around this and also gracefully handle the normal and idle cases by instead getting the current task_struct from the CPU runqueue. This is slightly racy: rq->curr is updated in __schedule() [2] before the registers and stack are switched in context_switch() [3]. However, it was already racy, since the pid field in NT_PRSTATUS is populated from current, which is updated after the registers and stack are switched (at least on x86-64) [4]. Closes #314. 1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/s390/kernel/crash_dump.c?h=v6.4#n309 2: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c?h=v6.4#n6646 3: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c?h=v6.4#n5343 4: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/process_64.c?h=v6.4#n621 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-29 16:16:27 -07:00
Omar Sandoval	cc0994a010	drgn.helpers.linux.sched: add cpu_curr() helper This will be used internally, but it's also a nice shortcut for per_cpu(prog["runqueues"], cpu). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-29 15:58:52 -07:00
Omar Sandoval	5057308c0f	drgn 0.0.23 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-28 13:59:18 -07:00
Omar Sandoval	0d6438d994	libdrgn: orc_info: use .orc_header to detect version My kernel patch was merged for Linux 6.4 and backported to 6.3.10, so now we can use the .orc_header section to reliably detect the ORC format version. Since the 6.4 release candidates and older versions of 6.3 don't have .orc_header, we'll keep the version check as a fallback. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-28 11:10:18 -07:00
Omar Sandoval	91ede0c6a4	libdrgn: orc_info: handle ORC changes in Linux 6.3 and 6.4 The ORC format changed twice recently: - Linux kernel commit ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC metadata") (in v6.3). - Linux kernel commit fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two") (in v6.4). The former went unnoticed because the change was subtle, and the latter completely broke x86-64 kernel stack traces. To handle this, let's "upgrade" the format to the latest version when we load and sort the ORC information. This is more work upfront but avoids needing to handle the version differences every time we use ORC to unwind. Unfortunately, ORC currently doesn't have any sort of versioning, so we have to break the rule of not checking kernel versions. However, I have a kernel patch pending merging that should fix this for the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	fc47ec1b78	libdrgn: add prog pointer to struct drgn_module The next commit needs this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	3085259d82	libdrgn: orc_info: use unsigned int instead of size_t for num_entries It's unrealistic for there to be more than 4 billion ORC entries. Switch to an unsigned int. The main benefit is that the indices array that we use to sort the parallel arrays of entries and pc_offsets becomes half the size, which also makes parsing ORC about 10% faster (down from ~5 ms to ~4.5 ms for the Fedora vmlinux on my laptop). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	dd08658a6e	libdrgn: don't cache ORC sections in struct drgn_elf_file .orc_unwind_ip and .orc_unwind are only referenced while initially parsing ORC data and then never touched again, so it's wasteful to cache them in struct drgn_elf_file. Look them up if and when we parse the ORC data instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	0bb503c6a0	libdrgn: orc_info: check ORC section alignment instead of copying In practice, the .orc_unwind and .orc_unwind_ip sections will always be suitably aligned. Check it, then assume the alignment later. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	8526b86644	libdrgn: linux_kernel: get slightly smaller code for kernel_module_iterator_next() By using the same temporary objects in the Linux 6.4 branch as the pre-6.4 branch, we get slightly better code generation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-22 15:27:39 -07:00
Omar Sandoval	2ee625fc74	libdrgn: handle DWARF sections exactly* like libdw We only support .debug_* sections, but libdw also supports .zdebug_, .debug_.dwo, and .gnu.debuglto_.debug_*. Mimic how libdw chooses debug sections, with one exception: .debug_cu_index and .debug_tu_index (used for DWP, which we don't support yet but will) should be considered DWO sections (this needs to be fixed in libdw, too). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-20 13:45:04 -07:00
Omar Sandoval	cff9b6185c	libdrgn: fix typo in ORC unwinder handling of ORC_REG_SP_INDIRECT ORC_REG_SP_INDIRECT is supposed to be an indirect access via rsp, but we have a typo and are using rbp instead. This is a partial fix for #304. Fixes: `630d39e345` ("libdrgn: add ORC unwinder") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-15 13:03:10 -07:00
Omar Sandoval	e2e2ebc317	libdrgn: fix Linux kernel crashed_thread() on non-x86 architectures We currently use crashing_cpu to determine the thread that caused a kernel crash. However, crashing_cpu is x86-specific (it is defined in arch/x86/kernel/reboot.c). Since Linux 4.5, the generic panic code defines a very similar variable, panic_cpu. Use that instead so that we support all architectures, but fall back to crashing_cpu to support older kernels on x86 (even though we don't claim to support 4.4 anymore). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-15 07:56:19 -07:00
Omar Sandoval	772492838f	drgn.helpers.linux.mm: add arbitrary address translation helpers follow_{page,pfn,phys}() translate the virtual address by walking the page table for a given mm_struct (built on top of the existing page table iterator interface). vmalloc_to_page() and vmalloc_to_pfn() are special cases for vmalloc addresses. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-06-02 23:40:38 -07:00
Stephen Brennan	ce8b2938e6	libdrgn: linux_kernel: Fix compiler warning With GCC 13.1.1 and the recommended build setup (CONFIGURE_FLAGS="--enable-compiler-warnings=error"), I get the following failure: In function 'linux_kernel_get_vmemmap', inlined from 'linux_kernel_object_find' at ../../libdrgn/linux_kernel_object_find.inc.strswitch:34:12: ../../libdrgn/linux_kernel.c:370:23: error: 'address' may be used uninitialized [-Werror=maybe-uninitialized] 370 \| err = drgn_object_set_unsigned(&prog->vmemmap, qualified_type, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 371 \| address, 0); \| ~~~~~~~~~~~ ../../libdrgn/linux_kernel.c: In function 'linux_kernel_object_find': ../../libdrgn/linux_kernel.c:361:26: note: 'address' was declared here 361 \| uint64_t address; \| ^~~~~~~ cc1: all warnings being treated as errors While linux_kernel_get_vmemmap_address should always update address in a non-error case, the compiler seems to disagree. It's easy enough to shut up the compiler by initializing address to 0. What's more, if there is an actual issue where the linux_kernel_get_vmemmap_address does NOT update the address variable, a 0 value will be easier to debug than garbage from an uninitialized variable. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2023-06-01 14:49:34 -07:00
Stephen Brennan	49d6bfdb24	Fix test failure on Python 3.12 (fixes #298 ) Running tests on Python 3.12, we get: test_int (tests.test_language_c.TestLiteral.test_int) ... python3.12: /usr/include/python3.12/object.h:215: Py_SIZE: Assertion `ob->ob_type != &PyLong_Type' failed. Aborted (core dumped) We're relying on an implementation detail to check whether the object is negative. Instead, catch an overflow error, negate and try again. Genuine overflows will still overflow on the second time, but negative numbers will succeed. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2023-06-01 14:23:51 -07:00
Ido Schimmel	3f3a957562	libdrgn: linux_kernel: Fix module detection on kernel v6.4 Kernel commit ac3b43283923 ("module: replace module_layout with module_memory") in v6.4 changed the layout of `struct module`, resulting in the following drgn error [1]. Fix this by first trying to determine the base address and size of each kernel module via the `struct module_memory mem[MOD_TEXT]` member, before falling back to previous methods that work on older kernels. Tested on v6.4-rc2 and v6.3 which does not include the above mentioned commit. Note that kernel commit b4aff7513df3 ("scripts/gdb: use mem instead of core_layout to get the module address") performs a similar fix in Python GDB scripts. Closes #296. [1] ``` # drgn drgn 0.0.22 (using Python 3.11.3, elfutils 0.189, with libkdumpfile) For help, type help(drgn). >>> import drgn >>> from drgn import NULL, Object, cast, container_of, execscript, offsetof, reinterpret, sizeof >>> from drgn.helpers.common import * >>> from drgn.helpers.linux import * warning: could not get debugging information for: kernel modules (could not find loaded kernel modules: 'struct module' has no member 'core_size') ``` Signed-off-by: Ido Schimmel <idosch@nvidia.com>	2023-05-28 22:08:18 -07:00
Omar Sandoval	fc3ea4184a	libdrgn: use new include-what-you-use exported declarations and fix warnings include-what-you-use/include-what-you-use#1164 fixed include-what-you-use/include-what-you-use#971 so that we can export forward declarations instead of hacking around it. I can't reproduce the issue with BINARY_OP_SIGNED_2C anymore either, so we can remove that hack, too. Also fix any other warnings. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-05-24 00:25:25 -07:00
Sven Schnelle	73e451d588	tests: enable MM tests on s390x s390x now has full mm support, so enable the tests for it. Signed-off-by: Sven Schnelle <svens@linux.ibm.com>	2023-03-22 15:24:11 -07:00
Sven Schnelle	3483a69a56	libdrgn: add s390x pagetable walk support Add support for walking s390x page tables. This supports up to 5 level page table walking and huge/large pages. In order to figure out the level of paging used, we read the first entry of the pgd, which is always mapped for lowcore access and use the level bits of the next page table. This is because drgn passes mm::pgd as pgtable argument to the walker function which doesn't contain the ASCE bits. Signed-off-by: Sven Schnelle <svens@linux.ibm.com>	2023-03-22 15:24:11 -07:00
Omar Sandoval	0d03be7d62	libdrgn: silence -Wmaybe-uninitialized false positive This false positive appears to only trigger on 32-bit. I reproduced it with GCC 10 and 12. Fixes #242. Reported-by: Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-02-24 01:04:15 -08:00
Omar Sandoval	18a8f69ad8	libdrgn: linux_kernel: add object finder for jiffies We have a lot of examples that use jiffies, but they stopped working long ago on x86-64 (since Linux kernel commit d8ad6d39c35d ("x86_64: Fix jiffies ODR violation") (in v5.8 and backported to stable releases)) and never worked on other architectures. This is because jiffies is defined in the Linux kernel's linker script. #277 proposed updating the examples to use jiffies_64, but I would guess that most kernel developers are familiar with jiffies and many have never seen jiffies_64. jiffies is also a nicer name to type in live demos. Let's add a case to the Linux kernel object finder to get the jiffies variable. Reported-by: Martin Liska <mliska@suse.cz> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-02-22 11:15:37 -08:00
Omar Sandoval	2f97cc0f5f	libdrgn: platform: expand on page table iterator documentation There are a lot of details about how the page table iterator functions should be used/implemented that commit `174b797ae3` ("libdrgn: platform: add documentation (especially for drgn_architecture_info)") didn't cover. Add an example and expand/clarify the documentation for the callbacks. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-02-21 17:42:22 -08:00
Jay Kamat	08cb38cc2f	Expand DW_AT_upper_bound quirk on zero size arrays GCC appears to use data8 at -1 when reporting zero length arrays when comping c++ code, this patch adds support and a test for that behavior. dwarf_info.c: Remove check for sdata on quirk for array length == 0 Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2023-02-21 16:44:20 -08:00
Omar Sandoval	94443457aa	libdrgn: handle GNU Debug Fission attributes, forms, and opcodes These are all equivalent to their DWARF 5 counterparts, which we already support: * DW_FORM_GNU_addr_index <-> DW_FORM_addrx * DW_FORM_GNU_str_index <-> DW_FORM_strx * DW_AT_GNU_addr_base <-> DW_AT_addr_base * DW_OP_GNU_addr_index <-> DW_OP_addrx * DW_OP_GNU_const_index <-> DW_OP_constx Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-02-08 13:25:45 -08:00
Omar Sandoval	02e344a7dd	libdrgn: use strswitch for ELF section names Move the definitions of the section names to a Python script, gen_elf_sections.py, and use that to generate the enum definitions and a lookup function. This is preparation for checking for section names with the .dwo suffix in the future. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-02-08 13:25:22 -08:00
Imran Khan	4d7c709621	helpers: idr: Enable idr helpers to work with older kernel. Prior to kernel v4.11, idr was not using radix tree as its backend. So current idr helper(s) only work for kernel v4.11+. Enable idr helpers(s) to work with non-radix tree based idr, so that the helpers can be used with older kernels as well. Thanks to Omar for optimizing the idr_for_each helper. Signed-off-by: Imran Khan <imran.f.khan@oracle.com>	2023-01-23 17:32:17 -08:00
Kevin Svetlitski	7e6efe6649	Add support for looking up types in namespaces Looking up objects in namespaces is already well-supported by `drgn`. These changes bring the same to functionality type lookup, so that `prog.type('struct A::B::C::MyType')` works in an analogous fashion to `prog['A::B::C::MyVar']`. Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>	2023-01-19 10:19:36 -08:00
Kevin Svetlitski	c32f0811cb	Fix memory leak in `c_format_compound_object` Found via CodeChecker static analysis. Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>	2023-01-11 11:59:43 -08:00
Omar Sandoval	2181826570	drgn 0.0.22 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-01-05 20:38:32 -08:00
Omar Sandoval	4731de6acc	libdrgn: x86_64: unwind with frame pointer more permissively get_registers_from_frame_pointer() has a sanity check that the unwound frame pointer must be greater than the current frame pointer. This is generally true if the entire program is using frame pointers, but not necessarily otherwise. In particular, if the program is a Linux kernel configured with ORC, most of the time, rbp is a general purpose register; it is only used as a frame pointer in special cases without unwinder information like BPF programs. Those cases are exactly when we want the frame pointer unwinder, but depending on what the caller was using rbp for, the frame pointer unwinder might bail prematurely. Let's remove the sanity check. In the worst case, this could lead us off into the weeds chasing pointers, but the iteration limit in drgn_get_stack_trace() prevents that from being dangerous. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-01-04 16:45:28 -08:00
Omar Sandoval	a6b6afaba2	libdrgn: return DRGN_ERROR_NOT_IMPLEMENTED_ERROR if virtual address translation is not implemented This will allow us to distinguish it from other errors. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-01-04 15:09:56 -08:00
Omar Sandoval	c48cddbdb0	libdrgn: ppc64: fix stack unwinding since Linux v5.11 and before v4.20 linux_kernel_get_initial_registers_ppc64() depends on the size of struct pt_regs, but this has changed multiple times, in: - Linux kernel commit 4c2de74cc869 ("powerpc/64: Interrupts save PPR on stack rather than thread_struct") (in v4.20) - Linux kernel commit 66f93c5a02d5 ("powerpc/64: Fix kernel stack 16-byte alignment") (in v4.20) - Linux kernel commit 8e560921b58c ("powerpc/book3s64/pkeys: Store/restore userspace AMR/IAMR correctly on entry and exit from kernel") (in v5.11) It also depends on the overhead stored before struct pt_regs on the stack, which changed in Linux kernel commit cd52414d5a6c ("powerpc/64: ELFv2 use minimal stack frames in int and switch frame sizes") (in v6.2). We can handle all of these cases by reading the previous r1 from memory instead of computing it from a hard-coded size and finding the struct pt_regs based on that r1 and the actual size of struct pt_regs. Reported in #232. Reported-by: Sourabh Jain <jainsourabh679@gmail.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-01-04 13:42:28 -08:00
Sven Schnelle	1bbeff92bf	libdrgn: add s390x unwinding support Co-authored-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com>	2022-12-19 13:48:44 -08:00
Omar Sandoval	9ee1ccff98	libdrgn: add stub s390 and s390x architectures with relocation implementation The only relocation type I saw in Debian's kernel module debug info was R_390_32. R_390_8, R_390_16, R_390_64, R_390_PC16, R_390_PC32, and R_390_PC64 are trivial to support, as well. The Linux kernel supports many more, but hopefully they won't show up for debug info. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-19 13:48:44 -08:00
Omar Sandoval	aa5f121ac9	libdrgn: document implementation-defined behavior in add_to_possibly_null_pointer() Konrad Borowski pointed out that add_to_possibly_null_pointer() relies on GCC-specific behavior: https://fosstodon.org/@xfix/109542070338182493. CONTRIBUTING.rst mentions that we assume that casting between pointers and integers does not change the bit representation, but we might as well document it here, too. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-19 12:07:40 -08:00
Kevin Svetlitski	4213bea149	libdrgn: add limited support for looking up types with template arguments Currently, looking up a type with template arguments results in an "invalid character" syntax error on the "<" character. The DWARF index includes template arguments in indexed names, so we need to do lookups including the template arguments. Full support for this would require parsing the template argument list syntax and normalizing it or looking it up as an AST in some way. For now, it's at least an improvement to pass the user's string verbatim. To do so, kludge it by adding a token containing everything from "<" to the matching ">" to the C++ lexer and appending that to the identifier. Co-authored-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>	2022-12-14 20:55:03 -08:00
Omar Sandoval	7ce84a3f1f	drgn.helpers.linux: add proper XArray helpers Commit `89eb868e95` ("helpers: make find_task() work on recent kernels") made radix_tree_lookup() and radix_tree_for_each() work for basic XArrays. However, it doesn't handle a couple of more advanced features: multi-index entries (which old radix trees actually also supported) and zero entries. It has also been really confusing to explain to people unfamiliar with the radix tree -> XArray transition that they should use helpers named radix_tree for a structure named xarray. So, let's finally add xa_load(), xa_for_each(), and some additional auxiliary helpers. The non-recursive xa_for_each() implementation is based on Kevin Svetlitski's C implementation from commit `2b47583c73` ("Rewrite linux helper iterators in C"). radix_tree_lookup() and radix_tree_for_each() share the implementation with xa_load() and xa_for_each(), respectively, so they are mostly interchangeable. Fixes: #61 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-13 17:46:37 -08:00
Omar Sandoval	6486073148	libdrgn: python: fix Py_BuildValue() type in gen_constants.py We're calling Py_BuildValue() with the "k" format for unsigned long but passing the enum value itself, which is promoted to int. I don't know whether there are any ABIs where this matters in practice, but let's use "K" and cast to unsigned long long explicitly to be safe. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-07 16:46:33 -08:00
Omar Sandoval	94e1407a5f	libdrgn: python: don't repeat class names in gen_constants.py Instead, define the list of constant classes in one place so we can generate all 3 places that need it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-07 15:41:49 -08:00
Omar Sandoval	af28419ee5	libdrgn: python: fix path_arg leaks in Program_find_{type,object} Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-06 13:25:55 -08:00
Omar Sandoval	d7204eaa00	libdrgn: python: simplify path_converter() PyUnicode_FSConverter() already handles os.PathLike, so we only need to handle None and save the string and length. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-06 10:49:00 -08:00
Omar Sandoval	73fea86792	libdrgn: python: add PyLong_From* and PyLong_As* wrappers for stdint.h types It feels icky to write code that, for example, passes a uint64_t to PyLong_FromUnsignedLongLong(). In practice it's fine, but it's much nicer to have conversion functions specifically for the stdint.h types. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-05 16:06:22 -08:00
Alastair Robertson	7180304c88	libdrgn: dwarf_info: Support DW_TAG_GNU_template_parameter_pack This DWARF tag is used by C++ classes which take a variable number of template parameters, such as std::variant and std::tuple. Signed-off-by: Alastair Robertson <ajor@meta.com>	2022-12-05 15:33:46 -08:00
Omar Sandoval	174b797ae3	libdrgn: platform: add documentation (especially for drgn_architecture_info) While reviewing #214, I realized that we have very little documentation for drgn_architecture_info (and platform internals in general). Let's document all of the important stuff, and in particular how to add support for new architectures. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-02 13:55:38 -08:00
Omar Sandoval	1088ef4a1e	libdrgn: platform: replace demangle_return_address() with demangle_cfi_registers() While documenting struct drgn_architecture_info, I realized that demangle_return_address() is difficult to explain. It's more straightforward to define this functionality as demangling any registers that are mangled when using CFI rather than just the return address register. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-12-02 13:52:06 -08:00
Omar Sandoval	0fad8a591a	libdrgn: fix finding types beginning in size_t or ptrdiff_t c_parse_specifier_qualifier_list() checks whether an identifier starts with "size_t" or "ptrdiff_t" to decide whether to return the size_t or ptrdiff_t type. This incorrectly matches stuff like like "size_tea" and "ptrdiff_tee". Fix this by making it an exact comparison. Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 16:21:56 -08:00
Omar Sandoval	18b12a5c7b	libdrgn: get .eh_frame from the correct file We're currently getting .eh_frame from the debug file. However, since .eh_frame is an SHF_ALLOC section, it is actually in the loaded file, and may not be in the debug file. This causes us to fail to unwind in modules whose debug file was created with objcopy --only-keep-debug (which is typical for Linux distro debug files). Fix it by getting .eh_frame from the loaded file. To make this easier, we split .eh_frame and .debug_frame data into two separate tables. We also don't bother deduplicating them anymore, since GCC and Clang only seem to generate one or the other in practice. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	270375f077	libdrgn: debug_info: get "loaded" ELF file For upcoming changes, we will need loaded (SHF_ALLOC) sections for modules. Some separate debug files (e.g., those created with objcopy --only-keep-debug) don't have those sections. Let's get the loaded file from libdwfl with dwfl_module_getelf() and save it in a struct drgn_elf_file. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	bcb53d712b	libdrgn: bypass libdwfl with struct drgn_elf_file Now that we track the debug file ourselves, we can avoid calling libdwfl in a bunch of places. By tracking the bias ourselves, we can avoid a bunch more. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	34f122144a	libdrgn: debug_info: wrap ELF file information in new struct drgn_elf_file struct drgn_module contains a bunch of information about the debug info file. Let's pull it out into its own structure, struct drgn_elf_file. This will be reused for the "main"/"loaded" file in an upcoming change. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 13:37:29 -08:00
Omar Sandoval	b3bab1c5b0	libdrgn: make module vs. program platform difference more clear It's confusing that we have a platform both for the program and for each module. They usually match, but they're not required to. For example, the user can manually add a file with a different platform just to read its debug info. Our rule is that if we're parsing anything from the module, we use the module platform; and otherwise, use the program platform. There are a couple of places where the platforms must match: when using call frame information (CFI) or registers. Let's make all of this more clear in the code (by using the module's platform even when it must match the program's platform) and in comments. No functional change. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 12:53:45 -08:00
Omar Sandoval	85f423dfb8	libdrgn: dwarf_info: get default pointer size from CU If a DW_TAG_pointer_type DIE doesn't specify its size with DW_AT_byte_size, we currently default to the program's address size. However, the DWARF we're parsing could be for a platform with a different address size. It's more correct to use the CU's address size. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 12:53:45 -08:00
Omar Sandoval	222680b47a	Add StackFrame.sp We have some generic helpers that we'd like to add (for example, #210) that need to know the stack pointer of a frame. These shouldn't need to hard-code register names for different architectures. Add a generic shortcut, StackFrame.sp. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-22 18:47:16 -08:00
Boris Burkov	c8ff8728f7	Support systems without qsort_r qsort_r is a non-standard glibc extension and turns out to be the only thing that prevents drgn from working on a musl system. "Fix" the use of qsort_r by switching it to qsort with a thread local variable for the parameter. Tested in a clean chroot install of musl voidlinux. Signed-off-by: Boris Burkov <boris@bur.io>	2022-11-03 12:57:55 -04:00
Stephen Brennan	5f3a91f80d	Add StackFrame.locals() method The StackFrame's __getitem__() method allows looking up names in the scope of a stack frame, which is an incredibly useful tool for debugging. However, the names are not discoverable -- you must already be looking at the source code or some other source to know what names can be queried. To fix this, add a locals() method to StackFrame, which lists names that can be queried in the scope. Since this method is named locals(), it stops at the function scope and doesn't include globals or class members. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2022-11-02 22:40:33 -07:00
Omar Sandoval	b3a5051ff4	libdrgn: dwarf_info: handle DW_TAG_enumerator DIE with missing or invalid DW_AT_name find_dwarf_enumerator() needs to check that the return value of dwarf_diename() is not NULL before calling strcmp(). This is similar to commit `330c71b5b5` ("libdrgn: dwarf_info: fix segfault on anonymous DIEs during scope search"), although I haven't seen this one happen in practice. Fixes: `bc85767e5f` ("libdrgn: support looking up parameters and variables in stack traces") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-02 22:19:44 -07:00
Omar Sandoval	4031093848	Add some missing copyright/license notices I wanted to make REUSE pass, but I'm not sure what to do about trivial files. REUSE suggests using CC0, but Fedora no longer allows CC0. I'll punt that until later. For now, let's add notices to some code files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 17:14:02 -07:00
Omar Sandoval	87b7292aa5	Relicense drgn from GPLv3+ to LGPLv2.1+ drgn is currently licensed as GPLv3+. Part of the long term vision for drgn is that other projects can use it as a library providing programmatic interfaces for debugger functionality. A more permissive license is better suited to this goal. We decided on LGPLv2.1+ as a good balance between software freedom and permissiveness. All contributors not employed by Meta were contacted via email and consented to the license change. The only exception was the author of commit `c4fbf7e589` ("libdrgn: fix for compilation error"), who did not respond. That commit reverted a single line of code to one originally written by me in commit `640b1c011d` ("libdrgn: embed DWARF index in DWARF info cache"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 17:05:16 -07:00
Omar Sandoval	d465071651	libdrgn: replace copies of elfutils headers with generated files Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 15:41:53 -07:00
Omar Sandoval	99dc927f38	libdrgn: dwarf_info: rename dw_tag_str constants Rename DW_TAG_{UNKNOWN_FORMAT,BUF_LEN} to DW_TAG_STR_{UNKNOWN_FORMAT,BUF_LEN} to make it more clear that they're for dw_tag_str. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-31 14:22:45 -07:00
Omar Sandoval	a4ae67b6b5	libdrgn: replace BUILD_BUG_ON* with static_assert Our container_of() and array_size() were copied from the Linux kernel and use some really ugly BUILD_BUG_ON_ZERO() and BUILD_BUG_ON_MSG() macros. C11 has _Static_assert, which is much nicer. We just have to shoehorn it into an expression, which we do with clever use of _Generic and sizeof a struct type definition. (We could accomplish the same idea with a comma expression, but GCC warns when the left-hand operand of a comma expression has no effect. We could also do it with a compound statement, but it's cooler to do it with standard C11.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-28 13:38:35 -07:00
Omar Sandoval	40f2d4b2aa	drgn 0.0.21 Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-12 12:00:54 -07:00
Omar Sandoval	70af25849c	libdrgn: rename drgn_debug_info_module to drgn_module Eventually, modules will be exposed as part of the public libdrgn API, so they should have a clean name. Additionally, the module API I'm currently working on will allow modules for which we don't have the debug info file, so "debug info module" would be a misnomer. Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and drgn_orc_module_info to drgn_module_orc_info to fit the new naming scheme better. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 16:52:46 -07:00
Omar Sandoval	8bfc9f1e07	libdrgn: python: rename module.c to main.c We're eventually going to add a drgn.Module class, which logically should go in a file called module.c. But, we already have a module.c with module-level definitions. Rename that file to main.c to free up module.c Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 16:24:32 -07:00
Omar Sandoval	1fe01bb4b8	libdrgn: python: add call_tp_alloc() There are a bunch of places where we call .tp_alloc() directly, which is very verbose. Add a macro that removes the boilerplate. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 16:24:29 -07:00
Omar Sandoval	60bafe96db	libdrgn: examples: use noreturn for usage() -Wimplicit-fallthrough has a false positive because the compiler apparently doesn't know that usage() never returns. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 16:12:38 -07:00
Omar Sandoval	03d5c2ebac	libdrgn: string_builder: replace string_builder_finalize() Instead of string_builder_finalize(), which leaves the string_builder in an undefined state (according to the documentation, at least), define string_builder_null_terminate(), which documents exactly what it does. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 15:55:04 -07:00
Omar Sandoval	cd41d9d576	libdrgn: string_builder: rework reserving Make string_builder_reserve() allocate an exact capacity, and add a string_builder_reserve_for_append() wrapper that does the next_power_of_two(current length + number to append) that all of the current callers want. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 15:55:02 -07:00
Omar Sandoval	d76a3a338f	libdrgn: string_builder: add dedicated initializer Rather than documenting how to initialize a struct string_builder, provide an initializer, STRING_BUILDER_INIT. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 15:32:07 -07:00
Omar Sandoval	05a3695d5b	libdrgn: enable -Wimplicit-fallthrough, take 2 This time, in order to work on both GCC and Clang, use __attribute__((__fallthrough__)) instead of /* fallthrough */ comments. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-04 23:36:01 -07:00
Omar Sandoval	2b4d5fd237	Revert "libdrgn: enable -Wimplicit-fallthrough" This reverts commit `e05bfbddc2`. Clang doesn't support /* fallthrough */ comments, so we'll need to use __attribute__((falthrough)), which will need some additional feature detection. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-04 18:12:03 -07:00
Omar Sandoval	e05bfbddc2	libdrgn: enable -Wimplicit-fallthrough This only required one change in the code where GCC wanted the comment placed differently. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-04 17:53:35 -07:00
Omar Sandoval	0b7ac5b046	Fix vmcore stack traces on Linux < 4.9 or >= 5.16 and add drgn.helpers.linux.task_cpu() task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes drgn_get_initial_registers() to think that the kernel is !SMP and use CPU 0 instead, producing incorrect stack traces. This has also always been wrong for Linux < 4.9 and on architectures that don't enable CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be ((struct thread_info *)task->stack)->cpu. Fix it by factoring out a new task_cpu() helper that handles all of the above cases. Also add a test case for task_cpu() in case this changes again. Fixes: `eea5422546` ("libdrgn: make Linux kernel stack unwinding more robust") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-03 16:21:12 -07:00
Omar Sandoval	330c71b5b5	libdrgn: dwarf_info: fix segfault on anonymous DIEs during scope search Jakub Kicinski reported that prog.crashed_thread().stack_trace()[1]['does not exist'] segfaulted on a vmcore he encountered. The segfault was a NULL pointer dereference of dwarf_diename() of a DW_TAG_subprogram DIE in drgn_find_in_dwarf_scopes(). The fix is to ignore DIEs without a name. I was curious what this anonymous DW_TAG_subprogram was. It turned out to be some dubious DWARF generated by Clang when a local variable is defined via a macro. One such example comes from the following code in arch/x86/events/intel/uncore.h: static inline bool uncore_mmio_is_valid_offset(struct intel_uncore_box *box, unsigned long offset) { if (offset < box->pmu->type->mmio_map_size) return true; pr_warn_once("perf uncore: Invalid offset 0x%lx exceeds mapped area of %s.\n", offset, box->pmu->type->name); return false; } pr_warn_once() expands to: #define pr_warn_once(fmt, ...) \ printk_once(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__) #define printk_once(fmt, ...) \ ({ \ static bool __section(".data.once") __print_once; \ bool __ret_print_once = !__print_once; \ \ if (!__print_once) { \ __print_once = true; \ printk(fmt, ##__VA_ARGS__); \ } \ unlikely(__ret_print_once); \ }) For some reason, Clang generates an anonymous, top-level DW_TAG_subprogram DIE to contain the __print_once variable: <1><1cf86e>: Abbrev Number: 62 (DW_TAG_subprogram) <2><1cf86f>: Abbrev Number: 61 (DW_TAG_variable) <1cf870> DW_AT_name : (indirect string, offset: 0x34fb2e): __print_once <1cf874> DW_AT_type : <0x1c574c> <1cf878> DW_AT_decl_file : 1 <1cf879> DW_AT_decl_line : 229 <1cf87a> DW_AT_location : 16 byte block: 3 2c 84 66 83 ff ff ff ff 94 1 31 1e 30 22 9f (DW_OP_addr: ffffffff8366842c; DW_OP_deref_size: 1; DW_OP_lit1; DW_OP_mul; DW_OP_lit0; DW_OP_plus; DW_OP_stack_value) Whereas GCC puts it in a DW_TAG_lexical block DIE inside of the DW_TAG_subprogram DIE for uncore_mmio_is_valid_offset(): <1><3110b2>: Abbrev Number: 45 (DW_TAG_subprogram) <3110b3> DW_AT_name : (indirect string, offset: 0x2e13e): uncore_mmio_is_valid_offset <3110b7> DW_AT_decl_file : 4 <3110b8> DW_AT_decl_line : 223 <3110b9> DW_AT_decl_column : 20 <3110ba> DW_AT_prototyped : 1 <3110ba> DW_AT_type : <0x2f416b> <3110be> DW_AT_inline : 3 (declared as inline and inlined) <3110bf> DW_AT_sibling : <0x311142> <2><3110ef>: Abbrev Number: 66 (DW_TAG_lexical_block) <3><3110f0>: Abbrev Number: 120 (DW_TAG_variable) <3110f1> DW_AT_name : (indirect string, offset: 0x2da3f): __print_once <3110f5> DW_AT_decl_file : 4 <3110f6> DW_AT_decl_line : 229 <3110f7> DW_AT_decl_column : 2 <3110f8> DW_AT_type : <0x2f416b> <3110fc> DW_AT_location : 9 byte block: 3 2c 28 48 83 ff ff ff ff (DW_OP_addr: ffffffff8348282c) Regardless, we shouldn't crash on this input. Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-09-21 14:12:16 -07:00
Omar Sandoval	30c9ad452d	libdrgn: linux_kernel: fix global per-CPU variables in kernel modules The .data..percpu section is excluded from /sys/module and struct module::sect_attrs, which means that we default its address to 0. This results in global per-CPU variables in kernel modules being relocated starting from 0 rather than the offset of the per-CPU allocation made for the module, which in turn causes those variables to appear to contain the wrong data. Fix it by manually getting the per-CPU address from struct module. Closes #185. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-09-12 16:27:28 -07:00
Omar Sandoval	a52016c4cb	libdrgn: linux_kernel: always use module list from core For the next fix, we need the address of the .data..percpu section, which is only available directly from the struct module and not from anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and update the name of the testing environment variable from DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE). This has some small overhead (~20ms longer startup time in my benchmarks) and means that we no longer determine the loaded modules if vmlinux is missing, but fixing the per-CPU issue is more important. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-09-12 16:11:59 -07:00
Omar Sandoval	94036f6daf	libdrgn: linux_kernel: optimize reading module list An upcoming fix requires us to always use the module list from the core dump rather than /proc/modules. However, with the existing code, this would cause a major startup time regression for the live kernel, mainly because reading from /proc/kcore is stupidly slow. We currently do 3 + strlen(module->name) reads for every module. We can reduce this to 1 read per module by reading the entire struct module at once. The size of struct module is ~700-900 bytes depending on the kernel configuration, which is still much faster to read than only reading what we need. In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this reduced the time spent in the kernel module iterator from ~2.5ms per module to ~0.4ms per module. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-09-12 16:08:33 -07:00
Omar Sandoval	a2db11ebae	libdrgn: object: fix use after free in drgn_object_set_from_buffer_internal() If drgn_object_set_buffer_from_internal() (used to implement drgn_object_set_from_buffer(), drgn_object_slice(), and drgn_object_reinterpret()) sets an object to a primitive type from a buffer that comes from the same object, then drgn_object_reinit() will free the value and then drgn_value_serialize() will access the freed value, probably resulting in garbage. Handle this case the same way we do if the result type is encoded as a buffer, by first copying to a temporary value. This doesn't affect usage through Python because objects are immutable in the Python bindings. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-09-12 16:08:33 -07:00
Omar Sandoval	f8ba278bc1	libdrgn: fix include-what-you-use warnings It's been awhile since I've run this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-26 12:43:20 -07:00
Omar Sandoval	b8cdfff250	libdrgn: add read(2) and pread(2) wrappers that don't return short reads We have a couple of loops that deal with short reads/EINTR from read(2) and pread(2), and upcoming changes would need to add more. Add some wrappers to abstract this away. drgn_read_memory_file() still needs the loop so it can fault on the exact offset that returns EIO. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-26 12:43:20 -07:00
Omar Sandoval	56fda2a0cf	libdrgn: fix min() warning on 32-bit architectures The call to min() in drgn_read_memory_file() results in the following warning on 32-bit architectures that I missed on review: In file included from ../../libdrgn/memory_reader.c:10: ../../libdrgn/memory_reader.c: In function 'drgn_read_memory_file': ../../libdrgn/minmax.h:36:26: warning: comparison of distinct pointer types lacks a cast 36 \| (void)(&unique_x == &unique_y); \ \| ^~ ../../libdrgn/minmax.h:28:19: note: in expansion of macro 'cmp_once_impl' 28 \| #define min(x, y) cmp_once_impl(x, y, PP_UNIQUE(_x), PP_UNIQUE(_y), <) \| ^~~~~~~~~~~~~ ../../libdrgn/memory_reader.c:284:34: note: in expansion of macro 'min' 284 \| size_t readlen = min(file_end - file_offset, count); \| ^~~ We can fix it with a cast, and additionally do the call to min() earlier and rework the logic a bit. Fixes: `9684771d61` ("libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-26 12:43:20 -07:00
Omar Sandoval	04d2dee964	libdrgn: elaborate on core dump p_filesz < p_memsz ambiguity There's a lot more context here that we should write down. It's also worth noting that it appears that GDB always zero fills the range between p_filesz and p_memsz, so if we end up having any other issues because of this, we might have to concede and go back to the behavior before commit `02912ca7d0` ("libdrgn: fix handling of p_filesz < p_memsz in core dumps"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-26 12:43:20 -07:00
Shung-Hsi Yu	9335e227d6	libdrgn: python: add Jupyter pretty printing support Add pretty printing support in Jupyter notebook for Object, Type, StackFrame, and StackTrace; it will print out their representation in programming language syntax with str(), similar to what's being done in interactive mode. Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>	2022-08-25 13:52:11 -07:00
Glen McCready	9684771d61	libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError makedumpfile will exclude zero pages. We found a core file where a structure straddled a page boundary and the end of the structure was all zeros so the page was excluded and we were generating a FaultError trying to access the structure. This change reverts a portion of that behaviour such that when we are debugging a kernel core we go back to the zero fill behaviour. To do this we go back to creating segments based on memsz instead of filesz and handling the filesz->memsz gap in drgn_read_memory_file. Fixes: `02912ca7d0` ("libdrgn: fix handling of p_filesz < p_memsz in core dumps") Signed-off-by: Glen McCready <gkm@mysteryinc.ca>	2022-08-25 11:59:39 -07:00
Omar Sandoval	ca373fe38a	docs: use "programmable debugger" description consistently Replace the old "Scriptable debugger library" and "Debugger-as-a-library" taglines with the one we're using on GitHub, "Programmable debugger". Make up for it by emphasizing that drgn can also be used as a library a tiny bit more in the README. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-19 01:21:32 -07:00
Michel Alexandre Salim	c0ed1a3203	Fix spelling error abbrevation => abbreviation; caught by Debian's lintian Signed-off-by: Michel Alexandre Salim <michel@michel-slm.name>	2022-08-17 21:45:51 -07:00
Omar Sandoval	6c90315f6f	python: fix FaultError reference leak PyErr_SetObject() takes a reference on the exception value, so we need to drop the reference we got when we created the value. Issue #196 ran into this by reading tons of unmapped addresses. Fixes: `80fef04c70` ("Add address attribute to FaultError exception") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-16 17:35:36 -07:00
Omar Sandoval	a19203a73e	libdrgn: fix QEMU guest memory dump Kconfig suggestion The config option is and always has been CONFIG_FW_CFG_SYSFS, not CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the internal CONFIG_CRASH_CORE. Fixes: `2bd861f719` ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-15 15:11:56 -07:00
Omar Sandoval	faaf01ad1b	Add drgn.StackTrace.prog and drgn_stack_trace_program() If we only have the stack trace available, it's useful to get the program it came from. This'll be used eventually for helpers that take a stack trace. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-08-11 14:45:54 -07:00

1 2 3 4 5 ...

870 Commits