JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-24 10:03:05 +00:00

Author	SHA1	Message	Date
Omar Sandoval	085e6f3078	tests: remove no-op setUp() method Commit `7d7aa7bf7b` ("libdrgn/python: remove Type == operator") removed the substantial part of tests.TestCase.setUp() but didn't remove the method. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-22 11:12:52 -08:00
Omar Sandoval	3f65552d95	tests: add tests for Linux kernel linked list helpers We indirectly test the linked list helpers via other helpers that use them, but let's add some dedicated tests. We test against two lists: * "modules", which is never empty in vmtest because we load the loop module. * "vmcore_list", which is always empty except in the kdump kernel. Like the red-black tree tests, these tests are written generically so we can use a different list if necessary. There aren't any great candidates for hlists, so we'll have to make do with indirectly testing them for now. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-22 10:49:25 -08:00
Omar Sandoval	5f76848b98	tests: add tests for Linux kernel red-black tree helpers The test cases use the VMA tree and cross-reference it with /proc/$pid/maps, but they're written so it could easily be swapped out for another tree if necessary. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-21 17:15:58 -08:00
Omar Sandoval	0a643b6fab	python: allow Program.type() to accept a Type Some helpers can accept either a str or a Type. If they want to always work with a Type internally, they need to do something like: if isinstance(type, str): type = prog.type(type) Instead, let's let Program.type() accept a Type and return the exact same type, so those helpers can unconditionally do: type = prog.type(type) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-21 16:52:36 -08:00
Omar Sandoval	c40543b15c	tests: add test cases for generic flag decode helpers Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-15 11:45:09 -08:00
Stephen Brennan	7970a60818	Add methods to return multiple matching symbols Currently we can lookup symbols by name or address, but this will only return one symbol, prioritizing the global symbols. However, symbols may share the same name, and symbols may also overlap address ranges, so it's possible for searches to return multiple results. Add functions which can return a list of multiple matching symbols. Signed-off-by: Stephen Brennan <stephen@brennan.io>	2022-01-15 11:44:33 -08:00
Omar Sandoval	e2fc4ce2ac	helpers: add a helper for decoding page flags As well as a couple of generic helpers backing it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-01-12 16:57:10 -08:00
Kevin Svetlitski	301cc767ba	Implement a new API for representing threads Previously, drgn had no way to represent a thread – retrieving a stack trace (the only extant thread-specific operation) was achieved by requiring the user to directly provide a tid. This commit introduces the scaffolding for the design outlined in issue #92, and implements the corresponding methods for userspace core dumps, the live Linux kernel, and Linux kernel core dumps. Future work will build on top of this commit to support live userspace processes. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-01-11 17:28:17 -08:00
Kevin Svetlitski	d3c9e24115	tests: make all tests inherit from drgn's TestCase class The majority of test cases already inherited from drgn's TestCase class. The few outliers that inherited directly from unittest.TestCase have been brought in line with the other tests. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-01-11 17:28:17 -08:00
Kevin Svetlitski	ac2cadabcd	Add framework for testing in kdump Now that the vmtest kernel supports kdump, add a script that can be used to crash and enter the kdump environment on demand. Use that to crash after running the normal test suite so that we can run tests against /proc/vmcore. vmcore tests live in their own directory; presently the only test is a simple sanity check that ensures we can can attach to /proc/vmcore. Signed-off-by: Omar Sandoval <osandov@osandov.com> Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-01-07 14:03:00 -08:00
Omar Sandoval	2ff58a4d45	libdrgn: linux: make per_cpu_ptr() support !SMP kernels Kernels built without multiprocessing support don't have __per_cpu_offset; instead, per_cpu_ptr() is a no-op. Make the helper do the same and update the test case to work on !SMP as well. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 16:51:15 -08:00
Omar Sandoval	b341c212f4	tests: fix black error Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 16:06:23 -08:00
Omar Sandoval	d72a9043b0	libdrgn: linux: replace idle_thread() with idle_task() I missed that the kernel has an idle_task() function which uses cpu_rq()->idle instead of idle_threads; the latter is technically architecture-specific. So, replace idle_thread() with idle_task(), which is architecture-independent and more consistent with the kernel. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 16:03:25 -08:00
Peilin Ye	ed7f864532	helpers: Add SOCKET_I() and SOCK_INODE() Add helpers to convert between sockets and inodes. As an example: >>> file = fget(task, fd) >>> sock = SOCKET_I(file.f_inode) >>> sock.type.value_() 2 >>> import socket >>> int(socket.SOCK_DGRAM) 2 >>> inode = SOCK_INODE(sock) Also add tests for the new helpers to tests/helpers/linux/test_net.py. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-12-21 14:55:25 -08:00
Peilin Ye	bc95749975	tests: Rename "sock" to "skt" in test_sk_fullsock() Reserve "sock" for "struct socket *" objects, according to our kernel naming convention. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-12-21 14:55:25 -08:00
Omar Sandoval	adfb04579b	libdrgn: linux: add idle_thread() helper PR #129 will need to get the idle thread for a CPU when the idle thread crashed. Add a helper for this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-21 14:40:57 -08:00
Omar Sandoval	6732148a11	tests: use NOBITS section for ELF symbols Currently, we create a section filled with zeroes to contain the symbols in our ELF symbol tests. We can just use a NOBITS section with no file data instead. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-17 16:46:12 -08:00
Kevin Svetlitski	2b47583c73	Rewrite linux helper iterators in C In preparation for introducing an API to represent threads, the linux helper iterators, radix_tree_for_each, idr_for_each, for_each_pid, and for_each_task have been rewritten in C. This will allow them to be accessed from libdrgn, which will be necessary for the threads API. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2021-12-17 16:24:54 -08:00
Omar Sandoval	02912ca7d0	libdrgn: fix handling of p_filesz < p_memsz in core dumps I implemented the case of a segment in a core file with p_filesz < p_memsz by treating the difference as zero bytes. This is correct for ET_EXEC and ET_DYN, but for ET_CORE, it actually means that the memory existed in the program but was not saved. For userspace core dumps, this typically happens for read-only file mappings. For kernel core dumps, makedumpfile does this to indicate memory that was excluded. Instead, let's return a DRGN_FAULT_ERROR if an attempt is made to read from these bytes. In the future, we need to read from the executable/library files when we can. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-08 00:02:44 -08:00
Omar Sandoval	0315ade709	tests: handle CONFIG_KALLSYMS=n and CONFIG_KALLSYMS_ALL=n If CONFIG_KALLSYMS_ALL=n, then /proc/kallsyms won't include lo_fops, which is a data symbol. Use a function symbol, lo_open, instead. Also check whether /proc/kallsyms exists in the first place. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-02 03:46:06 -08:00
Omar Sandoval	3914bb8e29	libdrgn: fix type names referring to anonymous types A pointer, array, or function referring to an anonymous type currently includes the full type definition in its type name. This creates very badly formatted objects for, e.g., drgn's own hash table types. Instead, use "struct <anonymous>" in the type name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-23 00:57:42 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	cdee38af7a	tests: use different symbol for kernel module debug info test Linux kernel commit 47e9624616c8 ("block: remove support for cryptoloop and the xor transfer") removed the loop_register_transfer function. We only used that symbol because it and loop_unregister_transfer were the only global symbols in the loop module. Now that we can get local symbols by name, we can use the "lo_fops" symbol, which is unlikely to be removed or renamed. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 14:40:06 -08:00
Omar Sandoval	ff40f65f0d	libdrgn: allow symbol name lookup to get local symbols Global symbols are preferred over weak symbols, and weak symbols are preferred over other symbols. dwfl_module_addrinfo() seems to have the same preference, so document address lookups as having the same behavior. (This is actually incorrect in the case of STB_GNU_UNIQUE, as dwfl_module_addrinfo() treats anything other than STB_GLOBAL, STB_WEAK, and STB_LOCAL as having the lowest precedence, but STB_GNU_UNIQUE is so obscure that it probably doesn't matter.) Based on work from Stephen Brennan. Closes #121. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 14:30:57 -08:00
Omar Sandoval	07d00b7b11	tests: add tests for ELF symbols Add some scaffolding to generate ELF files with symbol tables and use it to test symbol lookups and Elf_Sym -> drgn.Symbol translation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-19 17:04:20 -08:00
Omar Sandoval	c84d7e8c15	tests: generate ELF constants from elf.h Generalize generate_dwarf_constants.py for ELF and replace tests/elf.py with the generated version. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-19 17:02:32 -08:00
Omar Sandoval	cb8bf339c8	tests: elfwriter: don't add sections if there aren't any Only add SHT_NULL and .shstrtab sections if there are other sections to be added. This allows us to create core dumps with no sections, like core dumps on Linux. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-19 15:12:31 -08:00
Omar Sandoval	681d8453ce	tests: elfwriter: set e_phoff to zero if there are no segments readelf warns that a non-zero e_phoff with a zero e_phnum is invalid: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-19 15:11:54 -08:00
Jay Kamat	3700bb75b8	libdrgn: Follow typedefs in enum backing type lookup In C++ enums can be a typedef to an int, not just an int itself. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-11-18 13:48:31 -08:00
Omar Sandoval	a5845e63d4	tests: fix race condition in stack trace tests Stephen Brennan reported a flaky test while working on #121: ====================================================================== ERROR: test_by_task_struct (tests.helpers.linux.test_stack_trace.TestStackTrace) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/runner/work/drgn/drgn/tests/helpers/linux/test_stack_trace.py", line 22, in test_by_task_struct self.assertIn("pause", str(self.prog.stack_trace(find_task(self.prog, pid)))) ValueError: cannot unwind stack of running task The problem is that the stack trace tests wait for the thread state to change to "S". However, the state is updated while the thread is still technically running. For example, the pause() system call is implemented as: SYSCALL_DEFINE0(pause) { while (!signal_pending(current)) { __set_current_state(TASK_INTERRUPTIBLE); schedule(); } return -ERESTARTNOHAND; } If Program.stack_trace() accesses the thread after the state is changed but before the thread has actually been scheduled out (namely, before task_struct::on_cpu is set to 0), it will fail. Instead, let's check /proc/$pid/syscall, which contains "running" until the thread is completely scheduled out. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-04 14:13:55 -07:00
Omar Sandoval	3c52b18baa	tests: skip PID memory read test if /proc/$pid/mem doesn't work This works around a QEMU bug (https://gitlab.com/qemu-project/qemu/-/issues/698) which causes Packit build failures on 32-bit ARM. This should unblock #126. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-10-28 14:41:44 -07:00
Omar Sandoval	801f9d645c	tests: improve cgroup helper tests These haven't been running in vmtest since they were added. Enable cgroup2 in vmtest and rework the cgroup tests to create cgroups that we can test with. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-09-02 16:05:46 -07:00
Peilin Ye	f82273749d	helpers: Add qdisc_lookup() Add a helper, qdisc_lookup(), to get a Qdisc (struct Qdisc *) from a network device and a major handle number. As an example: >>> eth0 = netdev_get_by_name(prog, "eth0") >>> tbf = qdisc_lookup(eth0, 0x20) >>> tbf.ops.id.string_().decode() tbf >>> ingress = qdisc_lookup(eth0, 0xffff) >>> ingress.ops.id.string_().decode() ingress Testing depends on pyroute2. `TestTc` is skipped if pyroute2 is not found; test_qdisc_lookup() is skipped if the kernel is not built with the following options: CONFIG_DUMMY CONFIG_NET_SCH_PRIO CONFIG_NET_SCH_SFQ CONFIG_NET_SCH_TBF CONFIG_NET_SCH_INGRESS Suggested-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-09-02 11:03:00 -07:00
Peilin Ye	a01131483d	helpers: Add netdev_for_each_tx_queue() Add a helper, netdev_for_each_tx_queue(), to iterate over all TX queues of a network device. As an example: >>> eth0 = netdev_get_by_name(prog, "eth0") >>> for txq in netdev_for_each_tx_queue(eth0): ... print(txq.qdisc.ops.id.string_().decode()) ... sfq tbf prio pfifo_fast Set up `net` in setUpClass(), since now several tests use it. Also use it in test_netdev_get_by_{index,name}(), instead of assuming `init_net`. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-09-02 11:03:00 -07:00
Omar Sandoval	77b9d3ad98	tests: change LinuxHelperTestCase.setUp to setUpClass This already caches class variables, and it's shared across all Linux helper test cases, so it makes more sense as setUpClass. This will also allow subclasses to use cls.prog in their own setUpClass. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-31 17:43:18 -07:00
Stephen Brennan	207ca0e16b	tests: Add Symbol test Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>	2021-08-20 18:16:57 -07:00
Peilin Ye	242d1484f9	helpers: Add get_net_ns_by_{inode,fd}() Add a helper, get_net_ns_by_inode(), to get the network namespace ("netns") descriptor (struct net ) given an netns NSFS pseudo-file inode (struct inode ) e.g. "/proc/$PID/ns/net" or "/run/netns/$NAME". As an example: >>> inode = path_lookup(prog, "/run/netns/foo").dentry.d_inode >>> net = get_net_ns_by_inode(inode) >>> netdev = netdev_get_by_name(net, "eth3") >>> netdev.ifindex.value_() 5 Conventionally ip netns files can be found under "/var/run/netns/", while Docker netns files can be found under "/var/run/docker/netns". However, as pointed out by Omar, path_lookup() doesn't know how to deal with symlinks; resolve it using something like "pwd -P" before passing it to path_lookup(). Also add a get_net_ns_by_fd() wrapper around it as suggested by Omar. Example: >>> import os >>> pid = os.getpid() >>> task = find_task(prog, pid) >>> file = open(f"/proc/{pid}/ns/net") >>> net = get_net_ns_by_fd(task, file.fileno()) Add a test for get_net_ns_by_inode(). Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-08-20 11:39:56 -07:00
Peilin Ye	cf06be1813	helpers: Add for_each_net() Add a helper to iterate over all network namespaces in the system. As an example: >>> for net in for_each_net(prog): ... if netdev_get_by_name(net, "enp0s3"): ... print(net.ipv4.sysctl_ip_early_demux.value_()) ... 1 Also add a test for this new helper to tests/helpers/linux/test_net.py. Suggested-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-08-16 16:48:25 -07:00
Peilin Ye	557b8152cc	helpers: Add netdev_get_by_name() Add a helper to get the network device ("struct net_device *") given an interface name. As an example: >>> netdev = netdev_get_by_name(prog["init_net"], "lo") >>> netdev.ifindex.value_() 1 Or pass a "Program" as the first argument, and let the helper find in the initial network namespace (i.e. "init_net"): >>> netdev = netdev_get_by_index(prog, "dummy0") >>> netdev.ifindex.value_() 2 Also add a test for this new helper to tests/helpers/linux/test_net.py. This helper simply does a linear search over the name hash table of the network namespace, since implementing hashing in drgn is non-trivial. It is obviously slower than net/core/dev.c:netdev_name_node_lookup() in the kernel, but still useful. Linux kernel commit ff92741270bf ("net: introduce name_node struct to be used in hashlist") introduced struct netdev_name_node for name lookups. Start by assuming that the kernel has this commit, and fall back to the old path if that fails. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-08-11 20:23:39 -07:00
Omar Sandoval	5541fad063	Fix some flake8 errors Mainly unused imports, unused variables, unnecessary f-strings, and regex literals missing the r prefix. I'm not adding it to the CI linter because it's too noisy, though. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-11 14:52:44 -07:00
Peilin Ye	e9915886f6	helpers: Add netdev_get_by_index() Add a helper to find the corresponding "struct net_device *" object given an interface index number. As an example: >>> netdev = netdev_get_by_index(prog["init_net"], 1) >>> netdev.name.string_().decode() 'lo' Or pass a "Program" as the first argument, and let the helper find in its initial network namespace (i.e. "init_net"): >>> netdev = netdev_get_by_index(prog, 3) >>> netdev.name.string_().decode() 'enp0s3' Also add a test for this new helper to tests/helpers/linux/test_net.py. For now, a user may combine this new helper with socket.if_nametoindex() to look up by interface name: >>> netdev = find_netdev_by_index(prog, socket.if_nametoindex("dummy0")) >>> netdev.name.string_().decode() 'dummy0' However, as mentioned by Cong, one should keep in mind that socket.if_nametoindex() is based on system's current name-to-index mapping, which may be different from that of e.g. a kdump. Thus, as suggested by Omar, a better way to do name lookups would be simply linear-searching the name hash table, which is slower, but less erorr-prone. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>	2021-08-11 13:36:32 -07:00
Omar Sandoval	51f63bb53b	helpers: add node_state() to nodemask helpers Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-03 17:06:36 -07:00
Omar Sandoval	1213eb8f49	helpers: add bit operation helpers Extract for_each_set_bit() that was added internally for the cpumask and nodemask helpers, and add for_each_clear_bit() and test_bit() to go with it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-03 15:43:04 -07:00
Qi Zheng	2f97cb4f75	helpers: add kernel nodemask helpers Sometimes we want to traverse numa nodes in the system, so add kernel nodemask helpers to support this. Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>	2021-07-29 19:28:59 -07:00
Omar Sandoval	7335df114c	libdrgn: python: add Object.to_bytes_() And the libdrgn implementation, drgn_object_read_bytes(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:12:34 -07:00
Omar Sandoval	9c00552007	libdrgn: python: add Object.from_bytes_() Add a way to create an object from raw bytes. One example where I've wanted this is creating a struct pt_regs from a PRSTATUS note or other source. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:06:58 -07:00
Omar Sandoval	2e04e6b73c	libdrgn: binary_buffer: handle non-canonical LEB128 numbers LEB128 allows for redundant zero/sign bits, but we currently always treat extra bytes as overflow. Let's check those bytes correctly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 21:39:31 -07:00
Omar Sandoval	86e966fbf8	libdrgn: dwarf_index: handle DW_FORM_block Somehow I missed this form, and I've never seen it used. It's the same as DW_FORM_exprloc for our purposes, so it's an easy fix. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-30 01:34:52 -07:00
Omar Sandoval	faad25d7b2	libdrgn: debug_info: fix address of objects with size zero The stack trace variable work introduced a regression that causes objects with size zero to always be marked absent even if they have an address. This matters because GCC sometimes seems to omit the complete array type for arrays declared without a length, so an array variable can end up with an incomplete array type. I saw this with the "swapper_spaces" variable in mm/swap_state.c from the Linux kernel. Make sure to use the address of an empty piece if the variable is also empty. Fixes: `ffcb9ccb19` ("libdrgn: debug_info: implement creating objects from DWARF location descriptions") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 15:46:22 -07:00
Omar Sandoval	f7fe93e573	cli: show elfutils version in use drgn depends heavily on libelf and libdw, so it's useful to know what version we're using. Add drgn._elfutils_version and use that in the CLI and in the test cases where we currently check the libdw version. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-07 11:10:50 -07:00
Omar Sandoval	bc85767e5f	libdrgn: support looking up parameters and variables in stack traces After all of the preparatory work, the last two missing pieces are a way to find a variable by name in the list of scopes that we saved while unwinding, and a way to find the containing scopes of an inlined function. With that, we can finally look up parameters and variables in stack traces. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	ffcb9ccb19	libdrgn: debug_info: implement creating objects from DWARF location descriptions Add support for evaluating a DWARF location description and translating it into a drgn object. In this commit, this is just used for global variables, but an upcoming commit will wire this up to stack traces for parameters and local variables. There are a few locations that drgn's object model can't represent yet. DW_OP_piece/DW_OP_bit_piece can describe objects that are only partially known or partially in memory; we approximate these where we can. We don't have a good way to support DW_OP_implicit_pointer at all yet. This also adds test cases for DWARF expressions, which we couldn't easily test before. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-05 16:18:51 -07:00
Omar Sandoval	0e3054a0ba	libdrgn: make addresses wrap around when reading memory Define that addresses for memory reads wrap around after the maximum address rather than the current unpredictable behavior. This is done by: 1. Reworking drgn_memory_reader to work with an inclusive address range so that a segment can contain UINT64_MAX. drgn_memory_reader remains agnostic to the maximum address and requires that address ranges do not overflow a uint64_t. 2. Adding the overflow/wrap-around logic to drgn_program_add_memory_segment() and drgn_program_read_memory(). 3. Changing direct uses of drgn_memory_reader_reader() to drgn_program_read_memory() now that they are no longer equivalent. (For some platforms, a fault might be more appropriate than wrapping around, but this is a step in the right direction.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-03 17:49:29 -07:00
Omar Sandoval	cf371594f3	tests: run a few test cases with DW_FORM_indirect Pick a few DWARF parsing test cases that exercise the interesting cases for DW_FORM_indirect and run them with and without DW_FORM_indirect. We only test DW_FORM_indirect if libdw is new enough to support it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-04 16:56:54 -07:00
Omar Sandoval	609a1cafc6	libdrgn: dwarf_index: check for attribute forms more strictly Rather than silently ignoring attributes whose form we don't recognize, return an error. This way, we won't mysteriously skip indexing DIEs. While we're doing this, split the form -> instruction mapping to its own functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-05-04 16:56:54 -07:00
Jay Kamat	c108f9a24c	tests: add basic tests for type units Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-04-23 02:37:31 -07:00
Omar Sandoval	6b79b21ab5	tests: fix test depending on repr(enum.Flag) format CPython commit b775106d940e ("bpo-40066: Enum: modify `repr()` and `str()` (GH-22392)") changed repr(enum.Flag) from, e.g., <Qualifiers.VOLATILE\|CONST: 3> to Qualifiers.CONST\|Qualifiers.VOLATILE. Fix tests.test_type.TestType.test_qualifiers to not assume the format. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-22 01:17:22 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Davide Cavalca	7ca157316f	tests: properly escape regexp strings Signed-off-by: Davide Cavalca <dcavalca@fb.com>	2021-04-02 10:37:33 -07:00
Davide Cavalca	081d7773e1	tests: rename test_type_dies for pytest compatibility Signed-off-by: Davide Cavalca <dcavalca@fb.com>	2021-04-02 10:37:14 -07:00
Omar Sandoval	630d39e345	libdrgn: add ORC unwinder The Linux kernel has its own stack unwinding format for x86-64 called ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is essentially a simplified, less complete version of DWARF CFI. ORC is generated by analyzing machine code, so it is present for all but a few ignored functions. In contrast, DWARF CFI is generated by the compiler and is therefore missing for functions written in assembly and inline assembly (which is widespread in the kernel). This implements an ORC stack unwinder: it applies ELF relocations to the ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule kind, parses and efficiently stores ORC data, and translates ORC to drgn CFI rules. This will allow us to stack trace through assembly code, interrupts, and system calls. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-29 10:01:52 -07:00
Omar Sandoval	12723a0c08	tests: clean up tests.helpers.linux.test_debug_info Split the two modes into separate tests and move the environment variable fiddling into a separate helper function. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-26 12:49:06 -07:00
Omar Sandoval	da0280016c	libdrgn: python: identify bit fields in TypeMember.__repr__ If a member is a bit field, then we should format it with the underlying Object so that it shows the bit field size. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-17 12:02:53 -07:00
Jay Kamat	c22e501295	libdrgn: debug_info: fix parsing specifications of declarations drgn_compound_type_from_dwarf() and drgn_enum_type_from_dwarf() check the DW_AT_declaration flag to decide whether the type is a declaration of an incomplete type or a definition of a complete type. However, they check DW_AT_declaration with dwarf_attr_integrate(), which follows the DW_AT_specification reference if it is present. The DIE referenced by DW_AT_specification typically is a declaration, so this erroneously identifies definitions as declarations. Additionally, if drgn_debug_info_find_complete() finds the same definition, we can end up recursing until we hit the DWARF parsing depth limit. Fix it by not using dwarf_attr_integrate() for DW_AT_declaration. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2021-02-25 10:46:34 -08:00
Omar Sandoval	85dec2b8f6	tests: move C-specific tests from test_object to test_language_c TestCLiteral, TestCIntegerPromotion, TestCCommonRealType, TestCOperators, and TestCPretty in test_object all test various operations on objects, but since they're testing language-specific behavior, they belong in test_language_c. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 16:11:19 -08:00
Omar Sandoval	55e3a58e06	libdrgn: python: use correct member offset when creating object from value We need to use the offset of the member in the outermost object type, not the offset in the immediate containing type in the case of nested anonymous structs. Fixes: `e72ecd0e2c` ("libdrgn: replace drgn_program_member_info() with drgn_type_find_member()") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:29:59 -08:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	72b4aa9669	libdrgn: clean up object initialization Rename struct drgn_object_type to struct drgn_operand_type, add a new struct drgn_object_type which contains all of the type-related fields from struct drgn_object, and use it to implement drgn_object_type() and drgn_object_type_operand(), which are replacements for drgn_object_set_common() and drgn_object_type_encoding_and_size(). This cleans up a lot of the boilerplate around initializing objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 17:43:14 -08:00
Omar Sandoval	78316a28fb	libdrgn: remove half-baked support for complex types We've nominally supported complex types since commit `75c3679147` ("Rewrite drgn core in C"), but parsing them from DWARF has been incorrect from the start (they don't have a DW_AT_type attribute like we assume), and we never implemented proper support for complex objects. Drop the partial implementation; we can bring it back (properly) if someone requests it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 14:56:33 -08:00
Omar Sandoval	b899a10836	Remove register numbers from API and add register aliases enum drgn_register_number in the public libdrgn API and drgn.Register.number in the Python bindings are basically exports of DWARF register numbers. They only exist as a way to identify registers that's lighter weight than string lookups. libdrgn already has struct drgn_register, so we can use that to identify registers in the public API and remove enum drgn_register_number. This has a couple of benefits: we don't depend on DWARF numbering in our API, and we don't have to generate drgn.h from the architecture files. The Python bindings can just use string names for now. If it seems useful, StackFrame.register() can take a Register in the future, we'll just need to be careful to not allow Registers from the wrong platform. While we're changing the API anyways, also change it so that registers have a list of names instead of one name. This isn't needed for x86-64 at the moment, but will be for architectures that have multiple names for the same register (like ARM). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-28 17:47:45 -08:00
Omar Sandoval	bbefc573d8	libdrgn: debug_info: make sure DW_TAG_template_value_parameter has value Otherwise, an invalid DW_TAG_template_value_parameter can be confused for a type parameter. Fixes: `352c31e1ac` ("Add support for C++ template parameters") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 12:07:46 -08:00
Omar Sandoval	5f170ea3f3	helpers: add per_cpu() The correct way to access global per-CPU variables (per_cpu_ptr(prog[name].address_of_(), cpu)) has been a common source of confusion (see #77). Add an analogue to the per_cpu() macro in the kernel as a shortcut and document it as the easiest method for getting a global per-CPU variable: per_cpu(prog[name], cpu). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 11:40:05 -08:00
Omar Sandoval	81a203c48f	helpers: fix for_each_{possible,online,present}_cpu() on v4.4 Also reorder the definitions to alphabetical order and add tests. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-21 10:08:48 -08:00
Omar Sandoval	352c31e1ac	Add support for C++ template parameters Add struct drgn_type_template_parameter to libdrgn, the corresponding TypeTemplateParameter to the Python bindings, and support for parsing them from DWARF. With this, support for templates is almost, but not quite, complete. The main wart is that DW_TAG_name of compound types includes the template parameters, so the type tag includes it as well. We should remove that from the tag and instead have the type formatting code add it only when getting the full type name. Based on a patch from Jay Kamat. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	d35243b354	libdrgn: replace lazy types with lazy objects In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	a57c26ed32	libdrgn: fix zero-length array GCC < 9.0 workaround for qualified types We're not applying the zero-length array workaround when the array type is qualified. Make sure we pass through can_be_incomplete_array when parsing DW_TAG_{const,restrict,volatile,atomic}_type. Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 11:21:57 -08:00
Omar Sandoval	988e9e7190	libdrgn/python: add Object.absent_ Without this, the only way to check whether an object is absent in Python is to try to use the object and catch the ObjectAbsentError. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 15:06:40 -08:00
Omar Sandoval	30cfa40a72	libdrgn: rename "unavailable" objects to "absent" objects I was going to add an Object.available_ attribute, but that made me realize that the naming is somewhat ambiguous, as a reference object with an invalid address might also be considered "unavailable" by users. Use the name "absent" instead, which is more clear: the object isn't there at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 14:58:26 -08:00
Omar Sandoval	c2eec00ae0	libdrgn/python: use None instead of 0 for TypeMember.bit_field_size Make TypeMember.bit_field_size consistent with Object.bit_field_size_ by using None to represent a non-bit field instead of 0. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-25 01:53:23 -08:00
Omar Sandoval	7d7aa7bf7b	libdrgn/python: remove Type == operator The == operator on drgn.Type is only intended for testing. It's expensive and slow and not what people usually want. It's going to get even more awkward to define once types can refer to objects (for template parameters and static members and such). Let's replace == with a new identical() function only available in unit tests. Then, remove the operator from the Python bindings as well as the underlying libdrgn drgn_type_eq() and drgn_qualified_type_eq() functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 03:11:38 -08:00
Omar Sandoval	523fd26959	libdrgn: don't allow casting to non-scalar types at all Currently, we try to emulate the GNU C extension of casting a struct type to itself. This does a deep type comparison, which is expensive. We could take a shortcut like only comparing the kind and type name, but seeing as standard C only allows casting to a scalar type, let's drop support for casting to a struct (or other non-scalar) type entirely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 02:46:05 -08:00
Omar Sandoval	40004e5c8f	libdrgn/python: add offsetof() offsetof() can almost be implemented with Type.member(name).offset, but that doesn't parse member designators. Add an offsetof() function that does (and add drgn_type_offsetof() in libdrgn). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:46:41 -08:00
Omar Sandoval	fd04463596	libdrgn/python: add Type.member() In Python, looking up a member in a drgn Type by name currently looks something like: member = [member for member in type.members if member.name == "foo"][0] Add a Type.member(name) method, which is both easier and more efficient. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:10:23 -08:00
Omar Sandoval	e72ecd0e2c	libdrgn: replace drgn_program_member_info() with drgn_type_find_member() Now that types are associated with their program, we don't need to pass the program separately to drgn_program_member_info() and can replace it with a more natural drgn_type_find_member() API that takes only the type and member name. While we're at it, get rid of drgn_member_info and return the drgn_type_member and bit_offset directly. This also fixes a bug that drgn_error_member_not_found() ignores the member name length. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:40:54 -08:00
Omar Sandoval	cf9a068820	libdrgn/python: fix reference counting on Type.members and Type.parameters The TypeMember and TypeParameter instances referring to a libdrgn drgn_lazy_type are only valid as long as the Type containing them is still alive. Hold a reference on the containing Type from LazyType. We can do this without growing LazyType by getting rid of the enum state and using sentinel values for LazyType::lazy_type as the state. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:09:12 -08:00
Omar Sandoval	abafdd965f	Remove bit_offset from value objects There are a couple of reasons that it was the wrong choice to have a bit_offset for value objects: 1. When we store a buffer with a bit_offset, we're storing useless padding bits. 2. bit_offset describes a location, or in other words, part of an address. This makes sense for references, but not for values, which are just a bag of bytes. Get rid of union drgn_value.bit_offset in libdrgn, make Object.bit_offset None for value objects, and disallow passing bit_offset to the Object() constructor when creating a value. bit_offset can still be passed when creating an object from a buffer, but we'll shift the bytes down as necessary to store the value with no offset. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	bce9ef5f8d	libdrgn: linux kernel: remove THREAD_SIZE object finder THREAD_SIZE is still broken and I haven't looked into the root cause (see commit `95be142d17` ("tests: disable THREAD_SIZE test")). We don't need it anymore anyways, so let's remove it entirely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-10 02:08:13 -08:00
Omar Sandoval	97fbedec1f	libdrgn: return unavailable objects for DWARF objects without value or address Now that we have the concept of unavailable objects, use it for DWARF where appropriate. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 14:15:09 -08:00
Omar Sandoval	6bd0c2b4d2	libdrgn: add concept of "unavailable" objects There are some situations where we can find an object but can't determine its value, like local variables that have been optimized out, inlined functions without a concrete instance, and pure virtual methods. It's still useful to get some information from these objects, namely their types. Let's add the concept of an "unavailable" object, which is an object with a known type but unknown value/address. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:58:19 -08:00
Omar Sandoval	5f17281926	libdrgn: make drgn_object::is_reference an enum To prepare for a new kind of object, replace the is_reference bool with an enum drgn_object_kind. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:37:58 -08:00
Omar Sandoval	e7caa24176	tests: test kernel module debug info loading Now that vmtest supports kernel modules, test that we load them correctly. Closes #74. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-18 01:19:27 -07:00
Omar Sandoval	4431b4f918	vmtest: enable kernel modules We currently build with CONFIG_MODULES=n for simplicity. However, this means that we don't test kernel module support at all. Let's enable module support. This requires changing how we distribute kernels. Now, the /lib/modules/$(uname -r) directory (including the vmlinux and vmlinuz) is bundled up as a tarball. We extract it, then mount it with VirtFS, and do some extra setup for device nodes. (We lose the ability to run kernel builds directly, but I've never actually used that functionality.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-18 01:13:01 -07:00
Omar Sandoval	4cbb9b552a	libdrgn: fix comparison of types with anonymous members drgn_type_members_eq() skips comparing the types of anonymous members. Fix that and add a test for it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-08 17:32:46 -07:00
Jay Kamat	d1beb0184a	libdrgn: add support for objects in C++ namespaces DWARF represents namespaces with DW_TAG_namespace DIEs. Add these to the DWARF index, with each namespace being its own sub-index. We only index the namespace itself when it is first accessed, which should help with startup time and simplifies tracking. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	26291647eb	libdrgn: dwarf_index: handle DW_AT_specification DIEs with two passes We currently handle DIEs with a DW_AT_specification attribute by parsing the corresponding declaration to get the name and inserting the DIE as usual. This has a couple of problems: 1. It only works if DW_AT_specification refers to the same compilation unit, which is true for DW_FORM_ref{1,2,4,8,_udata}, but not DW_FORM_ref_addr. As a result, drgn doesn't support the latter. 2. It assumes that the DIE with DW_AT_specification is in the correct "scope". Unfortunately, this is not true for g++: for a variable definition in a C++ namespace, it generates a DIE with DW_AT_declaration as a child of the DW_TAG_namespace DIE and a DIE which refers to the declaration with DW_AT_specification _outside_ of the DW_TAG_namespace as a child of the DW_TAG_compilation_unit DIE. Supporting both of these cases requires reworking how we handle DW_AT_specification. This commit takes an approach of parsing the DWARF data in two passes: the first pass reads the abbrevation and file name tables and builds a map of instances of DW_AT_specification; the second pass indexes DIEs as before, but ignores DIEs with DW_AT_specification and handles DIEs with DW_AT_declaration by looking them up in the map built by the first pass. This approach is a 10-20% regression in indexing time in the benchmarks I ran. Thankfully, it is not 100% slower for a couple of reasons. The first is that the two passes are simpler than the original combined pass. The second is that a decent part of the indexing time is spent faulting in the mapped debugging information, which only needs to happen once (even if the file is cached, minor page faults add non-negligible overhead). This doesn't handle DW_AT_specification "chains" yet, but neither did the original code. If it is necessary, it shouldn't be too difficult to add. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-02 17:13:16 -07:00
Omar Sandoval	36068a0ea8	Fix trailing commas for Black v20.8b1 Black was recently changed to treat a trailing comma as an indicator to put each item/argument on its own line. We have a bunch of places where something previously had to be split into multiple lines, then was edited to fit on one line, but Black kept the trailing comma. Now this update wants to unnecessarily split it back up. For now, let's get rid of these commas. Hopefully in the future Black has a way to opt out of this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:29 -07:00
Omar Sandoval	2fc514f2a4	libdrgn/python: add Qualifiers.NONE and stop using Optional[Qualifiers] I originally did it this way because pydoc doesn't handle non-trivial defaults in signature very well (see commit `67a16a09b8` ("tests: test that Python documentation renders")). drgndoc doesn't generate signature for pydoc anymore, though, so we don't need to worry about it and can clean up the typing. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:29 -07:00
Omar Sandoval	a97f6c4fa2	Associate types with program I originally envisioned types as dumb descriptors. This mostly works for C because in C, types are fairly simple. However, even then the drgn_program_member_info() API is awkward. You should be able to look up a member directly from a type, but we need the program for caching purposes. This has also held me back from adding offsetof() or has_member() APIs. Things get even messier with C++. C++ template parameters can be objects (e.g., template <int N>). Such parameters would best be represented by a drgn object, which we need a drgn program for. Static members are a similar case. So, let's reimagine types as being owned by a program. This has a few parts: 1. In libdrgn, simple types are now created by factory functions, drgn_foo_type_create(). 2. To handle their variable length fields, compound types, enum types, and function types are constructed with a "builder" API. 3. Simple types are deduplicated. 4. The Python type factory functions are replaced by methods of the Program class. 5. While we're changing the API, the parameters to pointer_type() and array_type() are reordered to be more logical (and to allow pointer_type() to take a default size of None for the program's default pointer size). 6. Likewise, the type factory methods take qualifiers as a keyword argument only. A big part of this change is updating the tests and splitting up large test cases into smaller ones in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:41:09 -07:00
Omar Sandoval	c31208f69c	libdrgn: fold drgn_type_index into drgn_program This is preparation for associating types with a program. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:36:35 -07:00
Omar Sandoval	4e770fb18a	Format imports with isort Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-20 16:55:07 -07:00

1 2 3 4 5 ...

337 Commits