There's no reason to go through the trouble of checking the task_struct
if we were given a PRSTATUS note; it must be a thread that was running
at the time of the core dump. Refactor drgn_get_initial_registers() so
that we can use PRSTATUS earlier.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Maple trees have been around and used for VMAs for almost a year now
(since Linux 6.1). Finally add helpers and tests for them.
Closes#261.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This makes the cpumask tests a little more thorough, as now the online
mask will be different from the possible and present masks. It also
makes the cpulist discontiguous in most cases (since you usually can't
offline CPU 0).
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Upcoming tests will need to combine flags.
Fixes: 104a14781d ("tests: test compressed debug sections")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Peter Collingbourne reported that the over-reading we do in the AArch64
page table iterator uses too much bandwidth for remote targets. His
original proposal in #312 was to change the page table iterator to only
read one entry per level. However, this would regress large reads that
do end up using the additional entries (in particular when the target is
/proc/kcore, which has a high latency per read but also high enough
bandwidth that the over-read is essentially free).
We can get the best of both worlds by informing the page table iterator
how much we expect to need (at the cost of some additional complexity in
this admittedly already pretty complex code). Requiring an accurate end
would limit the flexibility of the page table iterator and be more
error-prone, so let's make it a non-binding hint.
Add the hint and use it in the x86-64 page table iterator to only read
as many entries as necessary. Also extend the test case for large page
table reads to test this better.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that the ppc64 architecture has complete MM (Memory Management)
support, enable testing for it.
Add the ppc64 architecture, along with other architectures, to the
HAVE_FULL_MM_SUPPORT list on separate lines to prevent pre-commit
black failure, as recommended by pre-commit.
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
libdrgn/include/elf.h no longer has the definitions that the script
needs. Update it to use the system elf.h and update tests/elf.py with one
new definition. Also make scripts/gen_elf_compat.py read elf.h in the
same way.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The test is mistakenly using the default program, not the one it set up
to prefer ORC. This made us miss the previous bug.
Fixes: a45603a884 ("tests: use test kmod for more stack trace tests")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Requiring the elaborated type specifier has been a common source of
confusion for people debugging C++ applications with drgn, since this
makes it look like the type doesn't exist or debug info was missing.
Now that drgn_program_find_type_impl() can look up multiple type kinds
in one shot, make c_family_find_type() look up struct, union, class, and
enum types in addition to typedefs when it gets a plain identifier and
the program is C++.
Closes#348.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
find_namespace_containing_die() only looks for DW_TAG_namespace DIEs
containing the target DIE, but it also needs to look fo nested
classes/structs/unions. Consider the following program:
namespace ns {
class Bar { ... };
class Foo {
class Bar { ... };
...
};
};
If we encounter a declaration DIE for ns::Foo::Bar, we'll end up looking
for the definition directly in ns and finding ns::Bar instead, which is
a completely different type.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
C++ supports defining classes inside of other classes (as well as
structs and unions). In C++, these are accessed with the same scope
resolution operator (::) as namespaces. We can now support this by
extending drgn_namespace_dwarf_index to index the children of
DW_TAG_class_type, DW_TAG_structure_type, and DW_TAG_union_type DIEs the
same way it currently does DW_TAG_namespace_type DIEs. The only
complication is that declaration class, struct, and union DIEs can also
have children for nested definitions. Rather than mixing declarations in
with type definitions, we pretend they're DW_TAG_namespace DIEs.
Closes#262.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
When we encounter an incomplete struct, union, class, or enum type, we
try to find the complete definition by name. We also try to detect
whether the name is ambiguous, i.e., whether there are multiple distinct
types with that name. This is based on the DWARF index's deduplication
by filename: if the index contains more than one DIE matching the (name,
tag), then the type name was defined in more than one file, and
therefore it is ambiguous.
However, this breaks if the exact same definition came from different
paths. For example, a Linux kernel module built out-of-tree may use
different paths than the original kernel build. Other scenarios
involving the compilation directory could also affect this.
Furthermore, this check won't be feasible with an upcoming rework of the
DWARF index.
Let's drop the check and return the first match regardless of other
matches. Hopefully it doesn't matter too much in practice. If the wrong
type is returned, it can be worked around by casting to the correct type
looked up by filename.
Closes#186.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The Linux kernel's struct task_struct on AArch64 contains an array of
__uint128_t:
>>> task = find_task(prog, 1)
>>> task.type_
struct task_struct *
>>> task.thread.type_
struct thread_struct {
struct cpu_context cpu_context;
struct {
unsigned long tp_value;
unsigned long tp2_value;
struct user_fpsimd_state fpsimd_state;
} uw;
enum fp_type fp_type;
unsigned int fpsimd_cpu;
void *sve_state;
void *sme_state;
unsigned int vl[2];
unsigned int vl_onexec[2];
unsigned long fault_address;
unsigned long fault_code;
struct debug_info debug;
struct ptrauth_keys_user keys_user;
struct ptrauth_keys_kernel keys_kernel;
u64 mte_ctrl;
u64 sctlr_user;
u64 svcr;
u64 tpidr2_el0;
}
>>> task.thread.uw.fpsimd_state.type_
struct user_fpsimd_state {
__int128 unsigned vregs[32];
__u32 fpsr;
__u32 fpcr;
__u32 __reserved[2];
}
As a result, printing a task_struct fails:
>>> task
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/host/home/osandov/repos/drgn3/drgn/cli.py", line 140, in _displayhook
text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns)
NotImplementedError: integer values larger than 64 bits are not yet supported
PR #311 suggested treating >64-bit integers as byte arrays for now; I
tried an alternate hack of handling >64-bit integers only in the
pretty-printing code. Both of these had issues, though.
Instead, let's push >64-bit integer support a little further and allow
storing "big integer" value objects. We still don't support any
operations on them, so this still doesn't complete #170. We store the
raw bytes of the value for now, but we'll probably change this if we add
support for operations (e.g., to store the value as an mp_limb_t array
for GMP). We also print >64-bit integer types in hexadecimal for
simplicity. This is inconsistent with the existing behavior of printing
in decimal, but more readable. In the future, we might want to add
heuristics to decide when to print in decimal vs hexadecimal for all
sizes.
Closes#311.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This is similar to commit 155ec92ef2 ("libdrgn: fix reading 32-bit
float object values on big-endian").
Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We've addressed all of the smaller differences with GNU Debug Fission
and split DWARF 5, so now all that remains is the DWARF index.
The general approach is: in drgn_dwarf_index_read_cus(), for each CU,
ask libdw for the "sub-DIE". For skeleton CUs, this is the split CU DIE
from the .dwo file. From that Dwarf_Die, we can get the Dwarf_CU and
then the Dwarf handle. Then, we wrap that in a struct drgn_elf_file
(cached in a hash table in the struct drgn_module), which the DWARF
index can work with from there.
Additionally, a couple of places (.debug_addr parsing and stack trace
local variable lookup) need to be updated to use the correct
drgn_elf_file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than coming up with our own, separate logging API for the Python
bindings, let's integrate with the logging module. The straightforward
part is creating a logger from the C extension and adding a log callback
that calls its log() method. However, syncing the log level between the
logging module and libdrgn requires monkey patching.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I've needed this many times, but there wasn't a corresponding function
in the kernel so I could never decide what to name it. Linux kernel
commit 4d70c74659d9 ("i915: Move list_count() to list.h as
list_count_nodes() for broader use") (in v6.3-rc1) fixed that problem
for me.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
__schedule tends to end up aliased with other special symbols because
it's placed in the .sched.text section. We're handling
__sched_text_start, but on ppc64, I also saw __cpuidle_text_end. Use
drgn_test_function from the test kernel module instead.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The test kernel module has several -Woverflow warnings when compiled for
Arm:
/home/osandov/repos/drgn/build/vmtest/arm/tmp9g8977mn/drgn_test.c:785:9: warning: unsigned conversion from 'long long int' to 'long unsigned int' changes value from '6221254864074593878' to '1448498774' [-Woverflow]
785 | 0x5656565656565656,
| ^~~~~~~~~~~~~~~~~~
/home/osandov/repos/drgn/build/vmtest/arm/tmp9g8977mn/drgn_test.c:786:9: warning: unsigned conversion from 'long long int' to 'long unsigned int' changes value from '1311768465173141112' to '305419896' [-Woverflow]
786 | 0x1234567812345678,
| ^~~~~~~~~~~~~~~~~~
/home/osandov/repos/drgn/build/vmtest/arm/tmp9g8977mn/drgn_test.c:787:9: warning: unsigned conversion from 'long long int' to 'long unsigned int' changes value from '1311768467294899695' to '2427178479' [-Woverflow]
787 | 0x1234567890abcdef,
| ^~~~~~~~~~~~~~~~~~
drgn_test_idr_dense and drgn_test_idr_ptrs aren't actually used by the
tests, so remove them.
Fixes: 4f2c8f0735 ("tests: idr: add test cases for idr.")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of simply checking whether the thread ID is non-zero, write a
specific task name to /proc/self/comm before crashing and check that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
follow_{page,pfn,phys}() translate the virtual address by walking the
page table for a given mm_struct (built on top of the existing page
table iterator interface). vmalloc_to_page() and vmalloc_to_pfn() are
special cases for vmalloc addresses.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The phys <-> pfn helpers only need PAGE_SHIFT, which is always
available, the page <-> pfn helpers only need vmemmap or mem_map, which
is independent of virtual address translation, and the page <-> phys
helpers are built on top of those.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The value can change between two reads. This caused test failures in the
previous commit 18a8f69ad8 ("libdrgn: linux_kernel: add object finder
for jiffies"). The important thing is that the addresses are the same.
Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have a lot of examples that use jiffies, but they stopped working
long ago on x86-64 (since Linux kernel commit d8ad6d39c35d ("x86_64: Fix
jiffies ODR violation") (in v5.8 and backported to stable releases)) and
never worked on other architectures. This is because jiffies is defined
in the Linux kernel's linker script. #277 proposed updating the examples
to use jiffies_64, but I would guess that most kernel developers are
familiar with jiffies and many have never seen jiffies_64. jiffies is
also a nicer name to type in live demos. Let's add a case to the Linux
kernel object finder to get the jiffies variable.
Reported-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
tests.linux_kernel.helpers.test_mm.TestMm.test_read_physical() fails on
Arm when the user page is mapped from high memory. Replace it with a
test case that uses the test physical address from the test kernel
module and asserts the expected PRNG data. (We will probably run into
more tests that fail with high memory once #244 is merged.)
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This replaces the multiples of a prime sequence used for the per-CPU
helper tests with a slightly more self-documenting PRNG interface in
both the test kernel module and Python test scaffolding. It will be used
more in upcoming tests.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
GCC appears to use data8 at -1 when reporting zero length arrays when
comping c++ code, this patch adds support and a test for that behavior.
dwarf_info.c: Remove check for sdata on quirk for array length == 0
Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
When testing #245, I encountered this error:
======================================================================
ERROR: test_identify_unrecognized (tests.linux_kernel.helpers.test_common.TestIdentifyAddress)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/host/home/osandov/repos/drgn/tests/linux_kernel/helpers/test_common.py", line 57, in test_identify_unrecognized
self.assertIsNone(identify_address(self.prog, start_addr - 1))
File "/host/home/osandov/repos/drgn/drgn/helpers/common/memory.py", line 99, in identify_address
symbol = prog.symbol(addr)
OverflowError: can't convert negative int to unsigned
This is because the lowest kernel address in s390x is 0, so we're
passing -1 to identify_address(). Work around that in the test.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The helper function returns a tuple with load
averaged over last 1, 5 and 15 minutes.
Signed-off-by: Martin Liska <mliska@suse.cz>
Co-authored-by: Omar Sandoval <osandov@osandov.com>
The test case checks `drgn_test_slob` from the test kmod.
Fixes: 975255f209 ("tests: handle cases without slab support in print_annotated_stack() test")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
So far idr support was available for only radix-tree based idrs.
Thus radix-tree tests were implicity covering idr test as well.
Now as we are supporting non radix-tree based idrs as well, so
add explicit test cases for idr testing.
The test are applicable for both new (i.e radix-tree based) and
old implementation of idrs.
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Looking up objects in namespaces is already well-supported by `drgn`.
These changes bring the same to functionality type lookup, so that
`prog.type('struct A::B::C::MyType')` works in an analogous fashion to
`prog['A::B::C::MyVar']`.
Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>