If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
A path is the most convenient way to find a cgroup if we don't already
have a pointer to it from another structure.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I originally thought this would be too difficult, but it's fairly
straightforward to parse /proc/mounts and allows us to avoid some setup
and cleanup.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This looks up a kernfs node from a path. It will be used to look up
cgroups by path. This is based on kernfs_walk_ns() from the Linux
kernel, but it doesn't handle namespaced kernfs nodes yet.
kernfs_walk_ns() in the kernel is actually built on another function,
kernfs_find_ns(), but I don't think the latter is very useful as a
helper.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn_debug_info_find_complete() looks up the name of the incomplete type
in the global namespace. This is incorrect for C++: we need to look it
up in the namespace that the DIE is in.
To find the containing namespace, we need to do a DIE ancestor walk. We
don't want to do this for C, so add a flag indicating whether a language
has namespaces to struct drgn_language. If it's true, then we do the
ancestor walk and then look up the name in the appropriate namespace.
Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
Currently, DIE references are specified as an index into the list of the
unit DIE's children. This has a few issues:
* It's hard to figure out what references what at a glance.
* Changes to tests sometimes need to renumber these indices.
* DIEs at lower levels in the tree cannot be referenced.
Replace it with explicit "labels" which are referred to by name.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add helpers for converting physical addresses to and from virtual
addresses, PFNs, and struct pages.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We currently test the functions to convert between virtual addresses,
PFNs, and struct pages with an mmap'd region and /proc/self/pagemap. Use
the test kernel module to test them more directly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently only testing whether we can translate user addresses.
Test a kernel address with the kernel page table, too.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The test kmod build has the following warning that I somehow didn't
notice before:
WARNING: modpost: /home/osandov/repos/drgn-main/tests/linux_kernel/kmod/drgn_test.o(.init.text+0x3ac): Section mismatch in reference from the function init_module() to the function .exit.text:drgn_test_exit()
The function __init init_module() references
a function __exit drgn_test_exit().
This is often seen when error handling in the init function
uses functionality in the exit path.
The fix is often to remove the __exit annotation of
drgn_test_exit() so it may be used outside an exit section.
Remove the __exit annotation as suggested.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Very similar to a541e9b170, but adds
partial support for floats (as opposed to integers) which aren't 32 or
64 bits.
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
Previously `drgn` did not recognize the `DW_ATE_UTF` encoding for base
types, and consequently could not handle `char8_t`, `char16_t`, or
`char32_t`. This has been remedied, and a corresponding test case added
to prevent regressions.
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
Our vmtest kernels have CONFIG_KALLSYMS_ALL, but distro kernels may not,
in which case variable symbols are not added to /proc/kallsyms. Then,
the Linux kernel debug info tests can't find our test symbol and fail.
Define a global function symbol and use it for the test debug info
tests instead.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Several of the mm tests currently fail on architectures that we haven't
implemented virtual address translation and such for (i.e., anything
other than x86-64). Only run those tests on x86-64 for now.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some architectures, including AArch64, don't have the pause() syscall.
glibc implements pause(3) with ppoll() on those architectures. Our stack
trace tests check for "pause" in the stack trace, so it fails on
AArch64. Update the tests to check for both "pause" and "poll".
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Linux kernel commit a0e286b6a5b6 ("loop: remove lo_refcount and avoid
lo_mutex in ->open / ->release") (in v5.19-rc1) removed the lo_open
symbol that we use for
tests.linux_kernel.test_debug_info.TestModuleDebugInfo. Replace it with
a symbol from the test kernel module so we don't need to worry about it
going away.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Several have snuck in since the last time I did this in commit
5541fad063 ("Fix some flake8 errors"). Prepare for adding flake8 to
pre-commit by fixing them.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Splitting the tests between tests/linux_kernel and tests/helpers/linux
means that we have to set up the unit tests twice, including loading
debug info. Python 3.7 and newer have a way to get around this, but
we're still sort of supporting Python 3.6. Move them under one path to
speed up test runs.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This one is considerably more complicated than the linked list one, but
it should catch lots of kinds of red-black tree mistakes.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
At LSF/MM+BPF 2022, Ted Ts'o pitched me the idea of using drgn to
validate the consistency of kernel data structures. I really liked this
idea, especially for big, complicated data structures. But first, let's
start small: document the concept of a "validator", which is just a
special kind of helper, and add some basic validator versions of linked
list helpers.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The current test case for slab_cache_for_each_allocated_object() is
barely more than a smoke test. It missed the bug fixed by the previous
commit. Now that we have the test kernel module, we can do a lot better
by comparing against the exact list of objects that are allocated.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Just like we did for lists, use a test data structure instead of the vma
tree for a process. This also allows us to test RB_EMPTY_NODE, which we
couldn't test before.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than picking a couple of lists which we hope won't change, use
the newly added test kernel module to define a few lists and test
against those. This also gives us proper tests for list_is_singular().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that commit d999703f94 ("vmtest: add kernel module build
dependencies to kernel packages") added the files necessary to build a
test kernel module, add the module (currently a stub) and the
scaffolding necessary to build and load it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Slab cache merging can cause major confusion when debugging using slab
cache helpers. Add a helper to detect whether a slab cache is merged and
an explanation of the implications.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
These are ported from https://github.com/josefbacik/debug-scripts.
Signed-off-by: alexlzhu <alexlzhu@fb.com>
[Omar: make test case check for exception on SLOB]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and
unsigned __int128. Clang additionally has N-bit integers of up to 2<<24
bits with _ExtInt(N), which was standardized in C23 as _BitInt(N).
Currently, we disallow creating objects with a >64-bit integer type. Jay
Kamat reported that this would cause errors when examining some
binaries. The reason we disallow this is that we don't have a way to
represent or do operations on >64-bit values. We could make use of a
bignum library like GMP to do this in the future.
However, for now, we can loosen this restriction and at least allow
reference and absent objects with big integer types. This requires
enforcing two things: that we never create a value object with a >64-bit
integer type, and that we never read the value of a reference object
with a >64-bit integer type.
Co-authored-by: Jay Kamat <jaygkamat@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
SLOB doesn't have /proc/slabinfo (and it can be disabled for SLUB and
SLAB on some kernel versions, too). Fall back to some known slab caches
if /proc/slabinfo doesn't exist.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
tests/helpers/linux contains test cases for Linux kernel helpers and
test cases for core Linux kernel support. The latter don't make sense
there; move them to tests/linux_kernel instead, along with the
scaffolding.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
First, find the sample file relative to the test module so that tests
can be run from a different directory. Second, pass --force to zstd so
that it doesn't ignore symlinks, which is required for environments like
Buck that copy the test files as a symlinks.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If /proc/kcore is missing (e.g., because CONFIG_PROC_KCORE is not
enabled) or invalid (e.g., Docker mounts /dev/null over /proc/kcore),
then skip the tests.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Knowing the thread or CPU that logged a specific message can be very
useful when investigating multithreaded bugs. Additionally, on kernels
since 5.10, the caller ID is always saved but only exposed to userspace
if CONFIG_PRINTK_CALLER is enabled, which distros don't currently seem
to do.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Our cheap heuristic for the default language will not always be correct,
and although we can improve it as cases arise, we should also just have
a way for the user to explicitly set the default language. Add
drgn_program_set_language() to libdrgn and allow setting
drgn.Program.language in the Python bindings. This will also make unit
testing different languages easier.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Black 22.1.0 has some style changes: string prefixes are normalized and
spaces around the power operator are removed.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This implements the existing thread API methods for live processes other
than drgn_thread_stack_trace(). It also doesn't yet add support for
full-blown tracing, but it at least brings live processes to feature
parity. This is taken from the non-ptrace parts of Kevin Svetlitski's
PR #142, with some modifications.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If a TID does not exist, then linux_helper_find_task() succeeds but
returns a null pointer object. Check for that instead of returning a
bogus thread.
Fixes: 301cc767ba ("Implement a new API for representing threads")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Currently only supported for user-space crash dumps. E.g. no support for
live user-space application debugging or kernel debugging.
Closes#144.
Signed-off-by: Mykola Lysenko <mykolal@fb.com>
When running under fakeroot (e.g., for an RPM mock build), geteuid()
returns 0, so LinuxHelperTestCase continues and fails with a
PermissionError either when attaching to /proc/kcore or opening
/dev/loop-control. Catch the PermissionError instead of checking the
EUID.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
GCC or binutils on Fedora Rawhide for ARM seems to have a bug where
c_keywords gets placed in the .data.rel.ro section (see
https://www.airs.com/blog/archives/189):
$ readelf -s .libs/libdrgnimpl_la-language_c.o | grep -w c_keywords
475: 00000000 16 OBJECT LOCAL DEFAULT 175 c_keywords
$ readelf -S .libs/libdrgnimpl_la-language_c.o | grep -F '[175]'
[175] .data.rel PROGBITS 00000000 051f90 000010 00 WA 0 0 4
$ readelf -s .libs/_drgn.so | grep -w c_keywords
9267: 0008e84c 16 OBJECT LOCAL DEFAULT 21 c_keywords.lto_priv.0
$ readelf -S .libs/_drgn.so | grep -F '[21]'
[21] .data.rel.ro PROGBITS 0008e018 07e018 000a10 00 WA 0 0 8
This results in a crash on startup when c_keywords_init() attempts to
populate c_keywords.
While this appears to be a compiler or linker bug, I've been meaning to
replace c_keywords with a static lookup function anyways. Now that we
have gen_strswitch.py, we can use it to generate the lookup function.
Add a script, gen_c_keywords_inc_strswitch.py, which generates an array
mapping token kind to spelling, and a memswitch mapping spelling to
token kind.
Signed-off-by: Omar Sandoval <osandov@osandov.com>