The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.
Closes#185.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).
This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some slab caches for large objects (like task_struct) allocate slabs as
compound pages. Only the head page is marked as PageSlab(), so if
find_containing_slab_cache() gets an address that was allocated out of a
tail page, it will incorrectly return NULL. Fix it by always getting the
compound_head, and add a test case with large slab objects.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Running test_find_containing_slab_cache_invalid() without the drgn_test
Linux kernel module gives a KeyError:
Traceback (most recent call last):
File ".../tests/linux_kernel/helpers/test_slab.py", line 169, in test_find_containing_slab_cache_invalid
find_containing_slab_cache(self.prog, self.prog["drgn_test_va"]),
KeyError: 'drgn_test_va'
Use the @skip_unless_have_test_kmod tag. The test also needs a
@skip_unless_have_full_mm_support tag as pointed out by Omar, so add it
while we are at it.
Fixes: 79ea6589c2 ("drgn.helpers.linux.slab: add find_containing_slab_cache helper")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Since _repr_pretty_() uses output of str(), and the latter is already
heavily tested in tests/test_language_c.py, we can simply test whether
p.text() is called instead of duplicating all the test cases.
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
There are a bunch of page flag getters in the kernel like
PageUptodate(), PageLocked(), etc., that kernel developers are
accustomed to using. Most of them are simple bit tests. Let's add
helpers for all of those. These are generated from
include/linux/page-flags.h in the Linux kernel source tree as of Linux
v6.0-rc1.
More complicated getters that need to do more than a simple flag check
(e.g., PageCompound()) will need to be added manually.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function identifies the slab cache (if any) the object at
the given address belongs to. This will be useful for a future helper
function which prints the stack trace with more information about each
item on the stack.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Modify how the test page is allocated to ensure we have a directly
mapped address which is not slab allocated for testing the negative case
of find_containing_slab_cache.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
A path is the most convenient way to find a cgroup if we don't already
have a pointer to it from another structure.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I originally thought this would be too difficult, but it's fairly
straightforward to parse /proc/mounts and allows us to avoid some setup
and cleanup.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This looks up a kernfs node from a path. It will be used to look up
cgroups by path. This is based on kernfs_walk_ns() from the Linux
kernel, but it doesn't handle namespaced kernfs nodes yet.
kernfs_walk_ns() in the kernel is actually built on another function,
kernfs_find_ns(), but I don't think the latter is very useful as a
helper.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn_debug_info_find_complete() looks up the name of the incomplete type
in the global namespace. This is incorrect for C++: we need to look it
up in the namespace that the DIE is in.
To find the containing namespace, we need to do a DIE ancestor walk. We
don't want to do this for C, so add a flag indicating whether a language
has namespaces to struct drgn_language. If it's true, then we do the
ancestor walk and then look up the name in the appropriate namespace.
Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
Currently, DIE references are specified as an index into the list of the
unit DIE's children. This has a few issues:
* It's hard to figure out what references what at a glance.
* Changes to tests sometimes need to renumber these indices.
* DIEs at lower levels in the tree cannot be referenced.
Replace it with explicit "labels" which are referred to by name.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add helpers for converting physical addresses to and from virtual
addresses, PFNs, and struct pages.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We currently test the functions to convert between virtual addresses,
PFNs, and struct pages with an mmap'd region and /proc/self/pagemap. Use
the test kernel module to test them more directly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently only testing whether we can translate user addresses.
Test a kernel address with the kernel page table, too.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The test kmod build has the following warning that I somehow didn't
notice before:
WARNING: modpost: /home/osandov/repos/drgn-main/tests/linux_kernel/kmod/drgn_test.o(.init.text+0x3ac): Section mismatch in reference from the function init_module() to the function .exit.text:drgn_test_exit()
The function __init init_module() references
a function __exit drgn_test_exit().
This is often seen when error handling in the init function
uses functionality in the exit path.
The fix is often to remove the __exit annotation of
drgn_test_exit() so it may be used outside an exit section.
Remove the __exit annotation as suggested.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Very similar to a541e9b170, but adds
partial support for floats (as opposed to integers) which aren't 32 or
64 bits.
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
Previously `drgn` did not recognize the `DW_ATE_UTF` encoding for base
types, and consequently could not handle `char8_t`, `char16_t`, or
`char32_t`. This has been remedied, and a corresponding test case added
to prevent regressions.
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
Our vmtest kernels have CONFIG_KALLSYMS_ALL, but distro kernels may not,
in which case variable symbols are not added to /proc/kallsyms. Then,
the Linux kernel debug info tests can't find our test symbol and fail.
Define a global function symbol and use it for the test debug info
tests instead.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Several of the mm tests currently fail on architectures that we haven't
implemented virtual address translation and such for (i.e., anything
other than x86-64). Only run those tests on x86-64 for now.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some architectures, including AArch64, don't have the pause() syscall.
glibc implements pause(3) with ppoll() on those architectures. Our stack
trace tests check for "pause" in the stack trace, so it fails on
AArch64. Update the tests to check for both "pause" and "poll".
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Linux kernel commit a0e286b6a5b6 ("loop: remove lo_refcount and avoid
lo_mutex in ->open / ->release") (in v5.19-rc1) removed the lo_open
symbol that we use for
tests.linux_kernel.test_debug_info.TestModuleDebugInfo. Replace it with
a symbol from the test kernel module so we don't need to worry about it
going away.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Several have snuck in since the last time I did this in commit
5541fad063 ("Fix some flake8 errors"). Prepare for adding flake8 to
pre-commit by fixing them.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Splitting the tests between tests/linux_kernel and tests/helpers/linux
means that we have to set up the unit tests twice, including loading
debug info. Python 3.7 and newer have a way to get around this, but
we're still sort of supporting Python 3.6. Move them under one path to
speed up test runs.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This one is considerably more complicated than the linked list one, but
it should catch lots of kinds of red-black tree mistakes.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
At LSF/MM+BPF 2022, Ted Ts'o pitched me the idea of using drgn to
validate the consistency of kernel data structures. I really liked this
idea, especially for big, complicated data structures. But first, let's
start small: document the concept of a "validator", which is just a
special kind of helper, and add some basic validator versions of linked
list helpers.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The current test case for slab_cache_for_each_allocated_object() is
barely more than a smoke test. It missed the bug fixed by the previous
commit. Now that we have the test kernel module, we can do a lot better
by comparing against the exact list of objects that are allocated.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Just like we did for lists, use a test data structure instead of the vma
tree for a process. This also allows us to test RB_EMPTY_NODE, which we
couldn't test before.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than picking a couple of lists which we hope won't change, use
the newly added test kernel module to define a few lists and test
against those. This also gives us proper tests for list_is_singular().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that commit d999703f94 ("vmtest: add kernel module build
dependencies to kernel packages") added the files necessary to build a
test kernel module, add the module (currently a stub) and the
scaffolding necessary to build and load it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Slab cache merging can cause major confusion when debugging using slab
cache helpers. Add a helper to detect whether a slab cache is merged and
an explanation of the implications.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
These are ported from https://github.com/josefbacik/debug-scripts.
Signed-off-by: alexlzhu <alexlzhu@fb.com>
[Omar: make test case check for exception on SLOB]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and
unsigned __int128. Clang additionally has N-bit integers of up to 2<<24
bits with _ExtInt(N), which was standardized in C23 as _BitInt(N).
Currently, we disallow creating objects with a >64-bit integer type. Jay
Kamat reported that this would cause errors when examining some
binaries. The reason we disallow this is that we don't have a way to
represent or do operations on >64-bit values. We could make use of a
bignum library like GMP to do this in the future.
However, for now, we can loosen this restriction and at least allow
reference and absent objects with big integer types. This requires
enforcing two things: that we never create a value object with a >64-bit
integer type, and that we never read the value of a reference object
with a >64-bit integer type.
Co-authored-by: Jay Kamat <jaygkamat@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
SLOB doesn't have /proc/slabinfo (and it can be disabled for SLUB and
SLAB on some kernel versions, too). Fall back to some known slab caches
if /proc/slabinfo doesn't exist.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
tests/helpers/linux contains test cases for Linux kernel helpers and
test cases for core Linux kernel support. The latter don't make sense
there; move them to tests/linux_kernel instead, along with the
scaffolding.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
First, find the sample file relative to the test module so that tests
can be run from a different directory. Second, pass --force to zstd so
that it doesn't ignore symlinks, which is required for environments like
Buck that copy the test files as a symlinks.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If /proc/kcore is missing (e.g., because CONFIG_PROC_KCORE is not
enabled) or invalid (e.g., Docker mounts /dev/null over /proc/kcore),
then skip the tests.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Knowing the thread or CPU that logged a specific message can be very
useful when investigating multithreaded bugs. Additionally, on kernels
since 5.10, the caller ID is always saved but only exposed to userspace
if CONFIG_PRINTK_CALLER is enabled, which distros don't currently seem
to do.
Signed-off-by: Omar Sandoval <osandov@osandov.com>