Commit Graph

346 Commits

Author SHA1 Message Date
Omar Sandoval
30c9ad452d libdrgn: linux_kernel: fix global per-CPU variables in kernel modules
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.

Closes #185.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:27:28 -07:00
Omar Sandoval
a52016c4cb libdrgn: linux_kernel: always use module list from core
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).

This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:11:59 -07:00
Omar Sandoval
e5c7acb4fb drgn.helpers.linux.slab: handle compound pages in find_containing_slab_cache()
Some slab caches for large objects (like task_struct) allocate slabs as
compound pages. Only the head page is marked as PageSlab(), so if
find_containing_slab_cache() gets an address that was allocated out of a
tail page, it will incorrectly return NULL. Fix it by always getting the
compound_head, and add a test case with large slab objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-09 16:35:28 -07:00
Omar Sandoval
42e7d474d1 drgn.helpers.linux.mm: add compound page helpers
I had these helpers lying around from a couple of bugs related to
compound pages that I debugged.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-09 15:54:43 -07:00
Peilin Ye
517d4bea18 tests: Add missing tags for test_find_containing_slab_cache_invalid()
Running test_find_containing_slab_cache_invalid() without the drgn_test
Linux kernel module gives a KeyError:

  Traceback (most recent call last):
    File ".../tests/linux_kernel/helpers/test_slab.py", line 169, in test_find_containing_slab_cache_invalid
      find_containing_slab_cache(self.prog, self.prog["drgn_test_va"]),
  KeyError: 'drgn_test_va'

Use the @skip_unless_have_test_kmod tag.  The test also needs a
@skip_unless_have_full_mm_support tag as pointed out by Omar, so add it
while we are at it.

Fixes: 79ea6589c2 ("drgn.helpers.linux.slab: add find_containing_slab_cache helper")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
2022-08-29 15:01:18 -07:00
Shung-Hsi Yu
e8d0c85811 test: add test for _repr_pretty_() method
Since _repr_pretty_() uses output of str(), and the latter is already
heavily tested in tests/test_language_c.py, we can simply test whether
p.text() is called instead of duplicating all the test cases.

Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
2022-08-25 13:52:28 -07:00
Omar Sandoval
d14f751475 drgn.helpers.linux.mm: add simple PageFlag() getters
There are a bunch of page flag getters in the kernel like
PageUptodate(), PageLocked(), etc., that kernel developers are
accustomed to using. Most of them are simple bit tests. Let's add
helpers for all of those. These are generated from
include/linux/page-flags.h in the Linux kernel source tree as of Linux
v6.0-rc1.

More complicated getters that need to do more than a simple flag check
(e.g., PageCompound()) will need to be added manually.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-18 15:50:15 -07:00
Nhat Pham
79ea6589c2 drgn.helpers.linux.slab: add find_containing_slab_cache helper
This helper function identifies the slab cache (if any) the object at
the given address belongs to. This will be useful for a future helper
function which prints the stack trace with more information about each
item on the stack.

Signed-off-by: Nhat Pham <nphamcs@gmail.com>
2022-08-16 15:52:21 -07:00
Nhat Pham
93f8d07bcf tests: directly allocate the test page in test kernel module
Modify how the test page is allocated to ensure we have a directly
mapped address which is not slab allocated for testing the negative case
of find_containing_slab_cache.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
2022-08-16 15:52:21 -07:00
Omar Sandoval
faaf01ad1b Add drgn.StackTrace.prog and drgn_stack_trace_program()
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-11 14:45:54 -07:00
Imran Khan
4296653090 tests: add test cases for Linux llist helpers.
Use the test kernel module to setup tests and add test_llist.py to
carry out testing.

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
2022-08-08 08:22:32 -07:00
Omar Sandoval
43f045ae1a tests: add BPF helper tests
These require a fair bit of scaffolding, but it's worth it to fill one
of our major testing gaps.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 23:17:04 -07:00
Omar Sandoval
3b2a4d7b20 tests: factor out temporary cgroup creation function
Some BPF tests want a temporary cgroup to test with.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 17:35:24 -07:00
Omar Sandoval
901c1fb190 tests: factor out function for raising OSError from ctypes call
We duplicate this in a few places, and for the BPF tests we will want it
again.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 17:34:53 -07:00
Omar Sandoval
82f631b28a drgn.helpers.linux.cgroup: add cgroup_get_from_path()
A path is the most convenient way to find a cgroup if we don't already
have a pointer to it from another structure.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-19 23:47:32 -07:00
Omar Sandoval
1e79bbb195 tests: find cgroup2 mount instead of mounting it
I originally thought this would be too difficult, but it's fairly
straightforward to parse /proc/mounts and allows us to avoid some setup
and cleanup.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-19 23:47:32 -07:00
Omar Sandoval
dd58f3b1ad drgn.helpers.linux.kernfs: add kernfs_walk()
This looks up a kernfs node from a path. It will be used to look up
cgroups by path. This is based on kernfs_walk_ns() from the Linux
kernel, but it doesn't handle namespaced kernfs nodes yet.
kernfs_walk_ns() in the kernel is actually built on another function,
kernfs_find_ns(), but I don't think the latter is very useful as a
helper.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-19 23:47:24 -07:00
Jay Kamat
063850325f libdrgn: dwarf: look up complete types in namespaces
drgn_debug_info_find_complete() looks up the name of the incomplete type
in the global namespace. This is incorrect for C++: we need to look it
up in the namespace that the DIE is in.

To find the containing namespace, we need to do a DIE ancestor walk. We
don't want to do this for C, so add a flag indicating whether a language
has namespaces to struct drgn_language. If it's true, then we do the
ancestor walk and then look up the name in the appropriate namespace.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2022-07-15 16:02:56 -07:00
Omar Sandoval
4ebe8f26c5 tests: reference DIEs with labels instead of indices
Currently, DIE references are specified as an index into the list of the
unit DIE's children. This has a few issues:

* It's hard to figure out what references what at a glance.
* Changes to tests sometimes need to renumber these indices.
* DIEs at lower levels in the tree cannot be referenced.

Replace it with explicit "labels" which are referred to by name.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-15 16:02:56 -07:00
Omar Sandoval
b6f025fbfc tests: make wrap_test_type_dies() take varargs instead of list of dies
This saves an level of indentation that just adds noise.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-15 16:02:56 -07:00
Omar Sandoval
0f9b123254 drgn.helpers.linux.mm: add physical address conversion helpers
Add helpers for converting physical addresses to and from virtual
addresses, PFNs, and struct pages.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
4ea0476caf tests: linux_kernel: use test module for address translation tests
We currently test the functions to convert between virtual addresses,
PFNs, and struct pages with an mmap'd region and /proc/self/pagemap. Use
the test kernel module to test them more directly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
95053639d4 tests: linux_kernel: test kernel address translation
We're currently only testing whether we can translate user addresses.
Test a kernel address with the kernel page table, too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
f56b2f117a tests: linux_kernel: fix section mismatch warning in test module
The test kmod build has the following warning that I somehow didn't
notice before:

  WARNING: modpost: /home/osandov/repos/drgn-main/tests/linux_kernel/kmod/drgn_test.o(.init.text+0x3ac): Section mismatch in reference from the function init_module() to the function .exit.text:drgn_test_exit()
  The function __init init_module() references
  a function __exit drgn_test_exit().
  This is often seen when error handling in the init function
  uses functionality in the exit path.
  The fix is often to remove the __exit annotation of
  drgn_test_exit() so it may be used outside an exit section.

Remove the __exit annotation as suggested.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-06 18:06:18 -07:00
Kevin Svetlitski
5aaf3db6fc libdrgn: support reference and absent objects with float types which aren't 32 or 64 bits
Very similar to a541e9b170, but adds
partial support for floats (as opposed to integers) which aren't 32 or
64 bits.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-07-06 15:47:18 -07:00
Kevin Svetlitski
661d6a186c Add support for UTF character base types
Previously `drgn` did not recognize the	`DW_ATE_UTF` encoding for base
types, and consequently could not handle `char8_t`, `char16_t`, or
`char32_t`. This has been remedied, and a corresponding test case added
to prevent regressions.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-07-06 09:44:16 -07:00
Omar Sandoval
e0b24903d7 tests: linux_kernel: use function symbol for debug info tests
Our vmtest kernels have CONFIG_KALLSYMS_ALL, but distro kernels may not,
in which case variable symbols are not added to /proc/kallsyms. Then,
the Linux kernel debug info tests can't find our test symbol and fail.
Define a global function symbol and use it for the test debug info
tests instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 23:52:17 -07:00
Omar Sandoval
f2ef75d5e6 tests: linux_kernel: don't run mm tests on architectures without mm support
Several of the mm tests currently fail on architectures that we haven't
implemented virtual address translation and such for (i.e., anything
other than x86-64). Only run those tests on x86-64 for now.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 23:52:15 -07:00
Omar Sandoval
c2c2bd90cc tests: linux_kernel: handle architectures without pause() syscall
Some architectures, including AArch64, don't have the pause() syscall.
glibc implements pause(3) with ppoll() on those architectures. Our stack
trace tests check for "pause" in the stack trace, so it fails on
AArch64. Update the tests to check for both "pause" and "poll".

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 22:11:50 -07:00
Omar Sandoval
0eccc61b30 tests: fix kernel module debug info test on v5.19
Linux kernel commit a0e286b6a5b6 ("loop: remove lo_refcount and avoid
lo_mutex in ->open / ->release") (in v5.19-rc1) removed the lo_open
symbol that we use for
tests.linux_kernel.test_debug_info.TestModuleDebugInfo. Replace it with
a symbol from the test kernel module so we don't need to worry about it
going away.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-09 14:27:19 -07:00
Omar Sandoval
a3b72e33c8 Fix some more flake8 errors
Several have snuck in since the last time I did this in commit
5541fad063 ("Fix some flake8 errors"). Prepare for adding flake8 to
pre-commit by fixing them.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-17 15:23:42 -07:00
Omar Sandoval
c4220f21cd tests: move Linux kernel helper tests under tests/linux_kernel
Splitting the tests between tests/linux_kernel and tests/helpers/linux
means that we have to set up the unit tests twice, including loading
debug info. Python 3.7 and newer have a way to get around this, but
we're still sort of supporting Python 3.6. Move them under one path to
speed up test runs.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-17 14:20:34 -07:00
Omar Sandoval
243abf59e5 helpers: add red-black tree validators
This one is considerably more complicated than the linked list one, but
it should catch lots of kinds of red-black tree mistakes.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-17 14:20:34 -07:00
Omar Sandoval
9ae36fd12e helpers: add validators, starting with linked lists
At LSF/MM+BPF 2022, Ted Ts'o pitched me the idea of using drgn to
validate the consistency of kernel data structures. I really liked this
idea, especially for big, complicated data structures. But first, let's
start small: document the concept of a "validator", which is just a
special kind of helper, and add some basic validator versions of linked
list helpers.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-17 14:20:16 -07:00
Omar Sandoval
053b9da325 tests: use test kernel module for slab_cache_for_each_allocated_object() test
The current test case for slab_cache_for_each_allocated_object() is
barely more than a smoke test. It missed the bug fixed by the previous
commit. Now that we have the test kernel module, we can do a lot better
by comparing against the exact list of objects that are allocated.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-16 16:50:38 -07:00
Omar Sandoval
755f79012e helpers: add RB_EMPTY_ROOT
We couldn't test this before, but now that we can we might as well add
it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-16 16:47:35 -07:00
Omar Sandoval
3b9fb8bfb0 tests: use test kernel module for Linux rbtree helper tests
Just like we did for lists, use a test data structure instead of the vma
tree for a process. This also allows us to test RB_EMPTY_NODE, which we
couldn't test before.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-16 16:47:35 -07:00
Omar Sandoval
a9815bb287 tests: add test cases for Linux hlist helpers
We didn't have a good target for these tests before, but with the test
module, it's easy now.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-16 16:47:35 -07:00
Omar Sandoval
042c98be4a tests: use test kernel module for Linux list helper tests
Rather than picking a couple of lists which we hope won't change, use
the newly added test kernel module to define a few lists and test
against those. This also gives us proper tests for list_is_singular().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-16 16:47:31 -07:00
Omar Sandoval
6a3dcad19b tests: add framework for test kernel module
Now that commit d999703f94 ("vmtest: add kernel module build
dependencies to kernel packages") added the files necessary to build a
test kernel module, add the module (currently a stub) and the
scaffolding necessary to build and load it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-16 14:27:35 -07:00
Omar Sandoval
cb3bb6cc2b helpers: add slab cache merging helper and documentation
Slab cache merging can cause major confusion when debugging using slab
cache helpers. Add a helper to detect whether a slab cache is merged and
an explanation of the implications.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-09 16:27:24 -07:00
alexlzhu
5c66aab23f helpers: add helpers for iterating over allocated slab objects
These are ported from https://github.com/josefbacik/debug-scripts.

Signed-off-by: alexlzhu <alexlzhu@fb.com>
[Omar: make test case check for exception on SLOB]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-09 16:05:16 -07:00
Omar Sandoval
a541e9b170 libdrgn: support reference and absent objects with >64-bit integer types
GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and
unsigned __int128. Clang additionally has N-bit integers of up to 2<<24
bits with _ExtInt(N), which was standardized in C23 as _BitInt(N).

Currently, we disallow creating objects with a >64-bit integer type. Jay
Kamat reported that this would cause errors when examining some
binaries. The reason we disallow this is that we don't have a way to
represent or do operations on >64-bit values. We could make use of a
bignum library like GMP to do this in the future.

However, for now, we can loosen this restriction and at least allow
reference and absent objects with big integer types. This requires
enforcing two things: that we never create a value object with a >64-bit
integer type, and that we never read the value of a reference object
with a >64-bit integer type.

Co-authored-by: Jay Kamat <jaygkamat@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-28 13:38:38 -07:00
Omar Sandoval
eb75ecadd6 tests: fix slab tests for SLOB allocator
SLOB doesn't have /proc/slabinfo (and it can be disabled for SLUB and
SLAB on some kernel versions, too). Fall back to some known slab caches
if /proc/slabinfo doesn't exist.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-01 16:48:29 -07:00
alexlzhu
88a6ecfe90 helpers: add helpers for listing and finding slab caches
These are ported from https://github.com/josefbacik/debug-scripts.

Signed-off-by: alexlzhu <alexlzhu@fb.com>
2022-03-16 16:43:40 -07:00
Omar Sandoval
322e2b1c69 tests: move non-helper Linux kernel tests
tests/helpers/linux contains test cases for Linux kernel helpers and
test cases for core Linux kernel support. The latter don't make sense
there; move them to tests/linux_kernel instead, along with the
scaffolding.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-04 15:43:33 -08:00
Omar Sandoval
45eb3eb858 tests: find tests/sample.coredump.zst more robustly
First, find the sample file relative to the test module so that tests
can be run from a different directory. Second, pass --force to zstd so
that it doesn't ignore symlinks, which is required for environments like
Buck that copy the test files as a symlinks.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-03 11:53:21 -08:00
Omar Sandoval
b6e0ad2af1 tests: check for missing or invalid /proc/kcore for Linux kernel tests
If /proc/kcore is missing (e.g., because CONFIG_PROC_KCORE is not
enabled) or invalid (e.g., Docker mounts /dev/null over /proc/kcore),
then skip the tests.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-03 01:17:51 -08:00
Omar Sandoval
c198afeba5 helpers: add caller ID to printk records
Knowing the thread or CPU that logged a specific message can be very
useful when investigating multithreaded bugs. Additionally, on kernels
since 5.10, the caller ID is always saved but only exposed to userspace
if CONFIG_PRINTK_CALLER is enabled, which distros don't currently seem
to do.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-01 02:32:45 -08:00
Omar Sandoval
0884b303ea helpers: rename dmesg module to printk
This is more in line with the naming in the kernel. Also slightly reword
some comments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-03-01 02:18:03 -08:00