The helper function returns a tuple with load
averaged over last 1, 5 and 15 minutes.
Signed-off-by: Martin Liska <mliska@suse.cz>
Co-authored-by: Omar Sandoval <osandov@osandov.com>
The test case checks `drgn_test_slob` from the test kmod.
Fixes: 975255f209 ("tests: handle cases without slab support in print_annotated_stack() test")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
So far idr support was available for only radix-tree based idrs.
Thus radix-tree tests were implicity covering idr test as well.
Now as we are supporting non radix-tree based idrs as well, so
add explicit test cases for idr testing.
The test are applicable for both new (i.e radix-tree based) and
old implementation of idrs.
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Looking up objects in namespaces is already well-supported by `drgn`.
These changes bring the same to functionality type lookup, so that
`prog.type('struct A::B::C::MyType')` works in an analogous fashion to
`prog['A::B::C::MyVar']`.
Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>
ppc64's crash_setup_regs() calls ppc_save_regs(), which isn't exported.
So, we need to provide our own implementation. At this point, we might
as well just copy the implementations of crash_setup_regs() for x86-64
and AArch64 and stop trying to use crash_setup_regs().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
When the architecture is missing virtual address translation or the
kernel is configured with SLOB, print_annotated_stack() will not
identify slab objects.
Fixes: 05041423c7 ("drgn.helpers.common.stack: add print_annotated_stack helper function")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The Linux kernel apparently enables -Wundef, so although #if is correct,
it results in a warning. Use #ifdef instead.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The current test case only checks that getting a stack trace succeeds.
By having the test kernel module create a struct pt_regs, we can
actually test that we get a reasonable stack trace.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Currently, looking up a type with template arguments results in an
"invalid character" syntax error on the "<" character. The DWARF index
includes template arguments in indexed names, so we need to do lookups
including the template arguments. Full support for this would require
parsing the template argument list syntax and normalizing it or looking
it up as an AST in some way. For now, it's at least an improvement to
pass the user's string verbatim. To do so, kludge it by adding a token
containing everything from "<" to the matching ">" to the C++ lexer and
appending that to the identifier.
Co-authored-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>
Commit 89eb868e95 ("helpers: make find_task() work on recent kernels")
made radix_tree_lookup() and radix_tree_for_each() work for basic
XArrays. However, it doesn't handle a couple of more advanced features:
multi-index entries (which old radix trees actually also supported) and
zero entries. It has also been really confusing to explain to people
unfamiliar with the radix tree -> XArray transition that they should use
helpers named radix_tree for a structure named xarray.
So, let's finally add xa_load(), xa_for_each(), and some additional
auxiliary helpers. The non-recursive xa_for_each() implementation is
based on Kevin Svetlitski's C implementation from commit 2b47583c73
("Rewrite linux helper iterators in C"). radix_tree_lookup() and
radix_tree_for_each() share the implementation with xa_load() and
xa_for_each(), respectively, so they are mostly interchangeable.
Fixes: #61
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This DWARF tag is used by C++ classes which take a variable number
of template parameters, such as std::variant and std::tuple.
Signed-off-by: Alastair Robertson <ajor@meta.com>
Thanks to commit c08f6be52a ("vmtest: kbuild: add
CONFIG_MODULE_UNLOAD=y"), I was finally able to try unloading the test
kernel module and found a trivial copy-and-paste error. Fix it.
Fixes: 42e7d474d1 ("drgn.helpers.linux.mm: add compound page helpers")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function shows the content of the stack trace, optionally
annotating each entry with the appropriate tag. Currently, it only
annotates symbols and slab cache objects.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
c_parse_specifier_qualifier_list() checks whether an identifier starts
with "size_t" or "ptrdiff_t" to decide whether to return the size_t or
ptrdiff_t type. This incorrectly matches stuff like like "size_tea" and
"ptrdiff_tee". Fix this by making it an exact comparison.
Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have some generic helpers that we'd like to add (for example, #210)
that need to know the stack pointer of a frame. These shouldn't need to
hard-code register names for different architectures. Add a generic
shortcut, StackFrame.sp.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we have slab_object_info(), use it to identify the offset of an
address in a slab object and whether the object is allocated.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
find_containing_slab_cache() is very useful, but there are two
additional pieces of information I've found myself wanting: the offset
of the pointer in the slab object and whether the slab object is
allocated or free. Add a new helper, slab_object_info(), which provides
that information.
The implementation was a challenge because we want to share as much code
as we can between slab_object_info() and
slab_cache_for_each_allocated_object(), but there are a multitude of
slab allocator configuration options and version differences requiring
unique handling. Stephen Brennan provided some code that I stole several
ideas from.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than the fuzzy guesses we do with a task in sigwait, take
Stephen's idea from commit 5f3a91f80d ("Add StackFrame.locals()
method") further and use the test kmod to set up some precise stack
frames that we can test. Also use kthread_park() to make sure we can't
race with the thread not being scheduled out (unlikely as that may be).
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The StackFrame's __getitem__() method allows looking up names in the
scope of a stack frame, which is an incredibly useful tool for
debugging. However, the names are not discoverable -- you must already
be looking at the source code or some other source to know what names
can be queried. To fix this, add a locals() method to StackFrame, which
lists names that can be queried in the scope. Since this method is named
locals(), it stops at the function scope and doesn't include globals or
class members.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.
All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
When SLUB is in use, and the CONFIG_SYSFS is enabled (a very common
situation), we are able to identify which slab caches have been merged.
Provide a helper to expose this information so that users can lookup the
correct cache name, or identify all other caches which have been merged
with a given cache.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Only x86-64 has ORC, so it's a waste of time to run
tests.linux_kernel.test_stack_trace.TestStackTrace.test_by_pid_orc() on
other architectures.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The syscall table will be useful outside of the test cases themselves.
Additionally, definining it in terms of the "normalized" machine name is
a little easier. The normalized machine name will be useful elsewhere,
too.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
tests.linux_kernel.helpers.test_common.TestIdentifyAddress.test_identify_unrecognized()
is missing @skip_unless_have_full_mm_support and
@skip_unless_have_test_kmod.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes
drgn_get_initial_registers() to think that the kernel is !SMP and use
CPU 0 instead, producing incorrect stack traces. This has also always
been wrong for Linux < 4.9 and on architectures that don't enable
CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be
((struct thread_info *)task->stack)->cpu.
Fix it by factoring out a new task_cpu() helper that handles all of the
above cases. Also add a test case for task_cpu() in case this changes
again.
Fixes: eea5422546 ("libdrgn: make Linux kernel stack unwinding more robust")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Every once in awhile, a tests.linux_kernel.test_stack_trace test fails
with a "cannot unwind stack of running task" error or without being able
to find pause or ppoll in the trace. I previously attempted to fix this
in commit a5845e63d4 ("tests: fix race condition in stack trace
tests"). However, now I'm seeing the forked process try to open
/etc/ld.so.cache before pausing for some reason. Since this can block,
proc_blocked() might return true before the process is actually in
pause(). If we get a stack trace while the process is in the wrong
syscall or in between calling another syscall and pause(), we will fail
as mentioned above.
Fix it in a few parts:
1. Use sigwait() instead of pause(), which I doubt is used anywhere else
while forking and won't get woken up by stray signals.
2. Wait until /proc/$pid/syscall shows that we're blocked in
rt_sigtimedwait or rt_sigtimedwait_time64 specifically.
3. Replace the manual fork_and_pause(), wait_until(proc_blocked, pid),
os.kill(pid, signal.SIGKILL), os.waitpid(pid, 0) sequence with a
context manager that takes care of all of that, fork_and_sigwait().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The next commit will some more syscall numbers that we need, so let's
settle on one place where we define syscall numbers.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function identifies the type of the address (slab allocated
or symbol) and returns a string representation of the address
accordingly. This will be useful for another helper function which
prints the stack trace with more information about each item on the
stack.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
This is a better place to put all of these generic helpers now. I think
these are all uncommon enough that it shouldn't be too much trouble to
move them.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.
Closes#185.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).
This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some slab caches for large objects (like task_struct) allocate slabs as
compound pages. Only the head page is marked as PageSlab(), so if
find_containing_slab_cache() gets an address that was allocated out of a
tail page, it will incorrectly return NULL. Fix it by always getting the
compound_head, and add a test case with large slab objects.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Running test_find_containing_slab_cache_invalid() without the drgn_test
Linux kernel module gives a KeyError:
Traceback (most recent call last):
File ".../tests/linux_kernel/helpers/test_slab.py", line 169, in test_find_containing_slab_cache_invalid
find_containing_slab_cache(self.prog, self.prog["drgn_test_va"]),
KeyError: 'drgn_test_va'
Use the @skip_unless_have_test_kmod tag. The test also needs a
@skip_unless_have_full_mm_support tag as pointed out by Omar, so add it
while we are at it.
Fixes: 79ea6589c2 ("drgn.helpers.linux.slab: add find_containing_slab_cache helper")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Since _repr_pretty_() uses output of str(), and the latter is already
heavily tested in tests/test_language_c.py, we can simply test whether
p.text() is called instead of duplicating all the test cases.
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
There are a bunch of page flag getters in the kernel like
PageUptodate(), PageLocked(), etc., that kernel developers are
accustomed to using. Most of them are simple bit tests. Let's add
helpers for all of those. These are generated from
include/linux/page-flags.h in the Linux kernel source tree as of Linux
v6.0-rc1.
More complicated getters that need to do more than a simple flag check
(e.g., PageCompound()) will need to be added manually.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function identifies the slab cache (if any) the object at
the given address belongs to. This will be useful for a future helper
function which prints the stack trace with more information about each
item on the stack.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Modify how the test page is allocated to ensure we have a directly
mapped address which is not slab allocated for testing the negative case
of find_containing_slab_cache.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.
Signed-off-by: Omar Sandoval <osandov@osandov.com>