There are a bunch of places where we call .tp_alloc() directly, which is
very verbose. Add a macro that removes the boilerplate.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
-Wimplicit-fallthrough has a false positive because the compiler
apparently doesn't know that usage() never returns.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of string_builder_finalize(), which leaves the string_builder in
an undefined state (according to the documentation, at least), define
string_builder_null_terminate(), which documents exactly what it does.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Make string_builder_reserve() allocate an exact capacity, and add a
string_builder_reserve_for_append() wrapper that does the
next_power_of_two(current length + number to append) that all of the
current callers want.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than documenting how to initialize a struct string_builder,
provide an initializer, STRING_BUILDER_INIT.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This time, in order to work on both GCC and Clang, use
__attribute__((__fallthrough__)) instead of /* fallthrough */ comments.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This reverts commit e05bfbddc2. Clang
doesn't support /* fallthrough */ comments, so we'll need to use
__attribute__((falthrough)), which will need some additional feature
detection.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes
drgn_get_initial_registers() to think that the kernel is !SMP and use
CPU 0 instead, producing incorrect stack traces. This has also always
been wrong for Linux < 4.9 and on architectures that don't enable
CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be
((struct thread_info *)task->stack)->cpu.
Fix it by factoring out a new task_cpu() helper that handles all of the
above cases. Also add a test case for task_cpu() in case this changes
again.
Fixes: eea5422546 ("libdrgn: make Linux kernel stack unwinding more robust")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Every once in awhile, a tests.linux_kernel.test_stack_trace test fails
with a "cannot unwind stack of running task" error or without being able
to find pause or ppoll in the trace. I previously attempted to fix this
in commit a5845e63d4 ("tests: fix race condition in stack trace
tests"). However, now I'm seeing the forked process try to open
/etc/ld.so.cache before pausing for some reason. Since this can block,
proc_blocked() might return true before the process is actually in
pause(). If we get a stack trace while the process is in the wrong
syscall or in between calling another syscall and pause(), we will fail
as mentioned above.
Fix it in a few parts:
1. Use sigwait() instead of pause(), which I doubt is used anywhere else
while forking and won't get woken up by stray signals.
2. Wait until /proc/$pid/syscall shows that we're blocked in
rt_sigtimedwait or rt_sigtimedwait_time64 specifically.
3. Replace the manual fork_and_pause(), wait_until(proc_blocked, pid),
os.kill(pid, signal.SIGKILL), os.waitpid(pid, 0) sequence with a
context manager that takes care of all of that, fork_and_sigwait().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The next commit will some more syscall numbers that we need, so let's
settle on one place where we define syscall numbers.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Jakub Kicinski reported that
prog.crashed_thread().stack_trace()[1]['does not exist'] segfaulted on a
vmcore he encountered. The segfault was a NULL pointer dereference of
dwarf_diename() of a DW_TAG_subprogram DIE in
drgn_find_in_dwarf_scopes(). The fix is to ignore DIEs without a name.
I was curious what this anonymous DW_TAG_subprogram was. It turned out
to be some dubious DWARF generated by Clang when a local variable is
defined via a macro. One such example comes from the following code in
arch/x86/events/intel/uncore.h:
static inline bool uncore_mmio_is_valid_offset(struct intel_uncore_box *box,
unsigned long offset)
{
if (offset < box->pmu->type->mmio_map_size)
return true;
pr_warn_once("perf uncore: Invalid offset 0x%lx exceeds mapped area of %s.\n",
offset, box->pmu->type->name);
return false;
}
pr_warn_once() expands to:
#define pr_warn_once(fmt, ...) \
printk_once(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
#define printk_once(fmt, ...) \
({ \
static bool __section(".data.once") __print_once; \
bool __ret_print_once = !__print_once; \
\
if (!__print_once) { \
__print_once = true; \
printk(fmt, ##__VA_ARGS__); \
} \
unlikely(__ret_print_once); \
})
For some reason, Clang generates an anonymous, top-level
DW_TAG_subprogram DIE to contain the __print_once variable:
<1><1cf86e>: Abbrev Number: 62 (DW_TAG_subprogram)
<2><1cf86f>: Abbrev Number: 61 (DW_TAG_variable)
<1cf870> DW_AT_name : (indirect string, offset: 0x34fb2e): __print_once
<1cf874> DW_AT_type : <0x1c574c>
<1cf878> DW_AT_decl_file : 1
<1cf879> DW_AT_decl_line : 229
<1cf87a> DW_AT_location : 16 byte block: 3 2c 84 66 83 ff ff ff ff 94 1 31 1e 30 22 9f (DW_OP_addr: ffffffff8366842c; DW_OP_deref_size: 1; DW_OP_lit1; DW_OP_mul; DW_OP_lit0; DW_OP_plus; DW_OP_stack_value)
Whereas GCC puts it in a DW_TAG_lexical block DIE inside of the
DW_TAG_subprogram DIE for uncore_mmio_is_valid_offset():
<1><3110b2>: Abbrev Number: 45 (DW_TAG_subprogram)
<3110b3> DW_AT_name : (indirect string, offset: 0x2e13e): uncore_mmio_is_valid_offset
<3110b7> DW_AT_decl_file : 4
<3110b8> DW_AT_decl_line : 223
<3110b9> DW_AT_decl_column : 20
<3110ba> DW_AT_prototyped : 1
<3110ba> DW_AT_type : <0x2f416b>
<3110be> DW_AT_inline : 3 (declared as inline and inlined)
<3110bf> DW_AT_sibling : <0x311142>
<2><3110ef>: Abbrev Number: 66 (DW_TAG_lexical_block)
<3><3110f0>: Abbrev Number: 120 (DW_TAG_variable)
<3110f1> DW_AT_name : (indirect string, offset: 0x2da3f): __print_once
<3110f5> DW_AT_decl_file : 4
<3110f6> DW_AT_decl_line : 229
<3110f7> DW_AT_decl_column : 2
<3110f8> DW_AT_type : <0x2f416b>
<3110fc> DW_AT_location : 9 byte block: 3 2c 28 48 83 ff ff ff ff (DW_OP_addr: ffffffff8348282c)
Regardless, we shouldn't crash on this input.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function identifies the type of the address (slab allocated
or symbol) and returns a string representation of the address
accordingly. This will be useful for another helper function which
prints the stack trace with more information about each item on the
stack.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
This is a better place to put all of these generic helpers now. I think
these are all uncommon enough that it shouldn't be too much trouble to
move them.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This will contain the new modules that Nhat is adding and be the new
home for some of the stuff currently in the top-level drgn.helpers
module.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.
Closes#185.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).
This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
An upcoming fix requires us to always use the module list from the core
dump rather than /proc/modules. However, with the existing code, this
would cause a major startup time regression for the live kernel, mainly
because reading from /proc/kcore is stupidly slow. We currently do 3 +
strlen(module->name) reads for every module. We can reduce this to 1
read per module by reading the entire struct module at once. The size of
struct module is ~700-900 bytes depending on the kernel configuration,
which is still much faster to read than only reading what we need.
In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this
reduced the time spent in the kernel module iterator from ~2.5ms per
module to ~0.4ms per module.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If drgn_object_set_buffer_from_internal() (used to implement
drgn_object_set_from_buffer(), drgn_object_slice(), and
drgn_object_reinterpret()) sets an object to a primitive type from a
buffer that comes from the same object, then drgn_object_reinit() will
free the value and then drgn_value_serialize() will access the freed
value, probably resulting in garbage. Handle this case the same way we
do if the result type is encoded as a buffer, by first copying to a
temporary value.
This doesn't affect usage through Python because objects are immutable
in the Python bindings.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some slab caches for large objects (like task_struct) allocate slabs as
compound pages. Only the head page is marked as PageSlab(), so if
find_containing_slab_cache() gets an address that was allocated out of a
tail page, it will incorrectly return NULL. Fix it by always getting the
compound_head, and add a test case with large slab objects.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
find_containing_slab_cache is supposed to returns NULL when encountered
a page which does not exist. This is detected when accessing page flags
gives us a fault error. However, this is not checked correctly in the
current implementation. This commit fixes this issue.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Running test_find_containing_slab_cache_invalid() without the drgn_test
Linux kernel module gives a KeyError:
Traceback (most recent call last):
File ".../tests/linux_kernel/helpers/test_slab.py", line 169, in test_find_containing_slab_cache_invalid
find_containing_slab_cache(self.prog, self.prog["drgn_test_va"]),
KeyError: 'drgn_test_va'
Use the @skip_unless_have_test_kmod tag. The test also needs a
@skip_unless_have_full_mm_support tag as pointed out by Omar, so add it
while we are at it.
Fixes: 79ea6589c2 ("drgn.helpers.linux.slab: add find_containing_slab_cache helper")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
The only exception is the link to ps(1) in task_state_to_char() because
that needs to link to a specific section.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have a couple of loops that deal with short reads/EINTR from read(2)
and pread(2), and upcoming changes would need to add more. Add some
wrappers to abstract this away.
drgn_read_memory_file() still needs the loop so it can fault on the
exact offset that returns EIO.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The call to min() in drgn_read_memory_file() results in the following
warning on 32-bit architectures that I missed on review:
In file included from ../../libdrgn/memory_reader.c:10:
../../libdrgn/memory_reader.c: In function 'drgn_read_memory_file':
../../libdrgn/minmax.h:36:26: warning: comparison of distinct pointer types lacks a cast
36 | (void)(&unique_x == &unique_y); \
| ^~
../../libdrgn/minmax.h:28:19: note: in expansion of macro 'cmp_once_impl'
28 | #define min(x, y) cmp_once_impl(x, y, PP_UNIQUE(_x), PP_UNIQUE(_y), <)
| ^~~~~~~~~~~~~
../../libdrgn/memory_reader.c:284:34: note: in expansion of macro 'min'
284 | size_t readlen = min(file_end - file_offset, count);
| ^~~
We can fix it with a cast, and additionally do the call to min() earlier
and rework the logic a bit.
Fixes: 9684771d61 ("libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There's a lot more context here that we should write down. It's also
worth noting that it appears that GDB always zero fills the range
between p_filesz and p_memsz, so if we end up having any other issues
because of this, we might have to concede and go back to the behavior
before commit 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz
in core dumps").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Something changed recently in PackIt and it no longer generates a
changelog if the rpmautospec macro is present. This ends up breaking
EPEL 8 builds, which apparently don't support rpmautospec properly yet
(see https://pagure.io/fedora-infra/rpmautospec/issue/204).
Signed-off-by: Davide Cavalca <dcavalca@fb.com>
Since _repr_pretty_() uses output of str(), and the latter is already
heavily tested in tests/test_language_c.py, we can simply test whether
p.text() is called instead of duplicating all the test cases.
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Add pretty printing support in Jupyter notebook for Object, Type,
StackFrame, and StackTrace; it will print out their representation in
programming language syntax with str(), similar to what's being done in
interactive mode.
Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Apparently Sphinx doesn't like it when you use the same link text for
two different links. Fix it by adding an extra underscore, which makes
it an anonymous reference.
Fixes: 9c69d2dd4b ("README: update libkdumpfile installation instructions")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
makedumpfile will exclude zero pages. We found a core file where a
structure straddled a page boundary and the end of the structure
was all zeros so the page was excluded and we were generating a
FaultError trying to access the structure.
This change reverts a portion of that behaviour such that when we are
debugging a kernel core we go back to the zero fill behaviour. To do this
we go back to creating segments based on memsz instead of filesz and
handling the filesz->memsz gap in drgn_read_memory_file.
Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Glen McCready <gkm@mysteryinc.ca>
Replace the old "Scriptable debugger library" and
"Debugger-as-a-library" taglines with the one we're using on GitHub,
"Programmable debugger". Make up for it by emphasizing that drgn can
also be used as a library a tiny bit more in the README.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we're documenting parameter and return types with annotations,
we can use only one line for the overload of functions that can take
either an object or a program and an integer.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of adding type information to directive descriptions with the
:type:, :rtype:, and :vartype: fields, document types with type
annotations. For functions and methods, we add the type annotations to
the signature. For variables and attributes, we use the :type: option.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There are a bunch of page flag getters in the kernel like
PageUptodate(), PageLocked(), etc., that kernel developers are
accustomed to using. Most of them are simple bit tests. Let's add
helpers for all of those. These are generated from
include/linux/page-flags.h in the Linux kernel source tree as of Linux
v6.0-rc1.
More complicated getters that need to do more than a simple flag check
(e.g., PageCompound()) will need to be added manually.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn is now packaged for openSUSE. Add instructions for installing with
zypper or from source. Also reindent the Arch Linux instructions
correctly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
PyErr_SetObject() takes a reference on the exception value, so we need
to drop the reference we got when we created the value. Issue #196 ran
into this by reading tons of unmapped addresses.
Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function identifies the slab cache (if any) the object at
the given address belongs to. This will be useful for a future helper
function which prints the stack trace with more information about each
item on the stack.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Modify how the test page is allocated to ensure we have a directly
mapped address which is not slab allocated for testing the negative case
of find_containing_slab_cache.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
The config option is and always has been CONFIG_FW_CFG_SYSFS, not
CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the
internal CONFIG_CRASH_CORE.
Fixes: 2bd861f719 ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
cgroup_bpf_prog_for_each() needed a minor update, but after fixing that,
all of the flavors pass all tests.
Signed-off-by: Omar Sandoval <osandov@osandov.com>