This will contain the new modules that Nhat is adding and be the new
home for some of the stuff currently in the top-level drgn.helpers
module.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.
Closes#185.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).
This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
An upcoming fix requires us to always use the module list from the core
dump rather than /proc/modules. However, with the existing code, this
would cause a major startup time regression for the live kernel, mainly
because reading from /proc/kcore is stupidly slow. We currently do 3 +
strlen(module->name) reads for every module. We can reduce this to 1
read per module by reading the entire struct module at once. The size of
struct module is ~700-900 bytes depending on the kernel configuration,
which is still much faster to read than only reading what we need.
In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this
reduced the time spent in the kernel module iterator from ~2.5ms per
module to ~0.4ms per module.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If drgn_object_set_buffer_from_internal() (used to implement
drgn_object_set_from_buffer(), drgn_object_slice(), and
drgn_object_reinterpret()) sets an object to a primitive type from a
buffer that comes from the same object, then drgn_object_reinit() will
free the value and then drgn_value_serialize() will access the freed
value, probably resulting in garbage. Handle this case the same way we
do if the result type is encoded as a buffer, by first copying to a
temporary value.
This doesn't affect usage through Python because objects are immutable
in the Python bindings.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some slab caches for large objects (like task_struct) allocate slabs as
compound pages. Only the head page is marked as PageSlab(), so if
find_containing_slab_cache() gets an address that was allocated out of a
tail page, it will incorrectly return NULL. Fix it by always getting the
compound_head, and add a test case with large slab objects.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
find_containing_slab_cache is supposed to returns NULL when encountered
a page which does not exist. This is detected when accessing page flags
gives us a fault error. However, this is not checked correctly in the
current implementation. This commit fixes this issue.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Running test_find_containing_slab_cache_invalid() without the drgn_test
Linux kernel module gives a KeyError:
Traceback (most recent call last):
File ".../tests/linux_kernel/helpers/test_slab.py", line 169, in test_find_containing_slab_cache_invalid
find_containing_slab_cache(self.prog, self.prog["drgn_test_va"]),
KeyError: 'drgn_test_va'
Use the @skip_unless_have_test_kmod tag. The test also needs a
@skip_unless_have_full_mm_support tag as pointed out by Omar, so add it
while we are at it.
Fixes: 79ea6589c2 ("drgn.helpers.linux.slab: add find_containing_slab_cache helper")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
The only exception is the link to ps(1) in task_state_to_char() because
that needs to link to a specific section.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have a couple of loops that deal with short reads/EINTR from read(2)
and pread(2), and upcoming changes would need to add more. Add some
wrappers to abstract this away.
drgn_read_memory_file() still needs the loop so it can fault on the
exact offset that returns EIO.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The call to min() in drgn_read_memory_file() results in the following
warning on 32-bit architectures that I missed on review:
In file included from ../../libdrgn/memory_reader.c:10:
../../libdrgn/memory_reader.c: In function 'drgn_read_memory_file':
../../libdrgn/minmax.h:36:26: warning: comparison of distinct pointer types lacks a cast
36 | (void)(&unique_x == &unique_y); \
| ^~
../../libdrgn/minmax.h:28:19: note: in expansion of macro 'cmp_once_impl'
28 | #define min(x, y) cmp_once_impl(x, y, PP_UNIQUE(_x), PP_UNIQUE(_y), <)
| ^~~~~~~~~~~~~
../../libdrgn/memory_reader.c:284:34: note: in expansion of macro 'min'
284 | size_t readlen = min(file_end - file_offset, count);
| ^~~
We can fix it with a cast, and additionally do the call to min() earlier
and rework the logic a bit.
Fixes: 9684771d61 ("libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There's a lot more context here that we should write down. It's also
worth noting that it appears that GDB always zero fills the range
between p_filesz and p_memsz, so if we end up having any other issues
because of this, we might have to concede and go back to the behavior
before commit 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz
in core dumps").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Something changed recently in PackIt and it no longer generates a
changelog if the rpmautospec macro is present. This ends up breaking
EPEL 8 builds, which apparently don't support rpmautospec properly yet
(see https://pagure.io/fedora-infra/rpmautospec/issue/204).
Signed-off-by: Davide Cavalca <dcavalca@fb.com>
Since _repr_pretty_() uses output of str(), and the latter is already
heavily tested in tests/test_language_c.py, we can simply test whether
p.text() is called instead of duplicating all the test cases.
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Add pretty printing support in Jupyter notebook for Object, Type,
StackFrame, and StackTrace; it will print out their representation in
programming language syntax with str(), similar to what's being done in
interactive mode.
Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Apparently Sphinx doesn't like it when you use the same link text for
two different links. Fix it by adding an extra underscore, which makes
it an anonymous reference.
Fixes: 9c69d2dd4b ("README: update libkdumpfile installation instructions")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
makedumpfile will exclude zero pages. We found a core file where a
structure straddled a page boundary and the end of the structure
was all zeros so the page was excluded and we were generating a
FaultError trying to access the structure.
This change reverts a portion of that behaviour such that when we are
debugging a kernel core we go back to the zero fill behaviour. To do this
we go back to creating segments based on memsz instead of filesz and
handling the filesz->memsz gap in drgn_read_memory_file.
Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Glen McCready <gkm@mysteryinc.ca>
Replace the old "Scriptable debugger library" and
"Debugger-as-a-library" taglines with the one we're using on GitHub,
"Programmable debugger". Make up for it by emphasizing that drgn can
also be used as a library a tiny bit more in the README.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we're documenting parameter and return types with annotations,
we can use only one line for the overload of functions that can take
either an object or a program and an integer.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of adding type information to directive descriptions with the
:type:, :rtype:, and :vartype: fields, document types with type
annotations. For functions and methods, we add the type annotations to
the signature. For variables and attributes, we use the :type: option.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There are a bunch of page flag getters in the kernel like
PageUptodate(), PageLocked(), etc., that kernel developers are
accustomed to using. Most of them are simple bit tests. Let's add
helpers for all of those. These are generated from
include/linux/page-flags.h in the Linux kernel source tree as of Linux
v6.0-rc1.
More complicated getters that need to do more than a simple flag check
(e.g., PageCompound()) will need to be added manually.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn is now packaged for openSUSE. Add instructions for installing with
zypper or from source. Also reindent the Arch Linux instructions
correctly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
PyErr_SetObject() takes a reference on the exception value, so we need
to drop the reference we got when we created the value. Issue #196 ran
into this by reading tons of unmapped addresses.
Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This helper function identifies the slab cache (if any) the object at
the given address belongs to. This will be useful for a future helper
function which prints the stack trace with more information about each
item on the stack.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Modify how the test page is allocated to ensure we have a directly
mapped address which is not slab allocated for testing the negative case
of find_containing_slab_cache.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
The config option is and always has been CONFIG_FW_CFG_SYSFS, not
CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the
internal CONFIG_CRASH_CORE.
Fixes: 2bd861f719 ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
cgroup_bpf_prog_for_each() needed a minor update, but after fixing that,
all of the flavors pass all tests.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Kernel makes use of several lockless singly lists (free_ipc_list,
delayed_mntput_list etc.) so having some helpers to traverse
these lists can be useful.
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
We don't specifically need BusyBox; we just need a reasonable Linux
userspace, which we can assume is already available on the host, whether
it's coreutils+util-linux, BusyBox, or something else.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The test command does this, and I always end up doing it when I'm doing
manual testing with the vmtest.vm CLI, so let's just do it by default.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Just picking up the newest version. Also fix the following warning:
WARNING: extlinks: Sphinx-6.0 will require a caption string to contain exactly one '%s' and all other '%' need to be escaped as '%%'.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
"Object finder" was renamed from "symbol finder" awhile ago, but we
forgot to update the advanced usage documentation.
Fixes: 0c5df56fba ("libdrgn: replace symbol index with object index")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently checking whether the iterator has entered the
non-canonical range when fetching the last level of the page table, but
the cutover actually happens while we're in the last level. Fix it by
doing the check unconditionally.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Similarly to the helpers available to iterate over eBPF programs and
maps, add helpers for links and BTF objects. The implementation is very
straightforward.
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[Omar: add kernel version comments]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The helpers only work since Linux v4.15, but it's easy to make them work
before that. We can also easily handle kernels without cgroup BPF
programs (either before Linux v4.10 or without CONFIG_CGROUP_BPF) and
yield nothing.
Signed-off-by: Omar Sandoval <osandov@osandov.com>