Commit Graph

1356 Commits

Author SHA1 Message Date
Omar Sandoval
4e86a9ae56 Create drgn.helpers.common package
This will contain the new modules that Nhat is adding and be the new
home for some of the stuff currently in the top-level drgn.helpers
module.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-14 17:08:39 -07:00
Omar Sandoval
0f854d2d55 drgn.helpers.linux: remove unused "# type: ignore"
python/mypy#1422 was fixed awhile ago.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-14 17:08:39 -07:00
Omar Sandoval
30c9ad452d libdrgn: linux_kernel: fix global per-CPU variables in kernel modules
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.

Closes #185.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:27:28 -07:00
Omar Sandoval
a52016c4cb libdrgn: linux_kernel: always use module list from core
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).

This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:11:59 -07:00
Omar Sandoval
94036f6daf libdrgn: linux_kernel: optimize reading module list
An upcoming fix requires us to always use the module list from the core
dump rather than /proc/modules. However, with the existing code, this
would cause a major startup time regression for the live kernel, mainly
because reading from /proc/kcore is stupidly slow. We currently do 3 +
strlen(module->name) reads for every module. We can reduce this to 1
read per module by reading the entire struct module at once. The size of
struct module is ~700-900 bytes depending on the kernel configuration,
which is still much faster to read than only reading what we need.

In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this
reduced the time spent in the kernel module iterator from ~2.5ms per
module to ~0.4ms per module.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:08:33 -07:00
Omar Sandoval
a2db11ebae libdrgn: object: fix use after free in drgn_object_set_from_buffer_internal()
If drgn_object_set_buffer_from_internal() (used to implement
drgn_object_set_from_buffer(), drgn_object_slice(), and
drgn_object_reinterpret()) sets an object to a primitive type from a
buffer that comes from the same object, then drgn_object_reinit() will
free the value and then drgn_value_serialize() will access the freed
value, probably resulting in garbage. Handle this case the same way we
do if the result type is encoded as a buffer, by first copying to a
temporary value.

This doesn't affect usage through Python because objects are immutable
in the Python bindings.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:08:33 -07:00
Omar Sandoval
67d0e8dab5 docs: remove stray backtick
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:08:12 -07:00
Omar Sandoval
e5c7acb4fb drgn.helpers.linux.slab: handle compound pages in find_containing_slab_cache()
Some slab caches for large objects (like task_struct) allocate slabs as
compound pages. Only the head page is marked as PageSlab(), so if
find_containing_slab_cache() gets an address that was allocated out of a
tail page, it will incorrectly return NULL. Fix it by always getting the
compound_head, and add a test case with large slab objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-09 16:35:28 -07:00
Omar Sandoval
42e7d474d1 drgn.helpers.linux.mm: add compound page helpers
I had these helpers lying around from a couple of bugs related to
compound pages that I debugged.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-09 15:54:43 -07:00
Nhat Pham
be04182ae7 drgn.helpers.linux.slab: fix find_containing_slab_cache's behavior when the page does not exist
find_containing_slab_cache is supposed to returns NULL when encountered
a page which does not exist. This is detected when accessing page flags
gives us a fault error. However, this is not checked correctly in the
current implementation. This commit fixes this issue.

Signed-off-by: Nhat Pham <nphamcs@gmail.com>
2022-08-30 17:44:28 -07:00
Peilin Ye
517d4bea18 tests: Add missing tags for test_find_containing_slab_cache_invalid()
Running test_find_containing_slab_cache_invalid() without the drgn_test
Linux kernel module gives a KeyError:

  Traceback (most recent call last):
    File ".../tests/linux_kernel/helpers/test_slab.py", line 169, in test_find_containing_slab_cache_invalid
      find_containing_slab_cache(self.prog, self.prog["drgn_test_va"]),
  KeyError: 'drgn_test_va'

Use the @skip_unless_have_test_kmod tag.  The test also needs a
@skip_unless_have_full_mm_support tag as pointed out by Omar, so add it
while we are at it.

Fixes: 79ea6589c2 ("drgn.helpers.linux.slab: add find_containing_slab_cache helper")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
2022-08-29 15:01:18 -07:00
Omar Sandoval
f8ba278bc1 libdrgn: fix include-what-you-use warnings
It's been awhile since I've run this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
eb38d88f15 docs: link to man pages with :manpage: consistently
The only exception is the link to ps(1) in task_state_to_char() because
that needs to link to a specific section.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
b8cdfff250 libdrgn: add read(2) and pread(2) wrappers that don't return short reads
We have a couple of loops that deal with short reads/EINTR from read(2)
and pread(2), and upcoming changes would need to add more. Add some
wrappers to abstract this away.

drgn_read_memory_file() still needs the loop so it can fault on the
exact offset that returns EIO.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
56fda2a0cf libdrgn: fix min() warning on 32-bit architectures
The call to min() in drgn_read_memory_file() results in the following
warning on 32-bit architectures that I missed on review:

In file included from ../../libdrgn/memory_reader.c:10:
../../libdrgn/memory_reader.c: In function 'drgn_read_memory_file':
../../libdrgn/minmax.h:36:26: warning: comparison of distinct pointer types lacks a cast
   36 |         (void)(&unique_x == &unique_y);                                         \
      |                          ^~
../../libdrgn/minmax.h:28:19: note: in expansion of macro 'cmp_once_impl'
   28 | #define min(x, y) cmp_once_impl(x, y, PP_UNIQUE(_x), PP_UNIQUE(_y), <)
      |                   ^~~~~~~~~~~~~
../../libdrgn/memory_reader.c:284:34: note: in expansion of macro 'min'
  284 |                 size_t readlen = min(file_end - file_offset, count);
      |                                  ^~~

We can fix it with a cast, and additionally do the call to min() earlier
and rework the logic a bit.

Fixes: 9684771d61 ("libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
04d2dee964 libdrgn: elaborate on core dump p_filesz < p_memsz ambiguity
There's a lot more context here that we should write down. It's also
worth noting that it appears that GDB always zero fills the range
between p_filesz and p_memsz, so if we end up having any other issues
because of this, we might have to concede and go back to the behavior
before commit 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz
in core dumps").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Davide Cavalca
30731abe6b packit: disable rpmautospec
Something changed recently in PackIt and it no longer generates a
changelog if the rpmautospec macro is present. This ends up breaking
EPEL 8 builds, which apparently don't support rpmautospec properly yet
(see https://pagure.io/fedora-infra/rpmautospec/issue/204).

Signed-off-by: Davide Cavalca <dcavalca@fb.com>
2022-08-25 15:35:30 -07:00
Shung-Hsi Yu
e8d0c85811 test: add test for _repr_pretty_() method
Since _repr_pretty_() uses output of str(), and the latter is already
heavily tested in tests/test_language_c.py, we can simply test whether
p.text() is called instead of duplicating all the test cases.

Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
2022-08-25 13:52:28 -07:00
Shung-Hsi Yu
9335e227d6 libdrgn: python: add Jupyter pretty printing support
Add pretty printing support in Jupyter notebook for Object, Type,
StackFrame, and StackTrace; it will print out their representation in
programming language syntax with str(), similar to what's being done in
interactive mode.

Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
2022-08-25 13:52:11 -07:00
Omar Sandoval
3865e8bdc8 docs: fix "Duplicate explicit target name" Sphinx warning
Apparently Sphinx doesn't like it when you use the same link text for
two different links. Fix it by adding an extra underscore, which makes
it an anonymous reference.

Fixes: 9c69d2dd4b ("README: update libkdumpfile installation instructions")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-25 13:26:36 -07:00
Glen McCready
9684771d61 libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError
makedumpfile will exclude zero pages. We found a core file where a
structure straddled a page boundary and the end of the structure
was all zeros so the page was excluded and we were generating a
FaultError trying to access the structure.

This change reverts a portion of that behaviour such that when we are
debugging a kernel core we go back to the zero fill behaviour. To do this
we go back to creating segments based on memsz instead of filesz and
handling the filesz->memsz gap in drgn_read_memory_file.

Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Glen McCready <gkm@mysteryinc.ca>
2022-08-25 11:59:39 -07:00
Omar Sandoval
9c69d2dd4b README: update libkdumpfile installation instructions
It is also packaged in the AUR, so make the instructions more accurate
for each distro.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-19 01:26:50 -07:00
Omar Sandoval
ca373fe38a docs: use "programmable debugger" description consistently
Replace the old "Scriptable debugger library" and
"Debugger-as-a-library" taglines with the one we're using on GitHub,
"Programmable debugger". Make up for it by emphasizing that drgn can
also be used as a library a tiny bit more in the README.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-19 01:21:32 -07:00
Omar Sandoval
5b6a8c27a9 docs: make overloaded address helper documentation more concise
Now that we're documenting parameter and return types with annotations,
we can use only one line for the overload of functions that can take
either an object or a program and an integer.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-19 01:11:37 -07:00
Omar Sandoval
ca94b87268 drgndoc: format types as type annotations
Instead of adding type information to directive descriptions with the
:type:, :rtype:, and :vartype: fields, document types with type
annotations. For functions and methods, we add the type annotations to
the signature. For variables and attributes, we use the :type: option.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-19 01:07:37 -07:00
Omar Sandoval
d14f751475 drgn.helpers.linux.mm: add simple PageFlag() getters
There are a bunch of page flag getters in the kernel like
PageUptodate(), PageLocked(), etc., that kernel developers are
accustomed to using. Most of them are simple bit tests. Let's add
helpers for all of those. These are generated from
include/linux/page-flags.h in the Linux kernel source tree as of Linux
v6.0-rc1.

More complicated getters that need to do more than a simple flag check
(e.g., PageCompound()) will need to be added manually.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-18 15:50:15 -07:00
Michel Alexandre Salim
c0ed1a3203 Fix spelling error
abbrevation => abbreviation; caught by Debian's lintian

Signed-off-by: Michel Alexandre Salim <michel@michel-slm.name>
2022-08-17 21:45:51 -07:00
Omar Sandoval
eda56c153b README: add openSUSE installation instructions
drgn is now packaged for openSUSE. Add instructions for installing with
zypper or from source. Also reindent the Arch Linux instructions
correctly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-16 23:29:35 -07:00
Omar Sandoval
6c90315f6f python: fix FaultError reference leak
PyErr_SetObject() takes a reference on the exception value, so we need
to drop the reference we got when we created the value. Issue #196 ran
into this by reading tons of unmapped addresses.

Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-16 17:35:36 -07:00
Nhat Pham
79ea6589c2 drgn.helpers.linux.slab: add find_containing_slab_cache helper
This helper function identifies the slab cache (if any) the object at
the given address belongs to. This will be useful for a future helper
function which prints the stack trace with more information about each
item on the stack.

Signed-off-by: Nhat Pham <nphamcs@gmail.com>
2022-08-16 15:52:21 -07:00
Nhat Pham
93f8d07bcf tests: directly allocate the test page in test kernel module
Modify how the test page is allocated to ensure we have a directly
mapped address which is not slab allocated for testing the negative case
of find_containing_slab_cache.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
2022-08-16 15:52:21 -07:00
Omar Sandoval
a19203a73e libdrgn: fix QEMU guest memory dump Kconfig suggestion
The config option is and always has been CONFIG_FW_CFG_SYSFS, not
CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the
internal CONFIG_CRASH_CORE.

Fixes: 2bd861f719 ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-15 15:11:56 -07:00
Omar Sandoval
decedc9734 setup.py: add 6.0 to vmtest kernels
cgroup_bpf_prog_for_each() needed a minor update, but after fixing that,
all of the flavors pass all tests.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-15 13:41:47 -07:00
Omar Sandoval
f5b2576314 drgn.helpers.linux.bpf: fix cgroup_bpf_prog_for_each() on Linux 6.0
The cgroup BPF program list was changed from a regular list to an hlist.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-15 13:40:43 -07:00
Omar Sandoval
6f3408829f docs: fix Sphinx "Title underline too short" warning
Fixes: 585bc6a3be ("Add helpers for lockless single lists (llist).")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-11 14:46:18 -07:00
Omar Sandoval
faaf01ad1b Add drgn.StackTrace.prog and drgn_stack_trace_program()
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-11 14:45:54 -07:00
Imran Khan
4296653090 tests: add test cases for Linux llist helpers.
Use the test kernel module to setup tests and add test_llist.py to
carry out testing.

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
2022-08-08 08:22:32 -07:00
Imran Khan
585bc6a3be Add helpers for lockless single lists (llist).
Kernel makes use of several lockless singly lists (free_ipc_list,
delayed_mntput_list etc.) so having some helpers to traverse
these lists can be useful.

Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
2022-08-08 08:22:32 -07:00
Omar Sandoval
b535b8f82e vmtest: don't use BusyBox
We don't specifically need BusyBox; we just need a reasonable Linux
userspace, which we can assume is already available on the host, whether
it's coreutils+util-linux, BusyBox, or something else.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-01 11:23:26 -07:00
Omar Sandoval
1b8d0ae82b vmtest.vm: change to host's working directory by default
The test command does this, and I always end up doing it when I'm doing
manual testing with the vmtest.vm CLI, so let's just do it by default.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-01 10:59:05 -07:00
Omar Sandoval
2c38ea5219 docs: update required Sphinx version to 5.1.1
Just picking up the newest version. Also fix the following warning:

  WARNING: extlinks: Sphinx-6.0 will require a caption string to contain exactly one '%s' and all other '%' need to be escaped as '%%'.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-27 10:04:20 -07:00
Omar Sandoval
0d95ac0d6e docs: fix stray reference to symbol finder
"Object finder" was renamed from "symbol finder" awhile ago, but we
forgot to update the advanced usage documentation.

Fixes: 0c5df56fba ("libdrgn: replace symbol index with object index")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-27 09:40:43 -07:00
Omar Sandoval
e3ba4d2f99 drgn 0.0.20
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-25 16:52:28 -07:00
Omar Sandoval
c47dd9952e Update elfutils in manylinux wheels to 0.187
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-25 16:52:24 -07:00
Omar Sandoval
e9d16732d6 libdrgn: x86_64: fix page table iteration over non-canonical range
We're currently checking whether the iterator has entered the
non-canonical range when fetching the last level of the page table, but
the cutover actually happens while we're in the last level. Fix it by
doing the check unconditionally.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-24 00:03:45 -07:00
Omar Sandoval
43f045ae1a tests: add BPF helper tests
These require a fair bit of scaffolding, but it's worth it to fill one
of our major testing gaps.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 23:17:04 -07:00
Omar Sandoval
3b2a4d7b20 tests: factor out temporary cgroup creation function
Some BPF tests want a temporary cgroup to test with.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 17:35:24 -07:00
Omar Sandoval
901c1fb190 tests: factor out function for raising OSError from ctypes call
We duplicate this in a few places, and for the BPF tests we will want it
again.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 17:34:53 -07:00
Quentin Monnet
764a858ee6 helpers: Add BPF helpers for iterating over BPF links and BTF objects
Similarly to the helpers available to iterate over eBPF programs and
maps, add helpers for links and BTF objects. The implementation is very
straightforward.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
[Omar: add kernel version comments]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-21 11:32:32 -07:00
Omar Sandoval
d20b446d2c drgn.helpers.linux.bpf: handle more kernel versions in cgroup_bpf_prog_for_each{,_effective}()
The helpers only work since Linux v4.15, but it's easy to make them work
before that. We can also easily handle kernels without cgroup BPF
programs (either before Linux v4.10 or without CONFIG_CGROUP_BPF) and
yield nothing.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-20 00:15:12 -07:00