DEFINE_BINARY_SEARCH_TREE_TYPE() doesn't need these. This is preparation
for a potential new use of a BST. But, it's also a good cleanup on its
own and allows us to move some code out of memory_reader.h and into
memory_reader.c. (This is similar to commit 1339dc6a2f ("libdrgn:
hash_table: move entry_to_key to DEFINE_HASH_TABLE_FUNCTIONS()").)
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of getting the address range from the sections we find, get it
directly from /proc/modules or from the `struct module`. (We already had
partial code to get the address range, but I can't remember why I didn't
use it.)
The real motivation for this is the upcoming module rework: it'll allow
us to report the module and its address range before iterating through
its sections. But it also means that we don't need the heuristic to
ignore special sections that shouldn't be considered part of the address
range (e.g., .init, .data..percpu [the latter of which we should be
ignoring but get away with not because it's excluded from sysfs]).
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Like drgn_error_fwrite(), but writes to a file descriptor instead of a
stdio stream. This will be used for logging.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn_error_fwrite() only calls string_builder_append_error() to get
special formatting for DRGN_ERROR_OS, but DRGN_ERROR_FAULT also needs
special formatting. Rather than needing to keep drgn_error_fwrite() and
string_builder_append_error() in sync, define them both in terms of a
common macro.
Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Since most Linux distros enable this, we should make sure we test it. It
was added for SLUB in Linux kernel commit 2482ddec670f ("mm: add SLUB
free list pointer obfuscation") (in v4.14), so we'll still get test
coverage of the non-hardened codepath while 4.9 is around.
It was also added for SLAB in Linux kernel commit 3404be67bf73
("mm/slab: expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB") (in
v5.9), although that currently doesn't change the in-memory data
structures.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Most Linux distros (I checked Fedora, Debian, and Arch) enable this,
which requires us to de-obfuscate the pointers in the freelist.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This has been useful to run manually before, but I haven't added it to
the CI because it was somewhat noisy. But, it reports some really useful
warnings, so let's configure it for our needs and add it to pre-commit.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Several have snuck in since the last time I did this in commit
5541fad063 ("Fix some flake8 errors"). Prepare for adding flake8 to
pre-commit by fixing them.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Splitting the tests between tests/linux_kernel and tests/helpers/linux
means that we have to set up the unit tests twice, including loading
debug info. Python 3.7 and newer have a way to get around this, but
we're still sort of supporting Python 3.6. Move them under one path to
speed up test runs.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This one is considerably more complicated than the linked list one, but
it should catch lots of kinds of red-black tree mistakes.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
At LSF/MM+BPF 2022, Ted Ts'o pitched me the idea of using drgn to
validate the consistency of kernel data structures. I really liked this
idea, especially for big, complicated data structures. But first, let's
start small: document the concept of a "validator", which is just a
special kind of helper, and add some basic validator versions of linked
list helpers.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The current test case for slab_cache_for_each_allocated_object() is
barely more than a smoke test. It missed the bug fixed by the previous
commit. Now that we have the test kernel module, we can do a lot better
by comparing against the exact list of objects that are allocated.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
SLUB and SLAB both have a mechanism where they'll remove some objects
from the per-slab freelist to keep them in a per-CPU cache. SLUB's is a
per-CPU freelist just like the per-slab one, and SLAB's is a per-CPU
array of free entries. We're not checking those lists, which means that
we're returning free objects as allocated. Fix it by checking against
the per-CPU lists in addition to the per-slab freelist.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Just like we did for lists, use a test data structure instead of the vma
tree for a process. This also allows us to test RB_EMPTY_NODE, which we
couldn't test before.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than picking a couple of lists which we hope won't change, use
the newly added test kernel module to define a few lists and test
against those. This also gives us proper tests for list_is_singular().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that commit d999703f94 ("vmtest: add kernel module build
dependencies to kernel packages") added the files necessary to build a
test kernel module, add the module (currently a stub) and the
scaffolding necessary to build and load it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have several helpers that are difficult to test because there are no
userspace APIs to test them against. Radix trees, slab objects, and
hlists are a few examples. For lists and rbtrees, luckily we can find
instances of those data structures exposed to userspace in some way, but
this is somewhat brittle.
We'd like a way to be able to unit test directly against kernel code
that sets things up in a way that is easy to test against. The easiest
way to do this is with a custom kernel module.
There are two options for how to enable this:
1. Build the custom kernel module as part of the vmtest kernel build and
package it with the vmtest kernel package.
2. Include the artifacts needed for kernel module builds in the vmtest
kernel package, then build the kernel module when running tests.
The latter makes the kernel packages significantly larger (on a build of
v5.18-rc6, 46M -> 53M compressed, 173M -> 217M decompressed), but it has
the huge advantage that it does not require a vmtest kernel rebuild to
add or modify tests. In order to optimize for making it easy to add new
helpers with test cases, this is the approach I chose.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now `python3 -m vmtest.manage -K --no-build` can be used to get the list
of latest kernel releases that need to be built. Also rename --dry-run
to --no-upload since --no-build is also a dry run in a way.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Slab cache merging can cause major confusion when debugging using slab
cache helpers. Add a helper to detect whether a slab cache is merged and
an explanation of the implications.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
These are ported from https://github.com/josefbacik/debug-scripts.
Signed-off-by: alexlzhu <alexlzhu@fb.com>
[Omar: make test case check for exception on SLOB]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and
unsigned __int128. Clang additionally has N-bit integers of up to 2<<24
bits with _ExtInt(N), which was standardized in C23 as _BitInt(N).
Currently, we disallow creating objects with a >64-bit integer type. Jay
Kamat reported that this would cause errors when examining some
binaries. The reason we disallow this is that we don't have a way to
represent or do operations on >64-bit values. We could make use of a
bignum library like GMP to do this in the future.
However, for now, we can loosen this restriction and at least allow
reference and absent objects with big integer types. This requires
enforcing two things: that we never create a value object with a >64-bit
integer type, and that we never read the value of a reference object
with a >64-bit integer type.
Co-authored-by: Jay Kamat <jaygkamat@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This will be used for partial 128-bit object support. There are other
places that should probably be converted to use it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The tiny flavor was failing due to a bug fixed in Linux kernel commit
c12cd77cb028 ("mm/vmalloc: fix spinning drain_vmap_work after reading
from /proc/vmcore") (in v5.18-rc3). Now that the fix is in, we can add
5.18 with no changes required.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The 32-bit and 64-bit variants have different register sizes, so they're
different architectures in drgn. For now, put them in the same file so
that they can share the relocation implementation. We'll need to figure
out how to handle registers later.
P.S. RISC-V has the weirdest relocations so far. /proc/kcore also
appears to be broken.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The only relocation type I saw in Debian's kernel module debug info was
R_ARM_ABS32. R_ARM_REL32 is easy. The Linux kernel supports a bunch of
other ones that don't seem relevant to debug info.
Unfortunately, I wasn't able to test this because /proc/kcore doesn't
exist on Arm. This apparently goes all the way back to 2003:
https://lwn.net/Articles/45315/.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The only relocation types I saw in Debian's kernel module debug info
were R_AARCH64_ABS64 and R_AARCH64_ABS32. R_AARCH64_ABS16,
R_AARCH64_PREL64, R_AARCH64_PREL32, and R_AARCH64_PREL16 are all easy.
The remaining types supported by the Linux kernel are for movw and
immediate instructions, which aren't relevant to debug info.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The only relocation type I saw in Debian's kernel module debug info was
R_386_32. R_386_PC32 is easy. The Linux kernel also supports
R_386_PLT32, but that's the same story as R_X86_64_PLT32 in x86-64, so
we don't implement it for now.
I was torn between naming it i386, x86, or IA-32. x86 isn't immediately
clear whether x86-64 is included or not. No one other than Intel calls
it IA-32. i386 might incorrectly imply that it is strictly the original
i386 instruction set with no later extensions, but the more general
meaning is used frequently in the Linux world (e.g., Debian and QEMU
both call it i386), so I went with that in the end.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
One of the biggest things we depend on libdwfl for is applying
relocations on architectures other than x86-64. I'm exploring the
possibility of removing the libdwfl dependency, so I'm going to add
relocation implementations for more architectures, starting with ppc64.
R_PPC64_ADDR32 and R_PPC64_ADDR64 were the only ones I saw in Debian's
kernel module debug info. R_PPC64_REL32 and R_PPC64_REL64 are
straightforward. The Linux kernel also implements R_PPC64_TOC*, which
don't seem relevant to debugging information, and R_PPC64_REL24 and
R_PPC64_REL16*, which I'd prefer to have a real example of.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Implement R_X86_64_32S and R_X86_64_PC64. I haven't seen these for debug
info in the wild, but they're supported by the Linux kernel and they're
easy to support. The only other type of relocation currently supported
by the kernel is R_X86_64_PLT32, which is trickier. For kernel modules,
it's equivalent to R_X86_64_PC32 (see Linux kernel commit b21ebf2fb4cd
("x86: Treat R_X86_64_PLT32 as R_X86_64_PC32"), but that doesn't seem to
be true in general. It doesn't seem applicable to debug info sections,
so hopefully we don't need to worry about it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In preparation for supporting ELF relocations for more architectures,
generalize ELF relocations to handle SHT_REL sections/ElfN_Rel.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Check that size_t makes sense and make sure int_key_hash_pair() doesn't
get an integer type larger than it supports. I can't imagine either of
these failing in practice, but make our assumptions explicit.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We call hash_combine() with a uint64_t in
drgn_debug_info_module_key_hash_pair() and drgn_type_dedupe_hash_pair().
On 32-bit systems, this only uses the least-significant 32 bits. Use
hash_64_to_32() on 32-bit and hash_128_to_64() on 64-bit to ensure that
we use all bits if we're given a type larger than size_t, and sanity
check that we're not given anything larger than we support.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If gh.download() in _download_kernel() raises an exception (e.g.,
because of an HTTP 403 error due to rate-limiting), then we will try to
exit the subprocess context managers for zstd and tar.
subprocess.Popen.__exit__() closes stdin if it is a pipe and then waits
for the subprocess. The context managers are exited in reverse order, so
first we'll try to wait for tar. But, that will never exit because its
stdin (piped from zstd) is still open. Fix it by always explicitly
closing zstd's stdin once the tar process is running.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
GitHub release assets don't allow "~" in filenames; they are changed
into ".". This means that we can't take advantage of the fact that ~
version sorts before anything, so add a sub-version to the localversion
which is 1 for the default flavor and 0 for everything else.
Signed-off-by: Omar Sandoval <osandov@osandov.com>