The helpers only work since Linux v4.15, but it's easy to make them work
before that. We can also easily handle kernels without cgroup BPF
programs (either before Linux v4.10 or without CONFIG_CGROUP_BPF) and
yield nothing.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This is the same idea as commit 4da28ba0a1 ("helpers: only lookup type
once for for_each_entry helpers").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
These currently only work on Linux v5.13 and newer, and it's not worth
the effort to support older versions. Let's at least document it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We currently don't have any tests for the BPF helpers or the
bpf_inspect.py tool. As a result, the latter is broken on newer kernel
versions. Before we can add tests, we need the vmtest kernel to support
BPF.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
A path is the most convenient way to find a cgroup if we don't already
have a pointer to it from another structure.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I originally thought this would be too difficult, but it's fairly
straightforward to parse /proc/mounts and allows us to avoid some setup
and cleanup.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This looks up a kernfs node from a path. It will be used to look up
cgroups by path. This is based on kernfs_walk_ns() from the Linux
kernel, but it doesn't handle namespaced kernfs nodes yet.
kernfs_walk_ns() in the kernel is actually built on another function,
kernfs_find_ns(), but I don't think the latter is very useful as a
helper.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn_debug_info_find_complete() looks up the name of the incomplete type
in the global namespace. This is incorrect for C++: we need to look it
up in the namespace that the DIE is in.
To find the containing namespace, we need to do a DIE ancestor walk. We
don't want to do this for C, so add a flag indicating whether a language
has namespaces to struct drgn_language. If it's true, then we do the
ancestor walk and then look up the name in the appropriate namespace.
Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
Currently, DIE references are specified as an index into the list of the
unit DIE's children. This has a few issues:
* It's hard to figure out what references what at a glance.
* Changes to tests sometimes need to renumber these indices.
* DIEs at lower levels in the tree cannot be referenced.
Replace it with explicit "labels" which are referred to by name.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we made the other memory management helpers generic, the last
thing to implement for AArch64 is page table walking. This looks a lot
like the x86-64 equivalent but has to support the various page and
virtual address sizes that can be configured for AArch64.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This is always 0 on x86-64, but on AArch64, the start of physical memory
can be at a much higher address.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
AArch64 has changed the location of vmemmap multiple times, and not all
of these can be easily distinguished. Rather than restorting to kernel
version checks, this replaces the vmemmap architecture callback with a
generic approach that gets the vmemmap address directly from the
mem_section table.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
On x86-64, the difference between virtual addresses in the direct map
and the corresponding physical addresses is called PAGE_OFFSET, so we
exposed that via an architecture callback and the Linux kernel object
finder. However, this doesn't translate to other architectures. Namely,
on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both
PAGE_OFFSET and PHYS_OFFSET have varied over time and between
configurations.
We can remove the architecture callback and avoid version-specific logic
by letting the page table tell us the offset. We just need an address in
the direct map, which is easy to find since this includes kmalloc and
memblock allocations.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add helpers for converting physical addresses to and from virtual
addresses, PFNs, and struct pages.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We currently test the functions to convert between virtual addresses,
PFNs, and struct pages with an mmap'd region and /proc/self/pagemap. Use
the test kernel module to test them more directly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
linux_helper_read_vm() has logic to merge adjacent physical address
ranges returned by the page table iterator. However, the check for
whether the ranges are adjacent is incorrect. Fix it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
pgtable_iterator_x86_64::table is only used if
pgtable_iterator_x86_64::index indicates that it has any cached entries,
so there's no point initializing table since we initialize index to
indicate that nothing is cached.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
AArch64 will need different sizes of page table iterators depending on
the page size and virtual address size. Rather than the static
pgtable_iterator_arch_size, allow architectures to define callbacks for
allocating and freeing a page table iterator. Also remove the generic
page table iterator wrapper and just pass that information to the
iterator function.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently only testing whether we can translate user addresses.
Test a kernel address with the kernel page table, too.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than computing it every time we need it, compute it once when we
parse PAGE_SIZE from VMCOREINFO (and validate that PAGE_SIZE is a power
of two). This will be more important for AArch64 page table walking.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The test kmod build has the following warning that I somehow didn't
notice before:
WARNING: modpost: /home/osandov/repos/drgn-main/tests/linux_kernel/kmod/drgn_test.o(.init.text+0x3ac): Section mismatch in reference from the function init_module() to the function .exit.text:drgn_test_exit()
The function __init init_module() references
a function __exit drgn_test_exit().
This is often seen when error handling in the init function
uses functionality in the exit path.
The fix is often to remove the __exit annotation of
drgn_test_exit() so it may be used outside an exit section.
Remove the __exit annotation as suggested.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For local testing with vmtest, we just want an extracted kernel package,
so save the trouble of compressing the package only to extract it and
allow vmtest.kbuild to output the directory directly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Very similar to a541e9b170, but adds
partial support for floats (as opposed to integers) which aren't 32 or
64 bits.
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
Previously `drgn` did not recognize the `DW_ATE_UTF` encoding for base
types, and consequently could not handle `char8_t`, `char16_t`, or
`char32_t`. This has been remedied, and a corresponding test case added
to prevent regressions.
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
Issue #182 reported that a core dump created by QEMU's dump-guest-memory
command confuses drgn: by default, it only has NT_PRSTATUS notes and
QEMU state notes for each CPU, so drgn thinks it's a userspace core
dump, and it doesn't have the necessary VMCOREINFO to use it as a Linux
kernel core dump.
It turns out that QEMU and Linux can cooperate to add a VMCOREINFO note
to the guest memory dump, which suffices for drgn. Let's detect a QEMU
guest memory dump without a VMCOREINFO note and include instructions on
how to capture a QEMU dump that makes drgn happy.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Our vmtest kernels have CONFIG_KALLSYMS_ALL, but distro kernels may not,
in which case variable symbols are not added to /proc/kallsyms. Then,
the Linux kernel debug info tests can't find our test symbol and fail.
Define a global function symbol and use it for the test debug info
tests instead.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Several of the mm tests currently fail on architectures that we haven't
implemented virtual address translation and such for (i.e., anything
other than x86-64). Only run those tests on x86-64 for now.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some architectures, including AArch64, don't have the pause() syscall.
glibc implements pause(3) with ppoll() on those architectures. Our stack
trace tests check for "pause" in the stack trace, so it fails on
AArch64. Update the tests to check for both "pause" and "poll".
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we track RA_SIGN_STATE and get the pointer authentication code
mask, we can remove the pointer authentication code from the return
address while unwinding. Add a new architecture callback,
->demangle_return_address(), for this purpose.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In order to support removing the pointer authentication code (PAC) from
return addresses on AArch64, we need to know what bits are being used
for the PAC. We can get this from the NT_ARM_PAC_MASK note in userspace
core dumps and from the NUMBER(KERNELPACMASK) field in VMCOREINFO for
Linux kernel core dumps.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We currently have 5 names that we match against, and there are more on
the way, so we might as well use a memswitch.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In an upcoming commit, we will parse the AArch64 pointer authentication
code mask either from the VMCOREINFO note or the NT_ARM_PAC_MASK note.
Since it doesn't always come from VMCOREINFO, it doesn't make sense to
put it in struct vmcoreinfo; struct drgn_program makes more sense. So,
make parse_vmcoreinfo() take struct drgn_program instead of struct
vmcoreinfo, rename it to drgn_program_parse_vmcoreinfo(), and replace
struct vmcoreinfo with an anonymous struct in struct drgn_program.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The RA_SIGN_STATE pseudo-register indicates whether the return address
is signed with a pointer authentication code. Add it to the register
definitions. It can be set through a normal CFI register rule or the
vendor-specific DW_CFA_AARCH64_negate_ra_state rule.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This will be used to implement DW_CFA_AARCH64_negate_ra_state. Also fix
a typographical error in a nearby comment.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add the basic register definitions and stack unwinding support
functions. Pointer authentication support will be added in subsequent
commits.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Reading the ABI specification, I realized that fallback_unwind_ppc64()
is completely wrong. Fix it.
Fixes: eec67768aa ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The usage of the link register in DWARF is a little confusing. On entry
to a function, the link register contains the address that should be
returned to. However, for DWARF, the link register is usually used as
the CFI return_address_register, which means that in an unwound frame,
it will contain the same thing as the program counter. I initially
thought that this was a mistake, believing that the link register should
contain the _next_ return address. However, after a return (with the blr
instruction), the link register will indeed contain the same address as
the program counter. This is consistent with our documentation of
register values for function call frames: "the register values are the
values when control returns to this frame".
So, rename our internal "ra" register to "lr", expose it to the API, and
add a little more documentation to the ppc64 initial register code.
Fixes: 221a218704 ("libdrgn: add powerpc stack trace support")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In additional to the general-purpose registers, struct pt_regs also
provides the cs and ss segment registers and the rflags register.
elf_gregset_t provides the other segment registers as well. We should
expose all of those.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Currently, register definitions are split across two files:
arch_foo.defs lists the names of registers, and arch_foo.c defines the
layout used to store registers in memory. The main rationale for this
was that the layout could be processed entirely by the C preprocessor,
but the register names needed an AWK script that we wanted to keep
minimal. But since commit af6f5a887d ("libdrgn: replace gen_arch.awk
with gen_arch_inc_strswitch.py"), arch_foo.defs is processed by a Python
script.
Let's define both the register names and the register layout in a new
file, arch_foo_defs.py, which is processed by gen_arch_inc_strswitch.py
This has a few benefits:
* It puts all of the register definitions for an architecture in one
place.
* It is easier to maintain than preprocessor magic. (It also makes it
trivial to support registers that don't exist in DWARF, which would've
been harder to do with our preprocessor code.)
* It gets rid of our DSL in favor of Python (which also lets us reduce
repetition for the ppc64 definitions).
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This hasn't been used since commit eec67768aa ("libdrgn: replace
elfutils DWARF unwinder with our own").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn_stack_frame_register() gets the register value with copy_lsbytes()
and then byte swaps it if the program's byte order is different from the
host's. But, copy_lsbytes() already fixes the byte order, so this ends
up with the original (wrong) byte order. We also don't need to zero out
the integer that we copy into since copy_lsbytes() also does that.
Fixes: eec67768aa ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
When running a python script with the drgn cli, an import of
a module in the current directory does not find that module.
This is surprising for regular python users.
According to the python documentation, sys.path is modified when
a script is passed on the command line:
"If the script name refers directly to a Python file, the
directory containing that file is added to the start of
sys.path, and the file is executed as the __main__ module."
However, it does not set the path if the passed in script is a
zipfile. Use pkgutil.get_importer() to check if this is the case
and only add the path if it returns None.
Add the same operation in drgn so that importing modules from
the same directory as the script work as expected.
Link: https://docs.python.org/3/using/cmdline.html#using-on-interface-options
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
This needed the previous small update to
tests.linux_kernel.test_debug_info.TestModuleDebugInfo.
tests.linux_kernel.helpers.test_tc.TestTc will also only work with
pyroute2 >= 0.6.10 (see svinota/pyroute2#899). No changes needed to drgn
itself.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Linux kernel commit a0e286b6a5b6 ("loop: remove lo_refcount and avoid
lo_mutex in ->open / ->release") (in v5.19-rc1) removed the lo_open
symbol that we use for
tests.linux_kernel.test_debug_info.TestModuleDebugInfo. Replace it with
a symbol from the test kernel module so we don't need to worry about it
going away.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Linux kernel commit 31cb50b5590f ("kbuild: check static EXPORT_SYMBOL*
by script instead of modpost") (in v5.19-rc1) added this script to the
build process, and the latest vmtest kernel build failed without it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Currently, we identify explicitly-reported kernel modules by the module
name that we get from the .modinfo or the .gnu.linkonce.this_module
section. However, objcopy --only-keep-debug (used for some Linux distro's
separate debug files) does not keep these sections. This means that
passing a file processed by objcopy --only-keep-debug to, e.g., drgn -s,
fails with "could not find kernel module name".
Instead of using the module name as the identifier, let's use the
module's GNU build ID. We can get it on a live system from
/sys/module/<module>/notes/, and on a core dump from struct
module::notes_attrs (which is the implementation of that sysfs
directory).
This was split out of my larger debug info discovery rework, which will
make more use of the build ID.
Closes#178.
Signed-off-by: Omar Sandoval <osandov@osandov.com>