PyUnicode_FSConverter() already handles os.PathLike, so we only need to
handle None and save the string and length.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
It feels icky to write code that, for example, passes a uint64_t to
PyLong_FromUnsignedLongLong(). In practice it's fine, but it's much
nicer to have conversion functions specifically for the stdint.h types.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This DWARF tag is used by C++ classes which take a variable number
of template parameters, such as std::variant and std::tuple.
Signed-off-by: Alastair Robertson <ajor@meta.com>
While reviewing #214, I realized that we have very little documentation
for drgn_architecture_info (and platform internals in general). Let's
document all of the important stuff, and in particular how to add
support for new architectures.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
While documenting struct drgn_architecture_info, I realized that
demangle_return_address() is difficult to explain. It's more
straightforward to define this functionality as demangling any registers
that are mangled when using CFI rather than just the return address
register.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
c_parse_specifier_qualifier_list() checks whether an identifier starts
with "size_t" or "ptrdiff_t" to decide whether to return the size_t or
ptrdiff_t type. This incorrectly matches stuff like like "size_tea" and
"ptrdiff_tee". Fix this by making it an exact comparison.
Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently getting .eh_frame from the debug file. However, since
.eh_frame is an SHF_ALLOC section, it is actually in the loaded file,
and may not be in the debug file. This causes us to fail to unwind in
modules whose debug file was created with objcopy --only-keep-debug
(which is typical for Linux distro debug files).
Fix it by getting .eh_frame from the loaded file. To make this easier,
we split .eh_frame and .debug_frame data into two separate tables. We
also don't bother deduplicating them anymore, since GCC and Clang only
seem to generate one or the other in practice.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For upcoming changes, we will need loaded (SHF_ALLOC) sections for
modules. Some separate debug files (e.g., those created with objcopy
--only-keep-debug) don't have those sections. Let's get the loaded file
from libdwfl with dwfl_module_getelf() and save it in a struct
drgn_elf_file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we track the debug file ourselves, we can avoid calling libdwfl
in a bunch of places. By tracking the bias ourselves, we can avoid a
bunch more.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
struct drgn_module contains a bunch of information about the debug info
file. Let's pull it out into its own structure, struct drgn_elf_file.
This will be reused for the "main"/"loaded" file in an upcoming change.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
It's confusing that we have a platform both for the program and for each
module. They usually match, but they're not required to. For example,
the user can manually add a file with a different platform just to read
its debug info. Our rule is that if we're parsing anything from the
module, we use the module platform; and otherwise, use the program
platform. There are a couple of places where the platforms must match:
when using call frame information (CFI) or registers. Let's make all of
this more clear in the code (by using the module's platform even when it
must match the program's platform) and in comments. No functional
change.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If a DW_TAG_pointer_type DIE doesn't specify its size with
DW_AT_byte_size, we currently default to the program's address size.
However, the DWARF we're parsing could be for a platform with a
different address size. It's more correct to use the CU's address size.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have some generic helpers that we'd like to add (for example, #210)
that need to know the stack pointer of a frame. These shouldn't need to
hard-code register names for different architectures. Add a generic
shortcut, StackFrame.sp.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
qsort_r is a non-standard glibc extension and turns out to be the only
thing that prevents drgn from working on a musl system. "Fix" the use of
qsort_r by switching it to qsort with a thread local variable for the
parameter.
Tested in a clean chroot install of musl voidlinux.
Signed-off-by: Boris Burkov <boris@bur.io>
The StackFrame's __getitem__() method allows looking up names in the
scope of a stack frame, which is an incredibly useful tool for
debugging. However, the names are not discoverable -- you must already
be looking at the source code or some other source to know what names
can be queried. To fix this, add a locals() method to StackFrame, which
lists names that can be queried in the scope. Since this method is named
locals(), it stops at the function scope and doesn't include globals or
class members.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
find_dwarf_enumerator() needs to check that the return value of
dwarf_diename() is not NULL before calling strcmp(). This is similar to
commit 330c71b5b5 ("libdrgn: dwarf_info: fix segfault on anonymous
DIEs during scope search"), although I haven't seen this one happen in
practice.
Fixes: bc85767e5f ("libdrgn: support looking up parameters and variables in stack traces")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I wanted to make REUSE pass, but I'm not sure what to do about trivial
files. REUSE suggests using CC0, but Fedora no longer allows CC0. I'll
punt that until later. For now, let's add notices to some code files.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.
All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rename DW_TAG_{UNKNOWN_FORMAT,BUF_LEN} to
DW_TAG_STR_{UNKNOWN_FORMAT,BUF_LEN} to make it more clear that they're
for dw_tag_str.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Our container_of() and array_size() were copied from the Linux kernel
and use some really ugly BUILD_BUG_ON_ZERO() and BUILD_BUG_ON_MSG()
macros. C11 has _Static_assert, which is much nicer. We just have to
shoehorn it into an expression, which we do with clever use of _Generic
and sizeof a struct type definition. (We could accomplish the same idea
with a comma expression, but GCC warns when the left-hand operand of a
comma expression has no effect. We could also do it with a compound
statement, but it's cooler to do it with standard C11.)
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Eventually, modules will be exposed as part of the public libdrgn API,
so they should have a clean name. Additionally, the module API I'm
currently working on will allow modules for which we don't have the
debug info file, so "debug info module" would be a misnomer.
Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and
drgn_orc_module_info to drgn_module_orc_info to fit the new naming
scheme better.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're eventually going to add a drgn.Module class, which logically
should go in a file called module.c. But, we already have a module.c
with module-level definitions. Rename that file to main.c to free up
module.c
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There are a bunch of places where we call .tp_alloc() directly, which is
very verbose. Add a macro that removes the boilerplate.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
-Wimplicit-fallthrough has a false positive because the compiler
apparently doesn't know that usage() never returns.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of string_builder_finalize(), which leaves the string_builder in
an undefined state (according to the documentation, at least), define
string_builder_null_terminate(), which documents exactly what it does.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Make string_builder_reserve() allocate an exact capacity, and add a
string_builder_reserve_for_append() wrapper that does the
next_power_of_two(current length + number to append) that all of the
current callers want.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than documenting how to initialize a struct string_builder,
provide an initializer, STRING_BUILDER_INIT.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This time, in order to work on both GCC and Clang, use
__attribute__((__fallthrough__)) instead of /* fallthrough */ comments.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This reverts commit e05bfbddc2. Clang
doesn't support /* fallthrough */ comments, so we'll need to use
__attribute__((falthrough)), which will need some additional feature
detection.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
task->cpu was moved to task->thread_info.cpu in Linux 5.16, which causes
drgn_get_initial_registers() to think that the kernel is !SMP and use
CPU 0 instead, producing incorrect stack traces. This has also always
been wrong for Linux < 4.9 and on architectures that don't enable
CONFIG_THREAD_INFO_IN_TASK; in those cases, it should be
((struct thread_info *)task->stack)->cpu.
Fix it by factoring out a new task_cpu() helper that handles all of the
above cases. Also add a test case for task_cpu() in case this changes
again.
Fixes: eea5422546 ("libdrgn: make Linux kernel stack unwinding more robust")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Jakub Kicinski reported that
prog.crashed_thread().stack_trace()[1]['does not exist'] segfaulted on a
vmcore he encountered. The segfault was a NULL pointer dereference of
dwarf_diename() of a DW_TAG_subprogram DIE in
drgn_find_in_dwarf_scopes(). The fix is to ignore DIEs without a name.
I was curious what this anonymous DW_TAG_subprogram was. It turned out
to be some dubious DWARF generated by Clang when a local variable is
defined via a macro. One such example comes from the following code in
arch/x86/events/intel/uncore.h:
static inline bool uncore_mmio_is_valid_offset(struct intel_uncore_box *box,
unsigned long offset)
{
if (offset < box->pmu->type->mmio_map_size)
return true;
pr_warn_once("perf uncore: Invalid offset 0x%lx exceeds mapped area of %s.\n",
offset, box->pmu->type->name);
return false;
}
pr_warn_once() expands to:
#define pr_warn_once(fmt, ...) \
printk_once(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
#define printk_once(fmt, ...) \
({ \
static bool __section(".data.once") __print_once; \
bool __ret_print_once = !__print_once; \
\
if (!__print_once) { \
__print_once = true; \
printk(fmt, ##__VA_ARGS__); \
} \
unlikely(__ret_print_once); \
})
For some reason, Clang generates an anonymous, top-level
DW_TAG_subprogram DIE to contain the __print_once variable:
<1><1cf86e>: Abbrev Number: 62 (DW_TAG_subprogram)
<2><1cf86f>: Abbrev Number: 61 (DW_TAG_variable)
<1cf870> DW_AT_name : (indirect string, offset: 0x34fb2e): __print_once
<1cf874> DW_AT_type : <0x1c574c>
<1cf878> DW_AT_decl_file : 1
<1cf879> DW_AT_decl_line : 229
<1cf87a> DW_AT_location : 16 byte block: 3 2c 84 66 83 ff ff ff ff 94 1 31 1e 30 22 9f (DW_OP_addr: ffffffff8366842c; DW_OP_deref_size: 1; DW_OP_lit1; DW_OP_mul; DW_OP_lit0; DW_OP_plus; DW_OP_stack_value)
Whereas GCC puts it in a DW_TAG_lexical block DIE inside of the
DW_TAG_subprogram DIE for uncore_mmio_is_valid_offset():
<1><3110b2>: Abbrev Number: 45 (DW_TAG_subprogram)
<3110b3> DW_AT_name : (indirect string, offset: 0x2e13e): uncore_mmio_is_valid_offset
<3110b7> DW_AT_decl_file : 4
<3110b8> DW_AT_decl_line : 223
<3110b9> DW_AT_decl_column : 20
<3110ba> DW_AT_prototyped : 1
<3110ba> DW_AT_type : <0x2f416b>
<3110be> DW_AT_inline : 3 (declared as inline and inlined)
<3110bf> DW_AT_sibling : <0x311142>
<2><3110ef>: Abbrev Number: 66 (DW_TAG_lexical_block)
<3><3110f0>: Abbrev Number: 120 (DW_TAG_variable)
<3110f1> DW_AT_name : (indirect string, offset: 0x2da3f): __print_once
<3110f5> DW_AT_decl_file : 4
<3110f6> DW_AT_decl_line : 229
<3110f7> DW_AT_decl_column : 2
<3110f8> DW_AT_type : <0x2f416b>
<3110fc> DW_AT_location : 9 byte block: 3 2c 28 48 83 ff ff ff ff (DW_OP_addr: ffffffff8348282c)
Regardless, we shouldn't crash on this input.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.
Closes#185.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).
This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
An upcoming fix requires us to always use the module list from the core
dump rather than /proc/modules. However, with the existing code, this
would cause a major startup time regression for the live kernel, mainly
because reading from /proc/kcore is stupidly slow. We currently do 3 +
strlen(module->name) reads for every module. We can reduce this to 1
read per module by reading the entire struct module at once. The size of
struct module is ~700-900 bytes depending on the kernel configuration,
which is still much faster to read than only reading what we need.
In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this
reduced the time spent in the kernel module iterator from ~2.5ms per
module to ~0.4ms per module.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If drgn_object_set_buffer_from_internal() (used to implement
drgn_object_set_from_buffer(), drgn_object_slice(), and
drgn_object_reinterpret()) sets an object to a primitive type from a
buffer that comes from the same object, then drgn_object_reinit() will
free the value and then drgn_value_serialize() will access the freed
value, probably resulting in garbage. Handle this case the same way we
do if the result type is encoded as a buffer, by first copying to a
temporary value.
This doesn't affect usage through Python because objects are immutable
in the Python bindings.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We have a couple of loops that deal with short reads/EINTR from read(2)
and pread(2), and upcoming changes would need to add more. Add some
wrappers to abstract this away.
drgn_read_memory_file() still needs the loop so it can fault on the
exact offset that returns EIO.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The call to min() in drgn_read_memory_file() results in the following
warning on 32-bit architectures that I missed on review:
In file included from ../../libdrgn/memory_reader.c:10:
../../libdrgn/memory_reader.c: In function 'drgn_read_memory_file':
../../libdrgn/minmax.h:36:26: warning: comparison of distinct pointer types lacks a cast
36 | (void)(&unique_x == &unique_y); \
| ^~
../../libdrgn/minmax.h:28:19: note: in expansion of macro 'cmp_once_impl'
28 | #define min(x, y) cmp_once_impl(x, y, PP_UNIQUE(_x), PP_UNIQUE(_y), <)
| ^~~~~~~~~~~~~
../../libdrgn/memory_reader.c:284:34: note: in expansion of macro 'min'
284 | size_t readlen = min(file_end - file_offset, count);
| ^~~
We can fix it with a cast, and additionally do the call to min() earlier
and rework the logic a bit.
Fixes: 9684771d61 ("libdrgn: Zero fill excluded pages in kernel core dumps rather than FaultError")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There's a lot more context here that we should write down. It's also
worth noting that it appears that GDB always zero fills the range
between p_filesz and p_memsz, so if we end up having any other issues
because of this, we might have to concede and go back to the behavior
before commit 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz
in core dumps").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add pretty printing support in Jupyter notebook for Object, Type,
StackFrame, and StackTrace; it will print out their representation in
programming language syntax with str(), similar to what's being done in
interactive mode.
Link: https://ipython.readthedocs.io/en/stable/api/generated/IPython.lib.pretty.html#extending
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
makedumpfile will exclude zero pages. We found a core file where a
structure straddled a page boundary and the end of the structure
was all zeros so the page was excluded and we were generating a
FaultError trying to access the structure.
This change reverts a portion of that behaviour such that when we are
debugging a kernel core we go back to the zero fill behaviour. To do this
we go back to creating segments based on memsz instead of filesz and
handling the filesz->memsz gap in drgn_read_memory_file.
Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Glen McCready <gkm@mysteryinc.ca>
Replace the old "Scriptable debugger library" and
"Debugger-as-a-library" taglines with the one we're using on GitHub,
"Programmable debugger". Make up for it by emphasizing that drgn can
also be used as a library a tiny bit more in the README.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
PyErr_SetObject() takes a reference on the exception value, so we need
to drop the reference we got when we created the value. Issue #196 ran
into this by reading tons of unmapped addresses.
Fixes: 80fef04c70 ("Add address attribute to FaultError exception")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The config option is and always has been CONFIG_FW_CFG_SYSFS, not
CONFIG_FW_CFG. Also suggest the user-visible CONFIG_KEXEC instead of the
internal CONFIG_CRASH_CORE.
Fixes: 2bd861f719 ("libdrgn: program: detect QEMU guest memory dumps without VMCOREINFO")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
If we only have the stack trace available, it's useful to get the
program it came from. This'll be used eventually for helpers that take a
stack trace.
Signed-off-by: Omar Sandoval <osandov@osandov.com>