We need to keep the Program alive for its types to stay valid, not just
the objects the Program has pinned. (I have no idea why I changed this
in commit 565e0343ef ("libdrgn: make symbol index pluggable with
callbacks").)
For operations where we don't have a type available, we currently fall
back to C. Instead, we should guess the language of the program and use
that as the default. The heurisitic implemented here gets the language
of the CU containing "main" (except for the Linux kernel, which is
always C). In the future, we should allow manually overriding the
automatically determined language.
For types obtained from DWARF, we determine it from the language of the
CU. For other types, it can be specified manually or fall back to the
default (C). Then, we can use the language for operations where the type
is available.
While we're here, make generate_dwarf_constants.py use the bundled
dwarf.h, generate code that black is happy with, and use the keyword
list from the standard library.
UTS_RELEASE is currently only accessible once debug info is loaded with
prog.load_debug_info(main=True). This makes it difficult to get the
release, find the appropriate vmlinux, then load the found vmlinux. We
can add vmcoreinfo_object_find as part of set_core_dump(), which makes
it possible to do the following:
prog = drgn.Program()
prog.set_core_dump(core_dump_path)
release = prog['UTS_RELEASE'].string_()
vmlinux_path = find_vmlinux(release)
prog.load_debug_info([vmlinux_path])
The only downside is that this ends up using the default definition of
char rather than what we would get from the debug info, but that
shouldn't be a big problem.
The osrelease is accessible via init_uts_ns.name.release, but we can
also get it straight out of vmcoreinfo, which will be useful for the
next change. UTS_RELEASE is the name of the macro defined in the kernel.
This continues the conversion from the last commit. Members and
parameters are basically the same, so we can do them together. Unlike
enumerators, these don't make sense to unpack or access as sequences.
Currently, type members, enumerators, and parameters are all represented
by tuples in the Python bindings. This is awkward to document and
implement. Instead, let's replace these tuples with proper types,
starting with the easiest one, TypeEnumerator. This one still makes
sense to treat as a sequence so that it can be unpacked as (name,
value).
Currently, we print:
>>> prog.symbol(prog['init_task'])
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: cannot convert 'struct task_struct' to index
It's not obvious what it means to convert to an index. Instead, let's
use the error message raised by operator.index():
TypeError: 'struct task_struct' object cannot be interpreted as an integer
Decouple some of the responsibilities of FaultError to
OutOfBoundsError so consumers can differentiate between
invalid memory accesses and running out of bounds in
drgn Objects which may be based on valid memory address.
After thinking about this some more, I decided that although it makes
sense for scripts to convert a type to an IntEnum class, I'd prefer that
the helpers take and return drgn Objects rather than these classes.
We'd like to have more control over how objects are formatted. I
considered defining a custom string format specification syntax, but
that's not easily discoverable. Instead, let's get rid of the current
format specification support and replace it with a normal method.
We currently have no test coverage for helpers. This is a problem, as
they can be fairly complicated and are susceptible to breaking with new
kernel versions. It's actually not too hard to test user-facing
subsystems on the running kernel as long as we're root and have debug
info for vmlinux, so let's add several tests for those. Specific data
structures will be a little trickier to test, so for now I'm not
covering those.
This implements the first step at supporting C++: class types. In
particular, this adds a new drgn_type_kind, DRGN_TYPE_CLASS, and support
for parsing DW_TAG_class_type from DWARF. Although classes are not valid
in C, this adds support for pretty printing them, for completeness.
In order to retrieve registers from stack traces, we need to know what
registers are defined for a platform. This adds a small DSL for defining
registers for an architecture. The DSL is parsed by an awk script that
generates the necessary tables, lookup functions, and enum definitions.
It's annoying to do obj.type_.size, and that doesn't even work for every
type. Add sizeof() that does the right thing whether it's given a Type
or Object.
For the following source code:
int arr[] = {};
GCC emits the following DWARF:
DWARF section [ 4] '.debug_info' at offset 0x40:
[Offset]
Compilation unit at offset 0:
Version: 4, Abbreviation section offset: 0, Address size: 8, Offset size: 4
[ b] compile_unit abbrev: 1
producer (strp) "GNU C17 9.2.0 -mtune=generic -march=x86-64 -g"
language (data1) C99 (12)
name (strp) "test.c"
comp_dir (strp) "/home/osandov"
stmt_list (sec_offset) 0
[ 1d] array_type abbrev: 2
type (ref4) [ 34]
sibling (ref4) [ 2d]
[ 26] subrange_type abbrev: 3
type (ref4) [ 2d]
upper_bound (sdata) -1
[ 2d] base_type abbrev: 4
byte_size (data1) 8
encoding (data1) signed (5)
name (strp) "ssizetype"
[ 34] base_type abbrev: 5
byte_size (data1) 4
encoding (data1) signed (5)
name (string) "int"
[ 3b] variable abbrev: 6
name (string) "arr"
decl_file (data1) test.c (1)
decl_line (data1) 1
decl_column (data1) 5
type (ref4) [ 1d]
external (flag_present) yes
location (exprloc)
[ 0] addr .bss+0 <arr>
Note the DW_AT_upper_bound of -1. We end up parsing this as UINT64_MAX
and returning a "DW_AT_upper_bound is too large" error. It appears that
GCC is simply emitting the array length minus one, so let's treat these
as having a length of zero.
Fixes#19.
There are a few places (e.g., Program.symbol(), Program.read()) where it
makes sense to accept, e.g., a drgn.Object with integer type. Replace
index_arg() with a converter function and use it everywhere that we use
the "K" format for PyArg_Parse*.
PlatformFlags and PrimitiveType got squashed into one string because of
a missing comma, and execscript was never added. Fix it and add some
test cases for it.
We're too inconsistent with how we use these for them to be useful (and
it's impossible to distinguish between a format error and some other
error from libelf/libdw/libdwfl), so let's just get rid of them and make
it all DRGN_ERROR_OTHER/Exception.
For stack trace support, we'll need to have some architecture-specific
functionality. drgn's current notion of an architecture doesn't actually
include the instruction set architecture. This change expands it to a
"platform", which includes the ISA as well as the existing flags.
struct drgn_symbol doesn't really represent a symbol; it's just an
object which hasn't been fully initialized (see c2be52dff0 ("libdrgn:
rename object index to symbol index"), it used to be called a "partial
object"). For stack traces, we're going to have a notion of a symbol
that more closely represents an ELF symbol, so let's get rid of the
temporary struct drgn_symbol representation and just return an object
directly.
Currently, repr() of structure and union types goes arbitrarily deep
(except for cycles). However, for lots of real-world types, this is
easily deeper than Python's recursion limit, so we can't get a useful
repr() at all:
>>> repr(prog.type('struct task_struct'))
Traceback (most recent call last):
File "<console>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while getting the repr of an object
Instead, only print one level of structure and union types.
A couple of times, I've broken help(drgn) by formatting a function
signature in a way that the inspect module doesn't understand (namely,
it crashes on Enum default arguments). Let's add a simple test that the
documentation at least renders.
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:
- It only supports finding debug information for the running kernel, not
vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
for every module.
- It doesn't find kernel modules with names containing both dashes and
underscores (e.g., aes-x86_64).
Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.
The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.