Add powerpc specific register information required to retrive the
stack traces of the tasks on both live system and from the core dump.
It uses the existing DSL format to define platform registers and
helper functions to initial them. It also adds architecture specific
information to enable powerpc. Current support is for little-endian
powerpc only.
Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
enum drgn_register_number in the public libdrgn API and
drgn.Register.number in the Python bindings are basically exports of
DWARF register numbers. They only exist as a way to identify registers
that's lighter weight than string lookups. libdrgn already has struct
drgn_register, so we can use that to identify registers in the public
API and remove enum drgn_register_number. This has a couple of benefits:
we don't depend on DWARF numbering in our API, and we don't have to
generate drgn.h from the architecture files. The Python bindings can
just use string names for now. If it seems useful, StackFrame.register()
can take a Register in the future, we'll just need to be careful to not
allow Registers from the wrong platform.
While we're changing the API anyways, also change it so that registers
have a list of names instead of one name. This isn't needed for x86-64
at the moment, but will be for architectures that have multiple names
for the same register (like ARM).
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In preparation for adding a "real", internal-only struct
drgn_stack_frame, replace the existing struct drgn_stack_frame with
explicit trace/frame arguments.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add struct drgn_type_template_parameter to libdrgn, the corresponding
TypeTemplateParameter to the Python bindings, and support for parsing
them from DWARF.
With this, support for templates is almost, but not quite, complete. The
main wart is that DW_TAG_name of compound types includes the template
parameters, so the type tag includes it as well. We should remove that
from the tag and instead have the type formatting code add it only when
getting the full type name.
Based on a patch from Jay Kamat.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In order to support static members, methods, default function arguments,
and value template parameters, we need to be able to store a drgn_object
in a drgn_type_member or drgn_type_parameter. These are all cases where
we want lazy evaluation, so we can replace drgn_lazy_type with a new
drgn_lazy_object which implements the same idea but for objects. Types
can still be represented with an absent object.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Getting the bit field size of a member will soon require evaluating the
lazy type, so return it from drgn_member_type() instead of accessing it
directly.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In preparation for struct drgn_type referencing struct drgn_object, move
the former after the latter.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I was going to add an Object.available_ attribute, but that made me
realize that the naming is somewhat ambiguous, as a reference object
with an invalid address might also be considered "unavailable" by users.
Use the name "absent" instead, which is more clear: the object isn't
there at all.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The == operator on drgn.Type is only intended for testing. It's
expensive and slow and not what people usually want. It's going to get
even more awkward to define once types can refer to objects (for
template parameters and static members and such). Let's replace == with
a new identical() function only available in unit tests. Then, remove
the operator from the Python bindings as well as the underlying libdrgn
drgn_type_eq() and drgn_qualified_type_eq() functions.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
offsetof() can almost be implemented with Type.member(name).offset, but
that doesn't parse member designators. Add an offsetof() function that
does (and add drgn_type_offsetof() in libdrgn).
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Add drgn_type_has_member() to libdrgn and Type.has_member() to the
Python bindings. This can simplify some version checks, like the one in
_for_each_block_device() since commit 9a10a927b0 ("helpers: fix
for_each_{disk,partition}() on kernels >= v5.1").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that types are associated with their program, we don't need to pass
the program separately to drgn_program_member_info() and can replace it
with a more natural drgn_type_find_member() API that takes only the type
and member name. While we're at it, get rid of drgn_member_info and
return the drgn_type_member and bit_offset directly. This also fixes a
bug that drgn_error_member_not_found() ignores the member name length.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64)
by moving the bit_offset and little_endian members out of the value and
reference structs.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There are a couple of reasons that it was the wrong choice to have a
bit_offset for value objects:
1. When we store a buffer with a bit_offset, we're storing useless
padding bits.
2. bit_offset describes a location, or in other words, part of an
address. This makes sense for references, but not for values, which
are just a bag of bytes.
Get rid of union drgn_value.bit_offset in libdrgn, make
Object.bit_offset None for value objects, and disallow passing
bit_offset to the Object() constructor when creating a value. bit_offset
can still be passed when creating an object from a buffer, but we'll
shift the bytes down as necessary to store the value with no offset.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There are some situations where we can find an object but can't
determine its value, like local variables that have been optimized out,
inlined functions without a concrete instance, and pure virtual methods.
It's still useful to get some information from these objects, namely
their types. Let's add the concept of an "unavailable" object, which is
an object with a known type but unknown value/address.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I'd like to use the name drgn_object_kind to distinguish between values
and references. "Encoding" is more accurate than "kind", anyways.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The Doxygen documentation for libdrgn has bit-rotted over time. Bring
back the Internal module, clean up a few renamed members and parameters,
and fix broken parsing caused by the generic definition macros.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I recently hit a couple of CI failures caused by relying on transitive
includes that weren't always present. include-what-you-use is a
Clang-based tool that helps with this. It's a bit finicky and noisy, so
this adds scripts/iwyu.py to make running it more convenient (but not
reliable enough to automate it in Travis).
This cleans up all reasonable include-what-you-use warnings and
reorganizes a few header files.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn was originally my side project, but for awhile now it's also been
my work project. Update the copyright headers to reflect this, and add a
copyright header to various files that were missing it.
I've found that I do this manually a lot (e.g., when digging through a
task's stack). Add shortcuts for reading unsigned integers and a note
for how to manually read other formats.
For C++ support, we need to add an array of template parameters to
struct drgn_type. struct drgn_type already has arrays for members,
enumerators, and parameters embedded at the end of the structure,
because no type needs more than one of those. However, struct, union,
and class types may need members and template parameters. We could add a
separate array of templates, but then it gets confusing having two
methods of storing arrays in struct drgn_type. Let's make these arrays
separate instead of embedding them.
For operations where we don't have a type available, we currently fall
back to C. Instead, we should guess the language of the program and use
that as the default. The heurisitic implemented here gets the language
of the CU containing "main" (except for the Linux kernel, which is
always C). In the future, we should allow manually overriding the
automatically determined language.
For types obtained from DWARF, we determine it from the language of the
CU. For other types, it can be specified manually or fall back to the
default (C). Then, we can use the language for operations where the type
is available.
Decouple some of the responsibilities of FaultError to
OutOfBoundsError so consumers can differentiate between
invalid memory accesses and running out of bounds in
drgn Objects which may be based on valid memory address.
Currently the drgn version number is defined in drgn.h.in, and configure
and setup.py both parse it out of there. However, now that we're
generating drgn.h anyways, it's easier to make configure.ac the source
of truth.
In preparation for making drgn_pretty_print_object() more flexible
(i.e., not always "pretty"), rename it to drgn_format_object(). For
consistency, let's rename drgn_pretty_print_type_name(),
drgn_pretty_print_type(), and drgn_pretty_print_stack_trace(), too.
If we only want debugging information for vmlinux and not kernel
modules, it'd be nice to only load the former. This adds a load_main
parameter to drgn_program_load_debug_info() which specifies just that.
For now, it's only implemented for the Linux kernel. While we're here,
let's make the paths parameter optional for the Python bindings.
This implements the first step at supporting C++: class types. In
particular, this adds a new drgn_type_kind, DRGN_TYPE_CLASS, and support
for parsing DW_TAG_class_type from DWARF. Although classes are not valid
in C, this adds support for pretty printing them, for completeness.
When debugging the Linux kernel, it's inconvenient to have to get the
task_struct of a thread in order to get its stack trace. This adds
support for looking it up solely by PID. In that case, we do the
find_task() inside of libdrgn. This also gives us stack trace support
for userspace core dumps almost for free since we already added support
for NT_PRSTATUS.