Decouple some of the responsibilities of FaultError to
OutOfBoundsError so consumers can differentiate between
invalid memory accesses and running out of bounds in
drgn Objects which may be based on valid memory address.
When we're checking whether the element that we formatted on one line
would fit on the previous line, we check whether the previous line is
empty with remaining_columns == start_columns. This is never true, as
remaining_columns is always set to start_columns - 1 at most, and it
only decreases from there until we start a new line.
drgn_object_truthiness() is a misnomer, as truthiness is a
language-specific concept. Instead, invert the return value and rename
it to drgn_object_is_zero(), which more accurately conveys the meaning.
In preparation for making drgn_pretty_print_object() more flexible
(i.e., not always "pretty"), rename it to drgn_format_object(). For
consistency, let's rename drgn_pretty_print_type_name(),
drgn_pretty_print_type(), and drgn_pretty_print_stack_trace(), too.
This implements the first step at supporting C++: class types. In
particular, this adds a new drgn_type_kind, DRGN_TYPE_CLASS, and support
for parsing DW_TAG_class_type from DWARF. Although classes are not valid
in C, this adds support for pretty printing them, for completeness.
This makes several improvements to the hash table API.
The first two changes make things more general in order to be consistent
with the upcoming binary search tree API:
- Items are renamed to entries.
- Positions are renamed to iterators.
- hash_table_empty() is added.
One change makes the definition API more convenient:
- It is no longer necessary to pass the types into
DEFINE_HASH_{MAP,SET}_FUNCTIONS().
A few changes take some good ideas from the C++ STL:
- hash_table_insert() now fails on duplicates instead of overwriting.
- hash_table_delete_iterator() returns the next iterator.
- hash_table_next() returns an iterator instead of modifying it.
One change reduces memory usage:
- The lower-level DEFINE_HASH_TABLE() is cleaned up and exposed as an
alternative to DEFINE_HASH_MAP() and DEFINE_HASH_SET(). This allows us
to get rid of the duplicated key where a hash map value already embeds
the key (the DWARF index file table) and gets rid of the need to make
a dummy hash set entry to do a search (the pointer and array type
caches).
Currently, programs can be created for three main use-cases: core dumps,
the running kernel, and a running process. However, internally, the
program memory, types, and symbols are pluggable. Expose that as a
callback API, which makes it possible to use drgn in much more creative
ways.
Similar to "libdrgn: make memory reader pluggable with callbacks", we
want to support custom type indexes (imagine, e.g., using drgn to parse
a binary format). For now, this disables the dwarf index tests; we'll
have a better way to test them later, so let's not bother adding more
test scaffolding.
There's a bug that we don't allow comparisons between void * and other
pointer types, so let's fix it by allowing all pointer comparisons
regardless of the referenced type. Although this isn't valid by the C
standard, GCC and Clang both allow it by default (with a warning).
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.