Commit Graph

40 Commits

Author SHA1 Message Date
Omar Sandoval
5f17281926 libdrgn: make drgn_object::is_reference an enum
To prepare for a new kind of object, replace the is_reference bool with
an enum drgn_object_kind.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 13:37:58 -08:00
Omar Sandoval
36068a0ea8 Fix trailing commas for Black v20.8b1
Black was recently changed to treat a trailing comma as an indicator to
put each item/argument on its own line. We have a bunch of places where
something previously had to be split into multiple lines, then was
edited to fit on one line, but Black kept the trailing comma. Now this
update wants to unnecessarily split it back up. For now, let's get rid
of these commas. Hopefully in the future Black has a way to opt out of
this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-27 11:31:29 -07:00
Omar Sandoval
a97f6c4fa2 Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.

Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.

So, let's reimagine types as being owned by a program. This has a few
parts:

1. In libdrgn, simple types are now created by factory functions,
   drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
   and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
   Program class.
5. While we're changing the API, the parameters to pointer_type() and
   array_type() are reordered to be more logical (and to allow
   pointer_type() to take a default size of None for the program's
   default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
   argument only.

A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 17:41:09 -07:00
Omar Sandoval
971a2d3687 libdrgn/python: make Objects fully immutable
The model has always been that drgn Objects are immutable, but for some
reason I went through the trouble of allowing __init__() to reinitialize
an already initialized Object. Instead, let's fully initialize the
Object in __new__() and get rid of __init__().
2020-05-18 00:07:49 -07:00
Omar Sandoval
ab876f3dbd libdrgn/python: allow specifying Object value positionally
It's annoying to have to do value= when creating objects, especially in
interactive mode. Let's allow passing in the value positionally so that
`Object(prog, "int", value=0)` becomes `Object(prog, "int", 0)`. It's
clear enough that this is creating an int with value 0.
2020-05-18 00:07:49 -07:00
Omar Sandoval
8b264f8823 Update copyright headers to Facebook and add missing headers
drgn was originally my side project, but for awhile now it's also been
my work project. Update the copyright headers to reflect this, and add a
copyright header to various files that were missing it.
2020-05-15 15:13:02 -07:00
Omar Sandoval
63299e0701 libdrgn: actually use uint64_t for two's complement unary ops
UNARY_OP_SIGNED_2C() uses a union of int64_t and uint64_t to avoid
signed integer overflow... except that there's a typo and the uint64_t
is actually an int64_t. Fix it and add a test that would catch it with
-fsanitize=undefined.
2020-05-08 13:50:24 -07:00
Omar Sandoval
26ef465007 libdrgn/python: add proper type for members and parameters
This continues the conversion from the last commit. Members and
parameters are basically the same, so we can do them together. Unlike
enumerators, these don't make sense to unpack or access as sequences.
2020-02-12 15:40:19 -08:00
Omar Sandoval
7c70a1a384 libdrgn/python: add proper type for enumerators
Currently, type members, enumerators, and parameters are all represented
by tuples in the Python bindings. This is awkward to document and
implement. Instead, let's replace these tuples with proper types,
starting with the easiest one, TypeEnumerator. This one still makes
sense to treat as a sequence so that it can be unpacked as (name,
value).
2020-02-12 15:37:41 -08:00
Omar Sandoval
9de2cc8410 libdrgn/python: make Object.__index__() TypeError message clearer
Currently, we print:

>>> prog.symbol(prog['init_task'])
Traceback (most recent call last):
  File "<console>", line 1, in <module>
TypeError: cannot convert 'struct task_struct' to index

It's not obvious what it means to convert to an index. Instead, let's
use the error message raised by operator.index():

TypeError: 'struct task_struct' object cannot be interpreted as an integer
2020-02-11 09:19:53 -08:00
Serapheim Dimitropoulos
ad82e9623a Introduce OutOfBoundsError
Decouple some of the responsibilities of FaultError to
OutOfBoundsError so consumers can differentiate between
invalid memory accesses and running out of bounds in
drgn Objects which may be based on valid memory address.
2020-02-04 14:59:31 -08:00
Omar Sandoval
660276a0b8 Format Python code with Black
I'm not a fan of 100% of the Black coding style, but I've spent too much
time manually formatting Python code, so let's just pull the trigger.
2020-01-14 11:51:58 -08:00
Omar Sandoval
1443d17fb4 libdrgn: add DRGN_FORMAT_OBJECT_IMPLICIT_ELEMENTS 2019-12-19 11:43:54 -08:00
Omar Sandoval
db66952b2e libdrgn: add DRGN_FORMAT_OBJECT_IMPLICIT_MEMBERS 2019-12-19 11:43:54 -08:00
Omar Sandoval
c8434e9a9e libdrgn: add DRGN_FORMAT_OBJECT_ELEMENT_INDICES 2019-12-19 11:43:54 -08:00
Omar Sandoval
cfceb491db libdrgn: add DRGN_FORMAT_OBJECT_MEMBER_NAMES 2019-12-19 11:43:54 -08:00
Omar Sandoval
4fad941ec1 libdrgn: add DRGN_FORMAT_OBJECT_{MEMBERS,ELEMENTS}_SAME_LINE 2019-12-19 11:43:54 -08:00
Omar Sandoval
6bb8da04a0 libdrgn: omit trailing comma when formatting one-line array
This is somewhat arbitrary, but I think it looks more natural to only
use the trailing comma for multi-line initializers.
2019-12-19 11:43:54 -08:00
Omar Sandoval
d77b7bd7e3 libdrgn: add DRGN_FORMAT_OBJECT_{TYPE_NAME,MEMBER_TYPE_NAMES,ELEMENT_TYPE_NAMES} 2019-12-19 11:43:54 -08:00
Omar Sandoval
89307c532a libdrgn: add DRGN_FORMAT_OBJECT_CHAR 2019-12-19 11:43:54 -08:00
Omar Sandoval
7cee597fff libdrgn: add DRGN_FORMAT_OBJECT_STRING 2019-12-19 11:43:54 -08:00
Omar Sandoval
5865fa4d16 libdrgn: add DRGN_FORMAT_OBJECT_SYMBOLIZE 2019-12-19 11:43:54 -08:00
Omar Sandoval
f58bc4bf3a libdrgn: add DRGN_FORMAT_OBJECT_DEREFERENCE 2019-12-19 11:43:54 -08:00
Omar Sandoval
cf3a07bdfb libdrgn: python: replace Object.__format__ with Object.format_
We'd like to have more control over how objects are formatted. I
considered defining a custom string format specification syntax, but
that's not easily discoverable. Instead, let's get rid of the current
format specification support and replace it with a normal method.
2019-12-19 11:43:52 -08:00
Amlan Nayak
0df2152307 Add basic class type support
This implements the first step at supporting C++: class types. In
particular, this adds a new drgn_type_kind, DRGN_TYPE_CLASS, and support
for parsing DW_TAG_class_type from DWARF. Although classes are not valid
in C, this adds support for pretty printing them, for completeness.
2019-11-18 10:36:40 -08:00
Omar Sandoval
b8c657d760 libdrgn: python: add sizeof()
It's annoying to do obj.type_.size, and that doesn't even work for every
type. Add sizeof() that does the right thing whether it's given a Type
or Object.
2019-10-18 11:47:32 -07:00
Omar Sandoval
430732093d libdrgn: python: add converter for byteorder
Rather than open-coding the conversion where we need it, make it a
proper converter function.
2019-10-15 21:21:21 -07:00
Omar Sandoval
55a9700435 libdrgn: python: accept integer-like arguments in more places
There are a few places (e.g., Program.symbol(), Program.read()) where it
makes sense to accept, e.g., a drgn.Object with integer type. Replace
index_arg() with a converter function and use it everywhere that we use
the "K" format for PyArg_Parse*.
2019-10-15 21:10:11 -07:00
Omar Sandoval
0c5df56fba libdrgn: replace symbol index with object index
struct drgn_symbol doesn't really represent a symbol; it's just an
object which hasn't been fully initialized (see c2be52dff0 ("libdrgn:
rename object index to symbol index"), it used to be called a "partial
object"). For stack traces, we're going to have a notion of a symbol
that more closely represents an ELF symbol, so let's get rid of the
temporary struct drgn_symbol representation and just return an object
directly.
2019-07-29 17:04:47 -07:00
Omar Sandoval
b5b024ecac tests: move common helpers to top-level 2019-07-29 17:04:47 -07:00
Omar Sandoval
baba1ff3f0 libdrgn: make program components pluggable
Currently, programs can be created for three main use-cases: core dumps,
the running kernel, and a running process. However, internally, the
program memory, types, and symbols are pluggable. Expose that as a
callback API, which makes it possible to use drgn in much more creative
ways.
2019-05-10 12:41:07 -07:00
Omar Sandoval
7282c40a75 libdrgn: fix crash in drgn_object_slice()
We need to set the value after we've reinitialized the object, otherwise
drgn_object_deinit() may try to free a buffer that we've already
overwritten. This also adds a test which triggers the crash.
2019-04-24 17:57:36 -07:00
Omar Sandoval
97f5cf70c6 libdrgn: fix C array and function casting
Casting an array or function should first convert the array or function
into a pointer.
2019-04-12 16:40:12 -07:00
Omar Sandoval
1db8d11f84 libdrgn: allow void pointer arithmetic
This is a GCC extension, but it's used pretty often in practice.
2019-04-12 16:06:24 -07:00
Omar Sandoval
7719d76820 libdrgn: allow comparison of function pointers and incomplete arrays 2019-04-12 15:49:15 -07:00
Omar Sandoval
309dc82789 libdrgn: allow comparing any pointer types in C
There's a bug that we don't allow comparisons between void * and other
pointer types, so let's fix it by allowing all pointer comparisons
regardless of the referenced type. Although this isn't valid by the C
standard, GCC and Clang both allow it by default (with a warning).
2019-04-12 15:44:08 -07:00
Omar Sandoval
73090f6128 libdrgn: fix error message for cast to incomplete type
The errant type is the one we're trying to cast to, not the object's
type. This fixes an abort in drgn_error_incomplete_type().
2019-04-12 13:24:51 -07:00
Omar Sandoval
435640faf6 Fix some linter errors 2019-04-11 15:51:20 -07:00
Omar Sandoval
321e2c210e libdrgn/python: format pointers in hex for Object.__repr__() 2019-04-05 10:36:54 -07:00
Omar Sandoval
75c3679147 Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:

- It's too slow for some common use cases, like iterating over large
  data structures.
- It can't be reused in utilities written in other languages.

This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:

- Types are now represented by a single Type class rather than the messy
  polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
  functions.

The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.

Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-04-02 14:12:07 -07:00