Commit Graph

996 Commits

Author SHA1 Message Date
Omar Sandoval
0d35dec8ee libdrgn: python: define Py_RETURN_BOOL
And use it instead of an if statement with
Py_RETURN_TRUE/Py_RETURN_FALSE or PyBool_FromLong().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-28 11:35:09 -08:00
Omar Sandoval
46343ae08d libdrgn: get rid of struct drgn_stack_frame
In preparation for adding a "real", internal-only struct
drgn_stack_frame, replace the existing struct drgn_stack_frame with
explicit trace/frame arguments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-27 11:22:34 -08:00
Omar Sandoval
71c6ac6927 libdrgn: use drgn_debug_info_module instead of Dwfl_Module in more places
It's easier to go from drgn_debug_info_module to Dwfl_Module than the
other direction, and I'd rather use the "higher-level"
drgn_debug_info_module wherever possible. So, store
drgn_debug_info_module in the DWARF index (which also saves a
dereference while building the index), and pass around
drgn_debug_info_module when parsing types/objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-27 11:17:41 -08:00
Omar Sandoval
9e3b3a36cf setup.py: fix black error
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-22 11:45:11 -08:00
Omar Sandoval
7c670d6faa Remove unused 'type: ignore' comment for pkgutil.read_code()
mypy 0.800 has a stub for pkgutil.read_code(), so we don't need to
ignore it anymore.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-22 11:27:10 -08:00
Omar Sandoval
2977bee278 Fix reexport of drgn.__version__
mypy 0.800 is stricter about reexports: "from foo import X as Y" is only
considered a reexport if X and Y are the same name (see
python/mypy#9515). mypy 0.800 fails with:

  drgn/internal/cli.py:46: error: Module has no attribute "__version__"

Rename drgn.internal.version.version to __version__ so that
drgn/__init__.py can reexport it with import __version__ as __version__.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-22 11:23:06 -08:00
Omar Sandoval
09a9220c60 setup.py: add 5.11 to vmtest kernels
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 16:59:57 -08:00
Omar Sandoval
0a396d60f3 helpers: fix block device helpers on v5.11
Linux v5.11 combined struct block_device and struct hd_struct, which
breaks for_each_disk(), for_each_partition(), part_devt(), and
part_name(). Update the helpers to handle the new scheme.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 16:56:03 -08:00
Omar Sandoval
bbefc573d8 libdrgn: debug_info: make sure DW_TAG_template_value_parameter has value
Otherwise, an invalid DW_TAG_template_value_parameter can be confused
for a type parameter.

Fixes: 352c31e1ac ("Add support for C++ template parameters")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 12:07:46 -08:00
Omar Sandoval
5f170ea3f3 helpers: add per_cpu()
The correct way to access global per-CPU variables
(per_cpu_ptr(prog[name].address_of_(), cpu)) has been a common source of
confusion (see #77). Add an analogue to the per_cpu() macro in the
kernel as a shortcut and document it as the easiest method for getting a
global per-CPU variable: per_cpu(prog[name], cpu).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 11:40:05 -08:00
Omar Sandoval
2c612ea97f libdrgn: fix address of global per-CPU variables with KASLR
The address of a per-CPU variable is really an offset into the per-CPU
area, but we're applying the load bias (i.e., KASLR offset) to it as if
it were an address, resulting in an invalid pointer when it's eventually
passed to per_cpu_ptr().

Fix this by applying the bias only if it the address is in the module's
address range. This heuristic avoids any Linux kernel-specific logic;
hopefully it doesn't have any undesired side effects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:14:50 -08:00
Omar Sandoval
a7962e9477 libdrgn: debug_info: pass around Dwfl_Module instead of bias
We're going to need the module start and end in
drgn_object_from_dwarf_variable(), so pass the Dwfl_Module around and
get the bias when we need it. This means we don't need the bias from
drgn_dwarf_index_get_die(), so get rid of that, too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:12:29 -08:00
Omar Sandoval
81a203c48f helpers: fix for_each_{possible,online,present}_cpu() on v4.4
Also reorder the definitions to alphabetical order and add tests.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:08:48 -08:00
Omar Sandoval
048952f9a6 libdrgn: x86-64: fix rsp of initial stack frame
We're using task->thread.sp for rsp in the initial frame for both the
struct inactive_task_frame path and frame pointer path. This is not
correct for either.

For kernels with struct inactive_task_frame, task->thread.sp points to
to the struct inactive_task_frame. The stack pointer in the initial
frame is the address immediately after the struct inactive_task_frame.

For kernels without struct inactive_task_frame, task->thread.sp points
to the saved rbp. We follow that rbp to the rbp and return address for
the initial frame; its stack pointer is the address immediately after
those.

Fixes: 10142f922f ("Add basic stack trace support")
Fixes: 51596f4d6c ("libdrgn: x86-64: remove garbage initial stack frame on old kernels")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-15 10:57:08 -08:00
Omar Sandoval
277c34e876 CONTRIBUTING: add guidelines for good commits
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-12 16:56:54 -08:00
Omar Sandoval
352c31e1ac Add support for C++ template parameters
Add struct drgn_type_template_parameter to libdrgn, the corresponding
TypeTemplateParameter to the Python bindings, and support for parsing
them from DWARF.

With this, support for templates is almost, but not quite, complete. The
main wart is that DW_TAG_name of compound types includes the template
parameters, so the type tag includes it as well. We should remove that
from the tag and instead have the type formatting code add it only when
getting the full type name.

Based on a patch from Jay Kamat.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
b6958f920c libdrgn: debug_info: move object parsing code in debug_info.c
In preparation for calling the object parsing code from the type parsing
code, move it up in the file (and update the coding style in
drgn_object_from_dwarf_enumerator() while we're at it).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
be1bb279aa libdrgn: debug_info: pass DIE bias when parsing types
This will be needed for types containing reference objects.

Based on a patch from Jay Kamat.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
d35243b354 libdrgn: replace lazy types with lazy objects
In order to support static members, methods, default function arguments,
and value template parameters, we need to be able to store a drgn_object
in a drgn_type_member or drgn_type_parameter. These are all cases where
we want lazy evaluation, so we can replace drgn_lazy_type with a new
drgn_lazy_object which implements the same idea but for objects. Types
can still be represented with an absent object.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
190062f470 libdrgn: get drgn_type_member.bit_field_size through drgn_member_type()
Getting the bit field size of a member will soon require evaluating the
lazy type, so return it from drgn_member_type() instead of accessing it
directly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
359177295d libdrgn: move type definitions in drgn.h
In preparation for struct drgn_type referencing struct drgn_object, move
the former after the latter.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
934dd36302 libdrgn: remove unused name parameter from drgn_object_from_dwarf_{subprogram,variable}()
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 12:16:29 -08:00
Omar Sandoval
85a5605e37 CI: run apt-get update before apt-get install
Apparently the package index can be out of date on the newly brought up
VM, leading to 404s, so make sure to update it first.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 11:22:42 -08:00
Omar Sandoval
a57c26ed32 libdrgn: fix zero-length array GCC < 9.0 workaround for qualified types
We're not applying the zero-length array workaround when the array type
is qualified. Make sure we pass through can_be_incomplete_array when
parsing DW_TAG_{const,restrict,volatile,atomic}_type.

Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 11:21:57 -08:00
Omar Sandoval
ca7682650d libdrgn: rename drgn_type_from_dwarf_child() to drgn_type_from_dwarf_attr()
The type comes from the DW_AT_type attribute of the DIE, not a child
DIE, so this is a better name.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 11:02:27 -08:00
Omar Sandoval
798f0887a5 libdrgn: simplify language fall back handling
If the language for a DWARF type is not found or unrecognized, we should
fall back to the global default, not the program default (the program
default language is for language-specific operations on the program, so
DWARF parsing shouldn't depend on it). Add a fall_back parameter to
drgn_language_from_die() and use it in DWARF parsing, and replace
drgn_language_or_default() with a drgn_default_language variable.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 10:46:35 -08:00
Omar Sandoval
a8be40ca60 libdrgn: python: fix Program_hold_object() reference leak
We should only increment a held object's reference count when it is
initially inserted into the set; subsequent holds are no-ops.

Fixes: a8d632b4c1 ("libdrgn/python: use F14 instead of PyDict for Program::objects")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-06 17:48:29 -08:00
Omar Sandoval
b87070f98c libdrgn: fix vector_shrink_to_fit() with size 0
realloc(ptr, 0) is equivalent to free(ptr). It may return NULL, in which
case vector_do_shrink_to_fit() won't update the vector's data and
capacity. A subsequent append will then try to reuse the previous
allocation, causing a use-after-free. free() empty vectors explicitly
instead.

Fixes: 8d52536271 ("libdrgn: add common vector implementation")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-06 17:47:39 -08:00
Omar Sandoval
d6a840ec30 libdrgn: deinitialize empty members/parameters/enumerators when deduplicating
Right now, an empty builder vector will not have anything to free, but
if we start pre-reserving these later, it will be a leak.

Fixes: c7af566c6e ("libdrgn: deduplicate all types with no members/parameters/enumerators")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-06 17:47:25 -08:00
Omar Sandoval
c7af566c6e libdrgn: deduplicate all types with no members/parameters/enumerators
Even if a compound, function, or enumerated type is complete, we can
still deduplicate it as long as it doesn't have members, parameters, or
enumerators.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-06 01:59:48 -08:00
Omar Sandoval
1631c11f37 Migrate to GitHub Actions
Travis CI is no longer offering free open source CI, so migrate to
GitHub Actions. The only downside is that GitHub Actions doesn't support
nested virtualization, but we can work around that by falling back to
slow emulation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-01 02:59:50 -08:00
Omar Sandoval
988e9e7190 libdrgn/python: add Object.absent_
Without this, the only way to check whether an object is absent in
Python is to try to use the object and catch the ObjectAbsentError.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-29 15:06:40 -08:00
Omar Sandoval
30cfa40a72 libdrgn: rename "unavailable" objects to "absent" objects
I was going to add an Object.available_ attribute, but that made me
realize that the naming is somewhat ambiguous, as a reference object
with an invalid address might also be considered "unavailable" by users.
Use the name "absent" instead, which is more clear: the object isn't
there at all.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-29 14:58:26 -08:00
Omar Sandoval
c2eec00ae0 libdrgn/python: use None instead of 0 for TypeMember.bit_field_size
Make TypeMember.bit_field_size consistent with Object.bit_field_size_ by
using None to represent a non-bit field instead of 0.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-25 01:53:23 -08:00
Omar Sandoval
7d7aa7bf7b libdrgn/python: remove Type == operator
The == operator on drgn.Type is only intended for testing. It's
expensive and slow and not what people usually want. It's going to get
even more awkward to define once types can refer to objects (for
template parameters and static members and such). Let's replace == with
a new identical() function only available in unit tests. Then, remove
the operator from the Python bindings as well as the underlying libdrgn
drgn_type_eq() and drgn_qualified_type_eq() functions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-22 03:11:38 -08:00
Omar Sandoval
523fd26959 libdrgn: don't allow casting to non-scalar types at all
Currently, we try to emulate the GNU C extension of casting a struct
type to itself. This does a deep type comparison, which is expensive. We
could take a shortcut like only comparing the kind and type name, but
seeing as standard C only allows casting to a scalar type, let's drop
support for casting to a struct (or other non-scalar) type entirely.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-22 02:46:05 -08:00
Omar Sandoval
40004e5c8f libdrgn/python: add offsetof()
offsetof() can almost be implemented with Type.member(name).offset, but
that doesn't parse member designators. Add an offsetof() function that
does (and add drgn_type_offsetof() in libdrgn).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 16:46:41 -08:00
Omar Sandoval
a595e52d22 libdrgn/python: add Type.has_member()
Add drgn_type_has_member() to libdrgn and Type.has_member() to the
Python bindings. This can simplify some version checks, like the one in
_for_each_block_device() since commit 9a10a927b0 ("helpers: fix
for_each_{disk,partition}() on kernels >= v5.1").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 16:38:48 -08:00
Omar Sandoval
fd04463596 libdrgn/python: add Type.member()
In Python, looking up a member in a drgn Type by name currently looks
something like:

  member = [member for member in type.members if member.name == "foo"][0]

Add a Type.member(name) method, which is both easier and more efficient.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 16:10:23 -08:00
Omar Sandoval
e72ecd0e2c libdrgn: replace drgn_program_member_info() with drgn_type_find_member()
Now that types are associated with their program, we don't need to pass
the program separately to drgn_program_member_info() and can replace it
with a more natural drgn_type_find_member() API that takes only the type
and member name. While we're at it, get rid of drgn_member_info and
return the drgn_type_member and bit_offset directly. This also fixes a
bug that drgn_error_member_not_found() ignores the member name length.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 14:40:54 -08:00
Omar Sandoval
cf9a068820 libdrgn/python: fix reference counting on Type.members and Type.parameters
The TypeMember and TypeParameter instances referring to a libdrgn
drgn_lazy_type are only valid as long as the Type containing them is
still alive. Hold a reference on the containing Type from LazyType. We
can do this without growing LazyType by getting rid of the enum state
and using sentinel values for LazyType::lazy_type as the state.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 14:09:12 -08:00
Omar Sandoval
738ae2c75f libdrgn: pack struct drgn_object better
We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64)
by moving the bit_offset and little_endian members out of the value and
reference structs.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
abafdd965f Remove bit_offset from value objects
There are a couple of reasons that it was the wrong choice to have a
bit_offset for value objects:

1. When we store a buffer with a bit_offset, we're storing useless
   padding bits.
2. bit_offset describes a location, or in other words, part of an
   address. This makes sense for references, but not for values, which
   are just a bag of bytes.

Get rid of union drgn_value.bit_offset in libdrgn, make
Object.bit_offset None for value objects, and disallow passing
bit_offset to the Object() constructor when creating a value. bit_offset
can still be passed when creating an object from a buffer, but we'll
shift the bytes down as necessary to store the value with no offset.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
d495d65108 Use @overload for drgn.Object() constructors
Instead of the current tangle of requirements on arguments, use
overloads.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
c801e5e9b1 drgndoc: format __init__() signature separately from class
Having the signature in the class line is awkward, especially when
__init__() is overloaded. Instead, document __init__() separately, but
refer to it by the name of the class. There might still be a better way
to represent this, but this is at least better than before.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
d7c7094992 drgndoc: fix unnecessary parentheses around tuple subscripts on Python 3.9
Python 3.9 stopped emitting ast.Index nodes, which broke skipping
parentheses around tuples when they're used as subscripts (e.g., for
generic type annotations). Fix it by removing ast.Index nodes in the
pretransformation step on old versions and then handling the new layout
where the ast.Tuple node is directly in ast.Subscript.slice. While we're
here, make sure that we don't skip the parentheses for an empty tuple in
a subscript.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
22c1d87aec libdrgn: cache page_offset and vmemmap as objects instead of uint64_t
This is a little cleaner and saves on conversions back and forth between
C values and objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:40:07 -08:00
Omar Sandoval
bce9ef5f8d libdrgn: linux kernel: remove THREAD_SIZE object finder
THREAD_SIZE is still broken and I haven't looked into the root cause
(see commit 95be142d17 ("tests: disable THREAD_SIZE test")). We don't
need it anymore anyways, so let's remove it entirely.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:08:13 -08:00
Omar Sandoval
51596f4d6c libdrgn: x86-64: remove garbage initial stack frame on old kernels
On old kernels, we set the initial frame as containing only rbp and let
libdwfl unwind it assuming frame pointers from there. This means that
the initial frame has a garbage rip. Follow the frame pointer and set
the previous rbp and return address ourselves instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:02:54 -08:00
Omar Sandoval
6e189027be libdrgn: x86-64: pass frame object as const
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 01:55:36 -08:00