Commit Graph

877 Commits

Author SHA1 Message Date
Omar Sandoval
7eab40aaeb libdrgn: rename drgn_error_debug_info() to drgn_error_debug_info_scn()
An upcoming change will introduce a similar function for when the
section isn't known. Rename the original so that the new one can take
its name.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-10 02:07:16 -08:00
Omar Sandoval
56c4003db7 setup.py: add 5.12 to vmtest kernels
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-09 13:51:52 -08:00
Jay Kamat
4552d78f4a libdrgn: debug_info: try to find DIE specification when parsing type
Currently, we look up incomplete types by name, which can fail if the
name is ambiguous or the type is unnamed. Try finding the complete type
via the DW_AT_specification map in the DWARF index first.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2021-03-08 15:24:24 -08:00
Jay Kamat
3823b21e17 libdrgn: dwarf_index: uses DIE address instead of section offset
To support indexing DWARF 4 type units, we need to be able to
differentiate between DIEs in .debug_info and .debug_types. We can't do
that with just a section offset, so instead store the address of the DIE
in the index and specification map.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2021-03-08 15:24:24 -08:00
Omar Sandoval
ca1a2598fd libdrgn: python: add missing function name to Object.format_() exceptions
The ":function name" is missing from the PyArg_ParseTupleAndKeywords()
call in DrgnObject_format(), so errors say, for example, "'foo' is an
invalid keyword argument for this function" instead of "for format_()".

Fixes: cf3a07bdfb ("libdrgn: python: replace Object.__format__ with Object.format_")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-08 14:27:33 -08:00
Omar Sandoval
a24c0f5b33 libdrgn: clean up usage of drgn_stop
Use drgn_not_found where it's more appropriate, and check explicitly
against drgn_stop instead of err->code == DRGN_ERROR_STOP.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-05 12:46:06 -08:00
Omar Sandoval
4680b93103 libdrgn: improve truncate_signed() and truncate_unsigned()
truncate_signed() requires 5 operations (compute a mask for the lower
bits, and it, compute the sign extension mask, xor it, subtract it) and
a branch. We can do it in 3 operations and no branches if we assume that
the compiler does an arithmetic shift for signed integers, which we
already depend on. Then, we can remove sign_extend(), which is the same
as truncate_signed() except it assumes that the upper bits are zero to
save on a couple of operations.

Similarly, for truncate_unsigned() we can remove the branch.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-26 16:05:49 -08:00
Omar Sandoval
b5ed892481 Fix some include-what-you-use warnings and update for Bear 3
Bear 3 changed the CLI arguments, so update scripts/iwyu.py for it and
clean up some new warnings.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-26 16:05:49 -08:00
Omar Sandoval
98e1947d26 libdrgn: require non-NULL drgn_architecture_info::register_by_name
Instead of checking whether it's NULL, define a stub for
arch_info_unknown.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-26 16:05:49 -08:00
Omar Sandoval
25eb2abb1a libdrgn: add drgn_platform getters
Add low-level getters equivalent to the drgn_program platform-related
helpers and use them in places where we have checked or can assume that
the platform is known.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-26 16:05:49 -08:00
Omar Sandoval
e04eda9880 libdrgn: define HOST_LITTLE_ENDIAN
As a minor cleanup, instead of writing __BYTE_ORDER__ ==
__ORDER_LITTLE_ENDIAN__ everywhere, define and use HOST_LITTLE_ENDIAN.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-26 16:05:49 -08:00
Jay Kamat
c22e501295 libdrgn: debug_info: fix parsing specifications of declarations
drgn_compound_type_from_dwarf() and drgn_enum_type_from_dwarf() check
the DW_AT_declaration flag to decide whether the type is a declaration
of an incomplete type or a definition of a complete type. However, they
check DW_AT_declaration with dwarf_attr_integrate(), which follows the
DW_AT_specification reference if it is present. The DIE referenced by
DW_AT_specification typically is a declaration, so this erroneously
identifies definitions as declarations. Additionally, if
drgn_debug_info_find_complete() finds the same definition, we can end up
recursing until we hit the DWARF parsing depth limit. Fix it by not
using dwarf_attr_integrate() for DW_AT_declaration.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2021-02-25 10:46:34 -08:00
Omar Sandoval
85dec2b8f6 tests: move C-specific tests from test_object to test_language_c
TestCLiteral, TestCIntegerPromotion, TestCCommonRealType,
TestCOperators, and TestCPretty in test_object all test various
operations on objects, but since they're testing language-specific
behavior, they belong in test_language_c.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 16:11:19 -08:00
Omar Sandoval
aaa98ccae3 libdrgn: consistently use __ for __attribute__ names
In some places, we add __ preceding and following an attribute name, and
in some places, we don't. Let's make it consistent. We might as well opt
for the __ to make clashes with macros less likely.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 03:16:23 -08:00
Omar Sandoval
0e1f85516a vmtest: manage: use str() instead of repr() for Path in error message
Otherwise, the path is formatted as "PosixPath('...')", which is ugly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 03:08:36 -08:00
Omar Sandoval
006b62e530 vmtest: manage: use real name for logging, not asyncio
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 03:08:36 -08:00
Omar Sandoval
bc06f3ae59 vmtest: manage: delete temporary install directory
The install directory contains redundant copies of the modules already
in the build tree and built package, so clean it up on success.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 03:08:36 -08:00
Omar Sandoval
c54ef80412 libdrgn: add missing LIBDRGN_PUBLIC exports
drgn_object_dereference_offset() and drgn_object_member_dereference()
are both in drgn.h.in but aren't exported. They should be.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 02:42:12 -08:00
Omar Sandoval
3ecb31de9f libdrgn: update stale references in drgn_object_slice() comment
drgn_program_member_info() was replaced by drgn_type_find_member() in
commit e72ecd0e2c ("libdrgn: replace drgn_program_member_info() with
drgn_type_find_member()"). drgn_object_pointer_offset() never existed;
it's supposed to be drgn_object_dereference_offset().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 02:41:20 -08:00
Omar Sandoval
da1e72f0d5 libdrgn: remove drgn_{,qualified_}type_eq() from drgn.h.in
The definitions were removed but these public declarations weren't.

Fixes: 7d7aa7bf7b ("libdrgn/python: remove Type == operator")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 02:37:36 -08:00
Omar Sandoval
55e3a58e06 libdrgn: python: use correct member offset when creating object from value
We need to use the offset of the member in the outermost object type,
not the offset in the immediate containing type in the case of nested
anonymous structs.

Fixes: e72ecd0e2c ("libdrgn: replace drgn_program_member_info() with drgn_type_find_member()")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-21 02:29:59 -08:00
Omar Sandoval
9fda010789 Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:

- Byte order is only meaningful for scalars. What does it mean for a
  struct to be big endian? A struct doesn't have a most or least
  significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
  byte order (DW_AT_endianity). The only producer I could find that uses
  this is GCC for the scalar_storage_order type attribute, and it only
  uses it for base types, not variables. GDB only seems to use to check
  it for base types, as well.

So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.

The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-19 21:41:29 -08:00
Omar Sandoval
72b4aa9669 libdrgn: clean up object initialization
Rename struct drgn_object_type to struct drgn_operand_type, add a new
struct drgn_object_type which contains all of the type-related fields
from struct drgn_object, and use it to implement drgn_object_type() and
drgn_object_type_operand(), which are replacements for
drgn_object_set_common() and drgn_object_type_encoding_and_size(). This
cleans up a lot of the boilerplate around initializing objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-19 17:43:14 -08:00
Omar Sandoval
78316a28fb libdrgn: remove half-baked support for complex types
We've nominally supported complex types since commit 75c3679147
("Rewrite drgn core in C"), but parsing them from DWARF has been
incorrect from the start (they don't have a DW_AT_type attribute like we
assume), and we never implemented proper support for complex objects.
Drop the partial implementation; we can bring it back (properly) if
someone requests it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-17 14:56:33 -08:00
Omar Sandoval
f09ab62b73 drgn 0.0.9
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-17 02:19:09 -08:00
Omar Sandoval
9a066b409f docs: mention that default arguments are not yet parsed from DWARF
TypeParameter.default_argument is currently basically a placeholder
because we don't parse it from DWARF and compilers don't emit it, so
document that. See #82.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-17 02:16:23 -08:00
Omar Sandoval
36df5fc076 libdrgn: ppc64: fix fetching cr fields from pt_regs
The condition register fields are numbered from most significant to
least significant. Also, the CFI for unwinding the condition register
fields restores them in their position in the condition register, so do
the same when initially populating them.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-17 00:45:14 -08:00
Omar Sandoval
547333d8ca docs: document GNU Awk version dependency
gen_arch.awk uses PROCINFO["sorted_in"] and arrays of arrays, both of
which were introduced in GNU Awk 4.0 according to
https://www.gnu.org/software/gawk/manual/html_node/Feature-History.html.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-09 16:50:58 -08:00
Kamalesh Babulal
221a218704 libdrgn: add powerpc stack trace support
Add powerpc specific register information required to retrive the
stack traces of the tasks on both live system and from the core dump.
It uses the existing DSL format to define platform registers and
helper functions to initial them. It also adds architecture specific
information to enable powerpc. Current support is for little-endian
powerpc only.

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
2021-01-29 11:31:59 -08:00
Omar Sandoval
b899a10836 Remove register numbers from API and add register aliases
enum drgn_register_number in the public libdrgn API and
drgn.Register.number in the Python bindings are basically exports of
DWARF register numbers. They only exist as a way to identify registers
that's lighter weight than string lookups. libdrgn already has struct
drgn_register, so we can use that to identify registers in the public
API and remove enum drgn_register_number. This has a couple of benefits:
we don't depend on DWARF numbering in our API, and we don't have to
generate drgn.h from the architecture files. The Python bindings can
just use string names for now. If it seems useful, StackFrame.register()
can take a Register in the future, we'll just need to be careful to not
allow Registers from the wrong platform.

While we're changing the API anyways, also change it so that registers
have a list of names instead of one name. This isn't needed for x86-64
at the moment, but will be for architectures that have multiple names
for the same register (like ARM).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-28 17:47:45 -08:00
Omar Sandoval
10e6464769 libdrgn: python: clean up module creation
Add a helper based on PyModule_AddType() from Python 3.9 and use it to
simplify PyInit__drgn(). Also handle errors in PyInit__drgn().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-28 12:41:13 -08:00
Omar Sandoval
0d35dec8ee libdrgn: python: define Py_RETURN_BOOL
And use it instead of an if statement with
Py_RETURN_TRUE/Py_RETURN_FALSE or PyBool_FromLong().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-28 11:35:09 -08:00
Omar Sandoval
46343ae08d libdrgn: get rid of struct drgn_stack_frame
In preparation for adding a "real", internal-only struct
drgn_stack_frame, replace the existing struct drgn_stack_frame with
explicit trace/frame arguments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-27 11:22:34 -08:00
Omar Sandoval
71c6ac6927 libdrgn: use drgn_debug_info_module instead of Dwfl_Module in more places
It's easier to go from drgn_debug_info_module to Dwfl_Module than the
other direction, and I'd rather use the "higher-level"
drgn_debug_info_module wherever possible. So, store
drgn_debug_info_module in the DWARF index (which also saves a
dereference while building the index), and pass around
drgn_debug_info_module when parsing types/objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-27 11:17:41 -08:00
Omar Sandoval
9e3b3a36cf setup.py: fix black error
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-22 11:45:11 -08:00
Omar Sandoval
7c670d6faa Remove unused 'type: ignore' comment for pkgutil.read_code()
mypy 0.800 has a stub for pkgutil.read_code(), so we don't need to
ignore it anymore.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-22 11:27:10 -08:00
Omar Sandoval
2977bee278 Fix reexport of drgn.__version__
mypy 0.800 is stricter about reexports: "from foo import X as Y" is only
considered a reexport if X and Y are the same name (see
python/mypy#9515). mypy 0.800 fails with:

  drgn/internal/cli.py:46: error: Module has no attribute "__version__"

Rename drgn.internal.version.version to __version__ so that
drgn/__init__.py can reexport it with import __version__ as __version__.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-22 11:23:06 -08:00
Omar Sandoval
09a9220c60 setup.py: add 5.11 to vmtest kernels
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 16:59:57 -08:00
Omar Sandoval
0a396d60f3 helpers: fix block device helpers on v5.11
Linux v5.11 combined struct block_device and struct hd_struct, which
breaks for_each_disk(), for_each_partition(), part_devt(), and
part_name(). Update the helpers to handle the new scheme.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 16:56:03 -08:00
Omar Sandoval
bbefc573d8 libdrgn: debug_info: make sure DW_TAG_template_value_parameter has value
Otherwise, an invalid DW_TAG_template_value_parameter can be confused
for a type parameter.

Fixes: 352c31e1ac ("Add support for C++ template parameters")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 12:07:46 -08:00
Omar Sandoval
5f170ea3f3 helpers: add per_cpu()
The correct way to access global per-CPU variables
(per_cpu_ptr(prog[name].address_of_(), cpu)) has been a common source of
confusion (see #77). Add an analogue to the per_cpu() macro in the
kernel as a shortcut and document it as the easiest method for getting a
global per-CPU variable: per_cpu(prog[name], cpu).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 11:40:05 -08:00
Omar Sandoval
2c612ea97f libdrgn: fix address of global per-CPU variables with KASLR
The address of a per-CPU variable is really an offset into the per-CPU
area, but we're applying the load bias (i.e., KASLR offset) to it as if
it were an address, resulting in an invalid pointer when it's eventually
passed to per_cpu_ptr().

Fix this by applying the bias only if it the address is in the module's
address range. This heuristic avoids any Linux kernel-specific logic;
hopefully it doesn't have any undesired side effects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:14:50 -08:00
Omar Sandoval
a7962e9477 libdrgn: debug_info: pass around Dwfl_Module instead of bias
We're going to need the module start and end in
drgn_object_from_dwarf_variable(), so pass the Dwfl_Module around and
get the bias when we need it. This means we don't need the bias from
drgn_dwarf_index_get_die(), so get rid of that, too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:12:29 -08:00
Omar Sandoval
81a203c48f helpers: fix for_each_{possible,online,present}_cpu() on v4.4
Also reorder the definitions to alphabetical order and add tests.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-21 10:08:48 -08:00
Omar Sandoval
048952f9a6 libdrgn: x86-64: fix rsp of initial stack frame
We're using task->thread.sp for rsp in the initial frame for both the
struct inactive_task_frame path and frame pointer path. This is not
correct for either.

For kernels with struct inactive_task_frame, task->thread.sp points to
to the struct inactive_task_frame. The stack pointer in the initial
frame is the address immediately after the struct inactive_task_frame.

For kernels without struct inactive_task_frame, task->thread.sp points
to the saved rbp. We follow that rbp to the rbp and return address for
the initial frame; its stack pointer is the address immediately after
those.

Fixes: 10142f922f ("Add basic stack trace support")
Fixes: 51596f4d6c ("libdrgn: x86-64: remove garbage initial stack frame on old kernels")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-15 10:57:08 -08:00
Omar Sandoval
277c34e876 CONTRIBUTING: add guidelines for good commits
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-12 16:56:54 -08:00
Omar Sandoval
352c31e1ac Add support for C++ template parameters
Add struct drgn_type_template_parameter to libdrgn, the corresponding
TypeTemplateParameter to the Python bindings, and support for parsing
them from DWARF.

With this, support for templates is almost, but not quite, complete. The
main wart is that DW_TAG_name of compound types includes the template
parameters, so the type tag includes it as well. We should remove that
from the tag and instead have the type formatting code add it only when
getting the full type name.

Based on a patch from Jay Kamat.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
b6958f920c libdrgn: debug_info: move object parsing code in debug_info.c
In preparation for calling the object parsing code from the type parsing
code, move it up in the file (and update the coding style in
drgn_object_from_dwarf_enumerator() while we're at it).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
be1bb279aa libdrgn: debug_info: pass DIE bias when parsing types
This will be needed for types containing reference objects.

Based on a patch from Jay Kamat.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00
Omar Sandoval
d35243b354 libdrgn: replace lazy types with lazy objects
In order to support static members, methods, default function arguments,
and value template parameters, we need to be able to store a drgn_object
in a drgn_type_member or drgn_type_parameter. These are all cases where
we want lazy evaluation, so we can replace drgn_lazy_type with a new
drgn_lazy_object which implements the same idea but for objects. Types
can still be represented with an absent object.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-01-08 17:39:51 -08:00