Commit Graph

808 Commits

Author SHA1 Message Date
Omar Sandoval
fd04463596 libdrgn/python: add Type.member()
In Python, looking up a member in a drgn Type by name currently looks
something like:

  member = [member for member in type.members if member.name == "foo"][0]

Add a Type.member(name) method, which is both easier and more efficient.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 16:10:23 -08:00
Omar Sandoval
e72ecd0e2c libdrgn: replace drgn_program_member_info() with drgn_type_find_member()
Now that types are associated with their program, we don't need to pass
the program separately to drgn_program_member_info() and can replace it
with a more natural drgn_type_find_member() API that takes only the type
and member name. While we're at it, get rid of drgn_member_info and
return the drgn_type_member and bit_offset directly. This also fixes a
bug that drgn_error_member_not_found() ignores the member name length.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 14:40:54 -08:00
Omar Sandoval
cf9a068820 libdrgn/python: fix reference counting on Type.members and Type.parameters
The TypeMember and TypeParameter instances referring to a libdrgn
drgn_lazy_type are only valid as long as the Type containing them is
still alive. Hold a reference on the containing Type from LazyType. We
can do this without growing LazyType by getting rid of the enum state
and using sentinel values for LazyType::lazy_type as the state.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 14:09:12 -08:00
Omar Sandoval
738ae2c75f libdrgn: pack struct drgn_object better
We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64)
by moving the bit_offset and little_endian members out of the value and
reference structs.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
abafdd965f Remove bit_offset from value objects
There are a couple of reasons that it was the wrong choice to have a
bit_offset for value objects:

1. When we store a buffer with a bit_offset, we're storing useless
   padding bits.
2. bit_offset describes a location, or in other words, part of an
   address. This makes sense for references, but not for values, which
   are just a bag of bytes.

Get rid of union drgn_value.bit_offset in libdrgn, make
Object.bit_offset None for value objects, and disallow passing
bit_offset to the Object() constructor when creating a value. bit_offset
can still be passed when creating an object from a buffer, but we'll
shift the bytes down as necessary to store the value with no offset.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
d495d65108 Use @overload for drgn.Object() constructors
Instead of the current tangle of requirements on arguments, use
overloads.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
c801e5e9b1 drgndoc: format __init__() signature separately from class
Having the signature in the class line is awkward, especially when
__init__() is overloaded. Instead, document __init__() separately, but
refer to it by the name of the class. There might still be a better way
to represent this, but this is at least better than before.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
d7c7094992 drgndoc: fix unnecessary parentheses around tuple subscripts on Python 3.9
Python 3.9 stopped emitting ast.Index nodes, which broke skipping
parentheses around tuples when they're used as subscripts (e.g., for
generic type annotations). Fix it by removing ast.Index nodes in the
pretransformation step on old versions and then handling the new layout
where the ast.Tuple node is directly in ast.Subscript.slice. While we're
here, make sure that we don't skip the parentheses for an empty tuple in
a subscript.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
22c1d87aec libdrgn: cache page_offset and vmemmap as objects instead of uint64_t
This is a little cleaner and saves on conversions back and forth between
C values and objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:40:07 -08:00
Omar Sandoval
bce9ef5f8d libdrgn: linux kernel: remove THREAD_SIZE object finder
THREAD_SIZE is still broken and I haven't looked into the root cause
(see commit 95be142d17 ("tests: disable THREAD_SIZE test")). We don't
need it anymore anyways, so let's remove it entirely.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:08:13 -08:00
Omar Sandoval
51596f4d6c libdrgn: x86-64: remove garbage initial stack frame on old kernels
On old kernels, we set the initial frame as containing only rbp and let
libdwfl unwind it assuming frame pointers from there. This means that
the initial frame has a garbage rip. Follow the frame pointer and set
the previous rbp and return address ourselves instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:02:54 -08:00
Omar Sandoval
6e189027be libdrgn: x86-64: pass frame object as const
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 01:55:36 -08:00
Omar Sandoval
3187453689 libdrgn: x86-64: remove unused read
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 01:38:11 -08:00
Omar Sandoval
ffa2e0acf1 libdrgn: add missing break in drgn_object_copy()
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-09 10:20:37 -08:00
Omar Sandoval
c4b0d313ac README: update Travis CI badge to travis-ci.com
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 14:38:15 -08:00
Omar Sandoval
97fbedec1f libdrgn: return unavailable objects for DWARF objects without value or address
Now that we have the concept of unavailable objects, use it for DWARF
where appropriate.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 14:15:09 -08:00
Omar Sandoval
6bd0c2b4d2 libdrgn: add concept of "unavailable" objects
There are some situations where we can find an object but can't
determine its value, like local variables that have been optimized out,
inlined functions without a concrete instance, and pure virtual methods.
It's still useful to get some information from these objects, namely
their types. Let's add the concept of an "unavailable" object, which is
an object with a known type but unknown value/address.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 13:58:19 -08:00
Omar Sandoval
5f17281926 libdrgn: make drgn_object::is_reference an enum
To prepare for a new kind of object, replace the is_reference bool with
an enum drgn_object_kind.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 13:37:58 -08:00
Omar Sandoval
edb1fe7f2f libdrgn: rename drgn_object_kind to drgn_object_encoding
I'd like to use the name drgn_object_kind to distinguish between values
and references. "Encoding" is more accurate than "kind", anyways.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 12:02:26 -08:00
Omar Sandoval
2710b4d2aa libdrgn: add macros for strict enum switch statements
There are several places where we'd like to enforce that every
enumeration is handled in a switch. Add SWITCH_ENUM() and
SWITCH_ENUM_DEFAULT() macros for that and use them.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 12:02:23 -08:00
Omar Sandoval
a4dbd7bf95 libdrgn: remove unused DRGN_NUM_ARCH
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-04 12:02:23 -08:00
Omar Sandoval
e39d6cec68 travis: add Python 3.9
Python 3.9 was released back in October. No changes to drgn are
required.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-02 11:26:41 -08:00
Omar Sandoval
3360170336 libdrgn: only install page table memory reader when supported
If virtual address translation isn't implemented for the target
architecture, then we shouldn't add the page table memory reader. If we
do, we get a DRGN_ERROR_INVALID_ARGUMENT error from
linux_helper_read_vm() instead of a DRGN_ERROR_FAULT error as expected.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-27 01:27:30 -08:00
Omar Sandoval
5975d19580 libdrgn: report better errors when parsing DWARF/kmod index
If the DWARF index encounters any error while parsing, it returns an
error saying only "debug information is truncated", which makes it hard
to track down parsing errors. The kmod index parser silently swallows
errors. For both, replace the mread functions with a higher-level
binary_buffer interface that can include more information including the
location of the error. For example:

  /tmp/mybinary: .debug_info+0x4: expected at least 56 bytes, have 55

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-13 17:00:07 -08:00
Omar Sandoval
756e5d27ad libdrgn: debug_info: put sections in an array (again)
Back in commit 9ce9094ee0 ("libdrgn: dwarf_index: don't copy sections
into each CU"), I changed the sections to be individual members. The
next change will be easier if they're in an array.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-11 16:22:04 -08:00
Omar Sandoval
f0a3629c26 libdrgn: debug_info: add dwarf_tag_str() and use it for error messages
There are several places where we manually pass around the string name
of a tag so it can be used for error messages. Do it programatically
instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-11 16:22:04 -08:00
Omar Sandoval
3885697696 drgn 0.0.8
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-11 13:32:04 -08:00
Omar Sandoval
dc867781e3 docs: remove stale comment in Program.pointer_type()
When pointer_type() became a Program method, I forgot to remove the
reference to the old pointer_type() method.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-11 11:56:18 -08:00
Omar Sandoval
2933a41ce1 setup.py: add 5.10 to vmtest kernels
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-11 11:29:11 -08:00
Omar Sandoval
388b6a1090 Generate version.py instead of using pkg_resources
We get the version of drgn with pkg_resources.get_distribution() in two
places: setup.py (when using an sdist) and the CLI. The former causes
problems because in some cases, pip doesn't find the drgn distribution
that's currently being built. The latter adds significant latency to
startup. On my laptop, just importing pkg_resources takes 130 ms. We can
solve both of these problems by generating a file containing the version
instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-20 02:40:16 -07:00
Omar Sandoval
e7caa24176 tests: test kernel module debug info loading
Now that vmtest supports kernel modules, test that we load them
correctly.

Closes #74.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-18 01:19:27 -07:00
Omar Sandoval
4431b4f918 vmtest: enable kernel modules
We currently build with CONFIG_MODULES=n for simplicity. However, this
means that we don't test kernel module support at all. Let's enable
module support. This requires changing how we distribute kernels. Now,
the /lib/modules/$(uname -r) directory (including the vmlinux and
vmlinuz) is bundled up as a tarball. We extract it, then mount it with
VirtFS, and do some extra setup for device nodes. (We lose the ability
to run kernel builds directly, but I've never actually used that
functionality.)

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-18 01:13:01 -07:00
Omar Sandoval
198e7d6999 setup.py: drop 5.5 from vmtest kernels
Linux 5.5 built by GCC 10 doesn't boot due to new optimizations breaking
how the kernel sets up the stack canary
(https://lkml.org/lkml/2020/3/14/186 has the details). 5.5 has been EOL
since April, so the fix was never backported. We could backport it
ourselves for vmtest builds or build with GCC 9, but it's not worth the
extra effort to test an EOL kernel. Let's just drop it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-15 12:50:53 -07:00
Omar Sandoval
de98b40495 vmtest: use pathlib
pathlib is a bit nicer than os.path for most cases, so try it out
starting with vmtest.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-14 11:37:31 -07:00
Omar Sandoval
fa081e32b9 libdrgn: update module section iterator for Linux v5.8
Linux v5.8 changed the module section structure, so we need to get the
section name differently.

Closes #73.

Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 13:07:36 -07:00
Omar Sandoval
1c6465f0b0 libdrgn: fix infinite loop on error caching kernel module sections
If cache_kernel_module_sections() in report_loaded_kernel_module()
fails, we continue to the next iteration without advancing to the next
kernel module. Then, we fail on that same kernel module and repeat. Make
sure that we go to the next kernel module.

Fixes: 423d2cd500 ("libdrgn: dwarf_index: rework file reporting")
Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 12:25:22 -07:00
Omar Sandoval
431b91ddb5 libdrgn: fix use-after-free in kernel module reporting error case
We're freeing path and then using it to report an error.

This has some weird knock-on effects. Since we freed the path, the error
message contains garbage. So, PyErr_SetString() can't decode it as a
UTF-8 string. The end result is a MissingDebugInfoError with no message.

Fix it by creating the error before freeing the path.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 11:55:06 -07:00
Omar Sandoval
2b325b9262 libdrgn: add an environment variable to disable use of /proc/modules and /sys/module
We use /proc/modules and /sys/module to find loaded kernel modules for
the running kernel instead of walking the module list in the core dump
as an optimization. To make it easier to test the core dump path, add an
environment variable to disable the optimization.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 11:24:39 -07:00
Omar Sandoval
661a5c56c3 libdrgn: refactor kernel module iterator
The next commit will allow using the offline path for the live kernel,
so the offline naming won't make much sense. Fold the offline path into
the top-level functions, and make the live path an escape hatch. Also
add some comments and improve naming for the file and directory handles
and update the coding style.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 23:22:01 -07:00
Omar Sandoval
ce8540e39c libdrgn: get rid of kernel_module_iterator::notes*
These were added in commit e5874ad18a ("libdrgn: use libdwfl"), but
they have never been used. Remove them.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 17:23:31 -07:00
Omar Sandoval
3c5d22637e libdrgn: clean up hash function APIs and improve documentation
Use *_hash_pair() for hash functions that do the full double hashing and
return a struct hash_pair and hash_*() for other hashing utility
functions. Also change some of the equality function names to be more
symmetric and improve the documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 16:20:08 -07:00
Omar Sandoval
761da83ddd libdrgn: add {min,max}_iconst() and rewrite min() and max()
min() and max() from the Linux kernel go through the trouble of
resulting in a constant expression if the arguments are constant
expressions, but they can't be used outside of a function due to their
use of ({ }). This means that they can't be used for, e.g., enumerators
or global arrays. Let's simplify min() and max() and instead add
explicit min_iconst() and max_iconst() macros that can be used
everywhere that an integer constant expression is required. We can then
use it in hash_table.h. While we're here, let's split these into their
own header file and document them better.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-10 23:48:03 -07:00
Omar Sandoval
fa44171ba1 libdrgn: split bit operations into their own header
And improve their documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-09 17:44:15 -07:00
Omar Sandoval
cae79d2676 libdrgn: add preprocessor utility macros
These will be used in upcoming changes.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-09 16:36:59 -07:00
Omar Sandoval
4cbb9b552a libdrgn: fix comparison of types with anonymous members
drgn_type_members_eq() skips comparing the types of anonymous members.
Fix that and add a test for it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-08 17:32:46 -07:00
Omar Sandoval
de6a4e07ae libdrgn: fix Doxygen
The Doxygen documentation for libdrgn has bit-rotted over time. Bring
back the Internal module, clean up a few renamed members and parameters,
and fix broken parsing caused by the generic definition macros.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-30 01:32:33 -07:00
Omar Sandoval
2704fd17c3 libdrgn: update Doxyfile
doxygen warns about a few obsolete Doxyfile options. Update it with
doxygen -u.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-29 23:04:08 -07:00
Omar Sandoval
d829401e5f vmtest: also disable onoatimehack on QEMU 5.0.1
The fix was backported to QEMU's 5.0 stable branch and released in
5.0.1.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-23 16:42:15 -07:00
Omar Sandoval
286c09844e Clean up #includes with include-what-you-use
I recently hit a couple of CI failures caused by relying on transitive
includes that weren't always present. include-what-you-use is a
Clang-based tool that helps with this. It's a bit finicky and noisy, so
this adds scripts/iwyu.py to make running it more convenient (but not
reliable enough to automate it in Travis).

This cleans up all reasonable include-what-you-use warnings and
reorganizes a few header files.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-23 16:29:42 -07:00
Omar Sandoval
fdbe336386 libdrgn: use -isystem for elfutils headers
The elfutils header files should be treated as if they were in the
standard location, so use -isystem instead of -I.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-22 15:45:10 -07:00