Commit Graph

974 Commits

Author SHA1 Message Date
Omar Sandoval
c768e97394 libdrgn: python: use _Thread_local instead of PyThreadState for drgn_in_python
Using a Python dictionary for this is much more heavyweight than just
using a thread-local variable (with no benefit as far as I can tell).
This also gets rid of a call to _PyDict_GetItem().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-22 01:17:12 -07:00
Omar Sandoval
08498967f7 libdrgn: configure with large file support
/proc/pid/mem is indexed by address. On 32-bit systems, addresses may be
out of the range of a 32-bit signed off_t. This results in pread()
returning EINVAL in drgn_read_memory_file(). Use AC_SYS_LARGEFILE in
configure.ac so that we use 64-bit off_t by default.

Closes #98.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-21 13:34:31 -07:00
Davide Cavalca
6332f9846a examples: add missing shebangs
Signed-off-by: Davide Cavalca <dcavalca@fb.com>
2021-04-20 11:20:06 -07:00
Davide Cavalca
31cbc187ce add COPYING to sdist
Signed-off-by: Davide Cavalca <dcavalca@fb.com>
2021-04-20 11:17:51 -07:00
Omar Sandoval
738261290f CI: temporarily disable vmtest
With the added Clang tests, apparently vmtest is generating excessive
traffic on Dropbox. Disable it on GitHub Actions until I can work out a
new solution.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-04 23:06:19 -07:00
Omar Sandoval
78b4188dd9 drgn 0.0.11
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:50:09 -07:00
Omar Sandoval
e7367a4a94 libdrgn: Makefile: remove generated source files from CLEANFILES
We don't actually want make clean to remove the generated files that are
included in a distribution tarball, because then the user will need to
regenerate them, and they might not have the dependencies installed.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:31:14 -07:00
Omar Sandoval
a4b9d68a8c Use GPL-3.0-or-later license identifier instead of GPL-3.0+
Apparently the latter is deprecated and the former is preferred.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:10:35 -07:00
Omar Sandoval
c7dc814978 CI: build with -Wall -Werror
This is the documented way that drgn should be built for development, so
let's enforce it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 17:28:09 -07:00
Omar Sandoval
113b2700a8 CI: test with GCC and Clang
Everytime I try to build drgn with Clang, there are a few things that
need fixing. Let's test it so that it stays in good shape.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 17:28:05 -07:00
Omar Sandoval
76d3348a6d libdrgn: hash_table: mark table##_delete_iterator() as unused
GCC doesn't warn about table##_delete_iterator() being unused because it
is inline, but Clang does, so add the unused attribute.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 16:28:46 -07:00
Omar Sandoval
acf722d315 libdrgn: hash_table: remove unused table##_chunk_set_capacity_scale()
The folly implementation calls this elsewhere, but we only need it in
table##_chunk_mark_eof(), so it was folded in there.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 16:27:27 -07:00
Omar Sandoval
b772432a86 libdrgn: cfi: don't rely on member containing a flexible array
Clang enables -Wgnu-variable-sized-type-not-at-end by default, which
warns for DRGN_CFI_ROW():

  arch_x86_64.c:735:27: warning: field 'row' with variable sized type 'struct drgn_cfi_row' not at the end of a struct or class is a GNU extension
        [-Wgnu-variable-sized-type-not-at-end]
          .default_dwarf_cfi_row = DRGN_CFI_ROW(

DRGN_CFI_ROW() is gnarly anyways, so instead of having it expand to a
pointer expression relying on this GCC extension, make it expand to an
initializer. Then, we can initialize default_dwarf_cfi_row as a separate
variable rather than directly in the initializer for struct
drgn_architecture_info.

This still relies on a GCC extension for static initialization of
flexible array members, but apparently Clang is okay with that one by
default (-Wgnu-flexible-array-initializer must be enabled explictly or
by -Wgnu or -Wpedantic).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 16:19:21 -07:00
Omar Sandoval
5c86e30b6e libdrgn: work around Clang __muloti4 for the third time
See commit 0cb77b303c ("libdrgn: work around Clang __muloti4 again")
and commit 2dd14ad522 ("libdrgn: work around "undefined reference to
'__muloti4'" when using Clang"). These keep sneaking in because I don't
have an old enough version of Clang lying around.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 15:30:07 -07:00
Omar Sandoval
301cc3f139 libdrgn: fix UBSan "applying zero offset to null pointer" errors
There are a couple of places where we compute `NULL + 0`, which is
undefined behavior. Add a helper to do this safely.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 13:38:29 -07:00
Omar Sandoval
9c31f11e35 libdrgn: object: fix UBSan error for uninitialized boolean
drgn_object_reinit() and drgn_object_copy() can both load from an
uninitialized little_endian field, causing UBSan errors like:

  libdrgn/object.h:105:27: runtime error: load of value 68, which is not a valid value for type '_Bool'

This only happens when little_endian isn't valid for the type and won't
be used anyways, but it's easy enough to work around.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 13:38:13 -07:00
Omar Sandoval
c9dc7fd574 libdrgn: type: fix memcpy() undefined behavior
It's undefined behavior to pass NULL to memcpy() even if the length is
zero. See also commit a17215e984 ("libdrgn: dwarf_index: fix memcpy()
undefined behavior").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 13:38:13 -07:00
Davide Cavalca
7ca157316f tests: properly escape regexp strings
Signed-off-by: Davide Cavalca <dcavalca@fb.com>
2021-04-02 10:37:33 -07:00
Davide Cavalca
081d7773e1 tests: rename test_type_dies for pytest compatibility
Signed-off-by: Davide Cavalca <dcavalca@fb.com>
2021-04-02 10:37:14 -07:00
Omar Sandoval
39f17a52b8 docs: add missing css, favicon, and logo files to sdist
Closes #96.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-02 10:30:28 -07:00
Omar Sandoval
beb0c9d640 drgn 0.0.10 2021-03-31 13:32:05 -07:00
Omar Sandoval
f285764f8a Include full libdrgn distribution in drgn sdist
Building drgn from an sdist currently requires autotools and gawk
because libdrgn in the sdist is more or less a git checkout. It's more
user-friendly to include the autotools output and generated code. Do
this by extending the sdist command to include a full libdrgn
distribution with `make distdir`.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-30 23:19:38 -07:00
Omar Sandoval
587ecd4df8 README: add pip to installation dependencies
The README instructs the user to install with pip, but doesn't actually
mention that pip needs to be installed.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-30 22:51:46 -07:00
Omar Sandoval
ce7a5e62f8 README: fix installation dependencies for old Debian and Ubuntu
The libelf-dev and libdw-dev packages on Debian Stretch, Ubuntu Xenial,
and older are missing dependencies on liblzma-dev and zlib1g-dev, which
causes pkg-config to fail when running configure. Add them explicitly
for old versions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-30 22:42:11 -07:00
Omar Sandoval
630d39e345 libdrgn: add ORC unwinder
The Linux kernel has its own stack unwinding format for x86-64 called
ORC: https://www.kernel.org/doc/html/latest/x86/orc-unwinder.html. It is
essentially a simplified, less complete version of DWARF CFI. ORC is
generated by analyzing machine code, so it is present for all but a few
ignored functions. In contrast, DWARF CFI is generated by the compiler
and is therefore missing for functions written in assembly and inline
assembly (which is widespread in the kernel).

This implements an ORC stack unwinder: it applies ELF relocations to the
ORC sections, adds a new DRGN_CFI_RULE_REGISTER_ADD_OFFSET CFI rule
kind, parses and efficiently stores ORC data, and translates ORC to drgn
CFI rules. This will allow us to stack trace through assembly code,
interrupts, and system calls.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-29 10:01:52 -07:00
Omar Sandoval
090064f20d libdrgn: x86-64: support R_X86_64_PC32 relocation type
This is used for .orc_unwind_ip for kernel modules.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 15:16:36 -07:00
Omar Sandoval
e0aaaf203d libdrgn: generalize applying ELF relocations
To support unwinding with ORC, we need to apply relocations to
.orc_unwind_ip, which libdwfl doesn't do. That means that we always need
to apply relocations on x86-64, not just as a fast path when the file's
byte order matches the host's. So, generalize handling of 64- vs 32-bit
and little- vs big-endian relocations, and move the handling of
relocation types to an arch-specific callback.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 15:16:35 -07:00
Omar Sandoval
63672be809 libdrgn: linux_kernel: save module .init section addresses
Linux kernel modules usually contain ELF relocations in DWARF and ORC
sections for symbols in .init sections. Since we ignore .init sections
entirely in cache_kernel_module_sections(), these relocations end up
being based on an address of 0 (so, e.g., a function from .init.text
could be reported as having an address of 0x0). It makes a little more
sense to use the address where the .init section was before it was
freed. So, let's update the sections' sh_addr but continue ignoring them
for determining the module's address range.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 15:13:47 -07:00
Omar Sandoval
da180b7274 libdrgn: handle errors from elf_strptr()
For some reason, we consistently ignore errors from elf_strptr(), but we
shouldn't.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 14:28:16 -07:00
Omar Sandoval
12723a0c08 tests: clean up tests.helpers.linux.test_debug_info
Split the two modes into separate tests and move the environment
variable fiddling into a separate helper function.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 12:49:06 -07:00
Omar Sandoval
e5bc41f16c libdrgn: add latest elf.h and dwarf.h to support elfutils 0.165
The oldest LTS version of Ubuntu, 16.04, has elfutils 0.165. This
version is missing some ELF and DWARF definitions used by drgn. Add
copies of elf.h from glibc 2.33 and dwarf.h and elfutils/known-dwarf.h
from elfutils 0.183 to get the latest definitions and drop the minimum
required version of elfutils further to 0.165.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-21 23:18:39 -07:00
Serapheim Dimitropoulos
a68abd5de4 libdrgn: stretch minimum supported version of libelf to 0.170
Currently libdrgn requires libelf to be of version 0.175 or
later. This patch allows the library to be compiled with libelf
0.170 (the newest version supported by Ubuntu 18.04 LTS).

Signed-off-by: Serapheim Dimitropoulos <serapheim@delphix.com>
2021-03-21 14:28:29 -07:00
Omar Sandoval
da0280016c libdrgn: python: identify bit fields in TypeMember.__repr__
If a member is a bit field, then we should format it with the underlying
Object so that it shows the bit field size.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-17 12:02:53 -07:00
Omar Sandoval
55354b3038 libdrgn: use flexible array for pgtable_iterator::arch
There's no reason to use GCC's zero-length array extension for this. Use
a standard flexible array instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-16 16:18:49 -07:00
Omar Sandoval
38d4330fec libdrgn: clean up stale comment references and Doxygen warnings
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-16 16:15:43 -07:00
Omar Sandoval
671947d185 libdrgn: remove unused drgn_program::attached_dwfl_state
I missed this when I removed the code that used it.

Fixes: eec67768aa ("libdrgn: replace elfutils DWARF unwinder with our own")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-16 15:41:07 -07:00
Omar Sandoval
4c5c5f3842 Remove bundled version of elfutils
We currently bundle a version of elfutils with patches to export
additional stack tracing functionality. This has a few drawbacks:

- Most of drgn's build time is actually building elfutils.
- Distributions don't like packages that bundle verions of other
  packages.
- elfutils, and thus drgn, can't be built with clang.

Now that we've replaced the elfutils DWARF unwinder with our own, we
don't need the patches, so we can drop the bundled elfutils and fix
these issues.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-16 00:52:09 -07:00
Omar Sandoval
eec67768aa libdrgn: replace elfutils DWARF unwinder with our own
The elfutils DWARF unwinder has a couple of limitations:

1. libdwfl doesn't have an interface for getting register values, so we
   have to bundle a patched version of elfutils with drgn.
2. Error handling is very awkward: dwfl_getthread_frames() can return an
   error even on success, so we have to squirrel away our own errors in
   the callback.

Furthermore, there are a couple of things that will be easier with our
own unwinder:

1. Integrating unwinding using ORC will be easier when we're handling
   unwinding ourselves.
2. Support for local variables isn't too far away now that we have DWARF
   expression evaluation.

Now that we have the register state, CFI, and DWARF expression pieces in
place, stitch them together with the new unwinder, and tweak the public
API a bit to reflect it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:43:12 -07:00
Omar Sandoval
35a1af7ad6 libdrgn: add DWARF expression evaluation
For DW_CFA_def_cfa_expression, DW_CFA_expression, and
DW_CFA_val_expression, we need to be evaluate a DWARF expression. Add an
interface for this. It doesn't yet support operations that aren't
applicable to CFI or some more exotic operations.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:36:38 -07:00
Omar Sandoval
fdaf7790a9 libdrgn: add DWARF call frame information parsing
In preparation for adding our own unwinder, add support for parsing and
finding DWARF/EH call frame information. Use a generic representation of
call frame information so that we can support other formats like ORC in
the future.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:36:38 -07:00
Omar Sandoval
0a6aaaae5d libdrgn: define structure for storing processor register values
libdwfl stores registers in an array of uint64_t indexed by the DWARF
register number. This is suboptimal for a couple of reasons:

1. Although the DWARF specification states that registers should be
   numbered for "optimal density", in practice this isn't the case. ABIs
   include unused ranges of numbers and don't order registers based on
   how likely they are to be known (e.g., caller-saved registers usually
   aren't recovered while unwinding the stack, but they are often
   numbered before callee-saved registers).
2. This precludes support for registers larger than 64 bits, like SSE
   registers.

For our own unwinder, we want to store registers in an
architecture-specific format to solve both of these problems.

So, have each architecture define its layout with registers arranged for
space efficiency and convenience when parsing saved registers from core
dumps. Instead of generating an arch_foo.c file from arch_foo.c.in,
separately define the logical register order in an arch_foo.defs file,
and use it to generate an arch_foo.inc file that is included from
arch_foo.c. The layout is defined as a macro in arch_foo.c. While we're
here, drop some register definitions that aren't useful at the moment.

Then, define struct drgn_register_state to efficiently store registers
in the defined format.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 16:36:38 -07:00
Omar Sandoval
cc1a5606d0 libdrgn: debug_info: save platform per module
Stack unwinding depends on some platform-specific information. If for
some reason a program has debugging information with different
platforms, then we need to make sure that while we're unwinding the
stack, we don't end up in a frame with a different platform, because the
registers won't make sense. Additionally, we should parse debugging
information using the module's platform rather than the program's
platform, which may not match. So, cache the platform derived from each
module's ELF file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 12:13:48 -07:00
Omar Sandoval
6065fc87af libdrgn: debug_info: save .debug_frame, .eh_frame, .text, and .got
These sections are needed for stack unwinding. However, .debug_frame and
.eh_frame don't need to be read right away, and .text and .got don't
need to be read at all, so partition them accordingly. Also, check that
the sections are specifically SHT_PROGBITS rather than not SHT_NOBITS.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 12:13:48 -07:00
Omar Sandoval
744cc414d3 libdrgn: add copy_lsbytes()
It will be used to copy register values.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 12:13:48 -07:00
Omar Sandoval
b0a6d12501 libdrgn: binary_buffer: add binary_buffer_next_[us]int()
These will be used for parsing .debug_frame, .eh_frame, and DWARF
expressions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-15 12:13:45 -07:00
Omar Sandoval
b55a5f7f4b libdrgn: binary_buffer: add binary_buffer_next_sN()
Along with _into_s64 and _into_u64 variants. These will be used for
parsing .eh_frame and DWARF expressions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-10 02:07:30 -08:00
Omar Sandoval
e5219b13e3 libdrgn: binary_buffer: add binary_buffer_next_sleb128()
Revive it from all the way back in commit 90fbec02fc ("dwarfindex:
delete unused read_sleb128() and read_strlen()") and add an _into_u64
variant. These will be used for parsing .debug_frame, .eh_frame, and
DWARF expressions.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-10 02:07:30 -08:00
Omar Sandoval
7eab40aaeb libdrgn: rename drgn_error_debug_info() to drgn_error_debug_info_scn()
An upcoming change will introduce a similar function for when the
section isn't known. Rename the original so that the new one can take
its name.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-10 02:07:16 -08:00
Omar Sandoval
56c4003db7 setup.py: add 5.12 to vmtest kernels
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-09 13:51:52 -08:00
Jay Kamat
4552d78f4a libdrgn: debug_info: try to find DIE specification when parsing type
Currently, we look up incomplete types by name, which can fail if the
name is ambiguous or the type is unnamed. Try finding the complete type
via the DW_AT_specification map in the DWARF index first.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2021-03-08 15:24:24 -08:00