Commit Graph

16 Commits

Author SHA1 Message Date
Omar Sandoval
8b264f8823 Update copyright headers to Facebook and add missing headers
drgn was originally my side project, but for awhile now it's also been
my work project. Update the copyright headers to reflect this, and add a
copyright header to various files that were missing it.
2020-05-15 15:13:02 -07:00
Jeff Mahoney
bf05d9bf3f libdrgn: allow to build without openmp
The configure script allows the user to not use any openmp
implementation but dwarf_index.c uses the locking APIs unconditionally.
This compiles but fails at runtime.

Adding simple stubs for the locking API. This is useful when debugging
crashes in dwarf indexing during development.
2020-04-08 12:33:40 -07:00
Omar Sandoval
78192cd61e libdrgn: add environment variable to see more missing debug info errors
Sometimes, I'd like to see all of the missing debug info errors rather
than just the first 5. Allow setting this through the
DRGN_MAX_DEBUG_INFO_ERRORS environment variable.
2019-10-02 17:22:12 -07:00
Omar Sandoval
423d2cd500 libdrgn: dwarf_index: rework file reporting
Currently, the interface between the DWARF index, libdwfl, and the code
which finds and reports vmlinux/kernel modules is spaghetti. The DWARF
index tracks Dwfl_Modules via their userdata. However, despite
conceptually being owned by the DWARF index, the reporting code reports
the Dwfl_Modules and sets up the userdata. These Dwfl_Modules and
drgn_dwfl_module_userdatas are messy to track and pass between the
layers.

This reworks the architecture so that the DWARF index owns the Dwfl
instance and files are reported to the DWARF index; the DWARF index
takes care of reporting to libdwfl internally. In addition to making the
interface for the reporter much cleaner, this improves a few things as a
side-effect:

- We now deduplicate on build ID in addition to path.
- We now skip searching for vmlinux and/or kernel modules if they were
  already indexed.
- We now support compressed ELF files via libdwelf.
- We can now load default debug info at the same time as additional
  debug info.
2019-10-02 17:22:11 -07:00
Omar Sandoval
e5874ad18a libdrgn: use libdwfl
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:

- It only supports finding debug information for the running kernel, not
  vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
  which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
  finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
  for every module.
- It doesn't find kernel modules with names containing both dashes and
  underscores (e.g., aes-x86_64).

Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.

The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.
2019-07-15 12:27:48 -07:00
Omar Sandoval
a9a2cb7cac libdrgn: dwarf_index: move bswap from file to compilation unit
Remove an indirection.
2019-07-15 12:27:38 -07:00
Omar Sandoval
b7e1b6ede6 libdrgn: dwarf_index: rename drgn_dwarf_index_iterator_next() output parameter 2019-07-15 12:27:24 -07:00
Omar Sandoval
9f9bec4762 libdrgn: use common vector where applicable
This converts several open-coded dynamic arrays to the new common vector
implementation:

- drgn_lexer stack
- Array dimension array for DWARF parsing
- drgn_program_read_c_string()
- DWARF index directory name hashes
- DWARF index file name hashes
- DWARF index abbreviation table
- DWARF index shard entries
2019-07-15 12:27:16 -07:00
Omar Sandoval
8a59a7e819 libdrgn: don't preallocate DWARF index memory
This doesn't make things any faster in my benchmarks, and it complicates
DWARF index initialization.
2019-07-08 16:23:38 -07:00
Omar Sandoval
ec33f9bf73 libdrgn: get rid of DWARF index flags
We always index everything, so simplify the code a bit.
2019-07-08 16:23:38 -07:00
Omar Sandoval
dcddaa2cc1 libdrgn: revamp hash table API
This makes several improvements to the hash table API.

The first two changes make things more general in order to be consistent
with the upcoming binary search tree API:

- Items are renamed to entries.
- Positions are renamed to iterators.
- hash_table_empty() is added.

One change makes the definition API more convenient:

- It is no longer necessary to pass the types into
  DEFINE_HASH_{MAP,SET}_FUNCTIONS().

A few changes take some good ideas from the C++ STL:

- hash_table_insert() now fails on duplicates instead of overwriting.
- hash_table_delete_iterator() returns the next iterator.
- hash_table_next() returns an iterator instead of modifying it.

One change reduces memory usage:

- The lower-level DEFINE_HASH_TABLE() is cleaned up and exposed as an
  alternative to DEFINE_HASH_MAP() and DEFINE_HASH_SET(). This allows us
  to get rid of the duplicated key where a hash map value already embeds
  the key (the DWARF index file table) and gets rid of the need to make
  a dummy hash set entry to do a search (the pointer and array type
  caches).
2019-05-24 17:48:05 -07:00
Omar Sandoval
9b563170f8 libdrgn: make load_debug_info() API saner
Rather than exposing the underlying open and load steps of DWARF index,
simplify it down to a single load step.
2019-05-13 15:04:27 -07:00
Omar Sandoval
baba1ff3f0 libdrgn: make program components pluggable
Currently, programs can be created for three main use-cases: core dumps,
the running kernel, and a running process. However, internally, the
program memory, types, and symbols are pluggable. Expose that as a
callback API, which makes it possible to use drgn in much more creative
ways.
2019-05-10 12:41:07 -07:00
Omar Sandoval
640b1c011d libdrgn: embed DWARF index in DWARF info cache 2019-05-06 14:55:34 -07:00
Omar Sandoval
2ed8e3148c libdrgn: get architecture info from core file instead of DWARF index 2019-05-06 14:55:34 -07:00
Omar Sandoval
75c3679147 Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:

- It's too slow for some common use cases, like iterating over large
  data structures.
- It can't be reused in utilities written in other languages.

This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:

- Types are now represented by a single Type class rather than the messy
  polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
  functions.

The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.

Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-04-02 14:12:07 -07:00