We have a few places where we parse raw ELF notes independently of
libelf, which isn't complicated but also not trivial thanks to alignment
requirements. In preparation for adding another place, factor the
parsing out into a common helper. Also document the complexities around
figuring out the correct alignment.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This one doesn't need any changes to the callback signature, just the
new interface. We also keep add_object_finder() for compatibility.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Like for symbol finders, we want extra flexibility around configuring
type finders. The type finder callback signature also has a couple of
warts: it doesn't take the program since it was added before types
needed to be constructed from a program, and it is called separately for
each type kind since originally type lookups were for only one kind.
While we're adding a new interface, let's fix these warts: pass the
program and a set of type kinds. However, we need to keep the old
add_type_finder() interface for backwards compatibility.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Currently, the bare-bones add_symbol_finder() interface only allows
adding a symbol finder that is called before any existing finders. It'd
be useful to be able to specify the order that symbol finders should be
called in and to selectively enable and disable them. To do that, we
also need finders to have a name to identify them by. So, replace
add_symbol_finder() (which hasn't been in a release yet) with a set of
interfaces providing this flexibility: register_symbol_finder(),
set_enabled_symbol_finders(), registered_symbol_finders(), and
enabled_symbol_finders(). Also change the callback signature to take the
program.
In particular, this flexibility will be very useful for a plugin system:
pre-installed plugins can register symbol finders that the user can
choose to enable or disable.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that the symbol finder API is created, we can move the ELF symbol
implementation into the debug_info.c file, where it more logically
belongs. The only change to these functions in the move is to declare
elf_symbols_search as static.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Symbol lookup is not yet modular, like type or object lookup. However,
making it modular would enable easier development and prototyping of
alternative Symbol providers, such as Linux kernel module symbol tables,
vmlinux kallsyms tables, and BPF function symbols. To begin with, create
a modular Symbol API within libdrgn, and refactor the ELF symbol search
to use it.
For now, we leave drgn_program_find_symbol_by_address_internal() alone.
Its conversion will require some surgery, since the new API can return
errors, whereas this function cannot.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Type units don't have a skeleton unit, so we need to walk over all of
the units in the split DWARF file to find them. Instead of doing this in
a second pass, rework drgn_dwarf_index_read_cus(): instead of
substituting skeleton units with their respective split units, call
drgn_dwarf_index_read_cus() recursively on the split DWARF file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're determining which callbacks to use for the Dwfl handle too early,
before the program flags are set. Instead of creating it later, shift
the flag checks to the callback itself.
Fixes: c85dd74f3e ("libdrgn: embed drgn_debug_info in drgn_program")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than the indirect link between verbage in the docs and getting
dependencies, just include the URL. Indent the related modules for
clarity.
Signed-off-by: Alex Gartrell <alexgartrell@gmail.com>
I missed this easy conversion back in commit ee51244dc1 ("libdrgn: add
_cleanup_free_ scope guard, no_cleanup_ptr(), and return_ptr()").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Commit b16dad8a36 ("libdrgn: support SHT_REL relocations") added our
own implementation of SHT_REL relocations, which are used by Arm and
i386. However, it failed to remove the check that skips over all
non-SHT_RELA sections, so we've been falling back to the (slow) libdwfl
implementation this whole time.
Fixes: b16dad8a36 ("libdrgn: support SHT_REL relocations")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In commit c4a122ead6 ("libdrgn: dwarf_info: scalably index all DIEs
per name"), I noted that indexing the Linux kernel was slower with 80
threads than with 8. After experimenting on multiple systems, I
determined that the slowdown was because of hyperthreading; limiting the
number of threads to the number of cores (not hyperthreads) was usually
faster, and never slower.
Unfortunately, OpenMP doesn't have a convenient way to express this, so
we have to parse sysfs and explicitly specify the number of threads
ourselves.
Compared to commit c4a122ead6 ("libdrgn: dwarf_info: scalably index
all DIEs per name"), this makes indexing the large C++ application half
a second faster, and the Linux kernel 70 ms faster, all while using less
resources.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The upcoming DWARF index rework will make it too difficult to roll back
in the middle of DWARF indexing. It also doesn't make sense for the
planned module API. Let's chuck that code and instead save the error to
return forever like we do for index_namespace().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The current generic vector API is pretty minimal and exposes its
internal members as part of the public interface. This has worked well
but prevents us from changing the vector implementation. In particular,
I'd like to have "small vector" variants that can store some entries
directly in the vector structure, use a smaller integer type for the
size and capacity, or both.
So, let's make the generated vector type "private" and add accessor
functions. This is very verbose in some cases, but it'll grant us much
more flexibility. While we're changing every user anyways, let's also
make use of _cleanup_(vector_deinit) where possible.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The lack of a semicolon after these macros has always confused tooling
like cscope. We could add semicolons everywhere now, but let's enforce
it for the future, too. Let's add a dummy struct forward declaration at
the end of each macro that enforces this requirement and also provides a
useful error message.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We've addressed all of the smaller differences with GNU Debug Fission
and split DWARF 5, so now all that remains is the DWARF index.
The general approach is: in drgn_dwarf_index_read_cus(), for each CU,
ask libdw for the "sub-DIE". For skeleton CUs, this is the split CU DIE
from the .dwo file. From that Dwarf_Die, we can get the Dwarf_CU and
then the Dwarf handle. Then, we wrap that in a struct drgn_elf_file
(cached in a hash table in the struct drgn_module), which the DWARF
index can work with from there.
Additionally, a couple of places (.debug_addr parsing and stack trace
local variable lookup) need to be updated to use the correct
drgn_elf_file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're getting (null) file paths in error messages (e.g., #233) because
libdwfl doesn't always return the debug file path. Fall back to the
loaded file path, which is better than nothing until we get rid of
libdwfl.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
This false positive appears to only trigger on 32-bit. I reproduced it
with GCC 10 and 12.
Fixes#242.
Reported-by: Timothée Cocault <timothee.cocault@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently getting .eh_frame from the debug file. However, since
.eh_frame is an SHF_ALLOC section, it is actually in the loaded file,
and may not be in the debug file. This causes us to fail to unwind in
modules whose debug file was created with objcopy --only-keep-debug
(which is typical for Linux distro debug files).
Fix it by getting .eh_frame from the loaded file. To make this easier,
we split .eh_frame and .debug_frame data into two separate tables. We
also don't bother deduplicating them anymore, since GCC and Clang only
seem to generate one or the other in practice.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For upcoming changes, we will need loaded (SHF_ALLOC) sections for
modules. Some separate debug files (e.g., those created with objcopy
--only-keep-debug) don't have those sections. Let's get the loaded file
from libdwfl with dwfl_module_getelf() and save it in a struct
drgn_elf_file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Now that we track the debug file ourselves, we can avoid calling libdwfl
in a bunch of places. By tracking the bias ourselves, we can avoid a
bunch more.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
struct drgn_module contains a bunch of information about the debug info
file. Let's pull it out into its own structure, struct drgn_elf_file.
This will be reused for the "main"/"loaded" file in an upcoming change.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
It's confusing that we have a platform both for the program and for each
module. They usually match, but they're not required to. For example,
the user can manually add a file with a different platform just to read
its debug info. Our rule is that if we're parsing anything from the
module, we use the module platform; and otherwise, use the program
platform. There are a couple of places where the platforms must match:
when using call frame information (CFI) or registers. Let's make all of
this more clear in the code (by using the module's platform even when it
must match the program's platform) and in comments. No functional
change.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.
All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Eventually, modules will be exposed as part of the public libdrgn API,
so they should have a clean name. Additionally, the module API I'm
currently working on will allow modules for which we don't have the
debug info file, so "debug info module" would be a misnomer.
Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and
drgn_orc_module_info to drgn_module_orc_info to fit the new naming
scheme better.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Rather than documenting how to initialize a struct string_builder,
provide an initializer, STRING_BUILDER_INIT.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In preparation for supporting ELF relocations for more architectures,
generalize ELF relocations to handle SHT_REL sections/ElfN_Rel.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Peilin Ye reported a couple of related crashes in drgn caused by Linux
kernel modules which had been processed with objcopy --only-keep-debug
(although he notes that since binutils-gdb commit 8c803a2dd7d3
("elf_backend_section_flags and _bfd_elf_init_private_section_data") (in
binutils v2.35), objcopy --only-keep-debug doesn't seem to work for
kernel modules).
If given an SHT_NOBITS section, elf_getdata() returns an Elf_Data with
d_buf = NULL and d_size set to the size in the section header, which is
often non-zero. There are a few places where this can cause us to
dereference a NULL pointer:
* In relocate_elf_sections() for the relocated section data.
* In relocate_elf_sections() for the symbol table section data.
* In get_kernel_module_name_from_modinfo().
* In get_kernel_module_name_from_this_module().
Fix it by checking the section type or directly checking Elf_Data::d_buf
everywhere that could potentially get an SHT_NOBITS section. This is
based on a PR from Peilin Ye.
Closes#145.
Reported-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
relocate_elf_section() shouldn't need to deal with reading the sections.
Pull that logic out into relocate_elf_file() (which will be shared with
REL-style relocations when we support those) and rename
relocate_elf_section() to apply_elf_relas().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Most of the places where we call elf_getdata() (via read_elf_section())
deal with SHT_PROGBITS sections. elf_getdata() always returns the
literal contents of the file for SHT_PROGBITS sections (usually straight
out of the mmap'd file).
The exceptions are the SHT_RELA and SHT_SYMTAB sections in
relocate_elf_section(). For these, if the byte order or alignment of the
file do not match the host, elf_getdata() allocates a new buffer and
converts the contents. relocate_elf_section() also handles unaligned
buffers and swaps the byte order, so it mistakenly ends up with the
original byte order of the file.
Rather than removing that from relocate_elf_section(), let's avoid the
extra allocation and use elf_rawdata(), which always returns the literal
contents of the file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
dwfl_module_getdwarf() may call into debuginfod_find_executable() or
debuginfod_find_debuginfo(), which aren't thread-safe. So, let's put the
initial call of dwfl_module_getdwarf() (which is the call that may go
into the debuginfod client) into a critical section.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
There are a few reasons for this:
1. dwfl_core_file_report() crashes on elfutils 0.183-0.185. Those
versions are still used by several distros.
2. In order to support --main-symbols and --symbols properly, we need to
report things ourselves.
3. I'm considering moving away from libdwfl in the long term.
We provide an escape hatch for now: setting the environment variable
DRGN_USE_LIBDWFL_REPORT=1 opts out of drgn's reporting and uses
libdwfl's.
Fixes#130.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
I implemented the case of a segment in a core file with p_filesz <
p_memsz by treating the difference as zero bytes. This is correct for
ET_EXEC and ET_DYN, but for ET_CORE, it actually means that the memory
existed in the program but was not saved. For userspace core dumps, this
typically happens for read-only file mappings. For kernel core dumps,
makedumpfile does this to indicate memory that was excluded.
Instead, let's return a DRGN_FAULT_ERROR if an attempt is made to read
from these bytes. In the future, we need to read from the
executable/library files when we can.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Issue #130 reported an "unknown attribute form 0x1f20" from drgn. 0x1f20
is DW_FORM_GNU_ref_alt, which is a reference to a DIE in an alternate
file. Similarly, DW_FORM_GNU_strp_alt is a string in an alternate file.
The alternate file is specified by the .gnu_debugaltlink section. This
is generated by dwz, which is used by at least Fedora and Debian.
libdwfl already finds the alternate debug info file, so we can save its
.debug_info and .debug_str and use those to support DW_FORM_GNU_ref_alt
and DW_FORM_GNU_strp_alt in the DWARF index.
Imported units are going to be more work to support in the DWARF index,
but this at least lets drgn start up.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead of iterating through every segment, we can just look at the
first and last loadable segments. This even works for vmlinux on x86-64
and Arm which have some special, relocatable segments.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The documentation for libdwelf states that "functions starting with
dwelf_elf will take a (libelf) Elf object as first argument and might
set elf_errno on error". So, we should be using drgn_error_libelf(), not
drgn_error_libdwfl(). While we're here, close the Elf handle before the
file descriptor for consistency.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
When explicitly reporting a debugging information file for a userspace
program, userspace_report_debug_info() currently always reports it with
a load address range of [0, 0) (i.e., not actually loaded into the
program). This is because for ET_DYN and ET_REL files, we have to
determine the address range by inspecting the core dump or program
state, which is a bit involved.
However, ET_EXEC is much easier: we can get the address range from the
segment headers. In fact, we already implemented this for vmlinux files,
so we can reuse that with a modification to make it more permissive.
ET_CORE debug info files don't make much sense, but libdwfl seems to
treat a reported ET_CORE file the same as ET_EXEC (see
dwfl_report_elf()), so we do, too.
Unfortunately, most executables on modern Linux distributions are
ET_DYN, but this will at least make testing easier.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The upcoming introduction of a higher level data structure to represent
a namespace has implications on the organization of the DWARF index and
debug info management code. Basically, we're going to want to track what
is currently known as struct drgn_dwarf_index_namespace as part of the
new struct drgn_namespace. That only leaves the DWARF specification map
and list of CUs in struct drgn_dwarf_index, which doesn't make much
sense anymore. Instead, let's:
* Move the specification map and CUs into struct drgn_dwarf_info.
* Rename struct drgn_dwarf_index_namespace to struct
drgn_namespace_dwarf_index to indicate that it is the "DWARF index for
a namespace" rather than a "namespace of a DWARF index".
* Move the DWARF index implementation into dwarf_info.c. The DWARF index
and debugging information management have always been coupled, so this
makes it more explicit and is more convenient.
* Improve documentation and naming in the DWARF index implementation.
Now, the only DWARF-specific code outside of dwarf_info.c is for stack
tracing, but we'll leave that for another day.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Continuing the refactoring from the previous commit, move the DWARF code
from debug_info.c to its own file, leaving only the generic ELF file
management in debug_info.c
Signed-off-by: Omar Sandoval <osandov@osandov.com>
debug_info.c currently contains code for managing ELF files with
debugging information, for parsing DWARF, and for parsing ORC. Let's
split it up, starting by moving ORC support to its own file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>