In commit 26291647eb ("libdrgn: dwarf_index: handle
DW_AT_specification DIEs with two passes"), I claimed that the
specification map didn't need to be sharded "because there typically
aren't enough of these in a program to cause contention". This is true
for the Linux kernel, but not for large C++ applications. Instead of
sharding, though, we can avoid synchronization entirely by having each
indexing thread build its own specification map and then merging them at
the end. This reduces the time to index one large, statically-linked C++
application from 15 seconds to 8.5 seconds! As expected, it has no
significant performance difference for the Linux kernel.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The upcoming rework of the DWARF index needs entries in the DWARF index
to be as small as possible. The first thing we can get rid of is the
struct drgn_elf_file * in struct drgn_dwarf_index_die and struct
drgn_dwarf_specification. Instead, we can sort the struct
drgn_dwarf_index_cu_vector index_cus by start address, then do a binary
search on the DIE address to find the CU and file containing it.
As a result of this change, struct drgn_dwarf_index_die no longer
contains enough information for drgn_dwarf_index_get_die() to convert it
into a libdw Dwarf_Die. But, after the last two commits,
drgn_dwarf_index_get_die() is now always called immediately after
drgn_dwarf_index_iterator_next(). So, let's get rid of
drgn_dwarf_index_get_die() and make drgn_dwarf_index_iterator_next()
return the Dwarf_Die and struct drgn_elf_file *.
We offset the cost of the binary search in index_cus by storing the
libdw Dwarf_CU * in struct drgn_dwarf_index_cu. This allows us to avoid
calling dwarf_offdie{,_types}(), which does a (slower) binary tree
search to find the Dwarf_CU * anyways.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The lack of a semicolon after these macros has always confused tooling
like cscope. We could add semicolons everywhere now, but let's enforce
it for the future, too. Let's add a dummy struct forward declaration at
the end of each macro that enforces this requirement and also provides a
useful error message.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Instead, reuse struct drgn_dwarf_index_cu for the pending CUs. This is
mainly so that we can save more information in the pending CU in a later
change. It also lets us merge our per-thread pending CU arrays with
memcpy() instead of element-by-element, but I didn't measure a
performance difference one way or the other.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We're currently getting .eh_frame from the debug file. However, since
.eh_frame is an SHF_ALLOC section, it is actually in the loaded file,
and may not be in the debug file. This causes us to fail to unwind in
modules whose debug file was created with objcopy --only-keep-debug
(which is typical for Linux distro debug files).
Fix it by getting .eh_frame from the loaded file. To make this easier,
we split .eh_frame and .debug_frame data into two separate tables. We
also don't bother deduplicating them anymore, since GCC and Clang only
seem to generate one or the other in practice.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
struct drgn_module contains a bunch of information about the debug info
file. Let's pull it out into its own structure, struct drgn_elf_file.
This will be reused for the "main"/"loaded" file in an upcoming change.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The StackFrame's __getitem__() method allows looking up names in the
scope of a stack frame, which is an incredibly useful tool for
debugging. However, the names are not discoverable -- you must already
be looking at the source code or some other source to know what names
can be queried. To fix this, add a locals() method to StackFrame, which
lists names that can be queried in the scope. Since this method is named
locals(), it stops at the function scope and doesn't include globals or
class members.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.
All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Eventually, modules will be exposed as part of the public libdrgn API,
so they should have a clean name. Additionally, the module API I'm
currently working on will allow modules for which we don't have the
debug info file, so "debug info module" would be a misnomer.
Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and
drgn_orc_module_info to drgn_module_orc_info to fit the new naming
scheme better.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The upcoming introduction of a higher level data structure to represent
a namespace has implications on the organization of the DWARF index and
debug info management code. Basically, we're going to want to track what
is currently known as struct drgn_dwarf_index_namespace as part of the
new struct drgn_namespace. That only leaves the DWARF specification map
and list of CUs in struct drgn_dwarf_index, which doesn't make much
sense anymore. Instead, let's:
* Move the specification map and CUs into struct drgn_dwarf_info.
* Rename struct drgn_dwarf_index_namespace to struct
drgn_namespace_dwarf_index to indicate that it is the "DWARF index for
a namespace" rather than a "namespace of a DWARF index".
* Move the DWARF index implementation into dwarf_info.c. The DWARF index
and debugging information management have always been coupled, so this
makes it more explicit and is more convenient.
* Improve documentation and naming in the DWARF index implementation.
Now, the only DWARF-specific code outside of dwarf_info.c is for stack
tracing, but we'll leave that for another day.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Continuing the refactoring from the previous commit, move the DWARF code
from debug_info.c to its own file, leaving only the generic ELF file
management in debug_info.c
Signed-off-by: Omar Sandoval <osandov@osandov.com>