Commit Graph

1371 Commits

Author SHA1 Message Date
Omar Sandoval
adfb04579b libdrgn: linux: add idle_thread() helper
PR #129 will need to get the idle thread for a CPU when the idle thread
crashed. Add a helper for this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 14:40:57 -08:00
Omar Sandoval
b916e6905b libdrgn: linux: translate per_cpu_ptr() helper to C
The next change will add a C helper that needs per_cpu_ptr().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 14:39:50 -08:00
Omar Sandoval
92f25e2974 vmtest: enable logging when running vmtest.vm CLI
Specifically, we want logs from vmtest.download.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 14:34:45 -08:00
Omar Sandoval
6732148a11 tests: use NOBITS section for ELF symbols
Currently, we create a section filled with zeroes to contain the symbols
in our ELF symbol tests. We can just use a NOBITS section with no file
data instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-17 16:46:12 -08:00
Kevin Svetlitski
2b47583c73 Rewrite linux helper iterators in C
In preparation for introducing an API to represent threads, the linux
helper iterators, radix_tree_for_each, idr_for_each, for_each_pid, and
for_each_task have been rewritten in C. This will allow them to be
accessed from libdrgn, which will be necessary for the threads API.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2021-12-17 16:24:54 -08:00
Omar Sandoval
0f68cd44e2 vmtest: mount /dev/shm in VM
PR #133 adds a test case using multiprocessing.Barrier(), which needs
/dev/shm.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-17 13:01:18 -08:00
Kevin Svetlitski
9add9529eb Ensure compile_commands.json contains -Wall
This minor change is a quality of life improvement ensuring developers
receive more warnings and diagnostics in their editors.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2021-12-17 12:08:22 -08:00
Stephen Brennan
f1cc88378a Silence mypy warnings
With mypy 0.920, two warnings appear on current main:

$ mypy --strict --no-warn-return-any drgn _drgn.pyi
drgn/helpers/linux/__init__.py:36: error: Need type annotation for "__all__" (hint: "__all__: List[<type>] = ...")
drgn/helpers/linux/__init__.py:38: error: unused "type: ignore" comment
Found 2 errors in 1 file (checked 33 source files)

The "unused" type:ignore directive was necessary for prior versions, so
add --no-warn-unused-ignores, so that we pass on multiple versions.
Apply a List[str] annotation to the __all__ variable to silence the
other error.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2021-12-16 14:46:27 -08:00
Alakesh Haloi
c4fbf7e589 libdrgn: fix for compilation error
On gcc version 7.3, we get following compilation error

  CC       libdrgnimpl_la-dwarf_info.lo
../../libdrgn/dwarf_info.c:181:51: error: initializer element is not
constant
 static const size_t DRGN_DWARF_INDEX_NUM_SHARDS = 1 <<
DRGN_DWARF_INDEX_SHARD_BITS;

This fixes the compilation error on older versions of gcc

Signed-off-by: Alakesh Haloi <alakesh.haloi@gmail.com>
2021-12-14 11:48:00 -08:00
Omar Sandoval
6fb304e99a Skip DCO check for draft pull requests
Draft pull requests can have temporary commits, so it doesn't make much
sense to check for sign-offs. Skip the check on drafts, making sure it
runs when a draft is changed to a normal pull request.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-13 12:14:42 -08:00
Omar Sandoval
609b4cc352 CONTRIBUTING: document some libdrgn coding conventions
Document conventions for init/deinit functions, create/destroy
functions, and functions which modify a struct drgn_object.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-13 11:40:15 -08:00
Omar Sandoval
1b54a25632 drgn 0.0.16
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 14:52:02 -08:00
Omar Sandoval
3a9ef1b6ca cli: print download progress in script mode, too
Instead of gating on script mode vs interactive mode, let's gate on
--quiet.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 13:36:56 -08:00
Omar Sandoval
a70e5d7893 cli: print debuginfod client progress
Running drgn on a system with debuginfod can appear to hang while the
debuginfod client downloads debug info. In interactive mode, let's set
the DEBUGINFOD_PROGRESS environment variable to get progress updates.
The output isn't super informative, but it's better than silence.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 12:52:02 -08:00
Omar Sandoval
061094187b libdrgn: debug_info: serialize initial calls to dwfl_module_getdwarf
dwfl_module_getdwarf() may call into debuginfod_find_executable() or
debuginfod_find_debuginfo(), which aren't thread-safe. So, let's put the
initial call of dwfl_module_getdwarf() (which is the call that may go
into the debuginfod client) into a critical section.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 20:37:14 +00:00
Omar Sandoval
ffcce8a745 Add a few files to source distributions
In particular, the Fedora RPM build needs pytest.ini. CONTRIBUTING.rst
should be included along the same lines as README.rst. libdrgn/Doxyfile
should be included so that users with a source distribution can build
the libdrgn documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 17:24:24 -08:00
Omar Sandoval
08e634c158 drgn 0.0.15
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 15:20:42 -08:00
Omar Sandoval
ad23378977 Update elfutils and libkdumpfile in manylinux wheels
Use the latest version of elfutils (0.186) and libkdumpfile (0.4.1). We
can drop the elfutils patch since 0.186 has the fix (and we have our own
workaround), but we need a new patch to build libkdumpfile.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 15:13:09 -08:00
Omar Sandoval
8ebdcb7109 libdrgn: memory_reader: remove unnecessary include
Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 15:12:11 -08:00
Omar Sandoval
8b2bf85e49 libdrgn: dwarf_info: fix garbage return from drgn_array_type_from_dwarf()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:21 -08:00
Omar Sandoval
8a41adc1b0 libdrgn: language_c: add missing error check in c_parse_abstract_declarator()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:15 -08:00
Omar Sandoval
f09fd13ef6 libdrgn: helpers: add missing error check in linux_helper_pid_task()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:06 -08:00
Omar Sandoval
e6abfeac03 libdrgn: debug_info: report userspace core dump debug info ourselves
There are a few reasons for this:

1. dwfl_core_file_report() crashes on elfutils 0.183-0.185. Those
   versions are still used by several distros.
2. In order to support --main-symbols and --symbols properly, we need to
   report things ourselves.
3. I'm considering moving away from libdwfl in the long term.

We provide an escape hatch for now: setting the environment variable
DRGN_USE_LIBDWFL_REPORT=1 opts out of drgn's reporting and uses
libdwfl's.

Fixes #130.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 12:11:10 -08:00
Omar Sandoval
02912ca7d0 libdrgn: fix handling of p_filesz < p_memsz in core dumps
I implemented the case of a segment in a core file with p_filesz <
p_memsz by treating the difference as zero bytes. This is correct for
ET_EXEC and ET_DYN, but for ET_CORE, it actually means that the memory
existed in the program but was not saved. For userspace core dumps, this
typically happens for read-only file mappings. For kernel core dumps,
makedumpfile does this to indicate memory that was excluded.

Instead, let's return a DRGN_FAULT_ERROR if an attempt is made to read
from these bytes. In the future, we need to read from the
executable/library files when we can.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 00:02:44 -08:00
Omar Sandoval
844d82848c libdrgn: add partial support for .gnu_debugaltlink
Issue #130 reported an "unknown attribute form 0x1f20" from drgn. 0x1f20
is DW_FORM_GNU_ref_alt, which is a reference to a DIE in an alternate
file. Similarly, DW_FORM_GNU_strp_alt is a string in an alternate file.
The alternate file is specified by the .gnu_debugaltlink section. This
is generated by dwz, which is used by at least Fedora and Debian.

libdwfl already finds the alternate debug info file, so we can save its
.debug_info and .debug_str and use those to support DW_FORM_GNU_ref_alt
and DW_FORM_GNU_strp_alt in the DWARF index.

Imported units are going to be more work to support in the DWARF index,
but this at least lets drgn start up.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-07 13:49:09 -08:00
Omar Sandoval
aef144c944 libdrgn: debug_info: improve elf_address_range()
Instead of iterating through every segment, we can just look at the
first and last loadable segments. This even works for vmlinux on x86-64
and Arm which have some special, relocatable segments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-06 13:33:55 -08:00
Omar Sandoval
10c66d4e99 libdrgn: get correct error when dwelf_elf_gnu_build_id() fails
The documentation for libdwelf states that "functions starting with
dwelf_elf will take a (libelf) Elf object as first argument and might
set elf_errno on error". So, we should be using drgn_error_libelf(), not
drgn_error_libdwfl(). While we're here, close the Elf handle before the
file descriptor for consistency.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-06 01:51:54 -08:00
Omar Sandoval
2c6e36847f Remove some include-what-you-use workarounds
include-what-you-use 0.17 fixed a couple of issues we were working
around with a mapping file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-06 01:51:54 -08:00
Omar Sandoval
91f6d03ee8 libdrgn: fix note name matching
The current code matches the desired note name as a prefix, but we need
an exact match.

Fixes: 75c3679147 ("Rewrite drgn core in C")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-03 12:03:19 -08:00
Omar Sandoval
0e318754fe libdrgn: don't swallow errors in relocate_elf_file()
Fixes: 62d98b3016 ("libdrgn: fold ELF relocation code into dwarf_index")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-03 11:31:05 -08:00
Omar Sandoval
0315ade709 tests: handle CONFIG_KALLSYMS=n and CONFIG_KALLSYMS_ALL=n
If CONFIG_KALLSYMS_ALL=n, then /proc/kallsyms won't include lo_fops,
which is a data symbol. Use a function symbol, lo_open, instead. Also
check whether /proc/kallsyms exists in the first place.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-02 03:46:06 -08:00
Omar Sandoval
36f7e8b59b README: add libtool to build dependencies for Debian and Arch
Fixes: 1b7badad0a ("docs: expand and reorganize installation instructions")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-02 02:01:46 -08:00
Omar Sandoval
3914bb8e29 libdrgn: fix type names referring to anonymous types
A pointer, array, or function referring to an anonymous type currently
includes the full type definition in its type name. This creates very
badly formatted objects for, e.g., drgn's own hash table types. Instead,
use "struct <anonymous>" in the type name.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-23 00:57:42 -08:00
Omar Sandoval
d18be05b7a README: mention Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 16:01:39 -08:00
Omar Sandoval
c0d8709b45 Update copyright headers to Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 15:59:44 -08:00
Omar Sandoval
93dc02a271 setup.py: add 5.16 to vmtest kernels
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 14:52:46 -08:00
Omar Sandoval
cdee38af7a tests: use different symbol for kernel module debug info test
Linux kernel commit 47e9624616c8 ("block: remove support for cryptoloop
and the xor transfer") removed the loop_register_transfer function. We
only used that symbol because it and loop_unregister_transfer were the
only global symbols in the loop module. Now that we can get local
symbols by name, we can use the "lo_fops" symbol, which is unlikely to
be removed or renamed.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 14:40:06 -08:00
Omar Sandoval
ff40f65f0d libdrgn: allow symbol name lookup to get local symbols
Global symbols are preferred over weak symbols, and weak symbols are
preferred over other symbols.

dwfl_module_addrinfo() seems to have the same preference, so document
address lookups as having the same behavior. (This is actually incorrect
in the case of STB_GNU_UNIQUE, as dwfl_module_addrinfo() treats anything
other than STB_GLOBAL, STB_WEAK, and STB_LOCAL as having the lowest
precedence, but STB_GNU_UNIQUE is so obscure that it probably doesn't
matter.)

Based on work from Stephen Brennan. Closes #121.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 14:30:57 -08:00
Omar Sandoval
07d00b7b11 tests: add tests for ELF symbols
Add some scaffolding to generate ELF files with symbol tables and use it
to test symbol lookups and Elf_Sym -> drgn.Symbol translation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-19 17:04:20 -08:00
Omar Sandoval
c84d7e8c15 tests: generate ELF constants from elf.h
Generalize generate_dwarf_constants.py for ELF and replace tests/elf.py
with the generated version.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-19 17:02:32 -08:00
Omar Sandoval
cb8bf339c8 tests: elfwriter: don't add sections if there aren't any
Only add SHT_NULL and .shstrtab sections if there are other sections to
be added. This allows us to create core dumps with no sections, like
core dumps on Linux.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-19 15:12:31 -08:00
Omar Sandoval
681d8453ce tests: elfwriter: set e_phoff to zero if there are no segments
readelf warns that a non-zero e_phoff with a zero e_phnum is invalid:

  Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-19 15:11:54 -08:00
Omar Sandoval
4808ef72ee libdrgn: debug_info: get address range of reported ET_EXEC files
When explicitly reporting a debugging information file for a userspace
program, userspace_report_debug_info() currently always reports it with
a load address range of [0, 0) (i.e., not actually loaded into the
program). This is because for ET_DYN and ET_REL files, we have to
determine the address range by inspecting the core dump or program
state, which is a bit involved.

However, ET_EXEC is much easier: we can get the address range from the
segment headers. In fact, we already implemented this for vmlinux files,
so we can reuse that with a modification to make it more permissive.

ET_CORE debug info files don't make much sense, but libdwfl seems to
treat a reported ET_CORE file the same as ET_EXEC (see
dwfl_report_elf()), so we do, too.

Unfortunately, most executables on modern Linux distributions are
ET_DYN, but this will at least make testing easier.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-19 14:58:10 -08:00
Omar Sandoval
c3f31e28f9 libdrgn: reorganize and move DWARF index into dwarf_info.c
The upcoming introduction of a higher level data structure to represent
a namespace has implications on the organization of the DWARF index and
debug info management code. Basically, we're going to want to track what
is currently known as struct drgn_dwarf_index_namespace as part of the
new struct drgn_namespace. That only leaves the DWARF specification map
and list of CUs in struct drgn_dwarf_index, which doesn't make much
sense anymore. Instead, let's:

* Move the specification map and CUs into struct drgn_dwarf_info.
* Rename struct drgn_dwarf_index_namespace to struct
  drgn_namespace_dwarf_index to indicate that it is the "DWARF index for
  a namespace" rather than a "namespace of a DWARF index".
* Move the DWARF index implementation into dwarf_info.c. The DWARF index
  and debugging information management have always been coupled, so this
  makes it more explicit and is more convenient.
* Improve documentation and naming in the DWARF index implementation.

Now, the only DWARF-specific code outside of dwarf_info.c is for stack
tracing, but we'll leave that for another day.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:55 -08:00
Omar Sandoval
5591d199b1 libdrgn: debug_info: split DWARF support into its own file
Continuing the refactoring from the previous commit, move the DWARF code
from debug_info.c to its own file, leaving only the generic ELF file
management in debug_info.c

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:54 -08:00
Omar Sandoval
c6b2bc4181 libdrgn: debug_info: split ORC support into its own file
debug_info.c currently contains code for managing ELF files with
debugging information, for parsing DWARF, and for parsing ORC. Let's
split it up, starting by moving ORC support to its own file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:04 -08:00
Jay Kamat
3700bb75b8 libdrgn: Follow typedefs in enum backing type lookup
In C++ enums can be a typedef to an int, not just an int itself.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2021-11-18 13:48:31 -08:00
Omar Sandoval
a90ffdfb67 libdrgn: dwarf_index: actually index namespaces in parallel
index_namespace() uses `#pragma omp for` instead of `#pragma omp
parallel for`, and it's not already in a parallel section. So, we're
indexing namespaces single-threaded, despite sharding the index. Oops.

Fixes: d1beb0184a ("libdrgn: add support for objects in C++ namespaces")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-17 18:11:08 -08:00
Omar Sandoval
2642f85a1a libdrgn: dwarf_index: avoid OpenMP when accessing indexed namespace
index_namespace() sets up an OpenMP loop everytime it is called.
However, if the namespace has no pending DIEs, this is unnecessary
overhead for every DWARF index lookup. Bail early if there are no
pending DIEs (i.e., because we already indexed the namespace). In a
microbenchmark, this was a 10x speed improvement for DWARF index
iterator initialization. For a Python prog.type() lookup benchmark, it
was a 10% speedup.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-17 18:10:33 -08:00
Omar Sandoval
12ddb87c26 libdrgn: dwarf_info: simplify DWARF index iterator code
We can save a pointer to the shard itself instead of the namespace and
shard index. We can also simplify drgn_dwarf_index_iterator_next()
further.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-17 18:08:58 -08:00