Commit Graph

1343 Commits

Author SHA1 Message Date
Omar Sandoval
7393808a7d cli: diagnose when someone tries to run a binary as a script
This is a common mistake:

  $ drgn core_dump
  Traceback (most recent call last):
    File "/usr/bin/drgn", line 33, in <module>
      sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')())
    File "/usr/lib/python3.10/site-packages/drgn/internal/cli.py", line 133, in main
      runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__")
    File "/usr/lib/python3.10/runpy.py", line 268, in run_path
      code, fname = _get_code_from_file(run_name, path_name)
    File "/usr/lib/python3.10/runpy.py", line 242, in _get_code_from_file
      code = compile(f.read(), fname, 'exec')
  ValueError: source code string cannot contain null bytes

The user intends to debug the core dump, but they've actually specified
the core dump as a Python script to run. The error message from the
runpy internals does not make that clear. So, let's catch this earlier
by doing a quick-and-dirty test of the file magic to see if it looks
like a core dump or other ELF file. If so, we exit with a more helpful
message:

  $ drgn core_dump
  error: core_dump is a core dump
  Did you mean "-c core_dump"?
  $ drgn /usr/bin/ls
  error: /usr/bin/ls is a binary, not a drgn script

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-21 14:51:17 -08:00
Omar Sandoval
c40543b15c tests: add test cases for generic flag decode helpers
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-15 11:45:09 -08:00
Stephen Brennan
7970a60818 Add methods to return multiple matching symbols
Currently we can lookup symbols by name or address, but this will only
return one symbol, prioritizing the global symbols. However, symbols may
share the same name, and symbols may also overlap address ranges, so
it's possible for searches to return multiple results. Add functions
which can return a list of multiple matching symbols.

Signed-off-by: Stephen Brennan <stephen@brennan.io>
2022-01-15 11:44:33 -08:00
Stephen Brennan
fb99f6dbe6 ci: Use pre-commit to run linters
Now that pre-commit is added, replace the manual commands for mypy,
isort, and black with equivalent pre-commit commands. This allows us to
avoid duplicating linter arguments. It also allows us to pin the linters
used in CI by way of the .pre-commit-config.yaml file, ensuring
reproducible lint errors.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-01-14 13:31:16 -08:00
Stephen Brennan
52b96aed88 Run pre-commit on all files
`pre-commit run --all-files` results in the following minor
updates, which appear to be caused by my own failure to run linters.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-01-14 13:31:16 -08:00
Stephen Brennan
ae377984d4 Add pre-commit
During PRs, lint and mypy errors can show up in the CI tests, which is
useful, but can introduce unnecessary churn on the PR as small lint
fixes are pushed. This commit adds (optional) support for pre-commit, a
tool which can be configured to run as a git pre-commit hook, running
linters on all changed code to catch issues before you push your code.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-01-14 13:31:16 -08:00
Omar Sandoval
e2fc4ce2ac helpers: add a helper for decoding page flags
As well as a couple of generic helpers backing it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-12 16:57:10 -08:00
Kevin Svetlitski
301cc767ba Implement a new API for representing threads
Previously, drgn had no way to represent a thread – retrieving a stack
trace (the only extant thread-specific operation) was achieved by
requiring the user to directly provide a tid.

This commit introduces the scaffolding for the design outlined in
issue #92, and implements the corresponding methods for userspace core
dumps, the live Linux kernel, and Linux kernel core dumps. Future work
will build on top of this commit to support live userspace processes.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Kevin Svetlitski
78139b6ba3 libdrgn: add Linux kernel task iterator
The thread API needs a way to iterate over all task_structs in the
kernel. Previously, we translated the existing for_each_task helper,
which supports iterating through specific PID namespaces by walking
through the PID radix tree or PID hashtable. However, we don't need
specific namespaces for the thread API, so we can instead use the much
simpler linked lists of thread groups and threads.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Omar Sandoval
95c4e2d748 Revert "Rewrite linux helper iterators in C"
This reverts commit 2b47583c73. After
Kevin had completed this, we realized that there is a simpler method for
iterating through tasks from libdrgn, which the next commit will
implement. Revert the translation, but keep the improved
tests.helpers.linux.test_pid.TestPid.test_for_each_task.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-11 17:28:17 -08:00
Kevin Svetlitski
32a968deb0 vmtest: only disable SMP for the capture kernel when not using KVM acceleration
Disabling SMP is necessary to work around a bug in QEMU's handling of
the capture kernel, but makes the tests run much slower. However, this
bug only appears to manifest when KVM acceleration is disabled, so the
testing harness has been modified to only disable SMP when this is true.

[Omar: use an environment variable instead of touching a file]
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Kevin Svetlitski
d3c9e24115 tests: make all tests inherit from drgn's TestCase class
The majority of test cases already inherited from drgn's TestCase class.
The few outliers that inherited directly from unittest.TestCase have
been brought in line with the other tests.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-11 17:28:17 -08:00
Kevin Svetlitski
ac2cadabcd Add framework for testing in kdump
Now that the vmtest kernel supports kdump, add a script that can be used
to crash and enter the kdump environment on demand. Use that to crash
after running the normal test suite so that we can run tests against
/proc/vmcore. vmcore tests live in their own directory; presently the
only test is a simple sanity check that ensures we can can attach to
/proc/vmcore.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-01-07 14:03:00 -08:00
Omar Sandoval
69c069b09f libdrgn: allow NULL argument to drgn_stack_trace_destroy()
This is one place where I broke the convention that I just documented.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-06 18:23:27 -08:00
Omar Sandoval
2ce41c22ae CONTRIBUTING: mention that _destroy functions should allow NULL
This is another undocumented convention.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-06 18:21:46 -08:00
Omar Sandoval
2a0b4c8848 vmtest: also add kexec_file_load() syscall config options
We can avoid the need for the kexec tool if we load the kdump kernel
ourselves, which is much easier with kexec_file_load(). Add the config
options to enable it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 23:32:49 -08:00
Omar Sandoval
ba93fd5a71 vmtest: add kdump kernel config options
We would like to test drgn against kernel core dumps (e.g., for #129).
One option would be to include some vmcore files in the repository and
test against those. But those can be huge, and we'd need a lot of them
to test different kernel versions. Instead, we can run vmtest, enable
kdump, and trigger a crash. To do that, we first need to enable a few
kernel config options.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 17:39:32 -08:00
Omar Sandoval
2ff58a4d45 libdrgn: linux: make per_cpu_ptr() support !SMP kernels
Kernels built without multiprocessing support don't have
__per_cpu_offset; instead, per_cpu_ptr() is a no-op. Make the helper do
the same and update the test case to work on !SMP as well.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 16:51:15 -08:00
Omar Sandoval
b341c212f4 tests: fix black error
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 16:06:23 -08:00
Omar Sandoval
d72a9043b0 libdrgn: linux: replace idle_thread() with idle_task()
I missed that the kernel has an idle_task() function which uses
cpu_rq()->idle instead of idle_threads; the latter is technically
architecture-specific. So, replace idle_thread() with idle_task(), which
is architecture-independent and more consistent with the kernel.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 16:03:25 -08:00
Peilin Ye
ed7f864532 helpers: Add SOCKET_I() and SOCK_INODE()
Add helpers to convert between sockets and inodes.  As an example:

	>>> file = fget(task, fd)
	>>> sock = SOCKET_I(file.f_inode)
	>>> sock.type.value_()
	2
	>>> import socket
	>>> int(socket.SOCK_DGRAM)
	2
	>>> inode = SOCK_INODE(sock)

Also add tests for the new helpers to tests/helpers/linux/test_net.py.

Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
2021-12-21 14:55:25 -08:00
Peilin Ye
bc95749975 tests: Rename "sock" to "skt" in test_sk_fullsock()
Reserve "sock" for "struct socket *" objects, according to our kernel
naming convention.

Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
2021-12-21 14:55:25 -08:00
Omar Sandoval
adfb04579b libdrgn: linux: add idle_thread() helper
PR #129 will need to get the idle thread for a CPU when the idle thread
crashed. Add a helper for this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 14:40:57 -08:00
Omar Sandoval
b916e6905b libdrgn: linux: translate per_cpu_ptr() helper to C
The next change will add a C helper that needs per_cpu_ptr().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 14:39:50 -08:00
Omar Sandoval
92f25e2974 vmtest: enable logging when running vmtest.vm CLI
Specifically, we want logs from vmtest.download.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-21 14:34:45 -08:00
Omar Sandoval
6732148a11 tests: use NOBITS section for ELF symbols
Currently, we create a section filled with zeroes to contain the symbols
in our ELF symbol tests. We can just use a NOBITS section with no file
data instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-17 16:46:12 -08:00
Kevin Svetlitski
2b47583c73 Rewrite linux helper iterators in C
In preparation for introducing an API to represent threads, the linux
helper iterators, radix_tree_for_each, idr_for_each, for_each_pid, and
for_each_task have been rewritten in C. This will allow them to be
accessed from libdrgn, which will be necessary for the threads API.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2021-12-17 16:24:54 -08:00
Omar Sandoval
0f68cd44e2 vmtest: mount /dev/shm in VM
PR #133 adds a test case using multiprocessing.Barrier(), which needs
/dev/shm.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-17 13:01:18 -08:00
Kevin Svetlitski
9add9529eb Ensure compile_commands.json contains -Wall
This minor change is a quality of life improvement ensuring developers
receive more warnings and diagnostics in their editors.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2021-12-17 12:08:22 -08:00
Stephen Brennan
f1cc88378a Silence mypy warnings
With mypy 0.920, two warnings appear on current main:

$ mypy --strict --no-warn-return-any drgn _drgn.pyi
drgn/helpers/linux/__init__.py:36: error: Need type annotation for "__all__" (hint: "__all__: List[<type>] = ...")
drgn/helpers/linux/__init__.py:38: error: unused "type: ignore" comment
Found 2 errors in 1 file (checked 33 source files)

The "unused" type:ignore directive was necessary for prior versions, so
add --no-warn-unused-ignores, so that we pass on multiple versions.
Apply a List[str] annotation to the __all__ variable to silence the
other error.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2021-12-16 14:46:27 -08:00
Alakesh Haloi
c4fbf7e589 libdrgn: fix for compilation error
On gcc version 7.3, we get following compilation error

  CC       libdrgnimpl_la-dwarf_info.lo
../../libdrgn/dwarf_info.c:181:51: error: initializer element is not
constant
 static const size_t DRGN_DWARF_INDEX_NUM_SHARDS = 1 <<
DRGN_DWARF_INDEX_SHARD_BITS;

This fixes the compilation error on older versions of gcc

Signed-off-by: Alakesh Haloi <alakesh.haloi@gmail.com>
2021-12-14 11:48:00 -08:00
Omar Sandoval
6fb304e99a Skip DCO check for draft pull requests
Draft pull requests can have temporary commits, so it doesn't make much
sense to check for sign-offs. Skip the check on drafts, making sure it
runs when a draft is changed to a normal pull request.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-13 12:14:42 -08:00
Omar Sandoval
609b4cc352 CONTRIBUTING: document some libdrgn coding conventions
Document conventions for init/deinit functions, create/destroy
functions, and functions which modify a struct drgn_object.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-13 11:40:15 -08:00
Omar Sandoval
1b54a25632 drgn 0.0.16
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 14:52:02 -08:00
Omar Sandoval
3a9ef1b6ca cli: print download progress in script mode, too
Instead of gating on script mode vs interactive mode, let's gate on
--quiet.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 13:36:56 -08:00
Omar Sandoval
a70e5d7893 cli: print debuginfod client progress
Running drgn on a system with debuginfod can appear to hang while the
debuginfod client downloads debug info. In interactive mode, let's set
the DEBUGINFOD_PROGRESS environment variable to get progress updates.
The output isn't super informative, but it's better than silence.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 12:52:02 -08:00
Omar Sandoval
061094187b libdrgn: debug_info: serialize initial calls to dwfl_module_getdwarf
dwfl_module_getdwarf() may call into debuginfod_find_executable() or
debuginfod_find_debuginfo(), which aren't thread-safe. So, let's put the
initial call of dwfl_module_getdwarf() (which is the call that may go
into the debuginfod client) into a critical section.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-09 20:37:14 +00:00
Omar Sandoval
ffcce8a745 Add a few files to source distributions
In particular, the Fedora RPM build needs pytest.ini. CONTRIBUTING.rst
should be included along the same lines as README.rst. libdrgn/Doxyfile
should be included so that users with a source distribution can build
the libdrgn documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 17:24:24 -08:00
Omar Sandoval
08e634c158 drgn 0.0.15
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 15:20:42 -08:00
Omar Sandoval
ad23378977 Update elfutils and libkdumpfile in manylinux wheels
Use the latest version of elfutils (0.186) and libkdumpfile (0.4.1). We
can drop the elfutils patch since 0.186 has the fix (and we have our own
workaround), but we need a new patch to build libkdumpfile.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 15:13:09 -08:00
Omar Sandoval
8ebdcb7109 libdrgn: memory_reader: remove unnecessary include
Fixes: 02912ca7d0 ("libdrgn: fix handling of p_filesz < p_memsz in core dumps")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 15:12:11 -08:00
Omar Sandoval
8b2bf85e49 libdrgn: dwarf_info: fix garbage return from drgn_array_type_from_dwarf()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:21 -08:00
Omar Sandoval
8a41adc1b0 libdrgn: language_c: add missing error check in c_parse_abstract_declarator()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:15 -08:00
Omar Sandoval
f09fd13ef6 libdrgn: helpers: add missing error check in linux_helper_pid_task()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:06 -08:00
Omar Sandoval
e6abfeac03 libdrgn: debug_info: report userspace core dump debug info ourselves
There are a few reasons for this:

1. dwfl_core_file_report() crashes on elfutils 0.183-0.185. Those
   versions are still used by several distros.
2. In order to support --main-symbols and --symbols properly, we need to
   report things ourselves.
3. I'm considering moving away from libdwfl in the long term.

We provide an escape hatch for now: setting the environment variable
DRGN_USE_LIBDWFL_REPORT=1 opts out of drgn's reporting and uses
libdwfl's.

Fixes #130.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 12:11:10 -08:00
Omar Sandoval
02912ca7d0 libdrgn: fix handling of p_filesz < p_memsz in core dumps
I implemented the case of a segment in a core file with p_filesz <
p_memsz by treating the difference as zero bytes. This is correct for
ET_EXEC and ET_DYN, but for ET_CORE, it actually means that the memory
existed in the program but was not saved. For userspace core dumps, this
typically happens for read-only file mappings. For kernel core dumps,
makedumpfile does this to indicate memory that was excluded.

Instead, let's return a DRGN_FAULT_ERROR if an attempt is made to read
from these bytes. In the future, we need to read from the
executable/library files when we can.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 00:02:44 -08:00
Omar Sandoval
844d82848c libdrgn: add partial support for .gnu_debugaltlink
Issue #130 reported an "unknown attribute form 0x1f20" from drgn. 0x1f20
is DW_FORM_GNU_ref_alt, which is a reference to a DIE in an alternate
file. Similarly, DW_FORM_GNU_strp_alt is a string in an alternate file.
The alternate file is specified by the .gnu_debugaltlink section. This
is generated by dwz, which is used by at least Fedora and Debian.

libdwfl already finds the alternate debug info file, so we can save its
.debug_info and .debug_str and use those to support DW_FORM_GNU_ref_alt
and DW_FORM_GNU_strp_alt in the DWARF index.

Imported units are going to be more work to support in the DWARF index,
but this at least lets drgn start up.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-07 13:49:09 -08:00
Omar Sandoval
aef144c944 libdrgn: debug_info: improve elf_address_range()
Instead of iterating through every segment, we can just look at the
first and last loadable segments. This even works for vmlinux on x86-64
and Arm which have some special, relocatable segments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-06 13:33:55 -08:00
Omar Sandoval
10c66d4e99 libdrgn: get correct error when dwelf_elf_gnu_build_id() fails
The documentation for libdwelf states that "functions starting with
dwelf_elf will take a (libelf) Elf object as first argument and might
set elf_errno on error". So, we should be using drgn_error_libelf(), not
drgn_error_libdwfl(). While we're here, close the Elf handle before the
file descriptor for consistency.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-06 01:51:54 -08:00
Omar Sandoval
2c6e36847f Remove some include-what-you-use workarounds
include-what-you-use 0.17 fixed a couple of issues we were working
around with a mapping file.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-06 01:51:54 -08:00