Commit Graph

74 Commits

Author SHA1 Message Date
Omar Sandoval
fc3ea4184a libdrgn: use new include-what-you-use exported declarations and fix warnings
include-what-you-use/include-what-you-use#1164 fixed
include-what-you-use/include-what-you-use#971 so that we can export
forward declarations instead of hacking around it. I can't reproduce the
issue with BINARY_OP_SIGNED_2C anymore either, so we can remove that
hack, too. Also fix any other warnings.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-05-24 00:25:25 -07:00
Omar Sandoval
18a8f69ad8 libdrgn: linux_kernel: add object finder for jiffies
We have a lot of examples that use jiffies, but they stopped working
long ago on x86-64 (since Linux kernel commit d8ad6d39c35d ("x86_64: Fix
jiffies ODR violation") (in v5.8 and backported to stable releases)) and
never worked on other architectures. This is because jiffies is defined
in the Linux kernel's linker script. #277 proposed updating the examples
to use jiffies_64, but I would guess that most kernel developers are
familiar with jiffies and many have never seen jiffies_64. jiffies is
also a nicer name to type in live demos. Let's add a case to the Linux
kernel object finder to get the jiffies variable.

Reported-by: Martin Liska <mliska@suse.cz>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-22 11:15:37 -08:00
Omar Sandoval
87b7292aa5 Relicense drgn from GPLv3+ to LGPLv2.1+
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.

All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 17:05:16 -07:00
Omar Sandoval
30c9ad452d libdrgn: linux_kernel: fix global per-CPU variables in kernel modules
The .data..percpu section is excluded from /sys/module and struct
module::sect_attrs, which means that we default its address to 0. This
results in global per-CPU variables in kernel modules being relocated
starting from 0 rather than the offset of the per-CPU allocation made
for the module, which in turn causes those variables to appear to
contain the wrong data. Fix it by manually getting the per-CPU address
from struct module.

Closes #185.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:27:28 -07:00
Omar Sandoval
a52016c4cb libdrgn: linux_kernel: always use module list from core
For the next fix, we need the address of the .data..percpu section,
which is only available directly from the struct module and not from
anywhere in /proc or /sys. Get rid of the /proc/modules fast path (and
update the name of the testing environment variable from
DRGN_USE_PROC_AND_SYS_MODULES to DRGN_USE_SYS_MODULE).

This has some small overhead (~20ms longer startup time in my
benchmarks) and means that we no longer determine the loaded modules if
vmlinux is missing, but fixing the per-CPU issue is more important.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:11:59 -07:00
Omar Sandoval
94036f6daf libdrgn: linux_kernel: optimize reading module list
An upcoming fix requires us to always use the module list from the core
dump rather than /proc/modules. However, with the existing code, this
would cause a major startup time regression for the live kernel, mainly
because reading from /proc/kcore is stupidly slow. We currently do 3 +
strlen(module->name) reads for every module. We can reduce this to 1
read per module by reading the entire struct module at once. The size of
struct module is ~700-900 bytes depending on the kernel configuration,
which is still much faster to read than only reading what we need.

In some benchmarks that I did with DRGN_USE_PROC_AND_SYS_MODULES=0, this
reduced the time spent in the kernel module iterator from ~2.5ms per
module to ~0.4ms per module.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-12 16:08:33 -07:00
Omar Sandoval
f8ba278bc1 libdrgn: fix include-what-you-use warnings
It's been awhile since I've run this.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
b8cdfff250 libdrgn: add read(2) and pread(2) wrappers that don't return short reads
We have a couple of loops that deal with short reads/EINTR from read(2)
and pread(2), and upcoming changes would need to add more. Add some
wrappers to abstract this away.

drgn_read_memory_file() still needs the loop so it can fault on the
exact offset that returns EIO.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-08-26 12:43:20 -07:00
Omar Sandoval
b28bd9f0a3 libdrgn: linux_kernel: get vmemmap generically
AArch64 has changed the location of vmemmap multiple times, and not all
of these can be easily distinguished. Rather than restorting to kernel
version checks, this replaces the vmemmap architecture callback with a
generic approach that gets the vmemmap address directly from the
mem_section table.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
a213573b23 libdrgn: linux_kernel: make virt_to_phys() and phys_to_virt() generic
On x86-64, the difference between virtual addresses in the direct map
and the corresponding physical addresses is called PAGE_OFFSET, so we
exposed that via an architecture callback and the Linux kernel object
finder. However, this doesn't translate to other architectures. Namely,
on AArch64, the difference is PAGE_OFFSET - PHYS_OFFSET, and both
PAGE_OFFSET and PHYS_OFFSET have varied over time and between
configurations.

We can remove the architecture callback and avoid version-specific logic
by letting the page table tell us the offset. We just need an address in
the direct map, which is easy to find since this includes kmalloc and
memblock allocations.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:11 -07:00
Omar Sandoval
b3a6d6a35f libdrgn: linux_kernel: cache PAGE_SHIFT derived from PAGE_SIZE
Rather than computing it every time we need it, compute it once when we
parse PAGE_SIZE from VMCOREINFO (and validate that PAGE_SIZE is a power
of two). This will be more important for AArch64 page table walking.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-07-14 12:05:09 -07:00
Omar Sandoval
3cba315293 libdrgn: linux_kernel: use memswitch for drgn_program_parse_vmcoreinfo()
We currently have 5 names that we match against, and there are more on
the way, so we might as well use a memswitch.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
9da9f6a871 libdrgn: fold struct vmcoreinfo into struct drgn_program
In an upcoming commit, we will parse the AArch64 pointer authentication
code mask either from the VMCOREINFO note or the NT_ARM_PAC_MASK note.
Since it doesn't always come from VMCOREINFO, it doesn't make sense to
put it in struct vmcoreinfo; struct drgn_program makes more sense. So,
make parse_vmcoreinfo() take struct drgn_program instead of struct
vmcoreinfo, rename it to drgn_program_parse_vmcoreinfo(), and replace
struct vmcoreinfo with an anonymous struct in struct drgn_program.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
ebfcabd5a8 libdrgn: linux_kernel: match explicitly-reported kernel modules by build ID
Currently, we identify explicitly-reported kernel modules by the module
name that we get from the .modinfo or the .gnu.linkonce.this_module
section. However, objcopy --only-keep-debug (used for some Linux distro's
separate debug files) does not keep these sections. This means that
passing a file processed by objcopy --only-keep-debug to, e.g., drgn -s,
fails with "could not find kernel module name".

Instead of using the module name as the identifier, let's use the
module's GNU build ID. We can get it on a live system from
/sys/module/<module>/notes/, and on a core dump from struct
module::notes_attrs (which is the implementation of that sysfs
directory).

This was split out of my larger debug info discovery rework, which will
make more use of the build ID.

Closes #178.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-01 14:21:12 -07:00
Omar Sandoval
931d9c999d libdrgn: linux_kernel: get module address range directly
Instead of getting the address range from the sections we find, get it
directly from /proc/modules or from the `struct module`. (We already had
partial code to get the address range, but I can't remember why I didn't
use it.)

The real motivation for this is the upcoming module rework: it'll allow
us to report the module and its address range before iterating through
its sections. But it also means that we don't need the heuristic to
ignore special sections that shouldn't be considered part of the address
range (e.g., .init, .data..percpu [the latter of which we should be
ignoring but get away with not because it's excluded from sysfs]).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-05-23 16:27:49 -07:00
Omar Sandoval
4f5249775d Fix various lints
Some functions that could be static found by -Wmissing-prototypes, some
include-what-you-use warnings, some missing SPDX identifiers. These
lints should be automated at some point.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-17 10:45:42 -08:00
Omar Sandoval
914ad8c53d libdrgn: use memswitch for linux_kernel_object_find
Replace the hand-written if-else ladder of memcmp() calls with a
memswitch.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-02-08 02:03:11 -08:00
Omar Sandoval
929b7de266 libdrgn: handle reading data from SHT_NOBITS sections
Peilin Ye reported a couple of related crashes in drgn caused by Linux
kernel modules which had been processed with objcopy --only-keep-debug
(although he notes that since binutils-gdb commit 8c803a2dd7d3
("elf_backend_section_flags and _bfd_elf_init_private_section_data") (in
binutils v2.35), objcopy --only-keep-debug doesn't seem to work for
kernel modules).

If given an SHT_NOBITS section, elf_getdata() returns an Elf_Data with
d_buf = NULL and d_size set to the size in the section header, which is
often non-zero. There are a few places where this can cause us to
dereference a NULL pointer:

* In relocate_elf_sections() for the relocated section data.
* In relocate_elf_sections() for the symbol table section data.
* In get_kernel_module_name_from_modinfo().
* In get_kernel_module_name_from_this_module().

Fix it by checking the section type or directly checking Elf_Data::d_buf
everywhere that could potentially get an SHT_NOBITS section. This is
based on a PR from Peilin Ye.

Closes #145.

Reported-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-01-27 12:23:09 -08:00
Omar Sandoval
c0d8709b45 Update copyright headers to Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 15:59:44 -08:00
Omar Sandoval
0e3054a0ba libdrgn: make addresses wrap around when reading memory
Define that addresses for memory reads wrap around after the maximum
address rather than the current unpredictable behavior. This is done by:

1. Reworking drgn_memory_reader to work with an inclusive address range
   so that a segment can contain UINT64_MAX. drgn_memory_reader remains
   agnostic to the maximum address and requires that address ranges do
   not overflow a uint64_t.
2. Adding the overflow/wrap-around logic to
   drgn_program_add_memory_segment() and drgn_program_read_memory().
3. Changing direct uses of drgn_memory_reader_reader() to
   drgn_program_read_memory() now that they are no longer equivalent.

(For some platforms, a fault might be more appropriate than wrapping
around, but this is a step in the right direction.)

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-06-03 17:49:29 -07:00
Omar Sandoval
a4b9d68a8c Use GPL-3.0-or-later license identifier instead of GPL-3.0+
Apparently the latter is deprecated and the former is preferred.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-04-03 01:10:35 -07:00
Omar Sandoval
63672be809 libdrgn: linux_kernel: save module .init section addresses
Linux kernel modules usually contain ELF relocations in DWARF and ORC
sections for symbols in .init sections. Since we ignore .init sections
entirely in cache_kernel_module_sections(), these relocations end up
being based on an address of 0 (so, e.g., a function from .init.text
could be reported as having an address of 0x0). It makes a little more
sense to use the address where the .init section was before it was
freed. So, let's update the sections' sh_addr but continue ignoring them
for determining the module's address range.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 15:13:47 -07:00
Omar Sandoval
da180b7274 libdrgn: handle errors from elf_strptr()
For some reason, we consistently ignore errors from elf_strptr(), but we
shouldn't.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-26 14:28:16 -07:00
Omar Sandoval
a24c0f5b33 libdrgn: clean up usage of drgn_stop
Use drgn_not_found where it's more appropriate, and check explicitly
against drgn_stop instead of err->code == DRGN_ERROR_STOP.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-03-05 12:46:06 -08:00
Omar Sandoval
9fda010789 Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:

- Byte order is only meaningful for scalars. What does it mean for a
  struct to be big endian? A struct doesn't have a most or least
  significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
  byte order (DW_AT_endianity). The only producer I could find that uses
  this is GCC for the scalar_storage_order type attribute, and it only
  uses it for base types, not variables. GDB only seems to use to check
  it for base types, as well.

So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.

The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-19 21:41:29 -08:00
Omar Sandoval
30cfa40a72 libdrgn: rename "unavailable" objects to "absent" objects
I was going to add an Object.available_ attribute, but that made me
realize that the naming is somewhat ambiguous, as a reference object
with an invalid address might also be considered "unavailable" by users.
Use the name "absent" instead, which is more clear: the object isn't
there at all.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-29 14:58:26 -08:00
Omar Sandoval
e72ecd0e2c libdrgn: replace drgn_program_member_info() with drgn_type_find_member()
Now that types are associated with their program, we don't need to pass
the program separately to drgn_program_member_info() and can replace it
with a more natural drgn_type_find_member() API that takes only the type
and member name. While we're at it, get rid of drgn_member_info and
return the drgn_type_member and bit_offset directly. This also fixes a
bug that drgn_error_member_not_found() ignores the member name length.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-15 14:40:54 -08:00
Omar Sandoval
abafdd965f Remove bit_offset from value objects
There are a couple of reasons that it was the wrong choice to have a
bit_offset for value objects:

1. When we store a buffer with a bit_offset, we're storing useless
   padding bits.
2. bit_offset describes a location, or in other words, part of an
   address. This makes sense for references, but not for values, which
   are just a bag of bytes.

Get rid of union drgn_value.bit_offset in libdrgn, make
Object.bit_offset None for value objects, and disallow passing
bit_offset to the Object() constructor when creating a value. bit_offset
can still be passed when creating an object from a buffer, but we'll
shift the bytes down as necessary to store the value with no offset.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-14 12:29:17 -08:00
Omar Sandoval
22c1d87aec libdrgn: cache page_offset and vmemmap as objects instead of uint64_t
This is a little cleaner and saves on conversions back and forth between
C values and objects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:40:07 -08:00
Omar Sandoval
bce9ef5f8d libdrgn: linux kernel: remove THREAD_SIZE object finder
THREAD_SIZE is still broken and I haven't looked into the root cause
(see commit 95be142d17 ("tests: disable THREAD_SIZE test")). We don't
need it anymore anyways, so let's remove it entirely.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-12-10 02:08:13 -08:00
Omar Sandoval
5975d19580 libdrgn: report better errors when parsing DWARF/kmod index
If the DWARF index encounters any error while parsing, it returns an
error saying only "debug information is truncated", which makes it hard
to track down parsing errors. The kmod index parser silently swallows
errors. For both, replace the mread functions with a higher-level
binary_buffer interface that can include more information including the
location of the error. For example:

  /tmp/mybinary: .debug_info+0x4: expected at least 56 bytes, have 55

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-11-13 17:00:07 -08:00
Omar Sandoval
fa081e32b9 libdrgn: update module section iterator for Linux v5.8
Linux v5.8 changed the module section structure, so we need to get the
section name differently.

Closes #73.

Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 13:07:36 -07:00
Omar Sandoval
1c6465f0b0 libdrgn: fix infinite loop on error caching kernel module sections
If cache_kernel_module_sections() in report_loaded_kernel_module()
fails, we continue to the next iteration without advancing to the next
kernel module. Then, we fail on that same kernel module and repeat. Make
sure that we go to the next kernel module.

Fixes: 423d2cd500 ("libdrgn: dwarf_index: rework file reporting")
Reported-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 12:25:22 -07:00
Omar Sandoval
431b91ddb5 libdrgn: fix use-after-free in kernel module reporting error case
We're freeing path and then using it to report an error.

This has some weird knock-on effects. Since we freed the path, the error
message contains garbage. So, PyErr_SetString() can't decode it as a
UTF-8 string. The end result is a MissingDebugInfoError with no message.

Fix it by creating the error before freeing the path.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 11:55:06 -07:00
Omar Sandoval
2b325b9262 libdrgn: add an environment variable to disable use of /proc/modules and /sys/module
We use /proc/modules and /sys/module to find loaded kernel modules for
the running kernel instead of walking the module list in the core dump
as an optimization. To make it easier to test the core dump path, add an
environment variable to disable the optimization.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-13 11:24:39 -07:00
Omar Sandoval
661a5c56c3 libdrgn: refactor kernel module iterator
The next commit will allow using the offline path for the live kernel,
so the offline naming won't make much sense. Fold the offline path into
the top-level functions, and make the live path an escape hatch. Also
add some comments and improve naming for the file and directory handles
and update the coding style.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 23:22:01 -07:00
Omar Sandoval
ce8540e39c libdrgn: get rid of kernel_module_iterator::notes*
These were added in commit e5874ad18a ("libdrgn: use libdwfl"), but
they have never been used. Remove them.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 17:23:31 -07:00
Omar Sandoval
3c5d22637e libdrgn: clean up hash function APIs and improve documentation
Use *_hash_pair() for hash functions that do the full double hashing and
return a struct hash_pair and hash_*() for other hashing utility
functions. Also change some of the equality function names to be more
symmetric and improve the documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-12 16:20:08 -07:00
Omar Sandoval
fa44171ba1 libdrgn: split bit operations into their own header
And improve their documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-10-09 17:44:15 -07:00
Omar Sandoval
286c09844e Clean up #includes with include-what-you-use
I recently hit a couple of CI failures caused by relying on transitive
includes that weren't always present. include-what-you-use is a
Clang-based tool that helps with this. It's a bit finicky and noisy, so
this adds scripts/iwyu.py to make running it more convenient (but not
reliable enough to automate it in Travis).

This cleans up all reasonable include-what-you-use warnings and
reorganizes a few header files.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-23 16:29:42 -07:00
Omar Sandoval
f83bb7c71b libdrgn: move debugging information tracking into drgn_debug_info
Debugging information tracking is currently in two places: drgn_program
finds debugging information, and drgn_dwarf_index stores it. Both of
these responsibilities make more sense as part of drgn_debug_info, so
let's move them there. This prepares us to track extra debugging
information that isn't pertinent to indexing.

This also reworks a couple of details of loading debugging information:

- drgn_dwarf_module and drgn_dwfl_module_userdata are consolidated into
  a single structure, drgn_debug_info_module.
- The first pass of DWARF indexing now happens in parallel with reading
  compilation units (by using OpenMP tasks).

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-22 10:58:24 -07:00
Omar Sandoval
7a85b4188e libdrgn: clean up read.h helpers and avoid undefined pointer behavior
There are a couple of related ways that we can cause undefined behavior
when parsing a malformed DWARF or depmod index file:

1. There are several places where we increment the cursor to skip past
   some data. It is undefined behavior if the result points out of
   bounds of the data, even if we don't attempt to dereference it.
2. read_in_bounds() checks that ptr <= end. This pointer comparison is
   only defined if ptr and end both point to elements of the same array
   object or one past the last element. If ptr has gone past end, then
   this comparison is likely undefined anyways.

Fix it by adding a helper to skip past data with bounds checking. Then,
all of the helpers can assume that ptr <= end and maintain that
invariant. while we're here and auditing all of the call sites, let's
clean up the API and rename it from read_foo() to the less generic
mread_foo().

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-09-02 17:13:16 -07:00
Omar Sandoval
a97f6c4fa2 Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.

Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.

So, let's reimagine types as being owned by a program. This has a few
parts:

1. In libdrgn, simple types are now created by factory functions,
   drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
   and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
   Program class.
5. While we're changing the API, the parameters to pointer_type() and
   array_type() are reordered to be more logical (and to allow
   pointer_type() to take a default size of None for the program's
   default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
   argument only.

A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 17:41:09 -07:00
Omar Sandoval
c31208f69c libdrgn: fold drgn_type_index into drgn_program
This is preparation for associating types with a program.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 17:36:35 -07:00
Omar Sandoval
948cda2941 libdrgn: add vector/hash table initializers and update coding style
Declaring a local vector or hash table and separately initializing it
with vector_init()/hash_table_init() is annoying. Add macros that can be
used as initializers.

This exposes several places where the C89 style of placing all
declarations at the beginning of a block is awkward. I adopted this
style from the Linux kernel, which uses C89 and thus requires this
style. I'm now convinced that it's usually nicer to declare variables
where they're used. So let's officially adopt the style of mixing
declarations and code (and ditch the blank line after declarations) and
update the functions touched by this change.
2020-07-01 12:48:24 -07:00
Omar Sandoval
e4c52c5422 libdrgn: linux_kernel: use names for kmod index constants
This makes it much easier to follow along with the code and understand
the format.
2020-06-30 15:14:21 -07:00
Omar Sandoval
eea5422546 libdrgn: make Linux kernel stack unwinding more robust
drgn has a couple of issues unwinding stack traces for kernel core
dumps:

1. It can't unwind the stack for the idle task (PID 0), which commonly
   appears in core dumps.
2. It uses the PID in PRSTATUS, which is racy and can't actually be
   trusted.

The solution for both of these is to look up the PRSTATUS note by CPU
instead of PID.

For the live kernel, drgn refuses to unwind the stack of tasks in the
"R" state. However, the "R" state is running *or runnable*, so in the
latter case, we can still unwind the stack. The solution for this is to
look at on_cpu for the task instead of the state.
2020-05-20 12:03:00 -07:00
Omar Sandoval
4d8597f0f8 libdrgn: add THREAD_SIZE to Linux kernel object finder
Despite the naming, this is the kernel stack size.
2020-05-19 17:10:54 -07:00
Omar Sandoval
8b264f8823 Update copyright headers to Facebook and add missing headers
drgn was originally my side project, but for awhile now it's also been
my work project. Update the copyright headers to reflect this, and add a
copyright header to various files that were missing it.
2020-05-15 15:13:02 -07:00
Omar Sandoval
2d1481f5ab libdrgn: add page table walker kernel memory reader
Now that we can walk page tables, we can use it in a memory reader that
reads kernel memory via the kernel page table. This means that we don't
need libkdumpfile for ELF vmcores anymore (although I'll keep the
functionality around until this code has been validated more).
2020-05-08 17:37:56 -07:00