To try out our new testing framework, move some simple Python unit tests
for the internal lexer API to C unit tests.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
So far we've been getting away with only unit testing through Python.
However, there's plenty of (existing and upcoming) internal code that
would be nice to unit test directly in C. For a framework, I opted for
check (https://libcheck.github.io/check/) because it is minimal, mature,
and available on all major distros. Add the autotools scaffolding,
including a copy of the checkmk script from check 0.15.2 since RHEL and
CentOS don't package it. We check the dependencies at configure time but
only fail if they're not available at `make check` time. Also wire up
`setup.py test` to run `make check`.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The C standard treats an empty variable argument list as a single, empty
argument, so PP_NARGS() currently expands to 1. But this is surprising,
especially for PP_OVERLOAD(). Use the , ##__VA_ARGS__ GNU C extension to
make PP_NARGS() expand to 0 instead. (We could also use __VA_OPT__(,) to
achieve the same thing. It has the advantage of being standardized for
C23, but the huge disadvantage that it's only available on relatively
recent versions of GCC and Clang.) Also check that the extension is
supported in configure.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Building with GCC 7.3 fails with:
../../libdrgn/hash_table.h:340:43: error: initializer element is not constant
static const size_t hash_table_max_size = SIZE_MAX >> hash_table_size_shift;
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Port folly commit a20494d7b2cc ("Shrink F14 maps"), which shrinks tables
using the basic storage policy by 8 bytes. This was performance and
memory-usage neutral for startup, but it would probably save some memory
when lots of namespaces are accessed.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
When rehashing a hash table using the vector storage policy, we're
prefetching the index items, but the folly implementation prefetches the
actual entries (because we're about to recalculate their hashes).
Fixes: f94b0262c6 ("libdrgn: hash_table: implement vector storage policy")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We still need a union and some careful casting in a couple of places,
but this is overall much cleaner.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
(vector_size_type)-1 / sizeof(vector_entry_type) is not a limit;
(vector_size_type)-1 is.
Fixes: b450a7b02b ("libdrgn: vector: support using a smaller type for size/capacity")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Avoid a repeat of commit f34f1c278f ("libdrgn/python: fix #includes in
symbol.c") by replacing automake's default, global -I. -I$(srcdir) with
-iquote . only for libdrgnimpl.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Our internal Buck build of drgn doesn't use -I$(srcdir) like automake
does, so #include "drgn.h" and #include "symbol.h" in
libdrgn/python/symbol.c don't work. "drgn.h" is included by "drgnpy.h",
so we can drop that one and use a relative path for "symbol.h" instead.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Expose the Symbol finder API so that Python code can be used to lookup
additional symbols by name or address.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Previously, Symbol objects could not be constructed in Python. However,
in order to allow Python Symbol finders, this needs to be changed.
Unfortunately, Symbol name lifetimes are tricky to manage. We introduce
a lifetime enumeration to handle this. The lifetime may be "static",
i.e. longer than the life of the program; "external", i.e. longer than
the life of the symbol, but no guarantees beyond that; or "owned", i.e.
owned by the Symbol itself.
Symbol objects constructed in Python are "external". The Symbol struct
owns the pointer to the drgn_symbol, and it holds a reference to the
Python object keeping the name valid (either the program, or a PyUnicode
object).
The added complexity is justified by the fact that most symbols are from
the ELF file, and thus share a lifetime with the Program. It would be a
waste to constantly strdup() these strings, just to support a small
number of Symbols created by Python code.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Now that the symbol finder API is created, we can move the ELF symbol
implementation into the debug_info.c file, where it more logically
belongs. The only change to these functions in the move is to declare
elf_symbols_search as static.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
The drgn_program_find_symbol_by_address_internal() function is used when
libdrgn itself may want to lookup a symbol: in particular, when
formatting stack traces or objects. It does less work by possibly
already having a Dwfl_Module looked up, and by avoiding memory
allocation of a symbol, and it's more convenient because it doesn't
return any errors, including on lookup failure.
Unfortunately, the new symbol finder API breaks all of these properties:
the returned symbol is now allocated via malloc() which needs cleanup on
error, and errors can be returned by any finder via the lookup API.
What's more, the finder API doesn't allow specifying an already-known
module. Thankfully, error handling can be improved using the cleanup
API, and looking up a module for an address is usually a reasonably
cheap binary tree operation.
Switch the internal method over to the new finder API. The major
difference now is simply that lookup failures don't result in an error:
they simply result in a NULL symbol.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
The following commit will modify it to use
drgn_program_symbols_search(), a static function declared below. Move it
underneath in preparation. No changes to the function.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Symbol lookup is not yet modular, like type or object lookup. However,
making it modular would enable easier development and prototyping of
alternative Symbol providers, such as Linux kernel module symbol tables,
vmlinux kallsyms tables, and BPF function symbols. To begin with, create
a modular Symbol API within libdrgn, and refactor the ELF symbol search
to use it.
For now, we leave drgn_program_find_symbol_by_address_internal() alone.
Its conversion will require some surgery, since the new API can return
errors, whereas this function cannot.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
By using __attribute__((__packed__)), we shrink each enum from the
default integer size of four bytes, down to the minimum size of one.
This reduces the size of drgn_symbol from 32 bytes down to 26, with 6
bytes of padding. It doesn't have a practical benefit yet, but adding
fields to struct drgn_symbol in the future may not increase the size.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
It can be confusing and misleading to see a FaultError for a strange
address that is actually physical.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Python 3.13.0a4 added a new public function, PyLong_AsNativeBytes(), to
replace the private _PyLong_AsByteArray(). It also modified the
signature of _PyLong_AsByteArray(). Let's use PyLong_AsNativeBytes()
when it's available. (PyLong_AsNativeBytes() also has the exact overflow
behavior we wanted, so it's a win-win.)
Closes#385.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
My elfutils patches to support .dwp files were just merged and included
in release 0.191. libdw does all of the heavy lifting, we just need to
apply the section offsets when we parse DWARF ourselves. We still need
to support older versions of elfutils, so add a stub.
Closes#317.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Type units don't have a skeleton unit, so we need to walk over all of
the units in the split DWARF file to find them. Instead of doing this in
a second pass, rework drgn_dwarf_index_read_cus(): instead of
substituting skeleton units with their respective split units, call
drgn_dwarf_index_read_cus() recursively on the split DWARF file.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
_PyDict_GetItemIdWithError() and _PyDict_SetItemId() have
straightforward replacements, so no need to fight this upstream.
Closes#361.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
We never free drgn_type::template_parameters.
Fixes: 352c31e1ac ("Add support for C++ template parameters")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
In tag-based KASAN modes, TCR_EL1.TBI1 is enabled, which causes the
top 8 bits of virtual addresses to be ignored for address translation
purposes. Do the same when reading from memory. There is no harm in doing
so unconditionally, as the architecture does not support >56 bit VA sizes.
Signed-off-by: Peter Collingbourne <pcc@google.com>
This function is useful when you only have a list of PCs
and not the full stack trace, for example when working with
the stack depot.
Signed-off-by: Peter Collingbourne <pcc@google.com>
We don't allow this because "value objects with a scalar type cannot be
reinterpreted, as their memory layout in the program is not known". That
doesn't really make sense: we already support reconstructing the
in-memory representation with drgn_object_read_bytes().
Implement this by making drgn_object_slice() support slicing all
objects, using drgn_object_read_bytes() when necessary, then make
drgn_object_reinterpret() a trivial wrapper around it.
Closes#378.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Our existing flexible arrays uses all have extra scaffolding around
them, so this isn't applicable for those, but PR #376 can make use of
it.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The generated header causes confusion for some tooling. The only reason
we're generating it is to substitute the version number. For now, let's
just manually duplicate the version number from configure.ac. We could
probably do something fancier like what autoconf itself does [1], but
that looks much more involved than simply adding a step to my release
runbook.
Closes#375.
1: https://lists.gnu.org/archive/html/autoconf/2007-12/msg00027.html
Signed-off-by: Omar Sandoval <osandov@osandov.com>
For the most part, this simply entails adding the correct decorator.
There are some notable conversions:
- All of the memory management helpers that already had
(prog: Program, addr: IntegerLike) and (addr: Object) overloads are
now much simpler and support keyword arguments.
- Helpers that already took a Program or an Object are also now much
simpler and support keyword arguments.
- Helpers that previously took a Program and positional parameters with
a default value (path_lookup(), for_each_mount(), print_mounts()) had
those parameters converted to keyword-only. This is not
backwards-compatible, unfortunately.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Some feedback that I've gotten that resonated with me was that it feels
silly and too verbose to always pass prog to helpers, like
for_each_task(prog), find_task(prog, pid), etc. Passing prog makes sense
from the library point of view, but it's not great for interactive
usability. And even when you include usage as a library, the vast
majority of use cases only need one program.
So, let's introduce the concept of the "default program". It has a
getter (get_default_prog()) and a setter (set_default_prog()). The CLI
sets it automatically. Library users can do it manually if they want to.
It is a per-thread setting.
Upcoming commits will update all of our helpers and functions that take
a Program to make it optional and default to the default program.
P.S. This was inspired by asyncio, which has many interfaces that take
an optional loop parameter but default to the current loop. Cf.
asyncio.get_event_loop() and asyncio.set_event_loop().
Signed-off-by: Omar Sandoval <osandov@osandov.com>
Return DRGN_ERROR_INVALID_ARGUMENT with a reasonable error message if
libkdumpfile cannot open a file.
The error output currently looks something like:
Traceback (most recent call last):
File "/research/bin/drgn", line 33, in <module>
sys.exit(load_entry_point('drgn', 'console_scripts', 'drgn')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/research/src/drgn/drgn/cli.py", line 264, in _main
prog.set_core_dump(args.core)
Exception: kdump_set_number_attr(KDUMP_ATTR_FILE_FD): File #0: Unknown file format
Replace it with this:
error: file #0: Unknown file format
This is similar to the error message when trying to open a file that is
neither KDUMP nor ELF:
error: not an ELF core file
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Reduce the error to a warning. Support for the flattened KDUMP format was
added in libkdumpfile-0.5.3, so a sufficiently recent libkdumpfile can open
flattened files directly.
However, the flattened file format requires scanning the whole file first
to build a map of flattened file segments, so opening a large file may be
too slow. Issue a warning, so users know they have the option to reassemble
the vmcore.
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
To allow logging in the function, let it take prog as a parameter. OTOH
the file descriptor then need not be passed separately.
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
The current linux_helper_task_iterator implementation does a loop over
the tasks list and an inner loop over task->thread_group. However, Linux
kernel commit 8e1f385104ac ("kill task_struct->thread_group") (in
v6.7-rc1) broke this.
Rework the implementation to loop over task->signal->thread_head,
equivalent to the kernel's for_each_process_thread() macro. This works
for all supported kernel versions and 6.7.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
The current page table walker will on average read around half of the
entire page table for each level. This is inefficient, especially when
debugging a remote target which may have a low bandwidth connection to
the debugger. Address this by only reading one PTE per level.
I've only done the aarch64 page table walker because that's all that I
needed, but in principle the other page table walkers could work in a
similar way.
Signed-off-by: Peter Collingbourne <pcc@google.com>
This reverts commit 747e02857d (except for
the test improvements). Peter Collingbourne noticed that the change I
used to test the performance of reading a single PTE at a time [1]
didn't cache higher level entries. Keeping that caching makes the
regression I was worried about negligible. So, there's no reason to add
the extra complexity of the hint.
1: https://github.com/osandov/drgn/pull/312#issuecomment-1754082129
Signed-off-by: Omar Sandoval <osandov@osandov.com>
- Fix messed up indentation by seven spaces instead of a tab.
- Use //-style comments.
- Put "imports" first.
- Call after setting up all other types so that future changes can set
up aliases referring to those types.
Signed-off-by: Omar Sandoval <osandov@osandov.com>