Commit Graph

728 Commits

Author SHA1 Message Date
Omar Sandoval
d4e0771f87 libdrgn: return error from drgn_program_{is_little_endian,bswap,is_64_bit}()
Most places that call these check has_platform and return an error, and
those that don't can live with the extra check.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 16:56:28 -07:00
Omar Sandoval
a8d632b4c1 libdrgn/python: use F14 instead of PyDict for Program::objects
Program::objects is used to store references to objects that must stay
alive while the Program is alive. It is currently a PyDict where the
keys are the object addresses as PyLong and the values are the objects
themselves. This has two problems:

1. Allocating the key as a full object is obviously wasteful.
2. PyDict doesn't have an API for reserving capacity ahead of time,
   which we want for an upcoming change.

Both of these are easily fixed by using our own hash table.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 16:56:28 -07:00
Omar Sandoval
b0f9403ebf drgndoc: directly use name passed as argument to drgndoc directive
E.g., drgndoc:: foo.bar() should emit py:method:: foo.bar() regardless
of a previous py:module directive.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 16:56:28 -07:00
Omar Sandoval
93e33513da drgndoc: bring back :exclude:
It's still useful to have an escape hatch for names we don't want
documented.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-26 16:36:55 -07:00
Omar Sandoval
d40526d85d scripts: add Python include header path to cscope
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-25 18:07:31 -07:00
arsarwade
6f6c5f272f
libdrgn: export function drgn_object_init() (#70)
drgn_object_init() is available in drgh.h file and seems to a required
call before calling drgn_program_find_object().

Without this, trying to call drgn_object_init() from an external C
application results in undefined reference.

Signed-off-by: Aditya Sarwade <asarwade@fb.com>
2020-08-21 10:24:52 -07:00
Omar Sandoval
903a44d0dd travis: upgrade to Ubuntu 20.04
This picks up a newer version of QEMU and lets us use udevadm trigger
-w. Let's also explicitly add "os: linux" to silence the config
validation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 17:57:39 -07:00
Omar Sandoval
7fb196cfbf vmtest: don't use onoatimehack on QEMU 5.1.0
As of QEMU commit a5804fcf7b22 ("9pfs: local: ignore O_NOATIME if we
don't have permissions") (in v5.1.0), QEMU handles O_NOATIME sanely, so
we don't need the LD_PRELOAD hack. Since we're adding a version check,
make the multidevs check based on the version, too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 17:55:57 -07:00
Omar Sandoval
656d85f2fe travis: check Python code with black, isort, and mypy
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 16:55:07 -07:00
Omar Sandoval
4e770fb18a Format imports with isort
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 16:55:07 -07:00
Omar Sandoval
8c7c80e2f7 Fix mypy --strict warnings
The remaining warnings are all no-any-return, which is hard to avoid in
drgn.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 16:28:02 -07:00
Omar Sandoval
0cf3320a89 Add type annotations to helpers
Now that drgndoc can handle overloads and we have the IntegerLike and
Path aliases, we can add type annotations to all helpers. There are also
a couple of functional changes that snuck in here to make annotating
easier.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 16:28:02 -07:00
Omar Sandoval
e4a2676cac drgndoc: support @typing.overload()
One of the blockers for adding type annotations to helpers is that some
helpers need to be overloaded, but drgndoc doesn't support that. This
adds support. Each function now tracks all of its overloaded signature,
each of which may be documented separately. The formatted output (for
functions/methods and classes with __init__()) combines all of the
documented overloads.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:21:29 -07:00
Omar Sandoval
64a04a6c4f drgndoc: include attributes based on presence of docstring
We can get rid of the :include: and :exclude: options by deciding solely
based on whether a node has a docstring. Empty docstrings can be used to
indicate nodes that should be included with no additional content. The
__init__() method must now also have a docstring in order to be
documented. Additionally, the directives are now fully formatted by the
Formatter rather than being split between the Formatter and
DrgnDocDirective.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:21:29 -07:00
Omar Sandoval
f41cc7fb48 drgndoc: recursively document names imported with alias
The helpers implemented in C have Python wrappers only for the purpose
of documentation. This is because drgndoc ignores all imports when
recursively documenting attributes. However, mypy uses the convention
that aliased imports (i.e., import ... as ... or from ... import ... as
...) are considered re-exported, so we can follow that convention and
include aliased imports. (mypy also considered attributes in __all__ as
re-exported, so we should probably follow that in the future, too, but
for now aliased imports are enough). This lets us get rid of the Python
wrappers.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:21:29 -07:00
Omar Sandoval
192d35c609 drgndoc: support relative imports
Mainly for completeness, as I don't really like using them in my own
projects.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:21:29 -07:00
Omar Sandoval
a270525f8b drgndoc: save all modules and classes traversed to resolve name
This will be used to support relative imports.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:21:29 -07:00
Omar Sandoval
4a3b8fb8e6 drgndoc: fix mypy --strict errors
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:21:29 -07:00
Omar Sandoval
2d49ef657b Add Path type alias
Rather than duplicating Union[str, bytes, os.PathLike] everywhere, add
an alias. Also make it explicitly os.PathLike[str] or os.PathLike[bytes]
to get rid of some mypy --strict errors.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:20:29 -07:00
Omar Sandoval
66c5cc83a6 Add IntegerLike type annotation
Lots if interfaces in drgn transparently turn an integer Object into an
int by using __index__(), so add an IntegerLike protocol for this and
use it everywhere applicable.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-20 11:16:50 -07:00
Omar Sandoval
2345325ac1 drgndoc: handle implicit classmethods
The __init_subclass__ and __class_getitem__ methods are always class
methods even if not decorated as such, so format them accordingly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-11 23:18:42 -07:00
Omar Sandoval
b8aa2dcfc5 drgndoc: format None, True, and False as roles
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-11 22:51:56 -07:00
Omar Sandoval
be85631471 travis: fix spurious VM crashes
Every few builds or so, a vmtest VM crashes after printing "x86: Booting
SMP configuration:". After some difficult debugging, I determined that
the crash happens in arch/x86/realmode/rm/trampoline_64.S (the code that
initializes secondary CPUs) at the ljmp from startup_32 to startup_64.
The real problem happens earlier in startup_32:

	movl	$pa_trampoline_pgd, %eax
	movl	%eax, %cr3

Sometimes, the store to CR3 "fails" and CR3 remains zero, which causes
the later ljmp to triple fault.

This can be reproduced by the following script:

	#!/bin/sh

	curl -L 'https://www.dropbox.com/sh/2mcf2xvg319qdaw/AABFKsISWRpndNZ1gz60O-qSa/x86_64/vmlinuz-5.8.0-rc7-vmtest1?dl=1' -o vmlinuz

	cat > commands.gdb << "EOF"
	set confirm off
	target remote :1234

	# arch/x86/realmode/rm/trampoline_64.S:startup_32 after CR3 store.
	hbreak *0x9ae09 if $cr3 == 0
	command
	info registers eax cr3
	quit 1
	end

	# kernel/smp.c:smp_init() after all CPUs have been brought up. If we get here,
	# the bug wasn't triggered.
	hbreak *0xffffffff81ed4484
	command
	kill
	quit 0
	end

	continue
	EOF

	while true; do
		qemu-system-x86_64 -cpu host -enable-kvm -smp 64 -m 128M \
			-nodefaults -display none -serial file:/dev/stdout -no-reboot \
			-kernel vmlinuz -append 'console=0,115200 panic=-1 nokaslr' \
			-s -S &

		gdb -batch -x commands.gdb || exit 1
	done

This seems to be a problem with nested virtualization that was fixed by
Linux kernel commit b4d185175bc1 ("KVM: VMX: give unrestricted guest
full control of CR3") (in v4.17). Apparently, the Google Cloud hosts
that Travis runs on are missing this fix. We obviously can't patch those
hosts, but we can work around it. Disabling unrestricted guest support
in the Travis VM causes CR3 stores in the nested vmtest VM to be
emulated, bypassing the bug.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-08-04 16:36:09 -07:00
Omar Sandoval
20bcde1f1d drgn 0.0.7
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 23:32:32 -07:00
Omar Sandoval
9f3fadd3de README: use code-block instead of highlight
PyPI's RST parser apparently doesn't know the highlight directive, which
snuck into the README in commit 4de147e478 ("Add CONTRIBUTING.rst").
Use code-block instead.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 23:28:23 -07:00
Omar Sandoval
025989871b drgn 0.0.6
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 17:25:54 -07:00
Omar Sandoval
e3309765f9 helpers: add kaslr_offset() and move pgtable_l5_enabled()
Make the KASLR offset available to Python in a new
drgn.helpers.linux.boot module, and move pgtable_l5_enabled() there,
too.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 17:00:16 -07:00
Omar Sandoval
d118fda740 vmtest: check that downloaded file is not truncated
My work VPN is apparently closing HTTP connections prematurely, which
exposed that urllib won't catch incomplete reads if copied through
shutil.copyfileobj(). Check it explicitly.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 12:19:46 -07:00
Omar Sandoval
d45aafe43c vmtest: manage: remove 3.16 blacklist
3.16 is EOL and no longer included in the list of releases.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 12:00:35 -07:00
Omar Sandoval
27e73fb84b docs: fix broken link to drgn.h
drgn.h is generated from drgn.h.in since commit d60c6a1d68 ("libdrgn:
add register information to platform").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-27 11:56:36 -07:00
Omar Sandoval
e7f353c118 libdrgn: hash_table: clean up coding style
Clean up the coding style of the remaining few places that the last
couple of changes didn't rewrite.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-18 11:53:05 -07:00
Omar Sandoval
f94b0262c6 libdrgn: hash_table: implement vector storage policy
The folly F14 implementation provides 3 storage policies: value, node,
and vector. The default F14FastMap/F14FastSet chooses between the value
and vector policies based on the value size.

We currently only implement the value policy, as the node policy is easy
to emulate and the vector policy would've added more complexity. This
adds support for the vector policy (adding even more C abuse :) and
automatically chooses the policy the same way as folly. It'd be easy to
add a way to choose the policy if needed.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-18 11:53:00 -07:00
Omar Sandoval
9ea11a7c26 libdrgn: hash_table: port reserve optimization
The only major change to the folly F14 implementation since I originally
ported it is commit 3d169f4365cf ("memory savings for F14 tables with
explicit reserve()"). That is a small improvement for small tables and a
large improvement for vector tables, which are about to be added.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-18 01:42:37 -07:00
Omar Sandoval
2eab47ce9e libdrgn: hash_table: use posix_memalign() instead of aligned_alloc()
posix_memalign() doesn't have the restriction that the size must be a
multiple of the alignment like aligned_alloc() does in C11.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-18 01:42:37 -07:00
Omar Sandoval
2409868409 libdrgn: hash_table: define chunk alignment constant
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-18 01:42:37 -07:00
Omar Sandoval
209eaee485 setup.py: import setuptools before distutils
setuptools recently started warning if distutils is imported before it.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-18 01:41:12 -07:00
Omar Sandoval
6d4af7e17e libdrgn: dwarf_info_cache: handle variables DW_AT_const_value
Compile-time constants have DW_AT_const_value instead of DW_AT_location.
We can translate those to a value object.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-13 15:23:51 -07:00
Omar Sandoval
213c148ce6 libdrgn: dwarf_info_cache: handle DW_AT_endianity
Variables can have a non-default endianity. Handle it and clean up
variable endian handling.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-13 14:26:58 -07:00
Omar Sandoval
c840072d05 libdrgn: make drgn_object_set_buffer() take a void *
It's awkward to make callers cast to char *.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-13 10:25:03 -07:00
Omar Sandoval
f1eaf5b14c libdrgn: add load_debug_info example program
Really it's more of a test program than an example program. It's useful
for benchmarking, testing with valgrind, etc. It's not built by default,
but it can be built manually with:

  $ make -C build/temp.* examples/load_debug_info

And run with:

  $ ./build/temp.*/examples/load_debug_info

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-10 16:18:58 -07:00
Omar Sandoval
3028da4d1d libdrgn: compare language in drgn_type_eq()
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-08 22:07:49 -07:00
Omar Sandoval
1409b56d24 travis.yml: remove unnecessary sudo from echo 2020-07-08 18:39:37 -07:00
Omar Sandoval
27744108e1 setup.py: add 5.8 to vmtest kernels 2020-07-08 18:39:37 -07:00
Omar Sandoval
95be142d17 tests: disable THREAD_SIZE test
GCC 10 doesn't generate a DIE for union thread_union, which breaks our
THREAD_SIZE object finder. The previous change removed our internal
dependency on THREAD_SIZE, so disable this test while I investigate why
GCC changed.
2020-07-08 18:36:10 -07:00
Omar Sandoval
1b47b866b4 libdrgn: go back to trusting PRSTATUS PID
Commit eea5422546 ("libdrgn: make Linux kernel stack unwinding more
robust") overlooked that if the task is running in userspace, the stack
pointer in PRSTATUS obviously won't match the kernel stack pointer.
Let's bite the bullet and use the PID. If the race shows up in practice,
we can try to come up with another workaround.
2020-07-08 18:34:16 -07:00
Omar Sandoval
4de147e478 Add CONTRIBUTING.rst
This documents best practices for contributing to drgn. We now require a
DCO sign-off.

Also clean up some related areas in the documentation.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-07 17:44:02 -07:00
Omar Sandoval
293418294a libdrgn: assume compiler uses sane integer implementation
I once tried to implement a generic arithmetic right shift macro without
relying on any implementation-defined behavior, but this turned out to
be really hard. drgn is fairly tied to GCC and GCC-compatible compilers
(like Clang), so let's just assume GCC's model [1]: modular conversion
to signed types, two's complement signed bitwise operators, and sign
extension for signed right shift.

1: https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html
2020-07-07 17:18:17 -07:00
Omar Sandoval
948cda2941 libdrgn: add vector/hash table initializers and update coding style
Declaring a local vector or hash table and separately initializing it
with vector_init()/hash_table_init() is annoying. Add macros that can be
used as initializers.

This exposes several places where the C89 style of placing all
declarations at the beginning of a block is awkward. I adopted this
style from the Linux kernel, which uses C89 and thus requires this
style. I'm now convinced that it's usually nicer to declare variables
where they're used. So let's officially adopt the style of mixing
declarations and code (and ditch the blank line after declarations) and
update the functions touched by this change.
2020-07-01 12:48:24 -07:00
Omar Sandoval
e4c52c5422 libdrgn: linux_kernel: use names for kmod index constants
This makes it much easier to follow along with the code and understand
the format.
2020-06-30 15:14:21 -07:00
Omar Sandoval
03d8cb0e32 libdrgn: fix hash_pair_from_non_avalanching_hash() on 64-bit without SSE 4.2
We were forgetting to mask away the extra bits. There are two places
that we use the tag without converting it to a uint8_t:
hash_table_probe_delta(), which is mostly benign since we mask it by the
chunk mask anyways; and table_chunk_match() without SSE 2, which
completely breaks.

While we're here, let's align the comments better.
2020-06-24 13:33:08 -07:00