Commit Graph

29 Commits

Author SHA1 Message Date
Jay Kamat
08cb38cc2f Expand DW_AT_upper_bound quirk on zero size arrays
GCC appears to use data8 at -1 when reporting zero length arrays when
comping c++ code, this patch adds support and a test for that behavior.

dwarf_info.c: Remove check for sdata on quirk for array length == 0

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2023-02-21 16:44:20 -08:00
Omar Sandoval
94443457aa libdrgn: handle GNU Debug Fission attributes, forms, and opcodes
These are all equivalent to their DWARF 5 counterparts, which we already
support:

* DW_FORM_GNU_addr_index <-> DW_FORM_addrx
* DW_FORM_GNU_str_index <-> DW_FORM_strx
* DW_AT_GNU_addr_base <-> DW_AT_addr_base
* DW_OP_GNU_addr_index <-> DW_OP_addrx
* DW_OP_GNU_const_index <-> DW_OP_constx

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2023-02-08 13:25:45 -08:00
Kevin Svetlitski
7e6efe6649 Add support for looking up types in namespaces
Looking up objects in namespaces is already well-supported by `drgn`.
These changes bring the same to functionality type lookup, so that
`prog.type('struct A::B::C::MyType')` works in an analogous fashion to
`prog['A::B::C::MyVar']`.

Signed-off-by: Kevin Svetlitski <svetlitski@meta.com>
2023-01-19 10:19:36 -08:00
Alastair Robertson
7180304c88 libdrgn: dwarf_info: Support DW_TAG_GNU_template_parameter_pack
This DWARF tag is used by C++ classes which take a variable number
of template parameters, such as std::variant and std::tuple.

Signed-off-by: Alastair Robertson <ajor@meta.com>
2022-12-05 15:33:46 -08:00
Omar Sandoval
18b12a5c7b libdrgn: get .eh_frame from the correct file
We're currently getting .eh_frame from the debug file. However, since
.eh_frame is an SHF_ALLOC section, it is actually in the loaded file,
and may not be in the debug file. This causes us to fail to unwind in
modules whose debug file was created with objcopy --only-keep-debug
(which is typical for Linux distro debug files).

Fix it by getting .eh_frame from the loaded file. To make this easier,
we split .eh_frame and .debug_frame data into two separate tables. We
also don't bother deduplicating them anymore, since GCC and Clang only
seem to generate one or the other in practice.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
bcb53d712b libdrgn: bypass libdwfl with struct drgn_elf_file
Now that we track the debug file ourselves, we can avoid calling libdwfl
in a bunch of places. By tracking the bias ourselves, we can avoid a
bunch more.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
34f122144a libdrgn: debug_info: wrap ELF file information in new struct drgn_elf_file
struct drgn_module contains a bunch of information about the debug info
file. Let's pull it out into its own structure, struct drgn_elf_file.
This will be reused for the "main"/"loaded" file in an upcoming change.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 13:37:29 -08:00
Omar Sandoval
b3bab1c5b0 libdrgn: make module vs. program platform difference more clear
It's confusing that we have a platform both for the program and for each
module. They usually match, but they're not required to. For example,
the user can manually add a file with a different platform just to read
its debug info. Our rule is that if we're parsing anything from the
module, we use the module platform; and otherwise, use the program
platform. There are a couple of places where the platforms must match:
when using call frame information (CFI) or registers. Let's make all of
this more clear in the code (by using the module's platform even when it
must match the program's platform) and in comments. No functional
change.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 12:53:45 -08:00
Omar Sandoval
85f423dfb8 libdrgn: dwarf_info: get default pointer size from CU
If a DW_TAG_pointer_type DIE doesn't specify its size with
DW_AT_byte_size, we currently default to the program's address size.
However, the DWARF we're parsing could be for a platform with a
different address size. It's more correct to use the CU's address size.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-28 12:53:45 -08:00
Boris Burkov
c8ff8728f7 Support systems without qsort_r
qsort_r is a non-standard glibc extension and turns out to be the only
thing that prevents drgn from working on a musl system. "Fix" the use of
qsort_r by switching it to qsort with a thread local variable for the
parameter.

Tested in a clean chroot install of musl voidlinux.

Signed-off-by: Boris Burkov <boris@bur.io>
2022-11-03 12:57:55 -04:00
Stephen Brennan
5f3a91f80d Add StackFrame.locals() method
The StackFrame's __getitem__() method allows looking up names in the
scope of a stack frame, which is an incredibly useful tool for
debugging. However, the names are not discoverable -- you must already
be looking at the source code or some other source to know what names
can be queried. To fix this, add a locals() method to StackFrame, which
lists names that can be queried in the scope. Since this method is named
locals(), it stops at the function scope and doesn't include globals or
class members.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2022-11-02 22:40:33 -07:00
Omar Sandoval
b3a5051ff4 libdrgn: dwarf_info: handle DW_TAG_enumerator DIE with missing or invalid DW_AT_name
find_dwarf_enumerator() needs to check that the return value of
dwarf_diename() is not NULL before calling strcmp(). This is similar to
commit 330c71b5b5 ("libdrgn: dwarf_info: fix segfault on anonymous
DIEs during scope search"), although I haven't seen this one happen in
practice.

Fixes: bc85767e5f ("libdrgn: support looking up parameters and variables in stack traces")
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-02 22:19:44 -07:00
Omar Sandoval
87b7292aa5 Relicense drgn from GPLv3+ to LGPLv2.1+
drgn is currently licensed as GPLv3+. Part of the long term vision for
drgn is that other projects can use it as a library providing
programmatic interfaces for debugger functionality. A more permissive
license is better suited to this goal. We decided on LGPLv2.1+ as a good
balance between software freedom and permissiveness.

All contributors not employed by Meta were contacted via email and
consented to the license change. The only exception was the author of
commit c4fbf7e589 ("libdrgn: fix for compilation error"), who did not
respond. That commit reverted a single line of code to one originally
written by me in commit 640b1c011d ("libdrgn: embed DWARF index in
DWARF info cache").

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 17:05:16 -07:00
Omar Sandoval
d465071651 libdrgn: replace copies of elfutils headers with generated files
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-11-01 15:41:53 -07:00
Omar Sandoval
99dc927f38 libdrgn: dwarf_info: rename dw_tag_str constants
Rename DW_TAG_{UNKNOWN_FORMAT,BUF_LEN} to
DW_TAG_STR_{UNKNOWN_FORMAT,BUF_LEN} to make it more clear that they're
for dw_tag_str.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-31 14:22:45 -07:00
Omar Sandoval
70af25849c libdrgn: rename drgn_debug_info_module to drgn_module
Eventually, modules will be exposed as part of the public libdrgn API,
so they should have a clean name. Additionally, the module API I'm
currently working on will allow modules for which we don't have the
debug info file, so "debug info module" would be a misnomer.

Also rename drgn_dwarf_module_info to drgn_module_dwarf_info and
drgn_orc_module_info to drgn_module_orc_info to fit the new naming
scheme better.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-05 16:52:46 -07:00
Omar Sandoval
05a3695d5b libdrgn: enable -Wimplicit-fallthrough, take 2
This time, in order to work on both GCC and Clang, use
__attribute__((__fallthrough__)) instead of /* fallthrough */ comments.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-10-04 23:36:01 -07:00
Omar Sandoval
330c71b5b5 libdrgn: dwarf_info: fix segfault on anonymous DIEs during scope search
Jakub Kicinski reported that
prog.crashed_thread().stack_trace()[1]['does not exist'] segfaulted on a
vmcore he encountered. The segfault was a NULL pointer dereference of
dwarf_diename() of a DW_TAG_subprogram DIE in
drgn_find_in_dwarf_scopes(). The fix is to ignore DIEs without a name.

I was curious what this anonymous DW_TAG_subprogram was. It turned out
to be some dubious DWARF generated by Clang when a local variable is
defined via a macro. One such example comes from the following code in
arch/x86/events/intel/uncore.h:

static inline bool uncore_mmio_is_valid_offset(struct intel_uncore_box *box,
					       unsigned long offset)
{
	if (offset < box->pmu->type->mmio_map_size)
		return true;

	pr_warn_once("perf uncore: Invalid offset 0x%lx exceeds mapped area of %s.\n",
		     offset, box->pmu->type->name);

	return false;
}

pr_warn_once() expands to:

#define pr_warn_once(fmt, ...)					\
	printk_once(KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
#define printk_once(fmt, ...)					\
({								\
	static bool __section(".data.once") __print_once;	\
	bool __ret_print_once = !__print_once;			\
								\
	if (!__print_once) {					\
		__print_once = true;				\
		printk(fmt, ##__VA_ARGS__);			\
	}							\
	unlikely(__ret_print_once);				\
})

For some reason, Clang generates an anonymous, top-level
DW_TAG_subprogram DIE to contain the __print_once variable:

 <1><1cf86e>: Abbrev Number: 62 (DW_TAG_subprogram)
 <2><1cf86f>: Abbrev Number: 61 (DW_TAG_variable)
    <1cf870>   DW_AT_name        : (indirect string, offset: 0x34fb2e): __print_once
    <1cf874>   DW_AT_type        : <0x1c574c>
    <1cf878>   DW_AT_decl_file   : 1
    <1cf879>   DW_AT_decl_line   : 229
    <1cf87a>   DW_AT_location    : 16 byte block: 3 2c 84 66 83 ff ff ff ff 94 1 31 1e 30 22 9f         (DW_OP_addr: ffffffff8366842c; DW_OP_deref_size: 1; DW_OP_lit1; DW_OP_mul; DW_OP_lit0; DW_OP_plus; DW_OP_stack_value)

Whereas GCC puts it in a DW_TAG_lexical block DIE inside of the
DW_TAG_subprogram DIE for uncore_mmio_is_valid_offset():

 <1><3110b2>: Abbrev Number: 45 (DW_TAG_subprogram)
    <3110b3>   DW_AT_name        : (indirect string, offset: 0x2e13e): uncore_mmio_is_valid_offset
    <3110b7>   DW_AT_decl_file   : 4
    <3110b8>   DW_AT_decl_line   : 223
    <3110b9>   DW_AT_decl_column : 20
    <3110ba>   DW_AT_prototyped  : 1
    <3110ba>   DW_AT_type        : <0x2f416b>
    <3110be>   DW_AT_inline      : 3    (declared as inline and inlined)
    <3110bf>   DW_AT_sibling     : <0x311142>
 <2><3110ef>: Abbrev Number: 66 (DW_TAG_lexical_block)
 <3><3110f0>: Abbrev Number: 120 (DW_TAG_variable)
    <3110f1>   DW_AT_name        : (indirect string, offset: 0x2da3f): __print_once
    <3110f5>   DW_AT_decl_file   : 4
    <3110f6>   DW_AT_decl_line   : 229
    <3110f7>   DW_AT_decl_column : 2
    <3110f8>   DW_AT_type        : <0x2f416b>
    <3110fc>   DW_AT_location    : 9 byte block: 3 2c 28 48 83 ff ff ff ff      (DW_OP_addr: ffffffff8348282c)

Regardless, we shouldn't crash on this input.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-09-21 14:12:16 -07:00
Michel Alexandre Salim
c0ed1a3203 Fix spelling error
abbrevation => abbreviation; caught by Debian's lintian

Signed-off-by: Michel Alexandre Salim <michel@michel-slm.name>
2022-08-17 21:45:51 -07:00
Jay Kamat
063850325f libdrgn: dwarf: look up complete types in namespaces
drgn_debug_info_find_complete() looks up the name of the incomplete type
in the global namespace. This is incorrect for C++: we need to look it
up in the namespace that the DIE is in.

To find the containing namespace, we need to do a DIE ancestor walk. We
don't want to do this for C, so add a flag indicating whether a language
has namespaces to struct drgn_language. If it's true, then we do the
ancestor walk and then look up the name in the appropriate namespace.

Signed-off-by: Jay Kamat <jaygkamat@gmail.com>
2022-07-15 16:02:56 -07:00
Kevin Svetlitski
661d6a186c Add support for UTF character base types
Previously `drgn` did not recognize the	`DW_ATE_UTF` encoding for base
types, and consequently could not handle `char8_t`, `char16_t`, or
`char32_t`. This has been remedied, and a corresponding test case added
to prevent regressions.

Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>
2022-07-06 09:44:16 -07:00
Omar Sandoval
4d1b608507 libdrgn: aarch64: add RA_SIGN_STATE pseudo-register and DW_CFA_AARCH64_negate_ra_state
The RA_SIGN_STATE pseudo-register indicates whether the return address
is signed with a pointer authentication code. Add it to the register
definitions. It can be set through a normal CFI register rule or the
vendor-specific DW_CFA_AARCH64_negate_ra_state rule.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-06-26 09:18:07 -07:00
Omar Sandoval
a541e9b170 libdrgn: support reference and absent objects with >64-bit integer types
GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and
unsigned __int128. Clang additionally has N-bit integers of up to 2<<24
bits with _ExtInt(N), which was standardized in C23 as _BitInt(N).

Currently, we disallow creating objects with a >64-bit integer type. Jay
Kamat reported that this would cause errors when examining some
binaries. The reason we disallow this is that we don't have a way to
represent or do operations on >64-bit values. We could make use of a
bignum library like GMP to do this in the future.

However, for now, we can loosen this restriction and at least allow
reference and absent objects with big integer types. This requires
enforcing two things: that we never create a value object with a >64-bit
integer type, and that we never read the value of a reference object
with a >64-bit integer type.

Co-authored-by: Jay Kamat <jaygkamat@gmail.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2022-04-28 13:38:38 -07:00
Alakesh Haloi
c4fbf7e589 libdrgn: fix for compilation error
On gcc version 7.3, we get following compilation error

  CC       libdrgnimpl_la-dwarf_info.lo
../../libdrgn/dwarf_info.c:181:51: error: initializer element is not
constant
 static const size_t DRGN_DWARF_INDEX_NUM_SHARDS = 1 <<
DRGN_DWARF_INDEX_SHARD_BITS;

This fixes the compilation error on older versions of gcc

Signed-off-by: Alakesh Haloi <alakesh.haloi@gmail.com>
2021-12-14 11:48:00 -08:00
Omar Sandoval
8b2bf85e49 libdrgn: dwarf_info: fix garbage return from drgn_array_type_from_dwarf()
Found with clang-static-analyzer.

Reported-by: Kevin Svetlitski <svetlitski@fb.com>
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-08 13:56:21 -08:00
Omar Sandoval
844d82848c libdrgn: add partial support for .gnu_debugaltlink
Issue #130 reported an "unknown attribute form 0x1f20" from drgn. 0x1f20
is DW_FORM_GNU_ref_alt, which is a reference to a DIE in an alternate
file. Similarly, DW_FORM_GNU_strp_alt is a string in an alternate file.
The alternate file is specified by the .gnu_debugaltlink section. This
is generated by dwz, which is used by at least Fedora and Debian.

libdwfl already finds the alternate debug info file, so we can save its
.debug_info and .debug_str and use those to support DW_FORM_GNU_ref_alt
and DW_FORM_GNU_strp_alt in the DWARF index.

Imported units are going to be more work to support in the DWARF index,
but this at least lets drgn start up.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-12-07 13:49:09 -08:00
Omar Sandoval
c0d8709b45 Update copyright headers to Meta
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-21 15:59:44 -08:00
Omar Sandoval
c3f31e28f9 libdrgn: reorganize and move DWARF index into dwarf_info.c
The upcoming introduction of a higher level data structure to represent
a namespace has implications on the organization of the DWARF index and
debug info management code. Basically, we're going to want to track what
is currently known as struct drgn_dwarf_index_namespace as part of the
new struct drgn_namespace. That only leaves the DWARF specification map
and list of CUs in struct drgn_dwarf_index, which doesn't make much
sense anymore. Instead, let's:

* Move the specification map and CUs into struct drgn_dwarf_info.
* Rename struct drgn_dwarf_index_namespace to struct
  drgn_namespace_dwarf_index to indicate that it is the "DWARF index for
  a namespace" rather than a "namespace of a DWARF index".
* Move the DWARF index implementation into dwarf_info.c. The DWARF index
  and debugging information management have always been coupled, so this
  makes it more explicit and is more convenient.
* Improve documentation and naming in the DWARF index implementation.

Now, the only DWARF-specific code outside of dwarf_info.c is for stack
tracing, but we'll leave that for another day.

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:55 -08:00
Omar Sandoval
5591d199b1 libdrgn: debug_info: split DWARF support into its own file
Continuing the refactoring from the previous commit, move the DWARF code
from debug_info.c to its own file, leaving only the generic ELF file
management in debug_info.c

Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-11-18 15:08:54 -08:00