JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-22 17:23:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	0fad8a591a	libdrgn: fix finding types beginning in size_t or ptrdiff_t c_parse_specifier_qualifier_list() checks whether an identifier starts with "size_t" or "ptrdiff_t" to decide whether to return the size_t or ptrdiff_t type. This incorrectly matches stuff like like "size_tea" and "ptrdiff_tee". Fix this by making it an exact comparison. Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-28 16:21:56 -08:00
Omar Sandoval	87b7292aa5	Relicense drgn from GPLv3+ to LGPLv2.1+ drgn is currently licensed as GPLv3+. Part of the long term vision for drgn is that other projects can use it as a library providing programmatic interfaces for debugger functionality. A more permissive license is better suited to this goal. We decided on LGPLv2.1+ as a good balance between software freedom and permissiveness. All contributors not employed by Meta were contacted via email and consented to the license change. The only exception was the author of commit `c4fbf7e589` ("libdrgn: fix for compilation error"), who did not respond. That commit reverted a single line of code to one originally written by me in commit `640b1c011d` ("libdrgn: embed DWARF index in DWARF info cache"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 17:05:16 -07:00
Omar Sandoval	03d5c2ebac	libdrgn: string_builder: replace string_builder_finalize() Instead of string_builder_finalize(), which leaves the string_builder in an undefined state (according to the documentation, at least), define string_builder_null_terminate(), which documents exactly what it does. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 15:55:04 -07:00
Omar Sandoval	d76a3a338f	libdrgn: string_builder: add dedicated initializer Rather than documenting how to initialize a struct string_builder, provide an initializer, STRING_BUILDER_INIT. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-10-05 15:32:07 -07:00
Jay Kamat	063850325f	libdrgn: dwarf: look up complete types in namespaces drgn_debug_info_find_complete() looks up the name of the incomplete type in the global namespace. This is incorrect for C++: we need to look it up in the namespace that the DIE is in. To find the containing namespace, we need to do a DIE ancestor walk. We don't want to do this for C, so add a flag indicating whether a language has namespaces to struct drgn_language. If it's true, then we do the ancestor walk and then look up the name in the appropriate namespace. Signed-off-by: Jay Kamat <jaygkamat@gmail.com>	2022-07-15 16:02:56 -07:00
prozak	e8dada0ec1	Enable prog.type to work with classes Also added test for CPP class type This is a prerequisite to #83 Signed-off-by: mykolal <nickolay.lysenko@gmail.com>	2022-02-22 14:55:23 -08:00
Omar Sandoval	4f5249775d	Fix various lints Some functions that could be static found by -Wmissing-prototypes, some include-what-you-use warnings, some missing SPDX identifiers. These lints should be automated at some point. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-17 10:45:42 -08:00
Omar Sandoval	9397a11605	libdrgn: export drgn_language instances libdrgn currently exports struct drgn_language pointers from drgn_program_language(), drgn_type_language(), and drgn_object_language(), but doesn't provide any way to do anything with them. Export our drgn_language instances and add drgn_language_name() so that they can at least be compared and printed. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-16 13:07:42 -08:00
Omar Sandoval	5d65ebb04b	libdrgn: don't store language structures in one array In the next change, we want to export languages to the public libdrgn interface. I couldn't figure out any way to export array elements as their own symbols. I'd also rather not export the drgn_languages array indices as an enum because that would preclude ever having any sort of language plugin support. Instead, let's get rid of the drgn_languages array as it currently exists and have separate drgn_language structures. This also allows us to make a bunch of the C implementation functions static again. We keep the language numbers so that we can store per-language data efficiently (currently drgn_program::void_types and languages_py), as well as a drgn_languages array to go from the language number to the struct drgn_language. But, this is all internal and could be changed if we ever support language plugins. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-16 12:47:12 -08:00
Omar Sandoval	c49bba41b2	libdrgn: language_c: replace c_keywords with memswitch GCC or binutils on Fedora Rawhide for ARM seems to have a bug where c_keywords gets placed in the .data.rel.ro section (see https://www.airs.com/blog/archives/189): $ readelf -s .libs/libdrgnimpl_la-language_c.o \| grep -w c_keywords 475: 00000000 16 OBJECT LOCAL DEFAULT 175 c_keywords $ readelf -S .libs/libdrgnimpl_la-language_c.o \| grep -F '[175]' [175] .data.rel PROGBITS 00000000 051f90 000010 00 WA 0 0 4 $ readelf -s .libs/_drgn.so \| grep -w c_keywords 9267: 0008e84c 16 OBJECT LOCAL DEFAULT 21 c_keywords.lto_priv.0 $ readelf -S .libs/_drgn.so \| grep -F '[21]' [21] .data.rel.ro PROGBITS 0008e018 07e018 000a10 00 WA 0 0 8 This results in a crash on startup when c_keywords_init() attempts to populate c_keywords. While this appears to be a compiler or linker bug, I've been meaning to replace c_keywords with a static lookup function anyways. Now that we have gen_strswitch.py, we can use it to generate the lookup function. Add a script, gen_c_keywords_inc_strswitch.py, which generates an array mapping token kind to spelling, and a memswitch mapping spelling to token kind. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-04 20:26:35 -08:00
Omar Sandoval	8a41adc1b0	libdrgn: language_c: add missing error check in c_parse_abstract_declarator() Found with clang-static-analyzer. Reported-by: Kevin Svetlitski <svetlitski@fb.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-12-08 13:56:15 -08:00
Omar Sandoval	3914bb8e29	libdrgn: fix type names referring to anonymous types A pointer, array, or function referring to an anonymous type currently includes the full type definition in its type name. This creates very badly formatted objects for, e.g., drgn's own hash table types. Instead, use "struct <anonymous>" in the type name. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-23 00:57:42 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	d1745755f1	Fix some include-what-you-use warnings Also: * Rename struct string to struct nstring and move it to its own header. * Fix scripts/iwyu.py, which was broken by commit `5541fad063` ("Fix some flake8 errors"). * Add workarounds for a few outstanding include-what-you-use issues. There is still a false positive for include-what-you-use/include-what-you-use#970, but hopefully that is fixed soon. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-10 15:09:29 -08:00
Omar Sandoval	fba5947fec	libdrgn: add array_for_each() And use it in a few appropriate places. This should hopefully make it harder to make iteration mistakes like the one fixed by commit `4755cfac7c` ("libdrgn: dwarf_index: increment correct variable when rolling back"). While we're doing this, move ARRAY_SIZE() into a new header file with array_for_each() and make it lowercase. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-23 17:32:00 -07:00
Omar Sandoval	0e3054a0ba	libdrgn: make addresses wrap around when reading memory Define that addresses for memory reads wrap around after the maximum address rather than the current unpredictable behavior. This is done by: 1. Reworking drgn_memory_reader to work with an inclusive address range so that a segment can contain UINT64_MAX. drgn_memory_reader remains agnostic to the maximum address and requires that address ranges do not overflow a uint64_t. 2. Adding the overflow/wrap-around logic to drgn_program_add_memory_segment() and drgn_program_read_memory(). 3. Changing direct uses of drgn_memory_reader_reader() to drgn_program_read_memory() now that they are no longer equivalent. (For some platforms, a fault might be more appropriate than wrapping around, but this is a step in the right direction.) Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-06-03 17:49:29 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	301cc3f139	libdrgn: fix UBSan "applying zero offset to null pointer" errors There are a couple of places where we compute `NULL + 0`, which is undefined behavior. Add a helper to do this safely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-02 13:38:29 -07:00
Omar Sandoval	a24c0f5b33	libdrgn: clean up usage of drgn_stop Use drgn_not_found where it's more appropriate, and check explicitly against drgn_stop instead of err->code == DRGN_ERROR_STOP. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-03-05 12:46:06 -08:00
Omar Sandoval	25eb2abb1a	libdrgn: add drgn_platform getters Add low-level getters equivalent to the drgn_program platform-related helpers and use them in places where we have checked or can assume that the platform is known. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-26 16:05:49 -08:00
Omar Sandoval	aaa98ccae3	libdrgn: consistently use __ for __attribute__ names In some places, we add __ preceding and following an attribute name, and in some places, we don't. Let's make it consistent. We might as well opt for the __ to make clashes with macros less likely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 03:16:23 -08:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	72b4aa9669	libdrgn: clean up object initialization Rename struct drgn_object_type to struct drgn_operand_type, add a new struct drgn_object_type which contains all of the type-related fields from struct drgn_object, and use it to implement drgn_object_type() and drgn_object_type_operand(), which are replacements for drgn_object_set_common() and drgn_object_type_encoding_and_size(). This cleans up a lot of the boilerplate around initializing objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 17:43:14 -08:00
Omar Sandoval	78316a28fb	libdrgn: remove half-baked support for complex types We've nominally supported complex types since commit `75c3679147` ("Rewrite drgn core in C"), but parsing them from DWARF has been incorrect from the start (they don't have a DW_AT_type attribute like we assume), and we never implemented proper support for complex objects. Drop the partial implementation; we can bring it back (properly) if someone requests it. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-17 14:56:33 -08:00
Omar Sandoval	190062f470	libdrgn: get drgn_type_member.bit_field_size through drgn_member_type() Getting the bit field size of a member will soon require evaluating the lazy type, so return it from drgn_member_type() instead of accessing it directly. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	30cfa40a72	libdrgn: rename "unavailable" objects to "absent" objects I was going to add an Object.available_ attribute, but that made me realize that the naming is somewhat ambiguous, as a reference object with an invalid address might also be considered "unavailable" by users. Use the name "absent" instead, which is more clear: the object isn't there at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 14:58:26 -08:00
Omar Sandoval	e72ecd0e2c	libdrgn: replace drgn_program_member_info() with drgn_type_find_member() Now that types are associated with their program, we don't need to pass the program separately to drgn_program_member_info() and can replace it with a more natural drgn_type_find_member() API that takes only the type and member name. While we're at it, get rid of drgn_member_info and return the drgn_type_member and bit_offset directly. This also fixes a bug that drgn_error_member_not_found() ignores the member name length. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:40:54 -08:00
Omar Sandoval	738ae2c75f	libdrgn: pack struct drgn_object better We can get struct drgn_object down from 40 bytes to 32 bytes (on x86-64) by moving the bit_offset and little_endian members out of the value and reference structs. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	abafdd965f	Remove bit_offset from value objects There are a couple of reasons that it was the wrong choice to have a bit_offset for value objects: 1. When we store a buffer with a bit_offset, we're storing useless padding bits. 2. bit_offset describes a location, or in other words, part of an address. This makes sense for references, but not for values, which are just a bag of bytes. Get rid of union drgn_value.bit_offset in libdrgn, make Object.bit_offset None for value objects, and disallow passing bit_offset to the Object() constructor when creating a value. bit_offset can still be passed when creating an object from a buffer, but we'll shift the bytes down as necessary to store the value with no offset. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	6bd0c2b4d2	libdrgn: add concept of "unavailable" objects There are some situations where we can find an object but can't determine its value, like local variables that have been optimized out, inlined functions without a concrete instance, and pure virtual methods. It's still useful to get some information from these objects, namely their types. Let's add the concept of an "unavailable" object, which is an object with a known type but unknown value/address. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:58:19 -08:00
Omar Sandoval	5f17281926	libdrgn: make drgn_object::is_reference an enum To prepare for a new kind of object, replace the is_reference bool with an enum drgn_object_kind. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:37:58 -08:00
Omar Sandoval	edb1fe7f2f	libdrgn: rename drgn_object_kind to drgn_object_encoding I'd like to use the name drgn_object_kind to distinguish between values and references. "Encoding" is more accurate than "kind", anyways. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:26 -08:00
Omar Sandoval	2710b4d2aa	libdrgn: add macros for strict enum switch statements There are several places where we'd like to enforce that every enumeration is handled in a switch. Add SWITCH_ENUM() and SWITCH_ENUM_DEFAULT() macros for that and use them. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 12:02:23 -08:00
Omar Sandoval	3c5d22637e	libdrgn: clean up hash function APIs and improve documentation Use _hash_pair() for hash functions that do the full double hashing and return a struct hash_pair and hash_() for other hashing utility functions. Also change some of the equality function names to be more symmetric and improve the documentation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-12 16:20:08 -07:00
Omar Sandoval	761da83ddd	libdrgn: add {min,max}_iconst() and rewrite min() and max() min() and max() from the Linux kernel go through the trouble of resulting in a constant expression if the arguments are constant expressions, but they can't be used outside of a function due to their use of ({ }). This means that they can't be used for, e.g., enumerators or global arrays. Let's simplify min() and max() and instead add explicit min_iconst() and max_iconst() macros that can be used everywhere that an integer constant expression is required. We can then use it in hash_table.h. While we're here, let's split these into their own header file and document them better. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-10 23:48:03 -07:00
Omar Sandoval	fa44171ba1	libdrgn: split bit operations into their own header And improve their documentation. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-10-09 17:44:15 -07:00
Omar Sandoval	286c09844e	Clean up #includes with include-what-you-use I recently hit a couple of CI failures caused by relying on transitive includes that weren't always present. include-what-you-use is a Clang-based tool that helps with this. It's a bit finicky and noisy, so this adds scripts/iwyu.py to make running it more convenient (but not reliable enough to automate it in Travis). This cleans up all reasonable include-what-you-use warnings and reorganizes a few header files. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-09-23 16:29:42 -07:00
Omar Sandoval	e49a87a3d7	libdrgn: remove struct drgn_object::prog We can get it via the type now. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:21 -07:00
Omar Sandoval	a97f6c4fa2	Associate types with program I originally envisioned types as dumb descriptors. This mostly works for C because in C, types are fairly simple. However, even then the drgn_program_member_info() API is awkward. You should be able to look up a member directly from a type, but we need the program for caching purposes. This has also held me back from adding offsetof() or has_member() APIs. Things get even messier with C++. C++ template parameters can be objects (e.g., template <int N>). Such parameters would best be represented by a drgn object, which we need a drgn program for. Static members are a similar case. So, let's reimagine types as being owned by a program. This has a few parts: 1. In libdrgn, simple types are now created by factory functions, drgn_foo_type_create(). 2. To handle their variable length fields, compound types, enum types, and function types are constructed with a "builder" API. 3. Simple types are deduplicated. 4. The Python type factory functions are replaced by methods of the Program class. 5. While we're changing the API, the parameters to pointer_type() and array_type() are reordered to be more logical (and to allow pointer_type() to take a default size of None for the program's default pointer size). 6. Likewise, the type factory methods take qualifiers as a keyword argument only. A big part of this change is updating the tests and splitting up large test cases into smaller ones in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:41:09 -07:00
Omar Sandoval	c31208f69c	libdrgn: fold drgn_type_index into drgn_program This is preparation for associating types with a program. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:36:35 -07:00
Omar Sandoval	948cda2941	libdrgn: add vector/hash table initializers and update coding style Declaring a local vector or hash table and separately initializing it with vector_init()/hash_table_init() is annoying. Add macros that can be used as initializers. This exposes several places where the C89 style of placing all declarations at the beginning of a block is awkward. I adopted this style from the Linux kernel, which uses C89 and thus requires this style. I'm now convinced that it's usually nicer to declare variables where they're used. So let's officially adopt the style of mixing declarations and code (and ditch the blank line after declarations) and update the functions touched by this change.	2020-07-01 12:48:24 -07:00
Omar Sandoval	8b264f8823	Update copyright headers to Facebook and add missing headers drgn was originally my side project, but for awhile now it's also been my work project. Update the copyright headers to reflect this, and add a copyright header to various files that were missing it.	2020-05-15 15:13:02 -07:00
Omar Sandoval	3d59e042f4	libdrgn: don't open-code fls() c_integer_literal() has an open-coded equivalent of fls() that assumes that unsigned long long is 64 bits. Use fls() instead.	2020-05-08 00:20:42 -07:00
Omar Sandoval	0a100064c1	libdrgn: improve and rename DRGN_UNREACHABLE() DRGN_UNREACHABLE() currently expands to abort(), but assert() provides more information. If NDEBUG is defined, we can use __builtin_unreachable() instead. DRGN_UNREACHABLE() isn't drgn-specific, so this renames it to UNREACHABLE(). It's also not really related to errors, so this moves it to internal.h.	2020-05-07 15:16:22 -07:00
Omar Sandoval	a3248b51e3	libdrgn: fix use after free when formatting compound types compound_initializer_init_next() saves a pointer to the compound initializer stack and uses it after appending to the stack, which may have reallocated the stack.	2020-04-13 16:49:18 -07:00
Omar Sandoval	cae7336750	libdrgn: fix error when expecting identifier after tag in type name We should be looking at the kind of the previous token, not the kind of the unexpected token. Closes #52.	2020-03-13 11:07:46 -07:00
Jay Kamat	6c264b0eae	libdrgn: add language to struct drgn_type For types obtained from DWARF, we determine it from the language of the CU. For other types, it can be specified manually or fall back to the default (C). Then, we can use the language for operations where the type is available.	2020-02-26 19:55:42 -08:00
Omar Sandoval	9e2df9f217	libdrgn: put language definitions in one array This way, languages can be identified by an index, which will be useful for adding Python bindings for drgn_language and for adding a language field to drgn_type.	2020-02-26 19:55:42 -08:00
Jay Kamat	054cb54a01	libdrgn: Rename find_symbol to find_symbol_by_address	2020-02-12 14:06:49 -08:00
Serapheim Dimitropoulos	ad82e9623a	Introduce OutOfBoundsError Decouple some of the responsibilities of FaultError to OutOfBoundsError so consumers can differentiate between invalid memory accesses and running out of bounds in drgn Objects which may be based on valid memory address.	2020-02-04 14:59:31 -08:00

1 2

84 Commits