JakeHillion/drgn

mirror of https://github.com/JakeHillion/drgn.git synced 2024-12-23 01:33:06 +00:00

Author	SHA1	Message	Date
Omar Sandoval	d1ffd581bd	libdrgn: allow reinterpreting primitive scalar values We don't allow this because "value objects with a scalar type cannot be reinterpreted, as their memory layout in the program is not known". That doesn't really make sense: we already support reconstructing the in-memory representation with drgn_object_read_bytes(). Implement this by making drgn_object_slice() support slicing all objects, using drgn_object_read_bytes() when necessary, then make drgn_object_reinterpret() a trivial wrapper around it. Closes #378. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2024-01-12 15:50:23 -08:00
Omar Sandoval	60a289fdff	tests: make identical() stricter Require exact type matches, not subclasses, and fail hard for types we don't explicitly handle. This caught one place where we weren't testing what we thought we were. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-11-08 13:18:44 -08:00
Omar Sandoval	243f6fb7d5	libdrgn: support value objects with >64-bit integer types The Linux kernel's struct task_struct on AArch64 contains an array of __uint128_t: >>> task = find_task(prog, 1) >>> task.type_ struct task_struct * >>> task.thread.type_ struct thread_struct { struct cpu_context cpu_context; struct { unsigned long tp_value; unsigned long tp2_value; struct user_fpsimd_state fpsimd_state; } uw; enum fp_type fp_type; unsigned int fpsimd_cpu; void sve_state; void sme_state; unsigned int vl[2]; unsigned int vl_onexec[2]; unsigned long fault_address; unsigned long fault_code; struct debug_info debug; struct ptrauth_keys_user keys_user; struct ptrauth_keys_kernel keys_kernel; u64 mte_ctrl; u64 sctlr_user; u64 svcr; u64 tpidr2_el0; } >>> task.thread.uw.fpsimd_state.type_ struct user_fpsimd_state { __int128 unsigned vregs[32]; __u32 fpsr; __u32 fpcr; __u32 __reserved[2]; } As a result, printing a task_struct fails: >>> task Traceback (most recent call last): File "<console>", line 1, in <module> File "/host/home/osandov/repos/drgn3/drgn/cli.py", line 140, in _displayhook text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns) NotImplementedError: integer values larger than 64 bits are not yet supported PR #311 suggested treating >64-bit integers as byte arrays for now; I tried an alternate hack of handling >64-bit integers only in the pretty-printing code. Both of these had issues, though. Instead, let's push >64-bit integer support a little further and allow storing "big integer" value objects. We still don't support any operations on them, so this still doesn't complete #170. We store the raw bytes of the value for now, but we'll probably change this if we add support for operations (e.g., to store the value as an mp_limb_t array for GMP). We also print >64-bit integer types in hexadecimal for simplicity. This is inconsistent with the existing behavior of printing in decimal, but more readable. In the future, we might want to add heuristics to decide when to print in decimal vs hexadecimal for all sizes. Closes #311. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 14:21:46 -07:00
Omar Sandoval	3ce37c8002	libdrgn: python: fix creating compound value with 32-bit float member on big-endian This is similar to commit `155ec92ef2` ("libdrgn: fix reading 32-bit float object values on big-endian"). Fixes: `75c3679147` ("Rewrite drgn core in C") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-02 10:39:34 -07:00
Omar Sandoval	0bc79c877a	libdrgn: fix stray bits when reading bytes of bit field Signed-off-by: Omar Sandoval <osandov@osandov.com>	2023-08-01 16:31:17 -07:00
Omar Sandoval	87b7292aa5	Relicense drgn from GPLv3+ to LGPLv2.1+ drgn is currently licensed as GPLv3+. Part of the long term vision for drgn is that other projects can use it as a library providing programmatic interfaces for debugger functionality. A more permissive license is better suited to this goal. We decided on LGPLv2.1+ as a good balance between software freedom and permissiveness. All contributors not employed by Meta were contacted via email and consented to the license change. The only exception was the author of commit `c4fbf7e589` ("libdrgn: fix for compilation error"), who did not respond. That commit reverted a single line of code to one originally written by me in commit `640b1c011d` ("libdrgn: embed DWARF index in DWARF info cache"). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-11-01 17:05:16 -07:00
Shung-Hsi Yu	e8d0c85811	test: add test for _repr_pretty_() method Since _repr_pretty_() uses output of str(), and the latter is already heavily tested in tests/test_language_c.py, we can simply test whether p.text() is called instead of duplicating all the test cases. Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>	2022-08-25 13:52:28 -07:00
Kevin Svetlitski	5aaf3db6fc	libdrgn: support reference and absent objects with float types which aren't 32 or 64 bits Very similar to `a541e9b170`, but adds partial support for floats (as opposed to integers) which aren't 32 or 64 bits. Signed-off-by: Kevin Svetlitski <svetlitski@fb.com>	2022-07-06 15:47:18 -07:00
Omar Sandoval	a541e9b170	libdrgn: support reference and absent objects with >64-bit integer types GCC and Clang have 128-bit integer types on 64-bit targets: __int128 and unsigned __int128. Clang additionally has N-bit integers of up to 2<<24 bits with _ExtInt(N), which was standardized in C23 as _BitInt(N). Currently, we disallow creating objects with a >64-bit integer type. Jay Kamat reported that this would cause errors when examining some binaries. The reason we disallow this is that we don't have a way to represent or do operations on >64-bit values. We could make use of a bignum library like GMP to do this in the future. However, for now, we can loosen this restriction and at least allow reference and absent objects with big integer types. This requires enforcing two things: that we never create a value object with a >64-bit integer type, and that we never read the value of a reference object with a >64-bit integer type. Co-authored-by: Jay Kamat <jaygkamat@gmail.com> Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-04-28 13:38:38 -07:00
Omar Sandoval	7f232a4815	pre-commit: update Black Black 22.1.0 has some style changes: string prefixes are normalized and spaces around the power operator are removed. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2022-02-12 13:48:49 -08:00
Omar Sandoval	c0d8709b45	Update copyright headers to Meta Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-11-21 15:59:44 -08:00
Omar Sandoval	5541fad063	Fix some flake8 errors Mainly unused imports, unused variables, unnecessary f-strings, and regex literals missing the r prefix. I'm not adding it to the CI linter because it's too noisy, though. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-08-11 14:52:44 -07:00
Omar Sandoval	7335df114c	libdrgn: python: add Object.to_bytes_() And the libdrgn implementation, drgn_object_read_bytes(). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:12:34 -07:00
Omar Sandoval	9c00552007	libdrgn: python: add Object.from_bytes_() Add a way to create an object from raw bytes. One example where I've wanted this is creating a struct pt_regs from a PRSTATUS note or other source. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-07-26 17:06:58 -07:00
Omar Sandoval	a4b9d68a8c	Use GPL-3.0-or-later license identifier instead of GPL-3.0+ Apparently the latter is deprecated and the former is preferred. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-04-03 01:10:35 -07:00
Omar Sandoval	85dec2b8f6	tests: move C-specific tests from test_object to test_language_c TestCLiteral, TestCIntegerPromotion, TestCCommonRealType, TestCOperators, and TestCPretty in test_object all test various operations on objects, but since they're testing language-specific behavior, they belong in test_language_c. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 16:11:19 -08:00
Omar Sandoval	55e3a58e06	libdrgn: python: use correct member offset when creating object from value We need to use the offset of the member in the outermost object type, not the offset in the immediate containing type in the case of nested anonymous structs. Fixes: `e72ecd0e2c` ("libdrgn: replace drgn_program_member_info() with drgn_type_find_member()") Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-21 02:29:59 -08:00
Omar Sandoval	9fda010789	Track byte order in scalar types instead of objects Currently, reference objects and buffer value objects have a byte order. However, this doesn't always make sense for a couple of reasons: - Byte order is only meaningful for scalars. What does it mean for a struct to be big endian? A struct doesn't have a most or least significant byte; its scalar members do. - The DWARF specification allows either types or variables to have a byte order (DW_AT_endianity). The only producer I could find that uses this is GCC for the scalar_storage_order type attribute, and it only uses it for base types, not variables. GDB only seems to use to check it for base types, as well. So, remove the byte order from objects, and move it to integer, boolean, floating-point, and pointer types. This model makes more sense, and it means that we can get the binary representation of any object now. The only downside is that we can no longer support a bit offset for non-scalars, but as far as I can tell, nothing needs that. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 21:41:29 -08:00
Omar Sandoval	72b4aa9669	libdrgn: clean up object initialization Rename struct drgn_object_type to struct drgn_operand_type, add a new struct drgn_object_type which contains all of the type-related fields from struct drgn_object, and use it to implement drgn_object_type() and drgn_object_type_operand(), which are replacements for drgn_object_set_common() and drgn_object_type_encoding_and_size(). This cleans up a lot of the boilerplate around initializing objects. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-02-19 17:43:14 -08:00
Omar Sandoval	d35243b354	libdrgn: replace lazy types with lazy objects In order to support static members, methods, default function arguments, and value template parameters, we need to be able to store a drgn_object in a drgn_type_member or drgn_type_parameter. These are all cases where we want lazy evaluation, so we can replace drgn_lazy_type with a new drgn_lazy_object which implements the same idea but for objects. Types can still be represented with an absent object. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2021-01-08 17:39:51 -08:00
Omar Sandoval	988e9e7190	libdrgn/python: add Object.absent_ Without this, the only way to check whether an object is absent in Python is to try to use the object and catch the ObjectAbsentError. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 15:06:40 -08:00
Omar Sandoval	30cfa40a72	libdrgn: rename "unavailable" objects to "absent" objects I was going to add an Object.available_ attribute, but that made me realize that the naming is somewhat ambiguous, as a reference object with an invalid address might also be considered "unavailable" by users. Use the name "absent" instead, which is more clear: the object isn't there at all. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-29 14:58:26 -08:00
Omar Sandoval	7d7aa7bf7b	libdrgn/python: remove Type == operator The == operator on drgn.Type is only intended for testing. It's expensive and slow and not what people usually want. It's going to get even more awkward to define once types can refer to objects (for template parameters and static members and such). Let's replace == with a new identical() function only available in unit tests. Then, remove the operator from the Python bindings as well as the underlying libdrgn drgn_type_eq() and drgn_qualified_type_eq() functions. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 03:11:38 -08:00
Omar Sandoval	523fd26959	libdrgn: don't allow casting to non-scalar types at all Currently, we try to emulate the GNU C extension of casting a struct type to itself. This does a deep type comparison, which is expensive. We could take a shortcut like only comparing the kind and type name, but seeing as standard C only allows casting to a scalar type, let's drop support for casting to a struct (or other non-scalar) type entirely. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-22 02:46:05 -08:00
Omar Sandoval	40004e5c8f	libdrgn/python: add offsetof() offsetof() can almost be implemented with Type.member(name).offset, but that doesn't parse member designators. Add an offsetof() function that does (and add drgn_type_offsetof() in libdrgn). Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 16:46:41 -08:00
Omar Sandoval	e72ecd0e2c	libdrgn: replace drgn_program_member_info() with drgn_type_find_member() Now that types are associated with their program, we don't need to pass the program separately to drgn_program_member_info() and can replace it with a more natural drgn_type_find_member() API that takes only the type and member name. While we're at it, get rid of drgn_member_info and return the drgn_type_member and bit_offset directly. This also fixes a bug that drgn_error_member_not_found() ignores the member name length. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-15 14:40:54 -08:00
Omar Sandoval	abafdd965f	Remove bit_offset from value objects There are a couple of reasons that it was the wrong choice to have a bit_offset for value objects: 1. When we store a buffer with a bit_offset, we're storing useless padding bits. 2. bit_offset describes a location, or in other words, part of an address. This makes sense for references, but not for values, which are just a bag of bytes. Get rid of union drgn_value.bit_offset in libdrgn, make Object.bit_offset None for value objects, and disallow passing bit_offset to the Object() constructor when creating a value. bit_offset can still be passed when creating an object from a buffer, but we'll shift the bytes down as necessary to store the value with no offset. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-14 12:29:17 -08:00
Omar Sandoval	6bd0c2b4d2	libdrgn: add concept of "unavailable" objects There are some situations where we can find an object but can't determine its value, like local variables that have been optimized out, inlined functions without a concrete instance, and pure virtual methods. It's still useful to get some information from these objects, namely their types. Let's add the concept of an "unavailable" object, which is an object with a known type but unknown value/address. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:58:19 -08:00
Omar Sandoval	5f17281926	libdrgn: make drgn_object::is_reference an enum To prepare for a new kind of object, replace the is_reference bool with an enum drgn_object_kind. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-12-04 13:37:58 -08:00
Omar Sandoval	36068a0ea8	Fix trailing commas for Black v20.8b1 Black was recently changed to treat a trailing comma as an indicator to put each item/argument on its own line. We have a bunch of places where something previously had to be split into multiple lines, then was edited to fit on one line, but Black kept the trailing comma. Now this update wants to unnecessarily split it back up. For now, let's get rid of these commas. Hopefully in the future Black has a way to opt out of this. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-27 11:31:29 -07:00
Omar Sandoval	a97f6c4fa2	Associate types with program I originally envisioned types as dumb descriptors. This mostly works for C because in C, types are fairly simple. However, even then the drgn_program_member_info() API is awkward. You should be able to look up a member directly from a type, but we need the program for caching purposes. This has also held me back from adding offsetof() or has_member() APIs. Things get even messier with C++. C++ template parameters can be objects (e.g., template <int N>). Such parameters would best be represented by a drgn object, which we need a drgn program for. Static members are a similar case. So, let's reimagine types as being owned by a program. This has a few parts: 1. In libdrgn, simple types are now created by factory functions, drgn_foo_type_create(). 2. To handle their variable length fields, compound types, enum types, and function types are constructed with a "builder" API. 3. Simple types are deduplicated. 4. The Python type factory functions are replaced by methods of the Program class. 5. While we're changing the API, the parameters to pointer_type() and array_type() are reordered to be more logical (and to allow pointer_type() to take a default size of None for the program's default pointer size). 6. Likewise, the type factory methods take qualifiers as a keyword argument only. A big part of this change is updating the tests and splitting up large test cases into smaller ones in a few places. Signed-off-by: Omar Sandoval <osandov@osandov.com>	2020-08-26 17:41:09 -07:00
Omar Sandoval	971a2d3687	libdrgn/python: make Objects fully immutable The model has always been that drgn Objects are immutable, but for some reason I went through the trouble of allowing __init__() to reinitialize an already initialized Object. Instead, let's fully initialize the Object in __new__() and get rid of __init__().	2020-05-18 00:07:49 -07:00
Omar Sandoval	ab876f3dbd	libdrgn/python: allow specifying Object value positionally It's annoying to have to do value= when creating objects, especially in interactive mode. Let's allow passing in the value positionally so that `Object(prog, "int", value=0)` becomes `Object(prog, "int", 0)`. It's clear enough that this is creating an int with value 0.	2020-05-18 00:07:49 -07:00
Omar Sandoval	8b264f8823	Update copyright headers to Facebook and add missing headers drgn was originally my side project, but for awhile now it's also been my work project. Update the copyright headers to reflect this, and add a copyright header to various files that were missing it.	2020-05-15 15:13:02 -07:00
Omar Sandoval	63299e0701	libdrgn: actually use uint64_t for two's complement unary ops UNARY_OP_SIGNED_2C() uses a union of int64_t and uint64_t to avoid signed integer overflow... except that there's a typo and the uint64_t is actually an int64_t. Fix it and add a test that would catch it with -fsanitize=undefined.	2020-05-08 13:50:24 -07:00
Omar Sandoval	26ef465007	libdrgn/python: add proper type for members and parameters This continues the conversion from the last commit. Members and parameters are basically the same, so we can do them together. Unlike enumerators, these don't make sense to unpack or access as sequences.	2020-02-12 15:40:19 -08:00
Omar Sandoval	7c70a1a384	libdrgn/python: add proper type for enumerators Currently, type members, enumerators, and parameters are all represented by tuples in the Python bindings. This is awkward to document and implement. Instead, let's replace these tuples with proper types, starting with the easiest one, TypeEnumerator. This one still makes sense to treat as a sequence so that it can be unpacked as (name, value).	2020-02-12 15:37:41 -08:00
Omar Sandoval	9de2cc8410	libdrgn/python: make Object.__index__() TypeError message clearer Currently, we print: >>> prog.symbol(prog['init_task']) Traceback (most recent call last): File "<console>", line 1, in <module> TypeError: cannot convert 'struct task_struct' to index It's not obvious what it means to convert to an index. Instead, let's use the error message raised by operator.index(): TypeError: 'struct task_struct' object cannot be interpreted as an integer	2020-02-11 09:19:53 -08:00
Serapheim Dimitropoulos	ad82e9623a	Introduce OutOfBoundsError Decouple some of the responsibilities of FaultError to OutOfBoundsError so consumers can differentiate between invalid memory accesses and running out of bounds in drgn Objects which may be based on valid memory address.	2020-02-04 14:59:31 -08:00
Omar Sandoval	660276a0b8	Format Python code with Black I'm not a fan of 100% of the Black coding style, but I've spent too much time manually formatting Python code, so let's just pull the trigger.	2020-01-14 11:51:58 -08:00
Omar Sandoval	1443d17fb4	libdrgn: add DRGN_FORMAT_OBJECT_IMPLICIT_ELEMENTS	2019-12-19 11:43:54 -08:00
Omar Sandoval	db66952b2e	libdrgn: add DRGN_FORMAT_OBJECT_IMPLICIT_MEMBERS	2019-12-19 11:43:54 -08:00
Omar Sandoval	c8434e9a9e	libdrgn: add DRGN_FORMAT_OBJECT_ELEMENT_INDICES	2019-12-19 11:43:54 -08:00
Omar Sandoval	cfceb491db	libdrgn: add DRGN_FORMAT_OBJECT_MEMBER_NAMES	2019-12-19 11:43:54 -08:00
Omar Sandoval	4fad941ec1	libdrgn: add DRGN_FORMAT_OBJECT_{MEMBERS,ELEMENTS}_SAME_LINE	2019-12-19 11:43:54 -08:00
Omar Sandoval	6bb8da04a0	libdrgn: omit trailing comma when formatting one-line array This is somewhat arbitrary, but I think it looks more natural to only use the trailing comma for multi-line initializers.	2019-12-19 11:43:54 -08:00
Omar Sandoval	d77b7bd7e3	libdrgn: add DRGN_FORMAT_OBJECT_{TYPE_NAME,MEMBER_TYPE_NAMES,ELEMENT_TYPE_NAMES}	2019-12-19 11:43:54 -08:00
Omar Sandoval	89307c532a	libdrgn: add DRGN_FORMAT_OBJECT_CHAR	2019-12-19 11:43:54 -08:00
Omar Sandoval	7cee597fff	libdrgn: add DRGN_FORMAT_OBJECT_STRING	2019-12-19 11:43:54 -08:00
Omar Sandoval	5865fa4d16	libdrgn: add DRGN_FORMAT_OBJECT_SYMBOLIZE	2019-12-19 11:43:54 -08:00

1 2

68 Commits