2021-11-21 23:59:44 +00:00
|
|
|
// Copyright (c) Meta Platforms, Inc. and affiliates.
|
2022-11-02 00:05:16 +00:00
|
|
|
// SPDX-License-Identifier: LGPL-2.1-or-later
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @file
|
|
|
|
*
|
|
|
|
* Program internals.
|
|
|
|
*
|
|
|
|
* See @ref ProgramInternals.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef DRGN_PROGRAM_H
|
|
|
|
#define DRGN_PROGRAM_H
|
|
|
|
|
libdrgn: use libdwfl
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:
- It only supports finding debug information for the running kernel, not
vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
for every module.
- It doesn't find kernel modules with names containing both dashes and
underscores (e.g., aes-x86_64).
Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.
The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.
2019-07-15 08:51:30 +01:00
|
|
|
#include <elfutils/libdwfl.h>
|
2020-09-24 00:02:02 +01:00
|
|
|
#include <libelf.h>
|
|
|
|
#include <sys/types.h>
|
2019-08-02 08:00:59 +01:00
|
|
|
#ifdef WITH_LIBKDUMPFILE
|
|
|
|
#include <libkdumpfile/kdumpfile.h>
|
|
|
|
#endif
|
libdrgn: use libdwfl
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:
- It only supports finding debug information for the running kernel, not
vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
for every module.
- It doesn't find kernel modules with names containing both dashes and
underscores (e.g., aes-x86_64).
Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.
The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.
2019-07-15 08:51:30 +01:00
|
|
|
|
2023-10-02 19:18:22 +01:00
|
|
|
#include "debug_info.h"
|
2020-09-24 00:02:02 +01:00
|
|
|
#include "drgn.h"
|
2019-10-24 22:26:45 +01:00
|
|
|
#include "hash_table.h"
|
2020-09-24 00:02:02 +01:00
|
|
|
#include "language.h"
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
#include "memory_reader.h"
|
2019-07-24 00:26:29 +01:00
|
|
|
#include "object_index.h"
|
2019-07-29 08:57:28 +01:00
|
|
|
#include "platform.h"
|
2023-07-18 18:14:26 +01:00
|
|
|
#include "pp.h"
|
libdrgn: introduce Symbol Finder API
Symbol lookup is not yet modular, like type or object lookup. However,
making it modular would enable easier development and prototyping of
alternative Symbol providers, such as Linux kernel module symbol tables,
vmlinux kallsyms tables, and BPF function symbols. To begin with, create
a modular Symbol API within libdrgn, and refactor the ELF symbol search
to use it.
For now, we leave drgn_program_find_symbol_by_address_internal() alone.
Its conversion will require some surgery, since the new API can return
errors, whereas this function cannot.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2024-03-02 00:46:53 +00:00
|
|
|
#include "symbol.h"
|
2020-04-23 00:23:26 +01:00
|
|
|
#include "type.h"
|
2020-05-20 19:30:00 +01:00
|
|
|
#include "vector.h"
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
/**
|
2020-09-30 09:32:33 +01:00
|
|
|
* @defgroup Internals Internals
|
|
|
|
*
|
|
|
|
* Internal implementation
|
|
|
|
*
|
|
|
|
* @{
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*
|
|
|
|
* @defgroup ProgramInternals Programs
|
|
|
|
*
|
|
|
|
* Program internals.
|
|
|
|
*
|
|
|
|
* @{
|
|
|
|
*/
|
|
|
|
|
2021-11-19 00:46:59 +00:00
|
|
|
struct drgn_thread {
|
|
|
|
struct drgn_program *prog;
|
|
|
|
uint32_t tid;
|
|
|
|
struct nstring prstatus;
|
|
|
|
struct drgn_object object;
|
|
|
|
};
|
|
|
|
|
2023-08-02 22:54:59 +01:00
|
|
|
DEFINE_VECTOR_TYPE(drgn_typep_vector, struct drgn_type *);
|
|
|
|
DEFINE_VECTOR_TYPE(drgn_prstatus_vector, struct nstring);
|
|
|
|
DEFINE_HASH_TABLE_TYPE(drgn_thread_set, struct drgn_thread);
|
2021-11-19 00:46:59 +00:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
struct drgn_program {
|
|
|
|
/** @privatesection */
|
2020-07-14 08:17:42 +01:00
|
|
|
|
libdrgn: use libdwfl
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:
- It only supports finding debug information for the running kernel, not
vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
for every module.
- It doesn't find kernel modules with names containing both dashes and
underscores (e.g., aes-x86_64).
Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.
The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.
2019-07-15 08:51:30 +01:00
|
|
|
/*
|
2020-07-14 08:17:42 +01:00
|
|
|
* Memory/core dump.
|
libdrgn: use libdwfl
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:
- It only supports finding debug information for the running kernel, not
vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
for every module.
- It doesn't find kernel modules with names containing both dashes and
underscores (e.g., aes-x86_64).
Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.
The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.
2019-07-15 08:51:30 +01:00
|
|
|
*/
|
2020-07-14 08:17:42 +01:00
|
|
|
struct drgn_memory_reader reader;
|
|
|
|
/* Elf core dump or /proc/pid/mem file segments. */
|
|
|
|
struct drgn_memory_file_segment *file_segments;
|
|
|
|
/* Elf core dump. Not valid for live programs or kdump files. */
|
|
|
|
Elf *core;
|
|
|
|
/* File descriptor for ELF core dump, kdump file, or /proc/pid/mem. */
|
|
|
|
int core_fd;
|
|
|
|
/* PID of live userspace program. */
|
|
|
|
pid_t pid;
|
2019-10-24 21:13:01 +01:00
|
|
|
#ifdef WITH_LIBKDUMPFILE
|
|
|
|
kdump_ctx_t *kdump_ctx;
|
|
|
|
#endif
|
2020-07-14 08:17:42 +01:00
|
|
|
|
2020-04-23 00:23:26 +01:00
|
|
|
/*
|
|
|
|
* Types.
|
|
|
|
*/
|
|
|
|
/** Callbacks for finding types. */
|
|
|
|
struct drgn_type_finder *type_finders;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
/** Void type for each language. */
|
|
|
|
struct drgn_type void_types[DRGN_NUM_LANGUAGES];
|
2020-04-23 00:23:26 +01:00
|
|
|
/** Cache of primitive types. */
|
|
|
|
struct drgn_type *primitive_types[DRGN_PRIMITIVE_TYPE_NUM];
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
/** Cache of deduplicated types. */
|
|
|
|
struct drgn_dedupe_type_set dedupe_types;
|
|
|
|
/**
|
2021-01-02 09:41:24 +00:00
|
|
|
* List of created types that are not deduplicated: types with non-empty
|
2021-01-09 01:28:27 +00:00
|
|
|
* lists of members, parameters, template parameters, or enumerators.
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
*
|
2021-01-09 01:28:27 +00:00
|
|
|
* Members, parameters, and template parameters contain lazily-evaluated
|
|
|
|
* objects, so they cannot be easily deduplicated.
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
*
|
2021-01-02 09:41:24 +00:00
|
|
|
* Enumerators could be deduplicated, but it's probably not worth the
|
|
|
|
* effort to hash and compare them.
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
*/
|
|
|
|
struct drgn_typep_vector created_types;
|
2020-04-23 00:23:26 +01:00
|
|
|
/** Cache for @ref drgn_program_find_member(). */
|
|
|
|
struct drgn_member_map members;
|
|
|
|
/**
|
|
|
|
* Set of types which have been already cached in @ref
|
|
|
|
* drgn_program::members.
|
|
|
|
*/
|
|
|
|
struct drgn_type_set members_cached;
|
|
|
|
|
2019-10-24 21:13:01 +01:00
|
|
|
/*
|
2020-07-14 08:17:42 +01:00
|
|
|
* Debugging information.
|
2019-10-24 21:13:01 +01:00
|
|
|
*/
|
2020-07-14 08:17:42 +01:00
|
|
|
struct drgn_object_index oindex;
|
2023-10-02 19:18:22 +01:00
|
|
|
struct drgn_debug_info dbinfo;
|
libdrgn: introduce Symbol Finder API
Symbol lookup is not yet modular, like type or object lookup. However,
making it modular would enable easier development and prototyping of
alternative Symbol providers, such as Linux kernel module symbol tables,
vmlinux kallsyms tables, and BPF function symbols. To begin with, create
a modular Symbol API within libdrgn, and refactor the ELF symbol search
to use it.
For now, we leave drgn_program_find_symbol_by_address_internal() alone.
Its conversion will require some surgery, since the new API can return
errors, whereas this function cannot.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2024-03-02 00:46:53 +00:00
|
|
|
struct drgn_symbol_finder *symbol_finders;
|
2020-07-14 08:17:42 +01:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Program information.
|
|
|
|
*/
|
|
|
|
/* Default language of the program. */
|
|
|
|
const struct drgn_language *lang;
|
|
|
|
struct drgn_platform platform;
|
|
|
|
bool has_platform;
|
|
|
|
enum drgn_program_flags flags;
|
|
|
|
|
|
|
|
/*
|
2021-11-19 00:46:59 +00:00
|
|
|
* Threads/stack traces.
|
2020-07-14 08:17:42 +01:00
|
|
|
*/
|
2020-05-20 19:30:00 +01:00
|
|
|
union {
|
|
|
|
/*
|
2021-03-16 23:15:43 +00:00
|
|
|
* For the Linux kernel, PRSTATUS notes indexed by CPU. See
|
|
|
|
* drgn_get_initial_registers() for why we don't use the PID
|
|
|
|
* map.
|
2020-05-20 19:30:00 +01:00
|
|
|
*/
|
|
|
|
struct drgn_prstatus_vector prstatus_vector;
|
2021-11-19 00:46:59 +00:00
|
|
|
/* For userspace programs, threads indexed by PID. */
|
|
|
|
struct drgn_thread_set thread_set;
|
2020-05-20 19:30:00 +01:00
|
|
|
};
|
2022-02-08 19:43:44 +00:00
|
|
|
struct drgn_thread *main_thread;
|
2021-11-19 00:46:59 +00:00
|
|
|
struct drgn_thread *crashed_thread;
|
2022-06-20 06:09:36 +01:00
|
|
|
/*
|
|
|
|
* AArch64 instruction pointer authentication code mask, parsed either
|
|
|
|
* from NT_ARM_PAC_MASK or VMCOREINFO.
|
|
|
|
*/
|
|
|
|
uint64_t aarch64_insn_pac_mask;
|
2021-11-19 00:46:59 +00:00
|
|
|
bool core_dump_notes_cached;
|
2021-03-16 22:39:37 +00:00
|
|
|
bool prefer_orc_unwinder;
|
2020-07-14 08:17:42 +01:00
|
|
|
|
2020-05-06 08:35:29 +01:00
|
|
|
/*
|
2020-07-14 08:17:42 +01:00
|
|
|
* Linux kernel-specific.
|
2020-05-06 08:35:29 +01:00
|
|
|
*/
|
2022-06-18 18:39:39 +01:00
|
|
|
/* The important parts of the VMCOREINFO note of a Linux kernel core. */
|
|
|
|
struct {
|
|
|
|
/** <tt>uname -r</tt> */
|
|
|
|
char osrelease[128];
|
|
|
|
/** PAGE_SIZE of the kernel. */
|
|
|
|
uint64_t page_size;
|
|
|
|
/**
|
|
|
|
* The offset from the compiled address of the kernel image to its
|
|
|
|
* actual address in memory.
|
|
|
|
*
|
|
|
|
* This is non-zero if kernel address space layout randomization (KASLR)
|
|
|
|
* is enabled.
|
|
|
|
*/
|
|
|
|
uint64_t kaslr_offset;
|
|
|
|
/** Kernel page table. */
|
|
|
|
uint64_t swapper_pg_dir;
|
2022-07-12 21:37:12 +01:00
|
|
|
/** Length of mem_section array (i.e., NR_SECTION_ROOTS). */
|
|
|
|
uint64_t mem_section_length;
|
2022-07-14 20:23:08 +01:00
|
|
|
/** VA_BITS on AArch64. */
|
|
|
|
uint64_t va_bits;
|
libdrgn: x86_64: avoid recursive address translation for swapper_pg_dir
Most core dumps contain some virtual address mappings: usually at a
minimum, the kernel's direct map is represented in ELF vmcores via a
segment. So normally, drgn can rely on the vmcore to read the virtual
address of swapper_pg_dir. However, some vmcores only contain physical
address information, so when drgn reads memory at swapper_pg_dir, it
needs to first translate that address, thus causing a recursive
translation error like below:
>>> prog["slab_caches"]
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/home/stepbren/repos/drgn/drgn/cli.py", line 141, in _displayhook
text = value.format_(columns=shutil.get_terminal_size((0, 0)).columns)
_drgn.FaultError: recursive address translation; page table may be missing from core dump: 0xffffffff9662aff8
Debuggers like crash, as well as libkdumpfile, contain fallback code
which can translate swapper_pg_dir in order to bootstrap this address
translation. In fact, the above error does not occur in drgn when using
libkdumpfile. So, let's add this fallback case to drgn as well. Other
architectures will need to have equivalent support added.
Co-authored-by: Illia Ostapyshyn <ostapyshyn@sra.uni-hannover.de>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2024-05-28 23:33:43 +01:00
|
|
|
/** phys_base on x86_64 */
|
|
|
|
uint64_t phys_base;
|
2022-07-14 20:23:08 +01:00
|
|
|
/** Whether 5-level paging was enabled on x86-64. */
|
2022-06-18 18:39:39 +01:00
|
|
|
bool pgtable_l5_enabled;
|
2022-07-08 22:23:09 +01:00
|
|
|
/** PAGE_SHIFT of the kernel (derived from PAGE_SIZE). */
|
|
|
|
int page_shift;
|
2023-06-30 00:28:12 +01:00
|
|
|
|
|
|
|
/** The original vmcoreinfo data, to expose as an object */
|
|
|
|
char *raw;
|
|
|
|
size_t raw_size;
|
2022-06-18 18:39:39 +01:00
|
|
|
} vmcoreinfo;
|
2022-07-02 21:14:18 +01:00
|
|
|
/*
|
|
|
|
* Difference between a virtual address in the direct mapping and the
|
|
|
|
* physical address it maps to.
|
|
|
|
*/
|
|
|
|
uint64_t direct_mapping_offset;
|
2020-07-14 08:17:42 +01:00
|
|
|
/* Cached vmemmap. */
|
2020-12-10 10:40:07 +00:00
|
|
|
struct drgn_object vmemmap;
|
2022-07-08 23:34:29 +01:00
|
|
|
/* Page table iterator. */
|
2020-07-14 08:17:42 +01:00
|
|
|
struct pgtable_iterator *pgtable_it;
|
|
|
|
/*
|
2022-07-08 23:34:29 +01:00
|
|
|
* Whether we are currently in address translation. Used to prevent
|
|
|
|
* address translation from recursing.
|
2020-07-14 08:17:42 +01:00
|
|
|
*/
|
2022-07-08 23:34:29 +01:00
|
|
|
bool in_address_translation;
|
2022-07-02 21:14:18 +01:00
|
|
|
/* Whether @ref drgn_program::direct_mapping_offset has been cached. */
|
|
|
|
bool direct_mapping_offset_cached;
|
2023-07-18 18:14:26 +01:00
|
|
|
|
2023-07-15 05:23:00 +01:00
|
|
|
/*
|
|
|
|
* Logging.
|
|
|
|
*/
|
|
|
|
drgn_log_fn *log_fn;
|
|
|
|
void *log_arg;
|
|
|
|
enum drgn_log_level log_level;
|
|
|
|
|
2023-07-18 18:14:26 +01:00
|
|
|
/*
|
|
|
|
* Blocking callbacks.
|
|
|
|
*/
|
|
|
|
drgn_program_begin_blocking_fn *begin_blocking_fn;
|
|
|
|
drgn_program_end_blocking_fn *end_blocking_fn;
|
|
|
|
void *blocking_arg;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
};
|
|
|
|
|
2019-05-01 19:22:59 +01:00
|
|
|
/** Initialize a @ref drgn_program. */
|
2019-05-10 07:53:16 +01:00
|
|
|
void drgn_program_init(struct drgn_program *prog,
|
2019-07-29 08:57:28 +01:00
|
|
|
const struct drgn_platform *platform);
|
2019-04-23 09:46:24 +01:00
|
|
|
|
2019-05-01 19:22:59 +01:00
|
|
|
/** Deinitialize a @ref drgn_program. */
|
2019-04-23 09:46:24 +01:00
|
|
|
void drgn_program_deinit(struct drgn_program *prog);
|
|
|
|
|
2019-08-02 08:00:59 +01:00
|
|
|
/**
|
|
|
|
* Set the @ref drgn_platform of a @ref drgn_program if it hasn't been set
|
|
|
|
* yet.
|
|
|
|
*/
|
|
|
|
void drgn_program_set_platform(struct drgn_program *prog,
|
|
|
|
const struct drgn_platform *platform);
|
|
|
|
|
2019-04-23 09:46:24 +01:00
|
|
|
/**
|
2019-05-10 07:53:16 +01:00
|
|
|
* Implement @ref drgn_program_from_core_dump() on an initialized @ref
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
* drgn_program.
|
|
|
|
*/
|
|
|
|
struct drgn_error *drgn_program_init_core_dump(struct drgn_program *prog,
|
2019-05-10 07:53:16 +01:00
|
|
|
const char *path);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2023-08-16 07:38:04 +01:00
|
|
|
/**
|
|
|
|
* Implement @ref drgn_program_from_core_dump_fd() on an initialized @ref
|
|
|
|
* drgn_program.
|
|
|
|
*/
|
|
|
|
struct drgn_error *drgn_program_init_core_dump_fd(struct drgn_program *prog,
|
|
|
|
int fd);
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/**
|
2019-05-10 07:53:16 +01:00
|
|
|
* Implement @ref drgn_program_from_kernel() on an initialized @ref
|
|
|
|
* drgn_program.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*/
|
2019-05-10 07:53:16 +01:00
|
|
|
struct drgn_error *drgn_program_init_kernel(struct drgn_program *prog);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
/**
|
2019-05-10 07:53:16 +01:00
|
|
|
* Implement @ref drgn_program_from_pid() on an initialized @ref drgn_program.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*/
|
|
|
|
struct drgn_error *drgn_program_init_pid(struct drgn_program *prog, pid_t pid);
|
|
|
|
|
2023-09-29 20:00:31 +01:00
|
|
|
struct drgn_error *
|
|
|
|
drgn_program_add_object_finder_impl(struct drgn_program *prog,
|
|
|
|
struct drgn_object_finder *finder,
|
|
|
|
drgn_object_find_fn fn, void *arg);
|
|
|
|
|
2020-08-25 02:01:25 +01:00
|
|
|
static inline struct drgn_error *
|
|
|
|
drgn_program_is_little_endian(struct drgn_program *prog, bool *ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
2020-08-25 02:01:25 +01:00
|
|
|
if (!prog->has_platform) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_INVALID_ARGUMENT,
|
|
|
|
"program byte order is not known");
|
|
|
|
}
|
2021-02-23 22:06:41 +00:00
|
|
|
*ret = drgn_platform_is_little_endian(&prog->platform);
|
2020-08-25 02:01:25 +01:00
|
|
|
return NULL;
|
2019-07-29 08:57:28 +01:00
|
|
|
}
|
|
|
|
|
2020-04-26 23:29:00 +01:00
|
|
|
/**
|
|
|
|
* Return whether a @ref drgn_program has a different endianness than the host
|
|
|
|
* system.
|
|
|
|
*/
|
2020-08-25 02:01:25 +01:00
|
|
|
static inline struct drgn_error *
|
|
|
|
drgn_program_bswap(struct drgn_program *prog, bool *ret)
|
2020-04-26 23:29:00 +01:00
|
|
|
{
|
2021-02-23 22:06:41 +00:00
|
|
|
if (!prog->has_platform) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_INVALID_ARGUMENT,
|
|
|
|
"program byte order is not known");
|
|
|
|
}
|
|
|
|
*ret = drgn_platform_bswap(&prog->platform);
|
2020-08-25 02:01:25 +01:00
|
|
|
return NULL;
|
2020-04-26 23:29:00 +01:00
|
|
|
}
|
|
|
|
|
2020-08-25 02:01:25 +01:00
|
|
|
static inline struct drgn_error *
|
|
|
|
drgn_program_is_64_bit(struct drgn_program *prog, bool *ret)
|
2019-07-29 08:57:28 +01:00
|
|
|
{
|
2020-08-25 02:01:25 +01:00
|
|
|
if (!prog->has_platform) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_INVALID_ARGUMENT,
|
|
|
|
"program word size is not known");
|
|
|
|
}
|
2021-02-23 22:06:41 +00:00
|
|
|
*ret = drgn_platform_is_64_bit(&prog->platform);
|
2020-08-25 02:01:25 +01:00
|
|
|
return NULL;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
static inline struct drgn_error *
|
2021-02-23 22:06:41 +00:00
|
|
|
drgn_program_address_size(struct drgn_program *prog, uint8_t *ret)
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
{
|
2021-02-23 22:06:41 +00:00
|
|
|
if (!prog->has_platform) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_INVALID_ARGUMENT,
|
|
|
|
"program address size is not known");
|
|
|
|
}
|
|
|
|
*ret = drgn_platform_address_size(&prog->platform);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct drgn_error *
|
|
|
|
drgn_program_address_mask(const struct drgn_program *prog, uint64_t *ret)
|
|
|
|
{
|
|
|
|
if (!prog->has_platform) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_INVALID_ARGUMENT,
|
|
|
|
"program address size is not known");
|
|
|
|
}
|
|
|
|
*ret = drgn_platform_address_mask(&prog->platform);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
return NULL;
|
2023-06-27 23:22:22 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline struct drgn_error *
|
|
|
|
drgn_program_untagged_addr(const struct drgn_program *prog, uint64_t *address)
|
|
|
|
{
|
|
|
|
if (!prog->has_platform) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_INVALID_ARGUMENT,
|
|
|
|
"program address size is not known");
|
|
|
|
}
|
|
|
|
*address &= drgn_platform_address_mask(&prog->platform);
|
|
|
|
if (prog->platform.arch->untagged_addr)
|
|
|
|
*address = prog->platform.arch->untagged_addr(*address);
|
|
|
|
return NULL;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
}
|
|
|
|
|
2021-11-19 00:46:59 +00:00
|
|
|
struct drgn_error *drgn_thread_dup_internal(const struct drgn_thread *thread,
|
|
|
|
struct drgn_thread *ret);
|
|
|
|
|
|
|
|
void drgn_thread_deinit(struct drgn_thread *thread);
|
|
|
|
|
2020-05-20 19:30:00 +01:00
|
|
|
/**
|
|
|
|
* Find the @c NT_PRSTATUS note for the given CPU.
|
|
|
|
*
|
|
|
|
* This is only valid for the Linux kernel.
|
|
|
|
*
|
|
|
|
* @param[out] ret Returned note data. If not found, <tt>ret->str</tt> is set to
|
|
|
|
* @c NULL and <tt>ret->len</tt> is set to zero.
|
|
|
|
*/
|
|
|
|
struct drgn_error *drgn_program_find_prstatus_by_cpu(struct drgn_program *prog,
|
|
|
|
uint32_t cpu,
|
2023-10-30 22:47:30 +00:00
|
|
|
struct nstring *ret);
|
2020-05-20 19:30:00 +01:00
|
|
|
|
2019-10-24 22:26:45 +01:00
|
|
|
/**
|
|
|
|
* Find the @c NT_PRSTATUS note for the given thread ID.
|
|
|
|
*
|
2020-05-20 19:30:00 +01:00
|
|
|
* This is only valid for userspace programs.
|
2019-10-24 22:26:45 +01:00
|
|
|
*
|
|
|
|
* @param[out] ret Returned note data. If not found, <tt>ret->str</tt> is set to
|
|
|
|
* @c NULL and <tt>ret->len</tt> is set to zero.
|
|
|
|
*/
|
2020-05-20 19:30:00 +01:00
|
|
|
struct drgn_error *drgn_program_find_prstatus_by_tid(struct drgn_program *prog,
|
|
|
|
uint32_t tid,
|
2021-11-10 23:09:29 +00:00
|
|
|
struct nstring *ret);
|
2019-10-24 22:26:45 +01:00
|
|
|
|
2020-02-11 22:54:09 +00:00
|
|
|
/**
|
|
|
|
* Cache the @c NT_PRSTATUS note provided by @p data in @p prog.
|
|
|
|
*
|
|
|
|
* @param[in] data The pointer to the note data.
|
|
|
|
* @param[in] size Size of data in note.
|
2021-11-19 00:46:59 +00:00
|
|
|
* @param[out] ret Thread ID from note.
|
2020-02-11 22:54:09 +00:00
|
|
|
*/
|
|
|
|
struct drgn_error *drgn_program_cache_prstatus_entry(struct drgn_program *prog,
|
2021-11-19 00:46:59 +00:00
|
|
|
const char *data,
|
|
|
|
size_t size,
|
|
|
|
uint32_t *ret);
|
2020-02-11 22:54:09 +00:00
|
|
|
|
2019-07-25 08:47:13 +01:00
|
|
|
/*
|
2024-03-02 00:46:53 +00:00
|
|
|
* Like @ref drgn_program_find_symbol_by_address(), but returns @c NULL rather
|
|
|
|
* than a lookup error if the symbol was not found.
|
2019-12-10 19:02:55 +00:00
|
|
|
*
|
2024-03-02 00:46:53 +00:00
|
|
|
* @param[in] address Address to search for.
|
|
|
|
* @param [out] ret The symbol found by the lookup (if found)
|
|
|
|
* @return @c NULL unless an error (unrelated to a lookup error) was encountered
|
2019-07-25 08:47:13 +01:00
|
|
|
*/
|
2024-03-02 00:46:53 +00:00
|
|
|
struct drgn_error *
|
|
|
|
drgn_program_find_symbol_by_address_internal(struct drgn_program *prog,
|
|
|
|
uint64_t address,
|
|
|
|
struct drgn_symbol **ret);
|
2019-07-25 08:47:13 +01:00
|
|
|
|
libdrgn: introduce Symbol Finder API
Symbol lookup is not yet modular, like type or object lookup. However,
making it modular would enable easier development and prototyping of
alternative Symbol providers, such as Linux kernel module symbol tables,
vmlinux kallsyms tables, and BPF function symbols. To begin with, create
a modular Symbol API within libdrgn, and refactor the ELF symbol search
to use it.
For now, we leave drgn_program_find_symbol_by_address_internal() alone.
Its conversion will require some surgery, since the new API can return
errors, whereas this function cannot.
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
2024-03-02 00:46:53 +00:00
|
|
|
struct drgn_error *
|
|
|
|
drgn_program_add_symbol_finder_impl(struct drgn_program *prog,
|
|
|
|
struct drgn_symbol_finder *finder,
|
|
|
|
drgn_symbol_find_fn fn, void *arg);
|
2023-07-18 18:14:26 +01:00
|
|
|
/**
|
|
|
|
* Call before a blocking (I/O or long-running) operation.
|
|
|
|
*
|
|
|
|
* Must be paired with @ref drgn_program_end_blocking().
|
|
|
|
*
|
|
|
|
* @return Opaque pointer to pass to @ref drgn_program_end_blocking().
|
|
|
|
*/
|
|
|
|
void *drgn_program_begin_blocking(struct drgn_program *prog);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Call after a blocking (I/O or long-running) operation.
|
|
|
|
*
|
|
|
|
* @param[in] state Return value of @ref drgn_program_begin_blocking().
|
|
|
|
*/
|
|
|
|
void drgn_program_end_blocking(struct drgn_program *prog, void *state);
|
|
|
|
|
|
|
|
struct drgn_blocking_guard_struct {
|
|
|
|
struct drgn_program *prog;
|
|
|
|
void *state;
|
|
|
|
};
|
|
|
|
|
|
|
|
static inline struct drgn_blocking_guard_struct
|
|
|
|
drgn_blocking_guard_init(struct drgn_program *prog)
|
|
|
|
{
|
|
|
|
return (struct drgn_blocking_guard_struct){
|
|
|
|
prog, drgn_program_begin_blocking(prog),
|
|
|
|
};
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void
|
|
|
|
drgn_blocking_guard_cleanup(struct drgn_blocking_guard_struct *guard)
|
|
|
|
{
|
|
|
|
drgn_program_end_blocking(guard->prog, guard->state);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Scope guard that wraps @ref drgn_program_begin_blocking() and @ref
|
|
|
|
* drgn_program_end_blocking().
|
|
|
|
*/
|
|
|
|
#define drgn_blocking_guard(prog) \
|
|
|
|
struct drgn_blocking_guard_struct PP_UNIQUE(guard) \
|
|
|
|
__attribute__((__cleanup__(drgn_blocking_guard_cleanup), __unused__)) = \
|
|
|
|
drgn_blocking_guard_init(prog)
|
|
|
|
|
2020-09-30 09:32:33 +01:00
|
|
|
/**
|
|
|
|
* @}
|
|
|
|
* @}
|
|
|
|
*/
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
#endif /* DRGN_PROGRAM_H */
|