2021-11-21 23:59:44 +00:00
|
|
|
// Copyright (c) Meta Platforms, Inc. and affiliates.
|
2021-04-03 09:10:35 +01:00
|
|
|
// SPDX-License-Identifier: GPL-3.0-or-later
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @file
|
|
|
|
*
|
|
|
|
* Memory reading interface.
|
|
|
|
*
|
|
|
|
* See @ref MemoryReader.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef DRGN_MEMORY_READER_H
|
|
|
|
#define DRGN_MEMORY_READER_H
|
|
|
|
|
2019-05-24 09:16:25 +01:00
|
|
|
#include "binary_search_tree.h"
|
2020-09-24 00:02:02 +01:00
|
|
|
#include "drgn.h"
|
2019-05-24 09:16:25 +01:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/**
|
|
|
|
* @ingroup Internals
|
|
|
|
*
|
|
|
|
* @defgroup MemoryReader Memory reader
|
|
|
|
*
|
|
|
|
* Memory reading interface.
|
|
|
|
*
|
2019-04-26 19:56:47 +01:00
|
|
|
* @ref drgn_memory_reader provides a common interface for registering regions
|
|
|
|
* of memory in a program and reading from memory.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*
|
2021-06-04 01:43:27 +01:00
|
|
|
* @ref drgn_memory_reader does not have a notion of the maximum address or
|
|
|
|
* address overflow/wrap-around. Those must be handled at a higher layer.
|
|
|
|
*
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
* @{
|
|
|
|
*/
|
|
|
|
|
2020-09-30 09:32:33 +01:00
|
|
|
DEFINE_BINARY_SEARCH_TREE_TYPE(drgn_memory_segment_tree,
|
2022-05-24 23:08:39 +01:00
|
|
|
struct drgn_memory_segment)
|
2019-05-24 09:16:25 +01:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/**
|
2019-04-26 19:56:47 +01:00
|
|
|
* Memory reader.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*
|
2019-04-26 19:56:47 +01:00
|
|
|
* A memory reader maps the segments of memory in an address space to callbacks
|
|
|
|
* which can be used to read memory from those segments.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*/
|
2019-04-26 19:56:47 +01:00
|
|
|
struct drgn_memory_reader {
|
2019-05-24 09:16:25 +01:00
|
|
|
/** Virtual memory segments. */
|
|
|
|
struct drgn_memory_segment_tree virtual_segments;
|
|
|
|
/** Physical memory segments. */
|
|
|
|
struct drgn_memory_segment_tree physical_segments;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
2019-04-26 19:56:47 +01:00
|
|
|
* Initialize a @ref drgn_memory_reader.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*
|
2019-04-26 19:56:47 +01:00
|
|
|
* The reader is initialized with no segments.
|
|
|
|
*/
|
|
|
|
void drgn_memory_reader_init(struct drgn_memory_reader *reader);
|
|
|
|
|
|
|
|
/** Deinitialize a @ref drgn_memory_reader. */
|
|
|
|
void drgn_memory_reader_deinit(struct drgn_memory_reader *reader);
|
|
|
|
|
2019-05-24 09:16:25 +01:00
|
|
|
/** Return whether a @ref drgn_memory_reader has no segments. */
|
|
|
|
bool drgn_memory_reader_empty(struct drgn_memory_reader *reader);
|
|
|
|
|
2021-06-04 01:43:27 +01:00
|
|
|
/**
|
|
|
|
* Add a segment to a @ref drgn_memory_reader.
|
|
|
|
*
|
|
|
|
* @param[in] reader Memory reader.
|
|
|
|
* @param[in] min_address Start address (inclusive).
|
|
|
|
* @param[in] max_address End address (inclusive). Must be `>= min_address`.
|
|
|
|
* @param[in] read_fn Callback to read from segment.
|
|
|
|
* @param[in] arg Argument to pass to @p read_fn.
|
|
|
|
* @param[in] physical Whether to add a physical memory segment.
|
|
|
|
* @return @c NULL on success, non-@c NULL on error.
|
|
|
|
*/
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
struct drgn_error *
|
2019-04-26 19:56:47 +01:00
|
|
|
drgn_memory_reader_add_segment(struct drgn_memory_reader *reader,
|
2021-06-04 01:43:27 +01:00
|
|
|
uint64_t min_address, uint64_t max_address,
|
2019-05-24 09:16:25 +01:00
|
|
|
drgn_memory_read_fn read_fn, void *arg,
|
|
|
|
bool physical);
|
2019-04-26 19:56:47 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* Read from a @ref drgn_memory_reader.
|
|
|
|
*
|
|
|
|
* @param[in] reader Memory reader.
|
|
|
|
* @param[out] buf Buffer to read into.
|
|
|
|
* @param[in] address Starting address in memory to read.
|
2021-06-04 01:43:27 +01:00
|
|
|
* @param[in] count Number of bytes to read. `address + count - 1` must be
|
|
|
|
* `<= UINT64_MAX`
|
2019-04-26 19:56:47 +01:00
|
|
|
* @param[in] physical Whether @c address is physical.
|
|
|
|
* @return @c NULL on success, non-@c NULL on error.
|
|
|
|
*/
|
|
|
|
struct drgn_error *drgn_memory_reader_read(struct drgn_memory_reader *reader,
|
|
|
|
void *buf, uint64_t address,
|
|
|
|
size_t count, bool physical);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2019-04-26 19:56:47 +01:00
|
|
|
/** Argument for @ref drgn_read_memory_file(). */
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
struct drgn_memory_file_segment {
|
|
|
|
/** Offset in the file where the segment starts. */
|
|
|
|
uint64_t file_offset;
|
2019-04-26 19:56:47 +01:00
|
|
|
/**
|
|
|
|
* Size of the segment in the file. This may be less than the size of
|
2022-08-26 17:46:15 +01:00
|
|
|
* the segment in memory.
|
2019-04-26 19:56:47 +01:00
|
|
|
*/
|
|
|
|
uint64_t file_size;
|
|
|
|
/** File descriptor. */
|
|
|
|
int fd;
|
2019-12-10 19:53:02 +00:00
|
|
|
/**
|
|
|
|
* If @c true, EIO is treated as a fault. Otherwise, it is treated as an
|
|
|
|
* OS error.
|
|
|
|
*/
|
|
|
|
bool eio_is_fault;
|
2022-08-17 23:13:24 +01:00
|
|
|
/**
|
2022-08-26 17:46:15 +01:00
|
|
|
* If @c true, reads between @ref file_size and the size of the segment
|
|
|
|
* in memory will be returned as zeroes. Otherwise, such reads will
|
|
|
|
* result in a fault.
|
2022-08-17 23:13:24 +01:00
|
|
|
*/
|
|
|
|
bool zerofill;
|
2019-04-26 19:56:47 +01:00
|
|
|
};
|
|
|
|
|
|
|
|
/** @ref drgn_memory_read_fn which reads from a file. */
|
|
|
|
struct drgn_error *drgn_read_memory_file(void *buf, uint64_t address,
|
2019-05-24 09:16:25 +01:00
|
|
|
size_t count, uint64_t offset,
|
|
|
|
void *arg, bool physical);
|
2019-04-26 19:56:47 +01:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/** @} */
|
|
|
|
|
|
|
|
#endif /* DRGN_MEMORY_READER_H */
|