2021-11-21 23:59:44 +00:00
|
|
|
// Copyright (c) Meta Platforms, Inc. and affiliates.
|
2022-11-02 00:05:16 +00:00
|
|
|
// SPDX-License-Identifier: LGPL-2.1-or-later
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @file
|
|
|
|
*
|
|
|
|
* Serialization and deserialization to and from memory.
|
|
|
|
*
|
|
|
|
* See @ref SerializationDeserialization.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef DRGN_SERIALIZE_H
|
|
|
|
#define DRGN_SERIALIZE_H
|
|
|
|
|
|
|
|
#include <stdbool.h>
|
|
|
|
#include <stdint.h>
|
2021-03-10 09:08:53 +00:00
|
|
|
#include <string.h>
|
|
|
|
|
|
|
|
#include "minmax.h"
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @ingroup Internals
|
|
|
|
*
|
|
|
|
* @defgroup SerializationDeserialization Serialization/deserialization
|
|
|
|
*
|
|
|
|
* Serialization and deserialization of bits to and from memory.
|
|
|
|
*
|
|
|
|
* @{
|
|
|
|
*/
|
|
|
|
|
2021-01-27 20:21:51 +00:00
|
|
|
/** Truncate a signed integer to @p bit_size bits with sign extension. */
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static inline int64_t truncate_signed(int64_t svalue, uint64_t bit_size)
|
|
|
|
{
|
2021-01-27 20:21:51 +00:00
|
|
|
return (int64_t)((uint64_t)svalue << (64 - bit_size)) >> (64 - bit_size);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/** Truncate an unsigned integer to @p bit_size bits. */
|
|
|
|
static inline uint64_t truncate_unsigned(uint64_t uvalue, uint64_t bit_size)
|
|
|
|
{
|
2021-01-27 20:21:51 +00:00
|
|
|
return uvalue << (64 - bit_size) >> (64 - bit_size);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
2023-08-02 00:31:17 +01:00
|
|
|
static inline int8_t truncate_signed8(int8_t svalue, int bit_size)
|
|
|
|
{
|
|
|
|
return (int8_t)((uint8_t)svalue << (8 - bit_size)) >> (8 - bit_size);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline uint8_t truncate_unsigned8(uint8_t uvalue, int bit_size)
|
|
|
|
{
|
|
|
|
return (uint8_t)(uvalue << (8 - bit_size)) >> (8 - bit_size);
|
|
|
|
}
|
|
|
|
|
2021-03-10 09:08:53 +00:00
|
|
|
/**
|
|
|
|
* Copy the @p src_size least-significant bytes from @p src to the @p dst_size
|
|
|
|
* least-significant bytes of @p dst.
|
|
|
|
*
|
|
|
|
* If `src_size > dst_size`, the extra bytes are discarded. If `src_size <
|
|
|
|
* dst_size`, the extra bytes are zero-filled.
|
|
|
|
*/
|
|
|
|
static inline void copy_lsbytes(void *dst, size_t dst_size,
|
|
|
|
bool dst_little_endian, const void *src,
|
|
|
|
size_t src_size, bool src_little_endian)
|
|
|
|
{
|
|
|
|
char *d = dst;
|
|
|
|
const char *s = src;
|
|
|
|
size_t size = min(dst_size, src_size);
|
|
|
|
if (dst_little_endian) {
|
|
|
|
if (src_little_endian) {
|
|
|
|
memcpy(d, s, size);
|
|
|
|
} else {
|
|
|
|
for (size_t i = 0; i < size; i++)
|
|
|
|
d[i] = s[src_size - 1 - i];
|
|
|
|
}
|
|
|
|
memset(d + size, 0, dst_size - size);
|
|
|
|
} else {
|
|
|
|
memset(d, 0, dst_size - size);
|
|
|
|
if (src_little_endian) {
|
2021-06-08 20:05:42 +01:00
|
|
|
for (size_t i = dst_size - size; i < dst_size; i++)
|
2021-03-10 09:08:53 +00:00
|
|
|
d[i] = s[dst_size - 1 - i];
|
|
|
|
} else {
|
|
|
|
memcpy(d + dst_size - size, s + src_size - size, size);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-06-02 21:17:22 +01:00
|
|
|
/**
|
|
|
|
* Return a bit mask with bits `[bit_offset, 7]` set.
|
|
|
|
*
|
|
|
|
* @param[in] lsb0 See @ref copy_bits().
|
|
|
|
*/
|
|
|
|
static inline uint8_t copy_bits_first_mask(unsigned int bit_offset, bool lsb0)
|
|
|
|
{
|
|
|
|
return lsb0 ? 0xff << bit_offset : 0xff >> bit_offset;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Return a bit mask with bits `[0, last_bit % 8]` set.
|
|
|
|
*
|
|
|
|
* @param[in] lsb0 See @ref copy_bits().
|
|
|
|
*/
|
|
|
|
static inline uint8_t copy_bits_last_mask(uint64_t last_bit, bool lsb0)
|
|
|
|
{
|
|
|
|
return lsb0 ? 0xff >> (7 - last_bit % 8) : 0x7f80 >> (last_bit % 8);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Copy @p bit_size bits from @p src at bit offset @p src_bit_offset to @p dst
|
|
|
|
* at bit offset @p dst_bit_offset.
|
|
|
|
*
|
|
|
|
* @param[in] dst Destination buffer.
|
|
|
|
* @param[in] dst_bit_offset Offset in bits from the beginning of @p dst to copy
|
|
|
|
* to. Must be < 8.
|
|
|
|
* @param[in] src Source buffer.
|
|
|
|
* @param[in] src_bit_offset Offset in bits from the beginning of @p src to copy
|
|
|
|
* from. Must be < 8.
|
|
|
|
* @param[in] bit_size Number of bits to copy.
|
|
|
|
* @param[in] lsb0 If @c true, bits within a byte are numbered from least
|
|
|
|
* significant (0) to most significant (7); if @c false, they are numbered from
|
|
|
|
* most significant (0) to least significant (7). This determines the
|
|
|
|
* interpretation of @p dst_bit_offset and @p src_bit_offset.
|
|
|
|
*/
|
|
|
|
void copy_bits(void *dst, unsigned int dst_bit_offset, const void *src,
|
|
|
|
unsigned int src_bit_offset, uint64_t bit_size, bool lsb0);
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/**
|
|
|
|
* Serialize bits to a memory buffer.
|
|
|
|
*
|
|
|
|
* Note that this does not perform any bounds checking, so the caller must check
|
|
|
|
* that <tt>bit_offset + bit_size</tt> is within the buffer.
|
|
|
|
*
|
|
|
|
* @param[in] buf Memory buffer to write to.
|
|
|
|
* @param[in] bit_offset Offset in bits from the beginning of @p buf to where to
|
|
|
|
* write. This is interpreted differently based on @p little_endian.
|
|
|
|
* @param[in] uvalue Bits to write, in host order.
|
2021-07-22 03:38:53 +01:00
|
|
|
* @param[in] bit_size Number of bits in @p uvalue. This must be greater than
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
* zero and no more than 64. Note that this is not checked or truncated, so if
|
|
|
|
* @p uvalue has more than this many bits, the results will likely be incorrect.
|
|
|
|
* @param[in] little_endian Whether the bits should be written out in
|
|
|
|
* little-endian order.
|
|
|
|
*/
|
|
|
|
void serialize_bits(void *buf, uint64_t bit_offset, uint64_t uvalue,
|
|
|
|
uint8_t bit_size, bool little_endian);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Deserialize bits from a memory buffer.
|
|
|
|
*
|
|
|
|
* Note that this does not perform any bounds checking, so the caller must check
|
|
|
|
* that <tt>bit_offset + bit_size</tt> is within the buffer.
|
|
|
|
*
|
|
|
|
* @param[in] buf Memory buffer to read from.
|
|
|
|
* @param[in] bit_offset Offset in bits from the beginning of @p buf to where to
|
|
|
|
* read from. This is interpreted differently based on @p little_endian.
|
2021-07-22 03:38:53 +01:00
|
|
|
* @param[in] bit_size Number of bits to read. This must be greater than zero
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
* and no more than 64.
|
|
|
|
* @param[in] little_endian Whether the bits should be interpreted in
|
|
|
|
* little-endian order.
|
|
|
|
* @return The read bits in host order.
|
|
|
|
*/
|
|
|
|
uint64_t deserialize_bits(const void *buf, uint64_t bit_offset,
|
|
|
|
uint8_t bit_size, bool little_endian);
|
|
|
|
|
|
|
|
/** @} */
|
|
|
|
|
|
|
|
#endif /* DRGN_SERIALIZE_H */
|