2020-05-15 23:13:02 +01:00
|
|
|
// Copyright (c) Facebook, Inc. and its affiliates.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
// SPDX-License-Identifier: GPL-3.0+
|
|
|
|
|
2020-09-16 01:42:53 +01:00
|
|
|
#include <assert.h>
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
#include <dwarf.h>
|
2020-09-16 01:42:53 +01:00
|
|
|
#include <elf.h>
|
2020-10-22 23:34:47 +01:00
|
|
|
#include <elfutils/known-dwarf.h>
|
2019-04-28 20:13:29 +01:00
|
|
|
#include <elfutils/libdw.h>
|
2020-09-16 01:42:53 +01:00
|
|
|
#include <elfutils/libdwelf.h>
|
2020-09-24 00:02:02 +01:00
|
|
|
#include <errno.h>
|
|
|
|
#include <fcntl.h>
|
2020-09-16 01:42:53 +01:00
|
|
|
#include <gelf.h>
|
|
|
|
#include <inttypes.h>
|
2020-09-24 00:02:02 +01:00
|
|
|
#include <stdarg.h>
|
|
|
|
#include <stdio.h>
|
2020-09-16 01:42:53 +01:00
|
|
|
#include <stdlib.h>
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
#include <string.h>
|
2020-09-16 01:42:53 +01:00
|
|
|
#include <unistd.h>
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-09-12 01:41:23 +01:00
|
|
|
#include "debug_info.h"
|
2020-09-24 00:02:02 +01:00
|
|
|
#include "error.h"
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
#include "hash_table.h"
|
2020-09-16 01:42:53 +01:00
|
|
|
#include "language.h"
|
2020-12-18 19:01:29 +00:00
|
|
|
#include "lazy_object.h"
|
2020-09-16 01:42:53 +01:00
|
|
|
#include "linux_kernel.h"
|
2020-07-13 23:21:55 +01:00
|
|
|
#include "object.h"
|
2020-09-24 00:02:02 +01:00
|
|
|
#include "path.h"
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
#include "program.h"
|
2020-04-23 00:23:26 +01:00
|
|
|
#include "type.h"
|
2020-09-24 00:02:02 +01:00
|
|
|
#include "util.h"
|
2019-07-09 23:50:45 +01:00
|
|
|
#include "vector.h"
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-10-22 23:34:47 +01:00
|
|
|
#define DW_TAG_UNKNOWN_FORMAT "unknown DWARF tag 0x%02x"
|
|
|
|
#define DW_TAG_BUF_LEN (sizeof(DW_TAG_UNKNOWN_FORMAT) - 4 + 2 * sizeof(int))
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Get the name of a DWARF tag.
|
|
|
|
*
|
|
|
|
* @return Static string if the tag is known or @p buf if the tag is unknown
|
|
|
|
* (populated with a description).
|
|
|
|
*/
|
|
|
|
static const char *dw_tag_str(int tag, char buf[DW_TAG_BUF_LEN])
|
|
|
|
{
|
|
|
|
switch (tag) {
|
|
|
|
#define DWARF_ONE_KNOWN_DW_TAG(name, value) case value: return "DW_TAG_" #name;
|
|
|
|
DWARF_ALL_KNOWN_DW_TAG
|
|
|
|
#undef DWARF_ONE_KNOWN_DW_TAG
|
|
|
|
default:
|
|
|
|
sprintf(buf, DW_TAG_UNKNOWN_FORMAT, tag);
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/** Like @ref dw_tag_str(), but takes a @c Dwarf_Die. */
|
|
|
|
static const char *dwarf_tag_str(Dwarf_Die *die, char buf[DW_TAG_BUF_LEN])
|
|
|
|
{
|
|
|
|
return dw_tag_str(dwarf_tag(die), buf);
|
|
|
|
}
|
|
|
|
|
2020-10-23 00:09:31 +01:00
|
|
|
static const char * const drgn_debug_scn_names[] = {
|
|
|
|
[DRGN_SCN_DEBUG_INFO] = ".debug_info",
|
|
|
|
[DRGN_SCN_DEBUG_ABBREV] = ".debug_abbrev",
|
|
|
|
[DRGN_SCN_DEBUG_STR] = ".debug_str",
|
|
|
|
[DRGN_SCN_DEBUG_LINE] = ".debug_line",
|
|
|
|
};
|
|
|
|
|
2020-10-29 00:35:09 +00:00
|
|
|
|
|
|
|
struct drgn_error *drgn_error_debug_info(struct drgn_debug_info_module *module,
|
|
|
|
enum drgn_debug_info_scn scn,
|
|
|
|
const char *ptr, const char *message)
|
|
|
|
{
|
|
|
|
const char *name = dwfl_module_info(module->dwfl_module, NULL, NULL,
|
|
|
|
NULL, NULL, NULL, NULL, NULL);
|
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER, "%s: %s+%#tx: %s",
|
|
|
|
name, drgn_debug_scn_names[scn],
|
|
|
|
ptr - (const char *)module->scns[scn]->d_buf,
|
|
|
|
message);
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *drgn_debug_info_buffer_error(struct binary_buffer *bb,
|
|
|
|
const char *pos,
|
|
|
|
const char *message)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info_buffer *buffer =
|
|
|
|
container_of(bb, struct drgn_debug_info_buffer, bb);
|
|
|
|
return drgn_error_debug_info(buffer->module, buffer->scn, pos, message);
|
|
|
|
}
|
|
|
|
|
2020-09-16 01:42:53 +01:00
|
|
|
DEFINE_VECTOR_FUNCTIONS(drgn_debug_info_module_vector)
|
|
|
|
|
|
|
|
static inline struct hash_pair
|
2020-10-13 00:06:23 +01:00
|
|
|
drgn_debug_info_module_key_hash_pair(const struct drgn_debug_info_module_key *key)
|
2020-09-16 01:42:53 +01:00
|
|
|
{
|
2020-10-13 00:06:23 +01:00
|
|
|
size_t hash = hash_bytes(key->build_id, key->build_id_len);
|
2020-09-16 01:42:53 +01:00
|
|
|
hash = hash_combine(hash, key->start);
|
|
|
|
hash = hash_combine(hash, key->end);
|
|
|
|
return hash_pair_from_avalanching_hash(hash);
|
|
|
|
}
|
|
|
|
static inline bool
|
2020-10-13 00:06:23 +01:00
|
|
|
drgn_debug_info_module_key_eq(const struct drgn_debug_info_module_key *a,
|
|
|
|
const struct drgn_debug_info_module_key *b)
|
2020-09-16 01:42:53 +01:00
|
|
|
{
|
|
|
|
return (a->build_id_len == b->build_id_len &&
|
|
|
|
memcmp(a->build_id, b->build_id, a->build_id_len) == 0 &&
|
|
|
|
a->start == b->start && a->end == b->end);
|
|
|
|
}
|
|
|
|
DEFINE_HASH_TABLE_FUNCTIONS(drgn_debug_info_module_table,
|
2020-10-13 00:06:23 +01:00
|
|
|
drgn_debug_info_module_key_hash_pair,
|
|
|
|
drgn_debug_info_module_key_eq)
|
2020-09-16 01:42:53 +01:00
|
|
|
|
2020-10-13 00:06:23 +01:00
|
|
|
DEFINE_HASH_TABLE_FUNCTIONS(c_string_set, c_string_key_hash_pair,
|
|
|
|
c_string_key_eq)
|
2020-09-16 01:42:53 +01:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @c Dwfl_Callbacks::find_elf() implementation.
|
|
|
|
*
|
|
|
|
* Ideally we'd use @c dwfl_report_elf() instead, but that doesn't take an @c
|
|
|
|
* Elf handle, which we need for a couple of reasons:
|
|
|
|
*
|
|
|
|
* - We usually already have the @c Elf handle open in order to identify the
|
|
|
|
* file.
|
|
|
|
* - For kernel modules, we set the section addresses in the @c Elf handle
|
|
|
|
* ourselves instead of using @c Dwfl_Callbacks::section_address().
|
|
|
|
*
|
|
|
|
* Additionally, there's a special case for vmlinux. It is usually an @c ET_EXEC
|
|
|
|
* ELF file, but when KASLR is enabled, it needs to be handled like an @c ET_DYN
|
|
|
|
* file. libdwfl has a hack for this when @c dwfl_report_module() is used, but
|
|
|
|
* @ref dwfl_report_elf() bypasses this hack.
|
|
|
|
*
|
|
|
|
* So, we're stuck using @c dwfl_report_module() and this dummy callback.
|
|
|
|
*/
|
|
|
|
static int drgn_dwfl_find_elf(Dwfl_Module *dwfl_module, void **userdatap,
|
|
|
|
const char *name, Dwarf_Addr base,
|
|
|
|
char **file_name, Elf **elfp)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info_module *module = *userdatap;
|
|
|
|
/*
|
|
|
|
* libdwfl consumes the returned path, file descriptor, and ELF handle,
|
|
|
|
* so clear the fields.
|
|
|
|
*/
|
|
|
|
*file_name = module->path;
|
|
|
|
int fd = module->fd;
|
|
|
|
*elfp = module->elf;
|
|
|
|
module->path = NULL;
|
|
|
|
module->fd = -1;
|
|
|
|
module->elf = NULL;
|
|
|
|
return fd;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Uses drgn_dwfl_find_elf() if the ELF file was reported directly and falls
|
|
|
|
* back to dwfl_linux_proc_find_elf() otherwise.
|
|
|
|
*/
|
|
|
|
static int drgn_dwfl_linux_proc_find_elf(Dwfl_Module *dwfl_module,
|
|
|
|
void **userdatap, const char *name,
|
|
|
|
Dwarf_Addr base, char **file_name,
|
|
|
|
Elf **elfp)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info_module *module = *userdatap;
|
|
|
|
if (module->elf) {
|
|
|
|
return drgn_dwfl_find_elf(dwfl_module, userdatap, name, base,
|
|
|
|
file_name, elfp);
|
|
|
|
}
|
|
|
|
return dwfl_linux_proc_find_elf(dwfl_module, userdatap, name, base,
|
|
|
|
file_name, elfp);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Uses drgn_dwfl_find_elf() if the ELF file was reported directly and falls
|
|
|
|
* back to dwfl_build_id_find_elf() otherwise.
|
|
|
|
*/
|
|
|
|
static int drgn_dwfl_build_id_find_elf(Dwfl_Module *dwfl_module,
|
|
|
|
void **userdatap, const char *name,
|
|
|
|
Dwarf_Addr base, char **file_name,
|
|
|
|
Elf **elfp)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info_module *module = *userdatap;
|
|
|
|
if (module->elf) {
|
|
|
|
return drgn_dwfl_find_elf(dwfl_module, userdatap, name, base,
|
|
|
|
file_name, elfp);
|
|
|
|
}
|
|
|
|
return dwfl_build_id_find_elf(dwfl_module, userdatap, name, base,
|
|
|
|
file_name, elfp);
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @c Dwfl_Callbacks::section_address() implementation.
|
|
|
|
*
|
|
|
|
* We set the section header @c sh_addr in memory instead of using this, but
|
|
|
|
* libdwfl requires the callback pointer to be non-@c NULL. It will be called
|
|
|
|
* for any sections that still have a zero @c sh_addr, meaning they are not
|
|
|
|
* present in memory.
|
|
|
|
*/
|
|
|
|
static int drgn_dwfl_section_address(Dwfl_Module *module, void **userdatap,
|
|
|
|
const char *name, Dwarf_Addr base,
|
|
|
|
const char *secname, Elf32_Word shndx,
|
|
|
|
const GElf_Shdr *shdr, Dwarf_Addr *addr)
|
|
|
|
{
|
|
|
|
*addr = -1;
|
|
|
|
return DWARF_CB_OK;
|
|
|
|
}
|
|
|
|
|
|
|
|
static const Dwfl_Callbacks drgn_dwfl_callbacks = {
|
|
|
|
.find_elf = drgn_dwfl_find_elf,
|
|
|
|
.find_debuginfo = dwfl_standard_find_debuginfo,
|
|
|
|
.section_address = drgn_dwfl_section_address,
|
|
|
|
};
|
|
|
|
|
|
|
|
static const Dwfl_Callbacks drgn_linux_proc_dwfl_callbacks = {
|
|
|
|
.find_elf = drgn_dwfl_linux_proc_find_elf,
|
|
|
|
.find_debuginfo = dwfl_standard_find_debuginfo,
|
|
|
|
.section_address = drgn_dwfl_section_address,
|
|
|
|
};
|
|
|
|
|
|
|
|
static const Dwfl_Callbacks drgn_userspace_core_dump_dwfl_callbacks = {
|
|
|
|
.find_elf = drgn_dwfl_build_id_find_elf,
|
|
|
|
.find_debuginfo = dwfl_standard_find_debuginfo,
|
|
|
|
.section_address = drgn_dwfl_section_address,
|
|
|
|
};
|
|
|
|
|
|
|
|
static void
|
|
|
|
drgn_debug_info_module_destroy(struct drgn_debug_info_module *module)
|
|
|
|
{
|
|
|
|
if (module) {
|
|
|
|
drgn_error_destroy(module->err);
|
|
|
|
elf_end(module->elf);
|
|
|
|
if (module->fd != -1)
|
|
|
|
close(module->fd);
|
|
|
|
free(module->path);
|
|
|
|
free(module->name);
|
|
|
|
free(module);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
drgn_debug_info_module_finish_indexing(struct drgn_debug_info *dbinfo,
|
|
|
|
struct drgn_debug_info_module *module)
|
|
|
|
{
|
|
|
|
module->state = DRGN_DEBUG_INFO_MODULE_INDEXED;
|
|
|
|
if (module->name) {
|
|
|
|
int ret = c_string_set_insert(&dbinfo->module_names,
|
|
|
|
(const char **)&module->name,
|
|
|
|
NULL);
|
|
|
|
/* drgn_debug_info_update_index() should've reserved enough. */
|
|
|
|
assert(ret != -1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_dwfl_module_removed_arg {
|
|
|
|
struct drgn_debug_info *dbinfo;
|
|
|
|
bool finish_indexing;
|
|
|
|
bool free_all;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int drgn_dwfl_module_removed(Dwfl_Module *dwfl_module, void *userdatap,
|
|
|
|
const char *name, Dwarf_Addr base,
|
|
|
|
void *_arg)
|
|
|
|
{
|
|
|
|
struct drgn_dwfl_module_removed_arg *arg = _arg;
|
|
|
|
/*
|
|
|
|
* userdatap is actually a void ** like for the other libdwfl callbacks,
|
|
|
|
* but dwfl_report_end() has the wrong signature for the removed
|
|
|
|
* callback.
|
|
|
|
*/
|
|
|
|
struct drgn_debug_info_module *module = *(void **)userdatap;
|
|
|
|
if (arg->finish_indexing && module &&
|
|
|
|
module->state == DRGN_DEBUG_INFO_MODULE_INDEXING)
|
|
|
|
drgn_debug_info_module_finish_indexing(arg->dbinfo, module);
|
|
|
|
if (arg->free_all || !module ||
|
|
|
|
module->state != DRGN_DEBUG_INFO_MODULE_INDEXED) {
|
|
|
|
drgn_debug_info_module_destroy(module);
|
|
|
|
} else {
|
|
|
|
/*
|
|
|
|
* The module was already indexed. Report it again so libdwfl
|
|
|
|
* doesn't remove it.
|
|
|
|
*/
|
|
|
|
Dwarf_Addr end;
|
|
|
|
dwfl_module_info(dwfl_module, NULL, NULL, &end, NULL, NULL,
|
|
|
|
NULL, NULL);
|
|
|
|
dwfl_report_module(arg->dbinfo->dwfl, name, base, end);
|
|
|
|
}
|
|
|
|
return DWARF_CB_OK;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void drgn_debug_info_free_modules(struct drgn_debug_info *dbinfo,
|
|
|
|
bool finish_indexing, bool free_all)
|
|
|
|
{
|
|
|
|
for (struct drgn_debug_info_module_table_iterator it =
|
|
|
|
drgn_debug_info_module_table_first(&dbinfo->modules); it.entry; ) {
|
|
|
|
struct drgn_debug_info_module *module = *it.entry;
|
|
|
|
struct drgn_debug_info_module **nextp = it.entry;
|
|
|
|
do {
|
|
|
|
struct drgn_debug_info_module *next = module->next;
|
|
|
|
if (finish_indexing &&
|
|
|
|
module->state == DRGN_DEBUG_INFO_MODULE_INDEXING) {
|
|
|
|
drgn_debug_info_module_finish_indexing(dbinfo,
|
|
|
|
module);
|
|
|
|
}
|
|
|
|
if (free_all ||
|
|
|
|
module->state != DRGN_DEBUG_INFO_MODULE_INDEXED) {
|
|
|
|
if (module == *nextp) {
|
|
|
|
if (nextp == it.entry && !next) {
|
|
|
|
it = drgn_debug_info_module_table_delete_iterator(&dbinfo->modules,
|
|
|
|
it);
|
|
|
|
} else {
|
|
|
|
if (!next)
|
|
|
|
it = drgn_debug_info_module_table_next(it);
|
|
|
|
*nextp = next;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
void **userdatap;
|
|
|
|
dwfl_module_info(module->dwfl_module,
|
|
|
|
&userdatap, NULL, NULL, NULL,
|
|
|
|
NULL, NULL, NULL);
|
|
|
|
*userdatap = NULL;
|
|
|
|
drgn_debug_info_module_destroy(module);
|
|
|
|
} else {
|
|
|
|
if (!next)
|
|
|
|
it = drgn_debug_info_module_table_next(it);
|
|
|
|
nextp = &module->next;
|
|
|
|
}
|
|
|
|
module = next;
|
|
|
|
} while (module);
|
|
|
|
}
|
|
|
|
|
|
|
|
dwfl_report_begin(dbinfo->dwfl);
|
|
|
|
struct drgn_dwfl_module_removed_arg arg = {
|
|
|
|
.dbinfo = dbinfo,
|
|
|
|
.finish_indexing = finish_indexing,
|
|
|
|
.free_all = free_all,
|
|
|
|
};
|
|
|
|
dwfl_report_end(dbinfo->dwfl, drgn_dwfl_module_removed, &arg);
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *
|
|
|
|
drgn_debug_info_report_error(struct drgn_debug_info_load_state *load,
|
|
|
|
const char *name, const char *message,
|
|
|
|
struct drgn_error *err)
|
|
|
|
{
|
|
|
|
if (err && err->code == DRGN_ERROR_NO_MEMORY) {
|
|
|
|
/* Always fail hard if we're out of memory. */
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
if (load->num_errors == 0 &&
|
|
|
|
!string_builder_append(&load->errors,
|
|
|
|
"could not get debugging information for:"))
|
|
|
|
goto err;
|
|
|
|
if (load->num_errors < load->max_errors) {
|
|
|
|
if (!string_builder_line_break(&load->errors))
|
|
|
|
goto err;
|
|
|
|
if (name && !string_builder_append(&load->errors, name))
|
|
|
|
goto err;
|
|
|
|
if (name && (message || err) &&
|
|
|
|
!string_builder_append(&load->errors, " ("))
|
|
|
|
goto err;
|
|
|
|
if (message && !string_builder_append(&load->errors, message))
|
|
|
|
goto err;
|
|
|
|
if (message && err &&
|
|
|
|
!string_builder_append(&load->errors, ": "))
|
|
|
|
goto err;
|
|
|
|
if (err && !string_builder_append_error(&load->errors, err))
|
|
|
|
goto err;
|
|
|
|
if (name && (message || err) &&
|
|
|
|
!string_builder_appendc(&load->errors, ')'))
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
load->num_errors++;
|
|
|
|
drgn_error_destroy(err);
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
err:
|
|
|
|
drgn_error_destroy(err);
|
|
|
|
return &drgn_enomem;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_debug_info_report_module(struct drgn_debug_info_load_state *load,
|
|
|
|
const void *build_id, size_t build_id_len,
|
|
|
|
uint64_t start, uint64_t end, const char *name,
|
|
|
|
Dwfl_Module *dwfl_module, const char *path,
|
|
|
|
int fd, Elf *elf, bool *new_ret)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info *dbinfo = load->dbinfo;
|
|
|
|
struct drgn_error *err;
|
|
|
|
char *path_key = NULL;
|
|
|
|
|
|
|
|
if (new_ret)
|
|
|
|
*new_ret = false;
|
|
|
|
|
|
|
|
struct hash_pair hp;
|
|
|
|
struct drgn_debug_info_module_table_iterator it;
|
|
|
|
if (build_id_len) {
|
|
|
|
struct drgn_debug_info_module_key key = {
|
|
|
|
.build_id = build_id,
|
|
|
|
.build_id_len = build_id_len,
|
|
|
|
.start = start,
|
|
|
|
.end = end,
|
|
|
|
};
|
2020-10-13 00:06:23 +01:00
|
|
|
hp = drgn_debug_info_module_table_hash(&key);
|
2020-09-16 01:42:53 +01:00
|
|
|
it = drgn_debug_info_module_table_search_hashed(&dbinfo->modules,
|
|
|
|
&key, hp);
|
|
|
|
if (it.entry &&
|
|
|
|
(*it.entry)->state == DRGN_DEBUG_INFO_MODULE_INDEXED) {
|
|
|
|
/* We've already indexed this module. */
|
|
|
|
err = NULL;
|
|
|
|
goto free;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!dwfl_module) {
|
|
|
|
path_key = realpath(path, NULL);
|
|
|
|
if (!path_key) {
|
|
|
|
path_key = strdup(path);
|
|
|
|
if (!path_key) {
|
|
|
|
err = &drgn_enomem;
|
|
|
|
goto free;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
dwfl_module = dwfl_report_module(dbinfo->dwfl, path_key, start,
|
|
|
|
end);
|
|
|
|
if (!dwfl_module) {
|
|
|
|
err = drgn_error_libdwfl();
|
|
|
|
goto free;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void **userdatap;
|
|
|
|
dwfl_module_info(dwfl_module, &userdatap, NULL, NULL, NULL, NULL, NULL,
|
|
|
|
NULL);
|
|
|
|
if (*userdatap) {
|
|
|
|
/* We've already reported this file at this offset. */
|
|
|
|
err = NULL;
|
|
|
|
goto free;
|
|
|
|
}
|
|
|
|
if (new_ret)
|
|
|
|
*new_ret = true;
|
|
|
|
|
|
|
|
struct drgn_debug_info_module *module = malloc(sizeof(*module));
|
|
|
|
if (!module) {
|
|
|
|
err = &drgn_enomem;
|
|
|
|
goto free;
|
|
|
|
}
|
|
|
|
module->state = DRGN_DEBUG_INFO_MODULE_NEW;
|
|
|
|
module->build_id = build_id;
|
|
|
|
module->build_id_len = build_id_len;
|
|
|
|
module->start = start;
|
|
|
|
module->end = end;
|
|
|
|
if (name) {
|
|
|
|
module->name = strdup(name);
|
|
|
|
if (!module->name) {
|
|
|
|
err = &drgn_enomem;
|
|
|
|
free(module);
|
|
|
|
goto free;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
module->name = NULL;
|
|
|
|
}
|
|
|
|
module->dwfl_module = dwfl_module;
|
2020-10-23 00:09:31 +01:00
|
|
|
memset(module->scns, 0, sizeof(module->scns));
|
2020-09-16 01:42:53 +01:00
|
|
|
module->path = path_key;
|
|
|
|
module->fd = fd;
|
|
|
|
module->elf = elf;
|
|
|
|
module->err = NULL;
|
|
|
|
module->next = NULL;
|
|
|
|
|
|
|
|
/* path_key, fd and elf are owned by the module now. */
|
|
|
|
|
|
|
|
if (!drgn_debug_info_module_vector_append(&load->new_modules,
|
|
|
|
&module)) {
|
|
|
|
drgn_debug_info_module_destroy(module);
|
|
|
|
return &drgn_enomem;
|
|
|
|
}
|
|
|
|
if (build_id_len) {
|
|
|
|
if (it.entry) {
|
|
|
|
/*
|
|
|
|
* The first module with this build ID is in
|
|
|
|
* new_modules, so insert it after in the list, not
|
|
|
|
* before.
|
|
|
|
*/
|
|
|
|
module->next = (*it.entry)->next;
|
|
|
|
(*it.entry)->next = module;
|
|
|
|
} else if (drgn_debug_info_module_table_insert_searched(&dbinfo->modules,
|
|
|
|
&module,
|
|
|
|
hp,
|
|
|
|
NULL) < 0) {
|
|
|
|
load->new_modules.size--;
|
|
|
|
drgn_debug_info_module_destroy(module);
|
|
|
|
return &drgn_enomem;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
*userdatap = module;
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
free:
|
|
|
|
elf_end(elf);
|
|
|
|
if (fd != -1)
|
|
|
|
close(fd);
|
|
|
|
free(path_key);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *
|
|
|
|
drgn_debug_info_report_elf(struct drgn_debug_info_load_state *load,
|
|
|
|
const char *path, int fd, Elf *elf, uint64_t start,
|
|
|
|
uint64_t end, const char *name, bool *new_ret)
|
|
|
|
{
|
|
|
|
|
|
|
|
struct drgn_error *err;
|
|
|
|
const void *build_id;
|
|
|
|
ssize_t build_id_len = dwelf_elf_gnu_build_id(elf, &build_id);
|
|
|
|
if (build_id_len < 0) {
|
|
|
|
err = drgn_debug_info_report_error(load, path, NULL,
|
|
|
|
drgn_error_libdwfl());
|
|
|
|
close(fd);
|
|
|
|
elf_end(elf);
|
|
|
|
return err;
|
|
|
|
} else if (build_id_len == 0) {
|
|
|
|
build_id = NULL;
|
|
|
|
}
|
|
|
|
return drgn_debug_info_report_module(load, build_id, build_id_len,
|
|
|
|
start, end, name, NULL, path, fd,
|
|
|
|
elf, new_ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int drgn_debug_info_report_dwfl_module(Dwfl_Module *dwfl_module,
|
|
|
|
void **userdatap,
|
|
|
|
const char *name, Dwarf_Addr base,
|
|
|
|
void *arg)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info_load_state *load = arg;
|
|
|
|
struct drgn_error *err;
|
|
|
|
|
|
|
|
if (*userdatap) {
|
|
|
|
/*
|
|
|
|
* This was either reported from drgn_debug_info_report_elf() or
|
|
|
|
* already indexed.
|
|
|
|
*/
|
|
|
|
return DWARF_CB_OK;
|
|
|
|
}
|
|
|
|
|
|
|
|
const unsigned char *build_id;
|
|
|
|
GElf_Addr build_id_vaddr;
|
|
|
|
int build_id_len = dwfl_module_build_id(dwfl_module, &build_id,
|
|
|
|
&build_id_vaddr);
|
|
|
|
if (build_id_len < 0) {
|
|
|
|
err = drgn_debug_info_report_error(load, name, NULL,
|
|
|
|
drgn_error_libdwfl());
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
} else if (build_id_len == 0) {
|
|
|
|
build_id = NULL;
|
|
|
|
}
|
|
|
|
Dwarf_Addr end;
|
|
|
|
dwfl_module_info(dwfl_module, NULL, NULL, &end, NULL, NULL, NULL, NULL);
|
|
|
|
err = drgn_debug_info_report_module(load, build_id, build_id_len, base,
|
|
|
|
end, NULL, dwfl_module, name, -1,
|
|
|
|
NULL, NULL);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
return DWARF_CB_OK;
|
|
|
|
|
|
|
|
err:
|
|
|
|
drgn_error_destroy(err);
|
|
|
|
return DWARF_CB_ABORT;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
userspace_report_debug_info(struct drgn_debug_info_load_state *load)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
|
|
|
|
for (size_t i = 0; i < load->num_paths; i++) {
|
|
|
|
int fd;
|
|
|
|
Elf *elf;
|
|
|
|
err = open_elf_file(load->paths[i], &fd, &elf);
|
|
|
|
if (err) {
|
|
|
|
err = drgn_debug_info_report_error(load, load->paths[i],
|
|
|
|
NULL, err);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* We haven't implemented a way to get the load address for
|
|
|
|
* anything reported here, so for now we report it as unloaded.
|
|
|
|
*/
|
|
|
|
err = drgn_debug_info_report_elf(load, load->paths[i], fd, elf,
|
|
|
|
0, 0, NULL, NULL);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (load->load_default) {
|
|
|
|
Dwfl *dwfl = load->dbinfo->dwfl;
|
|
|
|
struct drgn_program *prog = load->dbinfo->prog;
|
|
|
|
if (prog->flags & DRGN_PROGRAM_IS_LIVE) {
|
|
|
|
int ret = dwfl_linux_proc_report(dwfl, prog->pid);
|
|
|
|
if (ret == -1) {
|
|
|
|
return drgn_error_libdwfl();
|
|
|
|
} else if (ret) {
|
|
|
|
return drgn_error_create_os("dwfl_linux_proc_report",
|
|
|
|
ret, NULL);
|
|
|
|
}
|
|
|
|
} else if (dwfl_core_file_report(dwfl, prog->core,
|
|
|
|
NULL) == -1) {
|
|
|
|
return drgn_error_libdwfl();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *apply_relocation(Elf_Data *data, uint64_t r_offset,
|
|
|
|
uint32_t r_type, int64_t r_addend,
|
|
|
|
uint64_t st_value)
|
|
|
|
{
|
|
|
|
char *p;
|
|
|
|
|
|
|
|
p = (char *)data->d_buf + r_offset;
|
|
|
|
switch (r_type) {
|
|
|
|
case R_X86_64_NONE:
|
|
|
|
break;
|
|
|
|
case R_X86_64_32:
|
|
|
|
if (r_offset > SIZE_MAX - sizeof(uint32_t) ||
|
|
|
|
r_offset + sizeof(uint32_t) > data->d_size) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid relocation offset");
|
|
|
|
}
|
|
|
|
*(uint32_t *)p = st_value + r_addend;
|
|
|
|
break;
|
|
|
|
case R_X86_64_64:
|
|
|
|
if (r_offset > SIZE_MAX - sizeof(uint64_t) ||
|
|
|
|
r_offset + sizeof(uint64_t) > data->d_size) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid relocation offset");
|
|
|
|
}
|
|
|
|
*(uint64_t *)p = st_value + r_addend;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
|
|
|
"unimplemented relocation type %" PRIu32,
|
|
|
|
r_type);
|
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *relocate_section(Elf_Scn *scn, Elf_Scn *rela_scn,
|
|
|
|
Elf_Scn *symtab_scn,
|
|
|
|
uint64_t *sh_addrs, size_t shdrnum)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
Elf_Data *data, *rela_data, *symtab_data;
|
|
|
|
const Elf64_Rela *relocs;
|
|
|
|
const Elf64_Sym *syms;
|
|
|
|
size_t num_relocs, num_syms;
|
|
|
|
size_t i;
|
|
|
|
GElf_Shdr *shdr, shdr_mem;
|
|
|
|
|
|
|
|
err = read_elf_section(scn, &data);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
err = read_elf_section(rela_scn, &rela_data);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
err = read_elf_section(symtab_scn, &symtab_data);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
relocs = (Elf64_Rela *)rela_data->d_buf;
|
|
|
|
num_relocs = rela_data->d_size / sizeof(Elf64_Rela);
|
|
|
|
syms = (Elf64_Sym *)symtab_data->d_buf;
|
|
|
|
num_syms = symtab_data->d_size / sizeof(Elf64_Sym);
|
|
|
|
|
|
|
|
for (i = 0; i < num_relocs; i++) {
|
|
|
|
const Elf64_Rela *reloc = &relocs[i];
|
|
|
|
uint32_t r_sym, r_type;
|
|
|
|
uint16_t st_shndx;
|
|
|
|
uint64_t sh_addr;
|
|
|
|
|
|
|
|
r_sym = ELF64_R_SYM(reloc->r_info);
|
|
|
|
r_type = ELF64_R_TYPE(reloc->r_info);
|
|
|
|
|
|
|
|
if (r_sym >= num_syms) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid relocation symbol");
|
|
|
|
}
|
|
|
|
st_shndx = syms[r_sym].st_shndx;
|
|
|
|
if (st_shndx == 0) {
|
|
|
|
sh_addr = 0;
|
|
|
|
} else if (st_shndx < shdrnum) {
|
|
|
|
sh_addr = sh_addrs[st_shndx - 1];
|
|
|
|
} else {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid symbol section index");
|
|
|
|
}
|
|
|
|
err = apply_relocation(data, reloc->r_offset, r_type,
|
|
|
|
reloc->r_addend,
|
|
|
|
sh_addr + syms[r_sym].st_value);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Mark the relocation section as empty so that libdwfl doesn't try to
|
|
|
|
* apply it again.
|
|
|
|
*/
|
|
|
|
shdr = gelf_getshdr(rela_scn, &shdr_mem);
|
|
|
|
if (!shdr)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
shdr->sh_size = 0;
|
|
|
|
if (!gelf_update_shdr(rela_scn, shdr))
|
|
|
|
return drgn_error_libelf();
|
|
|
|
rela_data->d_size = 0;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Before the debugging information in a relocatable ELF file (e.g., Linux
|
|
|
|
* kernel module) can be used, it must have ELF relocations applied. This is
|
|
|
|
* usually done by libdwfl. However, libdwfl is relatively slow at it. This is a
|
|
|
|
* much faster implementation. It is only implemented for x86-64; for other
|
|
|
|
* architectures, we can fall back to libdwfl.
|
|
|
|
*/
|
|
|
|
static struct drgn_error *apply_elf_relocations(Elf *elf)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
GElf_Ehdr ehdr_mem, *ehdr;
|
|
|
|
size_t shdrnum, shstrndx;
|
|
|
|
uint64_t *sh_addrs;
|
|
|
|
Elf_Scn *scn;
|
|
|
|
|
|
|
|
ehdr = gelf_getehdr(elf, &ehdr_mem);
|
|
|
|
if (!ehdr)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
|
|
|
|
if (ehdr->e_type != ET_REL ||
|
|
|
|
ehdr->e_machine != EM_X86_64 ||
|
|
|
|
ehdr->e_ident[EI_CLASS] != ELFCLASS64 ||
|
|
|
|
ehdr->e_ident[EI_DATA] !=
|
2021-02-23 19:10:22 +00:00
|
|
|
(HOST_LITTLE_ENDIAN ? ELFDATA2LSB : ELFDATA2MSB)) {
|
2020-09-16 01:42:53 +01:00
|
|
|
/* Unsupported; fall back to libdwfl. */
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (elf_getshdrnum(elf, &shdrnum))
|
|
|
|
return drgn_error_libelf();
|
|
|
|
if (shdrnum > 1) {
|
|
|
|
sh_addrs = calloc(shdrnum - 1, sizeof(*sh_addrs));
|
|
|
|
if (!sh_addrs)
|
|
|
|
return &drgn_enomem;
|
|
|
|
|
|
|
|
scn = NULL;
|
|
|
|
while ((scn = elf_nextscn(elf, scn))) {
|
|
|
|
size_t ndx;
|
|
|
|
|
|
|
|
ndx = elf_ndxscn(scn);
|
|
|
|
if (ndx > 0 && ndx < shdrnum) {
|
|
|
|
GElf_Shdr *shdr, shdr_mem;
|
|
|
|
|
|
|
|
shdr = gelf_getshdr(scn, &shdr_mem);
|
|
|
|
if (!shdr) {
|
|
|
|
err = drgn_error_libelf();
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
sh_addrs[ndx - 1] = shdr->sh_addr;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
sh_addrs = NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (elf_getshdrstrndx(elf, &shstrndx)) {
|
|
|
|
err = drgn_error_libelf();
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
scn = NULL;
|
|
|
|
while ((scn = elf_nextscn(elf, scn))) {
|
|
|
|
GElf_Shdr *shdr, shdr_mem;
|
|
|
|
const char *scnname;
|
|
|
|
|
|
|
|
shdr = gelf_getshdr(scn, &shdr_mem);
|
|
|
|
if (!shdr) {
|
|
|
|
err = drgn_error_libelf();
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (shdr->sh_type != SHT_RELA)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
scnname = elf_strptr(elf, shstrndx, shdr->sh_name);
|
|
|
|
if (!scnname)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (strstartswith(scnname, ".rela.debug_")) {
|
|
|
|
Elf_Scn *info_scn, *link_scn;
|
|
|
|
|
|
|
|
info_scn = elf_getscn(elf, shdr->sh_info);
|
|
|
|
if (!info_scn) {
|
|
|
|
err = drgn_error_libelf();
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
link_scn = elf_getscn(elf, shdr->sh_link);
|
|
|
|
if (!link_scn) {
|
|
|
|
err = drgn_error_libelf();
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
err = relocate_section(info_scn, scn, link_scn,
|
|
|
|
sh_addrs, shdrnum);
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
out:
|
|
|
|
free(sh_addrs);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_get_debug_sections(struct drgn_debug_info_module *module)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
|
|
|
|
if (module->elf) {
|
|
|
|
err = apply_elf_relocations(module->elf);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Note: not dwfl_module_getelf(), because then libdwfl applies
|
|
|
|
* ELF relocations to all sections, not just debug sections.
|
|
|
|
*/
|
|
|
|
Dwarf_Addr bias;
|
|
|
|
Dwarf *dwarf = dwfl_module_getdwarf(module->dwfl_module, &bias);
|
|
|
|
if (!dwarf)
|
|
|
|
return drgn_error_libdwfl();
|
|
|
|
Elf *elf = dwarf_getelf(dwarf);
|
|
|
|
if (!elf)
|
|
|
|
return drgn_error_libdw();
|
|
|
|
|
2020-10-29 00:35:09 +00:00
|
|
|
module->little_endian = elf_getident(elf, NULL)[EI_DATA] == ELFDATA2LSB;
|
2020-09-16 01:42:53 +01:00
|
|
|
|
|
|
|
size_t shstrndx;
|
|
|
|
if (elf_getshdrstrndx(elf, &shstrndx))
|
|
|
|
return drgn_error_libelf();
|
|
|
|
|
|
|
|
Elf_Scn *scn = NULL;
|
|
|
|
while ((scn = elf_nextscn(elf, scn))) {
|
|
|
|
GElf_Shdr shdr_mem;
|
|
|
|
GElf_Shdr *shdr = gelf_getshdr(scn, &shdr_mem);
|
|
|
|
if (!shdr)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
|
|
|
|
if (shdr->sh_type == SHT_NOBITS || (shdr->sh_flags & SHF_GROUP))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
const char *scnname = elf_strptr(elf, shstrndx, shdr->sh_name);
|
|
|
|
if (!scnname)
|
|
|
|
continue;
|
|
|
|
|
2020-10-23 00:09:31 +01:00
|
|
|
for (size_t i = 0; i < DRGN_NUM_DEBUG_SCNS; i++) {
|
|
|
|
if (!module->scns[i] &&
|
|
|
|
strcmp(scnname, drgn_debug_scn_names[i]) == 0) {
|
|
|
|
err = read_elf_section(scn, &module->scns[i]);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2020-09-16 01:42:53 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Truncate any extraneous bytes so that we can assume that a pointer
|
|
|
|
* within .debug_str is always null-terminated.
|
|
|
|
*/
|
2020-10-23 00:09:31 +01:00
|
|
|
Elf_Data *debug_str = module->scns[DRGN_SCN_DEBUG_STR];
|
|
|
|
if (debug_str) {
|
|
|
|
const char *buf = debug_str->d_buf;
|
|
|
|
const char *nul = memrchr(buf, '\0', debug_str->d_size);
|
2020-09-16 01:42:53 +01:00
|
|
|
if (nul)
|
2020-10-23 00:09:31 +01:00
|
|
|
debug_str->d_size = nul - buf + 1;
|
2020-09-16 01:42:53 +01:00
|
|
|
else
|
2020-10-23 00:09:31 +01:00
|
|
|
debug_str->d_size = 0;
|
2020-09-16 01:42:53 +01:00
|
|
|
|
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_debug_info_read_module(struct drgn_debug_info_load_state *load,
|
|
|
|
struct drgn_dwarf_index_update_state *dindex_state,
|
|
|
|
struct drgn_debug_info_module *head)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
struct drgn_debug_info_module *module;
|
|
|
|
for (module = head; module; module = module->next) {
|
|
|
|
err = drgn_get_debug_sections(module);
|
|
|
|
if (err) {
|
|
|
|
module->err = err;
|
|
|
|
continue;
|
|
|
|
}
|
2020-10-23 00:09:31 +01:00
|
|
|
if (module->scns[DRGN_SCN_DEBUG_INFO] &&
|
|
|
|
module->scns[DRGN_SCN_DEBUG_ABBREV]) {
|
2020-09-16 01:42:53 +01:00
|
|
|
module->state = DRGN_DEBUG_INFO_MODULE_INDEXING;
|
|
|
|
drgn_dwarf_index_read_module(dindex_state, module);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* We checked all of the files and didn't find debugging information.
|
|
|
|
* Report why for each one.
|
|
|
|
*
|
|
|
|
* (If we did find debugging information, we discard errors on the
|
|
|
|
* unused files.)
|
|
|
|
*/
|
|
|
|
err = NULL;
|
|
|
|
#pragma omp critical(drgn_debug_info_read_module_error)
|
|
|
|
for (module = head; module; module = module->next) {
|
|
|
|
const char *name =
|
|
|
|
dwfl_module_info(module->dwfl_module, NULL, NULL, NULL,
|
|
|
|
NULL, NULL, NULL, NULL);
|
|
|
|
if (module->err) {
|
|
|
|
err = drgn_debug_info_report_error(load, name, NULL,
|
|
|
|
module->err);
|
|
|
|
module->err = NULL;
|
|
|
|
} else {
|
|
|
|
err = drgn_debug_info_report_error(load, name,
|
|
|
|
"no debugging information",
|
|
|
|
NULL);
|
|
|
|
}
|
|
|
|
if (err)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_debug_info_update_index(struct drgn_debug_info_load_state *load)
|
|
|
|
{
|
|
|
|
if (!load->new_modules.size)
|
|
|
|
return NULL;
|
|
|
|
struct drgn_debug_info *dbinfo = load->dbinfo;
|
|
|
|
if (!c_string_set_reserve(&dbinfo->module_names,
|
|
|
|
c_string_set_size(&dbinfo->module_names) +
|
|
|
|
load->new_modules.size))
|
|
|
|
return &drgn_enomem;
|
|
|
|
struct drgn_dwarf_index_update_state dindex_state;
|
|
|
|
drgn_dwarf_index_update_begin(&dindex_state, &dbinfo->dindex);
|
|
|
|
/*
|
|
|
|
* In OpenMP 5.0, this could be "#pragma omp parallel master taskloop"
|
|
|
|
* (added in GCC 9 and Clang 10).
|
|
|
|
*/
|
|
|
|
#pragma omp parallel
|
|
|
|
#pragma omp master
|
|
|
|
#pragma omp taskloop
|
|
|
|
for (size_t i = 0; i < load->new_modules.size; i++) {
|
|
|
|
if (drgn_dwarf_index_update_cancelled(&dindex_state))
|
|
|
|
continue;
|
|
|
|
struct drgn_error *module_err =
|
|
|
|
drgn_debug_info_read_module(load, &dindex_state,
|
|
|
|
load->new_modules.data[i]);
|
|
|
|
if (module_err)
|
|
|
|
drgn_dwarf_index_update_cancel(&dindex_state, module_err);
|
|
|
|
}
|
|
|
|
struct drgn_error *err = drgn_dwarf_index_update_end(&dindex_state);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
drgn_debug_info_free_modules(dbinfo, true, false);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *
|
|
|
|
drgn_debug_info_report_flush(struct drgn_debug_info_load_state *load)
|
|
|
|
{
|
|
|
|
struct drgn_debug_info *dbinfo = load->dbinfo;
|
|
|
|
dwfl_report_end(dbinfo->dwfl, NULL, NULL);
|
|
|
|
struct drgn_error *err = drgn_debug_info_update_index(load);
|
|
|
|
dwfl_report_begin_add(dbinfo->dwfl);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
load->new_modules.size = 0;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_debug_info_report_finalize_errors(struct drgn_debug_info_load_state *load)
|
|
|
|
{
|
|
|
|
if (load->num_errors > load->max_errors &&
|
|
|
|
(!string_builder_line_break(&load->errors) ||
|
|
|
|
!string_builder_appendf(&load->errors, "... %u more",
|
|
|
|
load->num_errors - load->max_errors))) {
|
|
|
|
free(load->errors.str);
|
|
|
|
return &drgn_enomem;
|
|
|
|
}
|
|
|
|
if (load->num_errors) {
|
|
|
|
return drgn_error_from_string_builder(DRGN_ERROR_MISSING_DEBUG_INFO,
|
|
|
|
&load->errors);
|
|
|
|
} else {
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *drgn_debug_info_load(struct drgn_debug_info *dbinfo,
|
|
|
|
const char **paths, size_t n,
|
|
|
|
bool load_default, bool load_main)
|
|
|
|
{
|
|
|
|
struct drgn_program *prog = dbinfo->prog;
|
|
|
|
struct drgn_error *err;
|
|
|
|
|
|
|
|
if (load_default)
|
|
|
|
load_main = true;
|
|
|
|
|
|
|
|
const char *max_errors = getenv("DRGN_MAX_DEBUG_INFO_ERRORS");
|
|
|
|
struct drgn_debug_info_load_state load = {
|
|
|
|
.dbinfo = dbinfo,
|
|
|
|
.paths = paths,
|
|
|
|
.num_paths = n,
|
|
|
|
.load_default = load_default,
|
|
|
|
.load_main = load_main,
|
|
|
|
.new_modules = VECTOR_INIT,
|
|
|
|
.max_errors = max_errors ? atoi(max_errors) : 5,
|
|
|
|
};
|
|
|
|
dwfl_report_begin_add(dbinfo->dwfl);
|
|
|
|
if (prog->flags & DRGN_PROGRAM_IS_LINUX_KERNEL)
|
|
|
|
err = linux_kernel_report_debug_info(&load);
|
|
|
|
else
|
|
|
|
err = userspace_report_debug_info(&load);
|
|
|
|
dwfl_report_end(dbinfo->dwfl, NULL, NULL);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* userspace_report_debug_info() reports the main debugging information
|
|
|
|
* directly with libdwfl, so we need to report it to dbinfo.
|
|
|
|
*/
|
|
|
|
if (!(prog->flags & DRGN_PROGRAM_IS_LINUX_KERNEL) && load_main &&
|
|
|
|
dwfl_getmodules(dbinfo->dwfl, drgn_debug_info_report_dwfl_module,
|
|
|
|
&load, 0)) {
|
|
|
|
err = &drgn_enomem;
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
|
|
|
err = drgn_debug_info_update_index(&load);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If this fails, it's too late to roll back. This can only fail with
|
|
|
|
* enomem, so it's not a big deal.
|
|
|
|
*/
|
|
|
|
err = drgn_debug_info_report_finalize_errors(&load);
|
|
|
|
out:
|
|
|
|
drgn_debug_info_module_vector_deinit(&load.new_modules);
|
|
|
|
return err;
|
|
|
|
|
|
|
|
err:
|
|
|
|
drgn_debug_info_free_modules(dbinfo, false, false);
|
|
|
|
free(load.errors.str);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool drgn_debug_info_is_indexed(struct drgn_debug_info *dbinfo,
|
|
|
|
const char *name)
|
|
|
|
{
|
|
|
|
return c_string_set_search(&dbinfo->module_names, &name).entry != NULL;
|
|
|
|
}
|
|
|
|
|
2020-10-13 00:06:23 +01:00
|
|
|
DEFINE_HASH_TABLE_FUNCTIONS(drgn_dwarf_type_map, ptr_key_hash_pair,
|
|
|
|
scalar_key_eq)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-07-13 21:30:36 +01:00
|
|
|
/**
|
|
|
|
* Return whether a DWARF DIE is little-endian.
|
|
|
|
*
|
|
|
|
* @param[in] check_attr Whether to check the DW_AT_endianity attribute. If @c
|
|
|
|
* false, only the ELF header is checked and this function cannot fail.
|
|
|
|
* @return @c NULL on success, non-@c NULL on error.
|
|
|
|
*/
|
|
|
|
static struct drgn_error *dwarf_die_is_little_endian(Dwarf_Die *die,
|
|
|
|
bool check_attr, bool *ret)
|
|
|
|
{
|
|
|
|
Dwarf_Attribute endianity_attr_mem, *endianity_attr;
|
|
|
|
Dwarf_Word endianity;
|
|
|
|
if (check_attr &&
|
|
|
|
(endianity_attr = dwarf_attr_integrate(die, DW_AT_endianity,
|
|
|
|
&endianity_attr_mem))) {
|
|
|
|
if (dwarf_formudata(endianity_attr, &endianity)) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid DW_AT_endianity");
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
endianity = DW_END_default;
|
|
|
|
}
|
|
|
|
switch (endianity) {
|
|
|
|
case DW_END_default: {
|
|
|
|
Elf *elf = dwarf_getelf(dwarf_cu_getdwarf(die->cu));
|
|
|
|
*ret = elf_getident(elf, NULL)[EI_DATA] == ELFDATA2LSB;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
case DW_END_little:
|
|
|
|
*ret = true;
|
|
|
|
return NULL;
|
|
|
|
case DW_END_big:
|
|
|
|
*ret = false;
|
|
|
|
return NULL;
|
|
|
|
default:
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"unknown DW_AT_endianity");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/** Like dwarf_die_is_little_endian(), but returns a @ref drgn_byte_order. */
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
static struct drgn_error *dwarf_die_byte_order(Dwarf_Die *die, bool check_attr,
|
2020-07-13 21:30:36 +01:00
|
|
|
enum drgn_byte_order *ret)
|
|
|
|
{
|
|
|
|
bool little_endian;
|
|
|
|
struct drgn_error *err = dwarf_die_is_little_endian(die, check_attr,
|
|
|
|
&little_endian);
|
|
|
|
/*
|
|
|
|
* dwarf_die_is_little_endian() can't fail if check_attr is false, so
|
|
|
|
* the !check_attr test suppresses maybe-uninitialized warnings.
|
|
|
|
*/
|
|
|
|
if (!err || !check_attr)
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
*ret = drgn_byte_order_from_little_endian(little_endian);
|
2020-07-13 21:30:36 +01:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static int dwarf_type(Dwarf_Die *die, Dwarf_Die *ret)
|
|
|
|
{
|
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr;
|
|
|
|
|
|
|
|
if (!(attr = dwarf_attr_integrate(die, DW_AT_type, &attr_mem)))
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
return dwarf_formref_die(attr, ret) ? 0 : -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int dwarf_flag(Dwarf_Die *die, unsigned int name, bool *ret)
|
|
|
|
{
|
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr;
|
|
|
|
|
2021-02-22 20:03:53 +00:00
|
|
|
if (!(attr = dwarf_attr(die, name, &attr_mem))) {
|
|
|
|
*ret = false;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return dwarf_formflag(attr, ret);
|
|
|
|
}
|
|
|
|
|
|
|
|
static int dwarf_flag_integrate(Dwarf_Die *die, unsigned int name, bool *ret)
|
|
|
|
{
|
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr;
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (!(attr = dwarf_attr_integrate(die, name, &attr_mem))) {
|
|
|
|
*ret = false;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return dwarf_formflag(attr, ret);
|
|
|
|
}
|
|
|
|
|
2019-04-30 20:38:41 +01:00
|
|
|
/**
|
|
|
|
* Parse a type from a DWARF debugging information entry.
|
|
|
|
*
|
|
|
|
* This is the same as @ref drgn_type_from_dwarf() except that it can be used to
|
|
|
|
* work around a bug in GCC < 9.0 that zero length array types are encoded the
|
|
|
|
* same as incomplete array types. There are a few places where GCC allows
|
|
|
|
* zero-length arrays but not incomplete arrays:
|
|
|
|
*
|
|
|
|
* - As the type of a member of a structure with only one member.
|
|
|
|
* - As the type of a structure member other than the last member.
|
|
|
|
* - As the type of a union member.
|
|
|
|
* - As the element type of an array.
|
|
|
|
*
|
|
|
|
* In these cases, we know that what appears to be an incomplete array type must
|
|
|
|
* actually have a length of zero. In other cases, a subrange DIE without
|
|
|
|
* DW_AT_count or DW_AT_upper_bound is ambiguous; we return an incomplete array
|
|
|
|
* type.
|
|
|
|
*
|
2020-09-12 01:41:23 +01:00
|
|
|
* @param[in] dbinfo Debugging information.
|
2021-01-27 19:12:33 +00:00
|
|
|
* @param[in] module Module containing @p die.
|
2019-04-30 20:38:41 +01:00
|
|
|
* @param[in] die DIE to parse.
|
|
|
|
* @param[in] can_be_incomplete_array Whether the type can be an incomplete
|
|
|
|
* array type. If this is @c false and the type appears to be an incomplete
|
|
|
|
* array type, its length is set to zero instead.
|
|
|
|
* @param[out] is_incomplete_array_ret Whether the encoded type is an incomplete
|
|
|
|
* array type or a typedef of an incomplete array type (regardless of @p
|
|
|
|
* can_be_incomplete_array).
|
|
|
|
* @param[out] ret Returned type.
|
|
|
|
* @return @c NULL on success, non-@c NULL on error.
|
|
|
|
*/
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_type_from_dwarf_internal(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, bool can_be_incomplete_array,
|
2019-04-30 20:38:41 +01:00
|
|
|
bool *is_incomplete_array_ret,
|
|
|
|
struct drgn_qualified_type *ret);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Parse a type from a DWARF debugging information entry.
|
|
|
|
*
|
2020-09-12 01:41:23 +01:00
|
|
|
* @param[in] dbinfo Debugging information.
|
2021-01-27 19:12:33 +00:00
|
|
|
* @param[in] module Module containing @p die.
|
2019-04-30 20:38:41 +01:00
|
|
|
* @param[in] die DIE to parse.
|
|
|
|
* @param[out] ret Returned type.
|
|
|
|
* @return @c NULL on success, non-@c NULL on error.
|
|
|
|
*/
|
|
|
|
static inline struct drgn_error *
|
2021-01-27 19:12:33 +00:00
|
|
|
drgn_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
|
|
|
struct drgn_qualified_type *ret)
|
2019-04-30 20:38:41 +01:00
|
|
|
{
|
2021-01-27 19:12:33 +00:00
|
|
|
return drgn_type_from_dwarf_internal(dbinfo, module, die, true, NULL,
|
|
|
|
ret);
|
2019-04-30 20:38:41 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* Parse a type from the @c DW_AT_type attribute of a DWARF debugging
|
|
|
|
* information entry.
|
|
|
|
*
|
2020-09-12 01:41:23 +01:00
|
|
|
* @param[in] dbinfo Debugging information.
|
2021-01-27 19:12:33 +00:00
|
|
|
* @param[in] module Module containing @p die.
|
2021-01-08 19:02:27 +00:00
|
|
|
* @param[in] die DIE with @c DW_AT_type attribute.
|
|
|
|
* @param[in] lang Language of @p die if it is already known, @c NULL if it
|
|
|
|
* should be determined from @p die.
|
2020-02-26 20:12:25 +00:00
|
|
|
* @param[in] can_be_void Whether the @c DW_AT_type attribute may be missing,
|
|
|
|
* which is interpreted as a void type. If this is false and the @c DW_AT_type
|
|
|
|
* attribute is missing, an error is returned.
|
|
|
|
* @param[in] can_be_incomplete_array See @ref drgn_type_from_dwarf_internal().
|
|
|
|
* @param[in] is_incomplete_array_ret See @ref drgn_type_from_dwarf_internal().
|
|
|
|
* @param[out] ret Returned type.
|
|
|
|
* @return @c NULL on success, non-@c NULL on error.
|
2019-04-30 20:38:41 +01:00
|
|
|
*/
|
2020-10-22 23:34:47 +01:00
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_type_from_dwarf_attr(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
2021-01-21 17:07:03 +00:00
|
|
|
const struct drgn_language *lang,
|
2021-01-08 19:59:31 +00:00
|
|
|
bool can_be_void, bool can_be_incomplete_array,
|
2021-01-08 19:02:27 +00:00
|
|
|
bool *is_incomplete_array_ret,
|
|
|
|
struct drgn_qualified_type *ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
2020-02-26 21:22:51 +00:00
|
|
|
struct drgn_error *err;
|
2020-10-22 23:34:47 +01:00
|
|
|
char tag_buf[DW_TAG_BUF_LEN];
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr;
|
2021-01-08 19:02:27 +00:00
|
|
|
if (!(attr = dwarf_attr_integrate(die, DW_AT_type, &attr_mem))) {
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (can_be_void) {
|
2021-01-08 19:02:27 +00:00
|
|
|
if (!lang) {
|
|
|
|
err = drgn_language_from_die(die, true, &lang);
|
2020-02-26 21:22:51 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
2021-01-08 19:02:27 +00:00
|
|
|
ret->type = drgn_void_type(dbinfo->prog, lang);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
ret->qualifiers = 0;
|
|
|
|
return NULL;
|
|
|
|
} else {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"%s is missing DW_AT_type",
|
2021-01-08 19:02:27 +00:00
|
|
|
dwarf_tag_str(die, tag_buf));
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-10-22 23:34:47 +01:00
|
|
|
Dwarf_Die type_die;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (!dwarf_formref_die(attr, &type_die)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
2020-10-22 23:34:47 +01:00
|
|
|
"%s has invalid DW_AT_type",
|
2021-01-08 19:02:27 +00:00
|
|
|
dwarf_tag_str(die, tag_buf));
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
2021-01-27 19:12:33 +00:00
|
|
|
return drgn_type_from_dwarf_internal(dbinfo, module, &type_die,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
can_be_incomplete_array,
|
|
|
|
is_incomplete_array_ret, ret);
|
|
|
|
}
|
|
|
|
|
2021-01-08 20:25:29 +00:00
|
|
|
static struct drgn_error *
|
|
|
|
drgn_object_from_dwarf_enumerator(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, const char *name,
|
|
|
|
struct drgn_object *ret)
|
2021-01-08 20:25:29 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
struct drgn_qualified_type qualified_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf(dbinfo, module, die, &qualified_type);
|
2021-01-08 20:25:29 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
const struct drgn_type_enumerator *enumerators =
|
|
|
|
drgn_type_enumerators(qualified_type.type);
|
|
|
|
size_t num_enumerators = drgn_type_num_enumerators(qualified_type.type);
|
|
|
|
for (size_t i = 0; i < num_enumerators; i++) {
|
|
|
|
if (strcmp(enumerators[i].name, name) != 0)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (drgn_enum_type_is_signed(qualified_type.type)) {
|
|
|
|
return drgn_object_set_signed(ret, qualified_type,
|
|
|
|
enumerators[i].svalue, 0);
|
|
|
|
} else {
|
|
|
|
return drgn_object_set_unsigned(ret, qualified_type,
|
|
|
|
enumerators[i].uvalue,
|
|
|
|
0);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
UNREACHABLE();
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_object_from_dwarf_subprogram(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, struct drgn_object *ret)
|
2021-01-08 20:25:29 +00:00
|
|
|
{
|
|
|
|
struct drgn_qualified_type qualified_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_error *err = drgn_type_from_dwarf(dbinfo, module, die,
|
2021-01-08 20:25:29 +00:00
|
|
|
&qualified_type);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
Dwarf_Addr low_pc;
|
|
|
|
if (dwarf_lowpc(die, &low_pc) == -1)
|
|
|
|
return drgn_object_set_absent(ret, qualified_type, 0);
|
2021-01-21 17:07:03 +00:00
|
|
|
Dwarf_Addr bias;
|
2021-01-27 19:12:33 +00:00
|
|
|
dwfl_module_info(module->dwfl_module, NULL, NULL, NULL, &bias, NULL,
|
|
|
|
NULL, NULL);
|
2021-01-08 20:25:29 +00:00
|
|
|
return drgn_object_set_reference(ret, qualified_type, low_pc + bias, 0,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
0);
|
2021-01-08 20:25:29 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_object_from_dwarf_constant(struct drgn_debug_info *dbinfo, Dwarf_Die *die,
|
|
|
|
struct drgn_qualified_type qualified_type,
|
|
|
|
Dwarf_Attribute *attr, struct drgn_object *ret)
|
|
|
|
{
|
|
|
|
struct drgn_object_type type;
|
2021-02-18 22:20:13 +00:00
|
|
|
struct drgn_error *err = drgn_object_type(qualified_type, 0, &type);
|
2021-01-08 20:25:29 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
Dwarf_Block block;
|
|
|
|
if (dwarf_formblock(attr, &block) == 0) {
|
2021-02-18 22:20:13 +00:00
|
|
|
if (block.length < drgn_value_size(type.bit_size)) {
|
2021-01-08 20:25:29 +00:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"DW_AT_const_value block is too small");
|
|
|
|
}
|
|
|
|
return drgn_object_set_from_buffer_internal(ret, &type,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
block.data, 0);
|
2021-02-18 22:20:13 +00:00
|
|
|
} else if (type.encoding == DRGN_OBJECT_ENCODING_SIGNED) {
|
2021-01-08 20:25:29 +00:00
|
|
|
Dwarf_Sword svalue;
|
|
|
|
if (dwarf_formsdata(attr, &svalue)) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid DW_AT_const_value");
|
|
|
|
}
|
2021-02-18 22:20:13 +00:00
|
|
|
drgn_object_set_signed_internal(ret, &type, svalue);
|
|
|
|
return NULL;
|
|
|
|
} else if (type.encoding == DRGN_OBJECT_ENCODING_UNSIGNED) {
|
2021-01-08 20:25:29 +00:00
|
|
|
Dwarf_Word uvalue;
|
|
|
|
if (dwarf_formudata(attr, &uvalue)) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"invalid DW_AT_const_value");
|
|
|
|
}
|
2021-02-18 22:20:13 +00:00
|
|
|
drgn_object_set_unsigned_internal(ret, &type, uvalue);
|
|
|
|
return NULL;
|
2021-01-08 20:25:29 +00:00
|
|
|
} else {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"unknown DW_AT_const_value form");
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_object_from_dwarf_variable(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, struct drgn_object *ret)
|
2021-01-08 20:25:29 +00:00
|
|
|
{
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
/*
|
|
|
|
* The DWARF 5 specifications mentions that data object entries can have
|
|
|
|
* DW_AT_endianity, but that doesn't seem to be used in practice. It
|
|
|
|
* would be inconvenient to support, so ignore it for now.
|
|
|
|
*/
|
2021-01-08 20:25:29 +00:00
|
|
|
struct drgn_qualified_type qualified_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_error *err = drgn_type_from_dwarf_attr(dbinfo, module,
|
2021-01-21 17:07:03 +00:00
|
|
|
die, NULL, true,
|
|
|
|
true, NULL,
|
2021-01-08 20:25:29 +00:00
|
|
|
&qualified_type);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
|
|
|
if ((attr = dwarf_attr_integrate(die, DW_AT_location, &attr_mem))) {
|
|
|
|
Dwarf_Op *loc;
|
|
|
|
size_t nloc;
|
|
|
|
if (dwarf_getlocation(attr, &loc, &nloc))
|
|
|
|
return drgn_error_libdw();
|
|
|
|
if (nloc != 1 || loc[0].atom != DW_OP_addr) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"DW_AT_location has unimplemented operation");
|
|
|
|
}
|
2021-01-21 18:14:50 +00:00
|
|
|
uint64_t address = loc[0].number;
|
|
|
|
Dwarf_Addr start, end, bias;
|
2021-01-27 19:12:33 +00:00
|
|
|
dwfl_module_info(module->dwfl_module, NULL, &start, &end, &bias,
|
|
|
|
NULL, NULL, NULL);
|
2021-01-21 18:14:50 +00:00
|
|
|
/*
|
|
|
|
* If the address is not in the module's address range, then
|
|
|
|
* it's probably something special like a Linux per-CPU variable
|
|
|
|
* (which isn't actually a variable address but an offset).
|
|
|
|
* Don't apply the bias in that case.
|
|
|
|
*/
|
|
|
|
if (start <= address + bias && address + bias < end)
|
|
|
|
address += bias;
|
|
|
|
return drgn_object_set_reference(ret, qualified_type, address,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
0, 0);
|
2021-01-08 20:25:29 +00:00
|
|
|
} else if ((attr = dwarf_attr_integrate(die, DW_AT_const_value,
|
|
|
|
&attr_mem))) {
|
|
|
|
return drgn_object_from_dwarf_constant(dbinfo, die,
|
|
|
|
qualified_type, attr,
|
|
|
|
ret);
|
|
|
|
} else {
|
2021-01-21 20:01:58 +00:00
|
|
|
if (dwarf_tag(die) == DW_TAG_template_value_parameter) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"DW_AT_template_value_parameter is missing value");
|
|
|
|
}
|
2021-01-08 20:25:29 +00:00
|
|
|
return drgn_object_set_absent(ret, qualified_type, 0);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_base_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
2021-01-21 17:07:03 +00:00
|
|
|
const struct drgn_language *lang,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
struct drgn_type **ret)
|
|
|
|
{
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
struct drgn_error *err;
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
const char *name = dwarf_diename(die);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (!name) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_base_type has missing or invalid DW_AT_name");
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
Dwarf_Attribute attr;
|
|
|
|
Dwarf_Word encoding;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (!dwarf_attr_integrate(die, DW_AT_encoding, &attr) ||
|
|
|
|
dwarf_formudata(&attr, &encoding)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_base_type has missing or invalid DW_AT_encoding");
|
|
|
|
}
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
int size = dwarf_bytesize(die);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (size == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_base_type has missing or invalid DW_AT_byte_size");
|
|
|
|
}
|
|
|
|
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
enum drgn_byte_order byte_order;
|
|
|
|
err = dwarf_die_byte_order(die, true, &byte_order);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
switch (encoding) {
|
|
|
|
case DW_ATE_boolean:
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
return drgn_bool_type_create(dbinfo->prog, name, size,
|
|
|
|
byte_order, lang, ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
case DW_ATE_float:
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
return drgn_float_type_create(dbinfo->prog, name, size,
|
|
|
|
byte_order, lang, ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
case DW_ATE_signed:
|
|
|
|
case DW_ATE_signed_char:
|
2020-09-12 01:41:23 +01:00
|
|
|
return drgn_int_type_create(dbinfo->prog, name, size, true,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
byte_order, lang, ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
case DW_ATE_unsigned:
|
|
|
|
case DW_ATE_unsigned_char:
|
2020-09-12 01:41:23 +01:00
|
|
|
return drgn_int_type_create(dbinfo->prog, name, size, false,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
byte_order, lang, ret);
|
2021-02-17 22:56:33 +00:00
|
|
|
/* We don't support complex types yet. */
|
|
|
|
case DW_ATE_complex_float:
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
default:
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_base_type has unknown DWARF encoding 0x%llx",
|
|
|
|
(unsigned long long)encoding);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2019-11-15 01:12:47 +00:00
|
|
|
* DW_TAG_structure_type, DW_TAG_union_type, DW_TAG_class_type, and
|
|
|
|
* DW_TAG_enumeration_type can be incomplete (i.e., have a DW_AT_declaration of
|
|
|
|
* true). This tries to find the complete type. If it succeeds, it returns NULL.
|
2021-03-05 20:46:06 +00:00
|
|
|
* If it can't find a complete type, it returns &drgn_not_found. Otherwise, it
|
|
|
|
* returns an error.
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
*/
|
|
|
|
static struct drgn_error *
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_debug_info_find_complete(struct drgn_debug_info *dbinfo, uint64_t tag,
|
|
|
|
const char *name, struct drgn_type **ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
|
2020-09-02 21:05:45 +01:00
|
|
|
struct drgn_dwarf_index_iterator it;
|
2020-09-12 01:41:23 +01:00
|
|
|
err = drgn_dwarf_index_iterator_init(&it, &dbinfo->dindex.global, name,
|
2020-05-11 18:16:05 +01:00
|
|
|
strlen(name), &tag, 1);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/*
|
|
|
|
* Find a matching DIE. Note that drgn_dwarf_index does not contain DIEs
|
|
|
|
* with DW_AT_declaration, so this will always be a complete type.
|
|
|
|
*/
|
2020-09-02 21:05:45 +01:00
|
|
|
struct drgn_dwarf_index_die *index_die =
|
|
|
|
drgn_dwarf_index_iterator_next(&it);
|
|
|
|
if (!index_die)
|
2021-03-05 20:46:06 +00:00
|
|
|
return &drgn_not_found;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
/*
|
|
|
|
* Look for another matching DIE. If there is one, then we can't be sure
|
|
|
|
* which type this is, so leave it incomplete rather than guessing.
|
|
|
|
*/
|
2020-09-02 21:05:45 +01:00
|
|
|
if (drgn_dwarf_index_iterator_next(&it))
|
2021-03-05 20:46:06 +00:00
|
|
|
return &drgn_not_found;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-09-02 21:05:45 +01:00
|
|
|
Dwarf_Die die;
|
2021-01-21 17:07:03 +00:00
|
|
|
err = drgn_dwarf_index_get_die(index_die, &die);
|
2020-09-02 21:05:45 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
struct drgn_qualified_type qualified_type;
|
2021-01-21 17:07:03 +00:00
|
|
|
err = drgn_type_from_dwarf(dbinfo, index_die->module, &die,
|
|
|
|
&qualified_type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
*ret = qualified_type.type;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2020-12-18 19:01:29 +00:00
|
|
|
struct drgn_dwarf_member_thunk_arg {
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module;
|
2020-12-18 19:01:29 +00:00
|
|
|
Dwarf_Die die;
|
|
|
|
bool can_be_incomplete_array;
|
|
|
|
};
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static struct drgn_error *
|
2020-12-18 19:01:29 +00:00
|
|
|
drgn_dwarf_member_thunk_fn(struct drgn_object *res, void *arg_)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
struct drgn_dwarf_member_thunk_arg *arg = arg_;
|
|
|
|
if (res) {
|
|
|
|
struct drgn_qualified_type qualified_type;
|
|
|
|
err = drgn_type_from_dwarf_attr(drgn_object_program(res)->_dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
arg->module, &arg->die, NULL,
|
|
|
|
false,
|
2020-12-18 19:01:29 +00:00
|
|
|
arg->can_be_incomplete_array,
|
|
|
|
NULL, &qualified_type);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
|
|
|
uint64_t bit_field_size;
|
|
|
|
if ((attr = dwarf_attr_integrate(&arg->die, DW_AT_bit_size,
|
|
|
|
&attr_mem))) {
|
|
|
|
Dwarf_Word bit_size;
|
|
|
|
if (dwarf_formudata(attr, &bit_size)) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"DW_TAG_member has invalid DW_AT_bit_size");
|
|
|
|
}
|
|
|
|
bit_field_size = bit_size;
|
|
|
|
} else {
|
|
|
|
bit_field_size = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
err = drgn_object_set_absent(res, qualified_type,
|
|
|
|
bit_field_size);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
free(arg);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
parse_member_offset(Dwarf_Die *die, union drgn_lazy_object *member_object,
|
|
|
|
bool little_endian, uint64_t *ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The simplest case is when we have DW_AT_data_bit_offset, which is
|
|
|
|
* already the offset in bits from the beginning of the containing
|
|
|
|
* object to the beginning of the member (which may be a bit field).
|
|
|
|
*/
|
|
|
|
attr = dwarf_attr_integrate(die, DW_AT_data_bit_offset, &attr_mem);
|
|
|
|
if (attr) {
|
|
|
|
Dwarf_Word bit_offset;
|
|
|
|
if (dwarf_formudata(attr, &bit_offset)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_member has invalid DW_AT_data_bit_offset");
|
|
|
|
}
|
|
|
|
*ret = bit_offset;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Otherwise, we might have DW_AT_data_member_location, which is the
|
|
|
|
* offset in bytes from the beginning of the containing object.
|
|
|
|
*/
|
|
|
|
attr = dwarf_attr_integrate(die, DW_AT_data_member_location, &attr_mem);
|
|
|
|
if (attr) {
|
|
|
|
Dwarf_Word byte_offset;
|
|
|
|
if (dwarf_formudata(attr, &byte_offset)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_member has invalid DW_AT_data_member_location");
|
|
|
|
}
|
|
|
|
*ret = 8 * byte_offset;
|
|
|
|
} else {
|
|
|
|
*ret = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* In addition to DW_AT_data_member_location, a bit field might have
|
|
|
|
* DW_AT_bit_offset, which is the offset in bits of the most significant
|
|
|
|
* bit of the bit field from the most significant bit of the containing
|
|
|
|
* object.
|
|
|
|
*/
|
|
|
|
attr = dwarf_attr_integrate(die, DW_AT_bit_offset, &attr_mem);
|
|
|
|
if (attr) {
|
|
|
|
Dwarf_Word bit_offset;
|
|
|
|
if (dwarf_formudata(attr, &bit_offset)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_member has invalid DW_AT_bit_offset");
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the architecture is little-endian, then we must compute
|
|
|
|
* the location of the most significant bit from the size of the
|
|
|
|
* member, then subtract the bit offset and bit size to get the
|
|
|
|
* location of the beginning of the bit field.
|
|
|
|
*
|
|
|
|
* If the architecture is big-endian, then the most significant
|
|
|
|
* bit of the bit field is the beginning.
|
|
|
|
*/
|
|
|
|
if (little_endian) {
|
2020-12-18 19:01:29 +00:00
|
|
|
err = drgn_lazy_object_evaluate(member_object);
|
|
|
|
if (err)
|
|
|
|
return err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
attr = dwarf_attr_integrate(die, DW_AT_byte_size,
|
|
|
|
&attr_mem);
|
|
|
|
/*
|
|
|
|
* If the member has an explicit byte size, we can use
|
|
|
|
* that. Otherwise, we have to get it from the member
|
|
|
|
* type.
|
|
|
|
*/
|
2020-12-18 19:01:29 +00:00
|
|
|
uint64_t byte_size;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (attr) {
|
|
|
|
Dwarf_Word word;
|
|
|
|
if (dwarf_formudata(attr, &word)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_member has invalid DW_AT_byte_size");
|
|
|
|
}
|
|
|
|
byte_size = word;
|
|
|
|
} else {
|
2020-12-18 19:01:29 +00:00
|
|
|
if (!drgn_type_has_size(member_object->obj.type)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_member bit field type does not have size");
|
|
|
|
}
|
2020-12-18 19:01:29 +00:00
|
|
|
err = drgn_type_sizeof(member_object->obj.type,
|
|
|
|
&byte_size);
|
|
|
|
if (err)
|
|
|
|
return err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
2020-12-18 19:01:29 +00:00
|
|
|
*ret += 8 * byte_size - bit_offset - member_object->obj.bit_size;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
} else {
|
|
|
|
*ret += bit_offset;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
static struct drgn_error *
|
2021-01-27 19:12:33 +00:00
|
|
|
parse_member(struct drgn_debug_info *dbinfo,
|
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
|
|
|
bool little_endian, bool can_be_incomplete_array,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_compound_type_builder *builder)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
2020-12-18 19:01:29 +00:00
|
|
|
struct drgn_error *err;
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
const char *name;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if ((attr = dwarf_attr_integrate(die, DW_AT_name, &attr_mem))) {
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
name = dwarf_formstring(attr);
|
|
|
|
if (!name) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_member has invalid DW_AT_name");
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
name = NULL;
|
|
|
|
}
|
|
|
|
|
2020-12-18 19:01:29 +00:00
|
|
|
struct drgn_dwarf_member_thunk_arg *thunk_arg =
|
|
|
|
malloc(sizeof(*thunk_arg));
|
|
|
|
if (!thunk_arg)
|
|
|
|
return &drgn_enomem;
|
2021-01-27 19:12:33 +00:00
|
|
|
thunk_arg->module = module;
|
2020-12-18 19:01:29 +00:00
|
|
|
thunk_arg->die = *die;
|
|
|
|
thunk_arg->can_be_incomplete_array = can_be_incomplete_array;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-12-18 19:01:29 +00:00
|
|
|
union drgn_lazy_object member_object;
|
|
|
|
drgn_lazy_object_init_thunk(&member_object, dbinfo->prog,
|
|
|
|
drgn_dwarf_member_thunk_fn, thunk_arg);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
uint64_t bit_offset;
|
2020-12-18 19:01:29 +00:00
|
|
|
err = parse_member_offset(die, &member_object, little_endian,
|
|
|
|
&bit_offset);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
|
|
|
goto err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-12-18 19:01:29 +00:00
|
|
|
err = drgn_compound_type_builder_add_member(builder, &member_object,
|
|
|
|
name, bit_offset);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
|
|
|
goto err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return NULL;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
|
|
|
|
err:
|
2020-12-18 19:01:29 +00:00
|
|
|
drgn_lazy_object_deinit(&member_object);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
return err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
2021-01-09 01:28:27 +00:00
|
|
|
struct drgn_dwarf_die_thunk_arg {
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module;
|
2021-01-09 01:28:27 +00:00
|
|
|
Dwarf_Die die;
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_dwarf_template_type_parameter_thunk_fn(struct drgn_object *res, void *arg_)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
struct drgn_dwarf_die_thunk_arg *arg = arg_;
|
|
|
|
if (res) {
|
|
|
|
struct drgn_qualified_type qualified_type;
|
|
|
|
err = drgn_type_from_dwarf_attr(drgn_object_program(res)->_dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
arg->module, &arg->die, NULL,
|
|
|
|
true, true, NULL,
|
2021-01-09 01:28:27 +00:00
|
|
|
&qualified_type);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
err = drgn_object_set_absent(res, qualified_type, 0);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
free(arg);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
|
|
|
drgn_dwarf_template_value_parameter_thunk_fn(struct drgn_object *res,
|
|
|
|
void *arg_)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
struct drgn_dwarf_die_thunk_arg *arg = arg_;
|
|
|
|
if (res) {
|
|
|
|
err = drgn_object_from_dwarf_variable(drgn_object_program(res)->_dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
arg->module, &arg->die,
|
|
|
|
res);
|
2021-01-09 01:28:27 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
free(arg);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
parse_template_parameter(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_object_thunk_fn *thunk_fn,
|
2021-01-09 01:28:27 +00:00
|
|
|
struct drgn_template_parameters_builder *builder)
|
|
|
|
{
|
|
|
|
char tag_buf[DW_TAG_BUF_LEN];
|
|
|
|
|
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
|
|
|
const char *name;
|
|
|
|
if ((attr = dwarf_attr_integrate(die, DW_AT_name, &attr_mem))) {
|
|
|
|
name = dwarf_formstring(attr);
|
|
|
|
if (!name) {
|
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
|
|
|
"%s has invalid DW_AT_name",
|
|
|
|
dwarf_tag_str(die, tag_buf));
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
name = NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
bool defaulted;
|
2021-02-22 20:03:53 +00:00
|
|
|
if (dwarf_flag_integrate(die, DW_AT_default_value, &defaulted)) {
|
2021-01-09 01:28:27 +00:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
|
|
|
"%s has invalid DW_AT_default_value",
|
|
|
|
dwarf_tag_str(die, tag_buf));
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_dwarf_die_thunk_arg *thunk_arg =
|
|
|
|
malloc(sizeof(*thunk_arg));
|
|
|
|
if (!thunk_arg)
|
|
|
|
return &drgn_enomem;
|
2021-01-27 19:12:33 +00:00
|
|
|
thunk_arg->module = module;
|
2021-01-09 01:28:27 +00:00
|
|
|
thunk_arg->die = *die;
|
|
|
|
|
|
|
|
union drgn_lazy_object argument;
|
|
|
|
drgn_lazy_object_init_thunk(&argument, dbinfo->prog, thunk_fn,
|
|
|
|
thunk_arg);
|
|
|
|
|
|
|
|
struct drgn_error *err =
|
|
|
|
drgn_template_parameters_builder_add(builder, &argument, name,
|
|
|
|
defaulted);
|
|
|
|
if (err)
|
|
|
|
drgn_lazy_object_deinit(&argument);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static struct drgn_error *
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_compound_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, const struct drgn_language *lang,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
enum drgn_type_kind kind, struct drgn_type **ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2020-10-22 23:34:47 +01:00
|
|
|
char tag_buf[DW_TAG_BUF_LEN];
|
2019-11-15 01:12:47 +00:00
|
|
|
|
2020-07-01 20:37:16 +01:00
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr = dwarf_attr_integrate(die, DW_AT_name,
|
|
|
|
&attr_mem);
|
|
|
|
const char *tag;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (attr) {
|
|
|
|
tag = dwarf_formstring(attr);
|
2019-11-15 01:12:47 +00:00
|
|
|
if (!tag) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
2019-11-15 01:12:47 +00:00
|
|
|
"%s has invalid DW_AT_name",
|
2020-10-22 23:34:47 +01:00
|
|
|
dwarf_tag_str(die, tag_buf));
|
2019-11-15 01:12:47 +00:00
|
|
|
}
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
} else {
|
|
|
|
tag = NULL;
|
|
|
|
}
|
|
|
|
|
2020-07-01 20:37:16 +01:00
|
|
|
bool declaration;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (dwarf_flag(die, DW_AT_declaration, &declaration)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
2019-11-15 01:12:47 +00:00
|
|
|
"%s has invalid DW_AT_declaration",
|
2020-10-22 23:34:47 +01:00
|
|
|
dwarf_tag_str(die, tag_buf));
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
if (declaration && tag) {
|
2020-10-22 23:34:47 +01:00
|
|
|
err = drgn_debug_info_find_complete(dbinfo, dwarf_tag(die), tag,
|
|
|
|
ret);
|
2021-03-05 20:46:06 +00:00
|
|
|
if (err != &drgn_not_found)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_compound_type_builder builder;
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_compound_type_builder_init(&builder, dbinfo->prog, kind);
|
2021-01-09 01:28:27 +00:00
|
|
|
|
|
|
|
int size;
|
2020-07-13 21:30:36 +01:00
|
|
|
bool little_endian;
|
2021-01-09 01:28:27 +00:00
|
|
|
if (declaration) {
|
|
|
|
size = 0;
|
|
|
|
} else {
|
|
|
|
size = dwarf_bytesize(die);
|
|
|
|
if (size == -1) {
|
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
|
|
|
"%s has missing or invalid DW_AT_byte_size",
|
|
|
|
dwarf_tag_str(die, tag_buf));
|
|
|
|
}
|
|
|
|
dwarf_die_is_little_endian(die, false, &little_endian);
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
Dwarf_Die member = {}, child;
|
2020-07-01 20:37:16 +01:00
|
|
|
int r = dwarf_child(die, &child);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
while (r == 0) {
|
2021-01-09 01:28:27 +00:00
|
|
|
switch (dwarf_tag(&child)) {
|
|
|
|
case DW_TAG_member:
|
|
|
|
if (!declaration) {
|
|
|
|
if (member.addr) {
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_member(dbinfo, module,
|
2021-01-21 17:07:03 +00:00
|
|
|
&member,
|
|
|
|
little_endian, false,
|
|
|
|
&builder);
|
2021-01-09 01:28:27 +00:00
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
member = child;
|
2020-03-24 20:24:24 +00:00
|
|
|
}
|
2021-01-09 01:28:27 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_template_type_parameter:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_template_parameter(dbinfo, module, &child,
|
2021-01-09 01:28:27 +00:00
|
|
|
drgn_dwarf_template_type_parameter_thunk_fn,
|
|
|
|
&builder.template_builder);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
break;
|
|
|
|
case DW_TAG_template_value_parameter:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_template_parameter(dbinfo, module, &child,
|
2021-01-09 01:28:27 +00:00
|
|
|
drgn_dwarf_template_value_parameter_thunk_fn,
|
|
|
|
&builder.template_builder);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
break;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
r = dwarf_siblingof(&child, &child);
|
|
|
|
}
|
|
|
|
if (r == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"libdw could not parse DIE children");
|
|
|
|
goto err;
|
|
|
|
}
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
/*
|
|
|
|
* Flexible array members are only allowed as the last member of a
|
|
|
|
* structure with at least one other member.
|
|
|
|
*/
|
|
|
|
if (member.addr) {
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_member(dbinfo, module, &member, little_endian,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
kind != DRGN_TYPE_UNION &&
|
|
|
|
builder.members.size > 0,
|
|
|
|
&builder);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
|
2021-01-09 01:28:27 +00:00
|
|
|
err = drgn_compound_type_create(&builder, tag, size, !declaration, lang,
|
|
|
|
ret);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
|
|
|
goto err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
err:
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
drgn_compound_type_builder_deinit(&builder);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2020-03-24 20:24:24 +00:00
|
|
|
static struct drgn_error *
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
parse_enumerator(Dwarf_Die *die, struct drgn_enum_type_builder *builder,
|
2020-03-24 20:24:24 +00:00
|
|
|
bool *is_signed)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
const char *name = dwarf_diename(die);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (!name) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_enumerator has missing or invalid DW_AT_name");
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
|
|
|
if (!(attr = dwarf_attr_integrate(die, DW_AT_const_value, &attr_mem))) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_enumerator is missing DW_AT_const_value");
|
|
|
|
}
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_error *err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (attr->form == DW_FORM_sdata ||
|
|
|
|
attr->form == DW_FORM_implicit_const) {
|
|
|
|
Dwarf_Sword svalue;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (dwarf_formsdata(attr, &svalue))
|
|
|
|
goto invalid;
|
|
|
|
err = drgn_enum_type_builder_add_signed(builder, name,
|
|
|
|
svalue);
|
|
|
|
/*
|
|
|
|
* GCC before 7.1 didn't include DW_AT_encoding for
|
|
|
|
* DW_TAG_enumeration_type DIEs, so we have to guess the sign
|
|
|
|
* for enum_compatible_type_fallback().
|
|
|
|
*/
|
|
|
|
if (!err && svalue < 0)
|
|
|
|
*is_signed = true;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
} else {
|
|
|
|
Dwarf_Word uvalue;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (dwarf_formudata(attr, &uvalue))
|
|
|
|
goto invalid;
|
|
|
|
err = drgn_enum_type_builder_add_unsigned(builder, name,
|
|
|
|
uvalue);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
return err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
invalid:
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"DW_TAG_enumerator has invalid DW_AT_const_value");
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* GCC before 5.1 did not include DW_AT_type for DW_TAG_enumeration_type DIEs,
|
|
|
|
* so we have to fabricate the compatible type.
|
|
|
|
*/
|
|
|
|
static struct drgn_error *
|
2020-09-12 01:41:23 +01:00
|
|
|
enum_compatible_type_fallback(struct drgn_debug_info *dbinfo,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
Dwarf_Die *die, bool is_signed,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
const struct drgn_language *lang,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
struct drgn_type **ret)
|
|
|
|
{
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
int size = dwarf_bytesize(die);
|
|
|
|
if (size == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_enumeration_type has missing or invalid DW_AT_byte_size");
|
|
|
|
}
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
enum drgn_byte_order byte_order;
|
|
|
|
dwarf_die_byte_order(die, false, &byte_order);
|
2020-09-12 01:41:23 +01:00
|
|
|
return drgn_int_type_create(dbinfo->prog, "<unknown>", size, is_signed,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
byte_order, lang, ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_enum_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
2021-01-21 17:07:03 +00:00
|
|
|
const struct drgn_language *lang,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_type **ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2020-07-01 20:37:16 +01:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
Dwarf_Attribute attr_mem;
|
2020-07-01 20:37:16 +01:00
|
|
|
Dwarf_Attribute *attr = dwarf_attr_integrate(die, DW_AT_name,
|
|
|
|
&attr_mem);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
const char *tag;
|
|
|
|
if (attr) {
|
|
|
|
tag = dwarf_formstring(attr);
|
|
|
|
if (!tag)
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_enumeration_type has invalid DW_AT_name");
|
|
|
|
} else {
|
|
|
|
tag = NULL;
|
|
|
|
}
|
|
|
|
|
2020-07-01 20:37:16 +01:00
|
|
|
bool declaration;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (dwarf_flag(die, DW_AT_declaration, &declaration)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_enumeration_type has invalid DW_AT_declaration");
|
|
|
|
}
|
|
|
|
if (declaration && tag) {
|
2020-09-12 01:41:23 +01:00
|
|
|
err = drgn_debug_info_find_complete(dbinfo,
|
|
|
|
DW_TAG_enumeration_type,
|
|
|
|
tag, ret);
|
2021-03-05 20:46:06 +00:00
|
|
|
if (err != &drgn_not_found)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (declaration) {
|
2020-09-12 01:41:23 +01:00
|
|
|
return drgn_incomplete_enum_type_create(dbinfo->prog, tag, lang,
|
|
|
|
ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_enum_type_builder builder;
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_enum_type_builder_init(&builder, dbinfo->prog);
|
2020-07-01 20:37:16 +01:00
|
|
|
bool is_signed = false;
|
|
|
|
Dwarf_Die child;
|
|
|
|
int r = dwarf_child(die, &child);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
while (r == 0) {
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (dwarf_tag(&child) == DW_TAG_enumerator) {
|
|
|
|
err = parse_enumerator(&child, &builder, &is_signed);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
r = dwarf_siblingof(&child, &child);
|
|
|
|
}
|
|
|
|
if (r == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"libdw could not parse DIE children");
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
2020-07-01 20:37:16 +01:00
|
|
|
struct drgn_type *compatible_type;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
r = dwarf_type(die, &child);
|
|
|
|
if (r == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_enumeration_type has invalid DW_AT_type");
|
|
|
|
goto err;
|
|
|
|
} else if (r) {
|
2020-09-12 01:41:23 +01:00
|
|
|
err = enum_compatible_type_fallback(dbinfo, die, is_signed,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
lang, &compatible_type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
} else {
|
|
|
|
struct drgn_qualified_type qualified_compatible_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf(dbinfo, module, &child,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
&qualified_compatible_type);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
compatible_type = qualified_compatible_type.type;
|
|
|
|
if (drgn_type_kind(compatible_type) != DRGN_TYPE_INT) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_AT_type of DW_TAG_enumeration_type is not an integer type");
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
err = drgn_enum_type_create(&builder, tag, compatible_type, lang, ret);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
err:
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
drgn_enum_type_builder_deinit(&builder);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_typedef_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, const struct drgn_language *lang,
|
2020-02-26 21:22:51 +00:00
|
|
|
bool can_be_incomplete_array,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
bool *is_incomplete_array_ret,
|
|
|
|
struct drgn_type **ret)
|
|
|
|
{
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
const char *name = dwarf_diename(die);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (!name) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_typedef has missing or invalid DW_AT_name");
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_qualified_type aliased_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_error *err = drgn_type_from_dwarf_attr(dbinfo, module, die,
|
|
|
|
lang, true,
|
2021-01-08 19:02:27 +00:00
|
|
|
can_be_incomplete_array,
|
|
|
|
is_incomplete_array_ret,
|
|
|
|
&aliased_type);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
|
2020-09-12 01:41:23 +01:00
|
|
|
return drgn_typedef_type_create(dbinfo->prog, name, aliased_type, lang,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_pointer_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, const struct drgn_language *lang,
|
2020-02-26 21:22:51 +00:00
|
|
|
struct drgn_type **ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_qualified_type referenced_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_error *err = drgn_type_from_dwarf_attr(dbinfo, module, die,
|
|
|
|
lang, true, true,
|
|
|
|
NULL,
|
2021-01-08 19:02:27 +00:00
|
|
|
&referenced_type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
|
|
|
uint64_t size;
|
|
|
|
if ((attr = dwarf_attr_integrate(die, DW_AT_byte_size, &attr_mem))) {
|
|
|
|
Dwarf_Word word;
|
|
|
|
if (dwarf_formudata(attr, &word)) {
|
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
|
|
|
"DW_TAG_pointer_type has invalid DW_AT_byte_size");
|
|
|
|
}
|
|
|
|
size = word;
|
|
|
|
} else {
|
2021-02-23 22:06:41 +00:00
|
|
|
uint8_t address_size;
|
|
|
|
err = drgn_program_address_size(dbinfo->prog, &address_size);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
2021-02-23 22:06:41 +00:00
|
|
|
size = address_size;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
}
|
|
|
|
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
/*
|
|
|
|
* The DWARF 5 specification doesn't mention DW_AT_endianity for
|
|
|
|
* DW_TAG_pointer_type DIEs, and GCC as of version 10.2 doesn't emit it
|
|
|
|
* even for pointers stored in the opposite byte order (e.g., when using
|
|
|
|
* scalar_storage_order), but it probably should.
|
|
|
|
*/
|
|
|
|
enum drgn_byte_order byte_order;
|
|
|
|
dwarf_die_byte_order(die, false, &byte_order);
|
2020-09-12 01:41:23 +01:00
|
|
|
return drgn_pointer_type_create(dbinfo->prog, referenced_type, size,
|
Track byte order in scalar types instead of objects
Currently, reference objects and buffer value objects have a byte order.
However, this doesn't always make sense for a couple of reasons:
- Byte order is only meaningful for scalars. What does it mean for a
struct to be big endian? A struct doesn't have a most or least
significant byte; its scalar members do.
- The DWARF specification allows either types or variables to have a
byte order (DW_AT_endianity). The only producer I could find that uses
this is GCC for the scalar_storage_order type attribute, and it only
uses it for base types, not variables. GDB only seems to use to check
it for base types, as well.
So, remove the byte order from objects, and move it to integer, boolean,
floating-point, and pointer types. This model makes more sense, and it
means that we can get the binary representation of any object now.
The only downside is that we can no longer support a bit offset for
non-scalars, but as far as I can tell, nothing needs that.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2021-02-18 00:13:23 +00:00
|
|
|
byte_order, lang, ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
struct array_dimension {
|
|
|
|
uint64_t length;
|
|
|
|
bool is_complete;
|
|
|
|
};
|
|
|
|
|
2019-07-09 23:50:45 +01:00
|
|
|
DEFINE_VECTOR(array_dimension_vector, struct array_dimension)
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static struct drgn_error *subrange_length(Dwarf_Die *die,
|
|
|
|
struct array_dimension *dimension)
|
|
|
|
{
|
|
|
|
Dwarf_Attribute attr_mem;
|
|
|
|
Dwarf_Attribute *attr;
|
|
|
|
Dwarf_Word word;
|
|
|
|
|
|
|
|
if (!(attr = dwarf_attr_integrate(die, DW_AT_upper_bound, &attr_mem)) &&
|
|
|
|
!(attr = dwarf_attr_integrate(die, DW_AT_count, &attr_mem))) {
|
|
|
|
dimension->is_complete = false;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (dwarf_formudata(attr, &word)) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_format(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_subrange_type has invalid %s",
|
|
|
|
attr->code == DW_AT_upper_bound ?
|
|
|
|
"DW_AT_upper_bound" :
|
|
|
|
"DW_AT_count");
|
|
|
|
}
|
|
|
|
|
|
|
|
dimension->is_complete = true;
|
2019-10-18 10:59:47 +01:00
|
|
|
/*
|
|
|
|
* GCC emits a DW_FORM_sdata DW_AT_upper_bound of -1 for empty array
|
|
|
|
* variables without an explicit size (e.g., `int arr[] = {};`).
|
|
|
|
*/
|
|
|
|
if (attr->code == DW_AT_upper_bound && attr->form == DW_FORM_sdata &&
|
|
|
|
word == (Dwarf_Word)-1) {
|
|
|
|
dimension->length = 0;
|
|
|
|
} else if (attr->code == DW_AT_upper_bound) {
|
2019-10-16 01:07:27 +01:00
|
|
|
if (word >= UINT64_MAX) {
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return drgn_error_create(DRGN_ERROR_OVERFLOW,
|
2019-10-16 01:07:27 +01:00
|
|
|
"DW_AT_upper_bound is too large");
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
dimension->length = (uint64_t)word + 1;
|
|
|
|
} else {
|
|
|
|
if (word > UINT64_MAX) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OVERFLOW,
|
2019-10-16 01:07:27 +01:00
|
|
|
"DW_AT_count is too large");
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
dimension->length = word;
|
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_array_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, const struct drgn_language *lang,
|
2020-02-26 21:22:51 +00:00
|
|
|
bool can_be_incomplete_array,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
bool *is_incomplete_array_ret,
|
|
|
|
struct drgn_type **ret)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2020-07-01 20:37:16 +01:00
|
|
|
struct array_dimension_vector dimensions = VECTOR_INIT;
|
2019-07-09 23:50:45 +01:00
|
|
|
struct array_dimension *dimension;
|
2020-07-01 20:37:16 +01:00
|
|
|
Dwarf_Die child;
|
|
|
|
int r = dwarf_child(die, &child);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
while (r == 0) {
|
|
|
|
if (dwarf_tag(&child) == DW_TAG_subrange_type) {
|
2019-07-09 23:50:45 +01:00
|
|
|
dimension = array_dimension_vector_append_entry(&dimensions);
|
|
|
|
if (!dimension)
|
|
|
|
goto out;
|
|
|
|
err = subrange_length(&child, dimension);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
r = dwarf_siblingof(&child, &child);
|
|
|
|
}
|
|
|
|
if (r == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"libdw could not parse DIE children");
|
|
|
|
goto out;
|
|
|
|
}
|
2019-07-09 23:50:45 +01:00
|
|
|
if (!dimensions.size) {
|
|
|
|
dimension = array_dimension_vector_append_entry(&dimensions);
|
|
|
|
if (!dimension)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
goto out;
|
2019-07-09 23:50:45 +01:00
|
|
|
dimension->is_complete = false;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
2020-07-01 20:37:16 +01:00
|
|
|
struct drgn_qualified_type element_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf_attr(dbinfo, module, die, lang, false, false,
|
|
|
|
NULL, &element_type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
2019-07-09 23:50:45 +01:00
|
|
|
*is_incomplete_array_ret = !dimensions.data[0].is_complete;
|
2020-07-01 20:37:16 +01:00
|
|
|
struct drgn_type *type;
|
2019-07-09 23:50:45 +01:00
|
|
|
do {
|
|
|
|
dimension = array_dimension_vector_pop(&dimensions);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (dimension->is_complete) {
|
2020-09-12 01:41:23 +01:00
|
|
|
err = drgn_array_type_create(dbinfo->prog, element_type,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
dimension->length, lang,
|
|
|
|
&type);
|
2019-07-09 23:50:45 +01:00
|
|
|
} else if (dimensions.size || !can_be_incomplete_array) {
|
2020-09-12 01:41:23 +01:00
|
|
|
err = drgn_array_type_create(dbinfo->prog, element_type,
|
|
|
|
0, lang, &type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
} else {
|
2020-09-12 01:41:23 +01:00
|
|
|
err = drgn_incomplete_array_type_create(dbinfo->prog,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
element_type,
|
|
|
|
lang, &type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
if (err)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
element_type.type = type;
|
|
|
|
element_type.qualifiers = 0;
|
2019-07-09 23:50:45 +01:00
|
|
|
} while (dimensions.size);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
*ret = type;
|
|
|
|
err = NULL;
|
|
|
|
out:
|
2019-07-09 23:50:45 +01:00
|
|
|
array_dimension_vector_deinit(&dimensions);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2020-12-18 19:01:29 +00:00
|
|
|
static struct drgn_error *
|
2021-01-08 19:59:31 +00:00
|
|
|
drgn_dwarf_formal_parameter_thunk_fn(struct drgn_object *res, void *arg_)
|
2020-12-18 19:01:29 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2021-01-09 01:28:27 +00:00
|
|
|
struct drgn_dwarf_die_thunk_arg *arg = arg_;
|
2020-12-18 19:01:29 +00:00
|
|
|
if (res) {
|
|
|
|
struct drgn_qualified_type qualified_type;
|
|
|
|
err = drgn_type_from_dwarf_attr(drgn_object_program(res)->_dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
arg->module, &arg->die, NULL,
|
|
|
|
false, true, NULL,
|
2020-12-18 19:01:29 +00:00
|
|
|
&qualified_type);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
err = drgn_object_set_absent(res, qualified_type, 0);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
free(arg);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
static struct drgn_error *
|
2021-01-27 19:12:33 +00:00
|
|
|
parse_formal_parameter(struct drgn_debug_info *dbinfo,
|
|
|
|
struct drgn_debug_info_module *module, Dwarf_Die *die,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_function_type_builder *builder)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
Dwarf_Attribute attr_mem, *attr;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
const char *name;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if ((attr = dwarf_attr_integrate(die, DW_AT_name, &attr_mem))) {
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
name = dwarf_formstring(attr);
|
|
|
|
if (!name) {
|
2019-08-15 23:03:42 +01:00
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"DW_TAG_formal_parameter has invalid DW_AT_name");
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
name = NULL;
|
|
|
|
}
|
|
|
|
|
2021-01-09 01:28:27 +00:00
|
|
|
struct drgn_dwarf_die_thunk_arg *thunk_arg =
|
2021-01-08 19:59:31 +00:00
|
|
|
malloc(sizeof(*thunk_arg));
|
2020-12-18 19:01:29 +00:00
|
|
|
if (!thunk_arg)
|
|
|
|
return &drgn_enomem;
|
2021-01-27 19:12:33 +00:00
|
|
|
thunk_arg->module = module;
|
2021-01-08 19:59:31 +00:00
|
|
|
thunk_arg->die = *die;
|
2020-12-18 19:01:29 +00:00
|
|
|
|
|
|
|
union drgn_lazy_object default_argument;
|
|
|
|
drgn_lazy_object_init_thunk(&default_argument, dbinfo->prog,
|
|
|
|
drgn_dwarf_formal_parameter_thunk_fn,
|
|
|
|
thunk_arg);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-12-18 19:01:29 +00:00
|
|
|
struct drgn_error *err =
|
|
|
|
drgn_function_type_builder_add_parameter(builder,
|
|
|
|
&default_argument,
|
|
|
|
name);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
2020-12-18 19:01:29 +00:00
|
|
|
drgn_lazy_object_deinit(&default_argument);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
return err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_function_type_from_dwarf(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, const struct drgn_language *lang,
|
2020-02-26 21:22:51 +00:00
|
|
|
struct drgn_type **ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2020-10-22 23:34:47 +01:00
|
|
|
char tag_buf[DW_TAG_BUF_LEN];
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
struct drgn_function_type_builder builder;
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_function_type_builder_init(&builder, dbinfo->prog);
|
2020-07-01 20:37:16 +01:00
|
|
|
bool is_variadic = false;
|
|
|
|
Dwarf_Die child;
|
|
|
|
int r = dwarf_child(die, &child);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
while (r == 0) {
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
switch (dwarf_tag(&child)) {
|
|
|
|
case DW_TAG_formal_parameter:
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (is_variadic) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_format(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"%s has DW_TAG_formal_parameter child after DW_TAG_unspecified_parameters child",
|
2020-10-22 23:34:47 +01:00
|
|
|
dwarf_tag_str(die,
|
|
|
|
tag_buf));
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
goto err;
|
|
|
|
}
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_formal_parameter(dbinfo, module, &child,
|
|
|
|
&builder);
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
if (err)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
goto err;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
break;
|
|
|
|
case DW_TAG_unspecified_parameters:
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (is_variadic) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_format(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"%s has multiple DW_TAG_unspecified_parameters children",
|
2020-10-22 23:34:47 +01:00
|
|
|
dwarf_tag_str(die,
|
|
|
|
tag_buf));
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
is_variadic = true;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
break;
|
2021-01-09 01:28:27 +00:00
|
|
|
case DW_TAG_template_type_parameter:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_template_parameter(dbinfo, module, &child,
|
2021-01-09 01:28:27 +00:00
|
|
|
drgn_dwarf_template_type_parameter_thunk_fn,
|
|
|
|
&builder.template_builder);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
break;
|
|
|
|
case DW_TAG_template_value_parameter:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = parse_template_parameter(dbinfo, module, &child,
|
2021-01-09 01:28:27 +00:00
|
|
|
drgn_dwarf_template_value_parameter_thunk_fn,
|
|
|
|
&builder.template_builder);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
break;
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
default:
|
|
|
|
break;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
r = dwarf_siblingof(&child, &child);
|
|
|
|
}
|
|
|
|
if (r == -1) {
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"libdw could not parse DIE children");
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
2020-07-01 20:37:16 +01:00
|
|
|
struct drgn_qualified_type return_type;
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf_attr(dbinfo, module, die, lang, true, true,
|
|
|
|
NULL, &return_type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
goto err;
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
err = drgn_function_type_create(&builder, return_type, is_variadic,
|
|
|
|
lang, ret);
|
|
|
|
if (err)
|
|
|
|
goto err;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
err:
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
drgn_function_type_builder_deinit(&builder);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2019-04-30 20:38:41 +01:00
|
|
|
static struct drgn_error *
|
2021-01-21 17:07:03 +00:00
|
|
|
drgn_type_from_dwarf_internal(struct drgn_debug_info *dbinfo,
|
2021-01-27 19:12:33 +00:00
|
|
|
struct drgn_debug_info_module *module,
|
|
|
|
Dwarf_Die *die, bool can_be_incomplete_array,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
bool *is_incomplete_array_ret,
|
|
|
|
struct drgn_qualified_type *ret)
|
|
|
|
{
|
2020-09-12 01:41:23 +01:00
|
|
|
if (dbinfo->depth >= 1000) {
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return drgn_error_create(DRGN_ERROR_RECURSION,
|
|
|
|
"maximum DWARF type parsing depth exceeded");
|
|
|
|
}
|
|
|
|
|
2021-03-05 23:15:33 +00:00
|
|
|
/* If we got a declaration, try to find the definition. */
|
|
|
|
bool declaration;
|
|
|
|
if (dwarf_flag(die, DW_AT_declaration, &declaration))
|
|
|
|
return drgn_error_libdw();
|
|
|
|
if (declaration) {
|
|
|
|
uintptr_t die_addr;
|
|
|
|
if (drgn_dwarf_index_find_definition(&dbinfo->dindex,
|
|
|
|
(uintptr_t)die->addr,
|
|
|
|
&module, &die_addr)) {
|
|
|
|
Dwarf_Addr bias;
|
|
|
|
Dwarf *dwarf = dwfl_module_getdwarf(module->dwfl_module,
|
|
|
|
&bias);
|
|
|
|
if (!dwarf)
|
|
|
|
return drgn_error_libdwfl();
|
|
|
|
uintptr_t start = (uintptr_t)module->scns[DRGN_SCN_DEBUG_INFO]->d_buf;
|
|
|
|
if (!dwarf_offdie(dwarf, die_addr - start, die))
|
|
|
|
return drgn_error_libdw();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-09-12 01:41:23 +01:00
|
|
|
struct drgn_dwarf_type_map_entry entry = {
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
.key = die->addr,
|
|
|
|
};
|
2020-09-12 01:41:23 +01:00
|
|
|
struct hash_pair hp = drgn_dwarf_type_map_hash(&entry.key);
|
|
|
|
struct drgn_dwarf_type_map_iterator it =
|
|
|
|
drgn_dwarf_type_map_search_hashed(&dbinfo->types, &entry.key,
|
|
|
|
hp);
|
2019-05-16 09:25:43 +01:00
|
|
|
if (it.entry) {
|
|
|
|
if (!can_be_incomplete_array &&
|
|
|
|
it.entry->value.is_incomplete_array) {
|
2020-09-12 01:41:23 +01:00
|
|
|
it = drgn_dwarf_type_map_search_hashed(&dbinfo->cant_be_incomplete_array_types,
|
|
|
|
&entry.key, hp);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
2019-05-16 09:25:43 +01:00
|
|
|
if (it.entry) {
|
|
|
|
ret->type = it.entry->value.type;
|
|
|
|
ret->qualifiers = it.entry->value.qualifiers;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
const struct drgn_language *lang;
|
2021-01-08 18:46:35 +00:00
|
|
|
struct drgn_error *err = drgn_language_from_die(die, true, &lang);
|
2020-02-26 21:22:51 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
ret->qualifiers = 0;
|
2020-09-12 01:41:23 +01:00
|
|
|
dbinfo->depth++;
|
2019-05-16 09:25:43 +01:00
|
|
|
entry.value.is_incomplete_array = false;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
switch (dwarf_tag(die)) {
|
|
|
|
case DW_TAG_const_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf_attr(dbinfo, module, die, lang, true,
|
|
|
|
can_be_incomplete_array,
|
2021-01-08 19:06:26 +00:00
|
|
|
&entry.value.is_incomplete_array,
|
|
|
|
ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
ret->qualifiers |= DRGN_QUALIFIER_CONST;
|
|
|
|
break;
|
|
|
|
case DW_TAG_restrict_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf_attr(dbinfo, module, die, lang, true,
|
|
|
|
can_be_incomplete_array,
|
2021-01-08 19:06:26 +00:00
|
|
|
&entry.value.is_incomplete_array,
|
|
|
|
ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
ret->qualifiers |= DRGN_QUALIFIER_RESTRICT;
|
|
|
|
break;
|
|
|
|
case DW_TAG_volatile_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf_attr(dbinfo, module, die, lang, true,
|
|
|
|
can_be_incomplete_array,
|
2021-01-08 19:06:26 +00:00
|
|
|
&entry.value.is_incomplete_array,
|
|
|
|
ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
ret->qualifiers |= DRGN_QUALIFIER_VOLATILE;
|
|
|
|
break;
|
|
|
|
case DW_TAG_atomic_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_type_from_dwarf_attr(dbinfo, module, die, lang, true,
|
|
|
|
can_be_incomplete_array,
|
2021-01-08 19:06:26 +00:00
|
|
|
&entry.value.is_incomplete_array,
|
|
|
|
ret);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
ret->qualifiers |= DRGN_QUALIFIER_ATOMIC;
|
|
|
|
break;
|
|
|
|
case DW_TAG_base_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_base_type_from_dwarf(dbinfo, module, die, lang,
|
2021-01-08 19:59:31 +00:00
|
|
|
&ret->type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_structure_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_compound_type_from_dwarf(dbinfo, module, die, lang,
|
|
|
|
DRGN_TYPE_STRUCT,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
&ret->type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_union_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_compound_type_from_dwarf(dbinfo, module, die, lang,
|
|
|
|
DRGN_TYPE_UNION,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
&ret->type);
|
2019-11-15 01:12:47 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_class_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_compound_type_from_dwarf(dbinfo, module, die, lang,
|
|
|
|
DRGN_TYPE_CLASS,
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
&ret->type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_enumeration_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_enum_type_from_dwarf(dbinfo, module, die, lang,
|
2021-01-08 19:59:31 +00:00
|
|
|
&ret->type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_typedef:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_typedef_type_from_dwarf(dbinfo, module, die, lang,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
can_be_incomplete_array,
|
2019-05-16 09:25:43 +01:00
|
|
|
&entry.value.is_incomplete_array,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
&ret->type);
|
|
|
|
break;
|
|
|
|
case DW_TAG_pointer_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_pointer_type_from_dwarf(dbinfo, module, die, lang,
|
|
|
|
&ret->type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
break;
|
|
|
|
case DW_TAG_array_type:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_array_type_from_dwarf(dbinfo, module, die, lang,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
can_be_incomplete_array,
|
2019-05-16 09:25:43 +01:00
|
|
|
&entry.value.is_incomplete_array,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
&ret->type);
|
|
|
|
break;
|
|
|
|
case DW_TAG_subroutine_type:
|
|
|
|
case DW_TAG_subprogram:
|
2021-01-27 19:12:33 +00:00
|
|
|
err = drgn_function_type_from_dwarf(dbinfo, module, die, lang,
|
|
|
|
&ret->type);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
break;
|
|
|
|
default:
|
2019-08-15 23:03:42 +01:00
|
|
|
err = drgn_error_format(DRGN_ERROR_OTHER,
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
"unknown DWARF type tag 0x%x",
|
|
|
|
dwarf_tag(die));
|
|
|
|
break;
|
|
|
|
}
|
2020-09-12 01:41:23 +01:00
|
|
|
dbinfo->depth--;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
2019-05-16 09:25:43 +01:00
|
|
|
entry.value.type = ret->type;
|
|
|
|
entry.value.qualifiers = ret->qualifiers;
|
2020-09-12 01:41:23 +01:00
|
|
|
struct drgn_dwarf_type_map *map;
|
2019-05-16 09:25:43 +01:00
|
|
|
if (!can_be_incomplete_array && entry.value.is_incomplete_array)
|
2020-09-12 01:41:23 +01:00
|
|
|
map = &dbinfo->cant_be_incomplete_array_types;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
else
|
2020-09-12 01:41:23 +01:00
|
|
|
map = &dbinfo->types;
|
|
|
|
if (drgn_dwarf_type_map_insert_searched(map, &entry, hp, NULL) == -1) {
|
Associate types with program
I originally envisioned types as dumb descriptors. This mostly works for
C because in C, types are fairly simple. However, even then the
drgn_program_member_info() API is awkward. You should be able to look up
a member directly from a type, but we need the program for caching
purposes. This has also held me back from adding offsetof() or
has_member() APIs.
Things get even messier with C++. C++ template parameters can be objects
(e.g., template <int N>). Such parameters would best be represented by a
drgn object, which we need a drgn program for. Static members are a
similar case.
So, let's reimagine types as being owned by a program. This has a few
parts:
1. In libdrgn, simple types are now created by factory functions,
drgn_foo_type_create().
2. To handle their variable length fields, compound types, enum types,
and function types are constructed with a "builder" API.
3. Simple types are deduplicated.
4. The Python type factory functions are replaced by methods of the
Program class.
5. While we're changing the API, the parameters to pointer_type() and
array_type() are reordered to be more logical (and to allow
pointer_type() to take a default size of None for the program's
default pointer size).
6. Likewise, the type factory methods take qualifiers as a keyword
argument only.
A big part of this change is updating the tests and splitting up large
test cases into smaller ones in a few places.
Signed-off-by: Omar Sandoval <osandov@osandov.com>
2020-07-16 00:34:56 +01:00
|
|
|
/*
|
|
|
|
* This will "leak" the type we created, but it'll still be
|
|
|
|
* cleaned up when the program is freed.
|
|
|
|
*/
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return &drgn_enomem;
|
|
|
|
}
|
|
|
|
if (is_incomplete_array_ret)
|
2019-05-16 09:25:43 +01:00
|
|
|
*is_incomplete_array_ret = entry.value.is_incomplete_array;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2020-09-12 01:41:23 +01:00
|
|
|
struct drgn_error *drgn_debug_info_find_type(enum drgn_type_kind kind,
|
|
|
|
const char *name, size_t name_len,
|
|
|
|
const char *filename, void *arg,
|
|
|
|
struct drgn_qualified_type *ret)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2020-09-12 01:41:23 +01:00
|
|
|
struct drgn_debug_info *dbinfo = arg;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2020-09-02 21:05:45 +01:00
|
|
|
uint64_t tag;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
switch (kind) {
|
2019-04-22 20:30:15 +01:00
|
|
|
case DRGN_TYPE_INT:
|
|
|
|
case DRGN_TYPE_BOOL:
|
|
|
|
case DRGN_TYPE_FLOAT:
|
|
|
|
tag = DW_TAG_base_type;
|
|
|
|
break;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
case DRGN_TYPE_STRUCT:
|
|
|
|
tag = DW_TAG_structure_type;
|
|
|
|
break;
|
|
|
|
case DRGN_TYPE_UNION:
|
|
|
|
tag = DW_TAG_union_type;
|
|
|
|
break;
|
2019-11-15 01:12:47 +00:00
|
|
|
case DRGN_TYPE_CLASS:
|
|
|
|
tag = DW_TAG_class_type;
|
|
|
|
break;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
case DRGN_TYPE_ENUM:
|
|
|
|
tag = DW_TAG_enumeration_type;
|
|
|
|
break;
|
|
|
|
case DRGN_TYPE_TYPEDEF:
|
|
|
|
tag = DW_TAG_typedef;
|
|
|
|
break;
|
|
|
|
default:
|
2020-05-07 23:02:33 +01:00
|
|
|
UNREACHABLE();
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
2020-09-02 21:05:45 +01:00
|
|
|
struct drgn_dwarf_index_iterator it;
|
2020-09-12 01:41:23 +01:00
|
|
|
err = drgn_dwarf_index_iterator_init(&it, &dbinfo->dindex.global, name,
|
2020-05-11 18:16:05 +01:00
|
|
|
name_len, &tag, 1);
|
|
|
|
if (err)
|
|
|
|
return err;
|
2020-09-02 21:05:45 +01:00
|
|
|
struct drgn_dwarf_index_die *index_die;
|
|
|
|
while ((index_die = drgn_dwarf_index_iterator_next(&it))) {
|
|
|
|
Dwarf_Die die;
|
2021-01-21 17:07:03 +00:00
|
|
|
err = drgn_dwarf_index_get_die(index_die, &die);
|
2020-09-02 21:05:45 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
2019-04-22 20:30:15 +01:00
|
|
|
if (die_matches_filename(&die, filename)) {
|
2021-01-21 17:07:03 +00:00
|
|
|
err = drgn_type_from_dwarf(dbinfo, index_die->module,
|
|
|
|
&die, ret);
|
2019-04-22 20:30:15 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
/*
|
|
|
|
* For DW_TAG_base_type, we need to check that the type
|
|
|
|
* we found was the right kind.
|
|
|
|
*/
|
|
|
|
if (drgn_type_kind(ret->type) == kind)
|
|
|
|
return NULL;
|
2019-04-20 00:05:19 +01:00
|
|
|
}
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
2019-07-23 23:32:26 +01:00
|
|
|
return &drgn_not_found;
|
2019-04-28 16:42:28 +01:00
|
|
|
}
|
|
|
|
|
2019-04-30 20:38:41 +01:00
|
|
|
struct drgn_error *
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_debug_info_find_object(const char *name, size_t name_len,
|
|
|
|
const char *filename,
|
|
|
|
enum drgn_find_object_flags flags, void *arg,
|
|
|
|
struct drgn_object *ret)
|
2019-04-30 20:38:41 +01:00
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
2020-09-12 01:41:23 +01:00
|
|
|
struct drgn_debug_info *dbinfo = arg;
|
2019-04-30 20:38:41 +01:00
|
|
|
|
2020-09-12 01:41:23 +01:00
|
|
|
struct drgn_dwarf_index_namespace *ns = &dbinfo->dindex.global;
|
2020-05-11 18:16:05 +01:00
|
|
|
if (name_len >= 2 && memcmp(name, "::", 2) == 0) {
|
|
|
|
/* Explicit global namespace. */
|
|
|
|
name_len -= 2;
|
|
|
|
name += 2;
|
|
|
|
}
|
|
|
|
const char *colons;
|
|
|
|
while ((colons = memmem(name, name_len, "::", 2))) {
|
|
|
|
struct drgn_dwarf_index_iterator it;
|
|
|
|
uint64_t ns_tag = DW_TAG_namespace;
|
|
|
|
err = drgn_dwarf_index_iterator_init(&it, ns, name,
|
|
|
|
colons - name, &ns_tag, 1);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
struct drgn_dwarf_index_die *index_die =
|
|
|
|
drgn_dwarf_index_iterator_next(&it);
|
|
|
|
if (!index_die)
|
|
|
|
return &drgn_not_found;
|
|
|
|
ns = index_die->namespace;
|
|
|
|
name_len -= colons + 2 - name;
|
|
|
|
name = colons + 2;
|
|
|
|
}
|
|
|
|
|
2020-09-02 21:05:45 +01:00
|
|
|
uint64_t tags[3];
|
|
|
|
size_t num_tags = 0;
|
2019-04-30 20:38:41 +01:00
|
|
|
if (flags & DRGN_FIND_OBJECT_CONSTANT)
|
|
|
|
tags[num_tags++] = DW_TAG_enumerator;
|
|
|
|
if (flags & DRGN_FIND_OBJECT_FUNCTION)
|
|
|
|
tags[num_tags++] = DW_TAG_subprogram;
|
|
|
|
if (flags & DRGN_FIND_OBJECT_VARIABLE)
|
|
|
|
tags[num_tags++] = DW_TAG_variable;
|
|
|
|
|
2020-09-02 21:05:45 +01:00
|
|
|
struct drgn_dwarf_index_iterator it;
|
2020-05-11 18:16:05 +01:00
|
|
|
err = drgn_dwarf_index_iterator_init(&it, ns, name, strlen(name), tags,
|
|
|
|
num_tags);
|
|
|
|
if (err)
|
|
|
|
return err;
|
2020-09-02 21:05:45 +01:00
|
|
|
struct drgn_dwarf_index_die *index_die;
|
|
|
|
while ((index_die = drgn_dwarf_index_iterator_next(&it))) {
|
|
|
|
Dwarf_Die die;
|
2021-01-21 17:07:03 +00:00
|
|
|
err = drgn_dwarf_index_get_die(index_die, &die);
|
2020-09-02 21:05:45 +01:00
|
|
|
if (err)
|
|
|
|
return err;
|
2019-04-30 20:38:41 +01:00
|
|
|
if (!die_matches_filename(&die, filename))
|
|
|
|
continue;
|
|
|
|
switch (dwarf_tag(&die)) {
|
2019-07-24 00:26:29 +01:00
|
|
|
case DW_TAG_enumeration_type:
|
2021-01-21 17:07:03 +00:00
|
|
|
return drgn_object_from_dwarf_enumerator(dbinfo,
|
|
|
|
index_die->module,
|
|
|
|
&die, name,
|
2021-01-08 19:59:31 +00:00
|
|
|
ret);
|
2019-04-30 20:38:41 +01:00
|
|
|
case DW_TAG_subprogram:
|
2021-01-21 17:07:03 +00:00
|
|
|
return drgn_object_from_dwarf_subprogram(dbinfo,
|
|
|
|
index_die->module,
|
|
|
|
&die, ret);
|
2019-04-30 20:38:41 +01:00
|
|
|
case DW_TAG_variable:
|
2021-01-21 17:07:03 +00:00
|
|
|
return drgn_object_from_dwarf_variable(dbinfo,
|
|
|
|
index_die->module,
|
|
|
|
&die, ret);
|
2019-04-30 20:38:41 +01:00
|
|
|
default:
|
2020-05-07 23:02:33 +01:00
|
|
|
UNREACHABLE();
|
2019-04-30 20:38:41 +01:00
|
|
|
}
|
|
|
|
}
|
2019-07-23 23:32:26 +01:00
|
|
|
return &drgn_not_found;
|
2019-04-30 20:38:41 +01:00
|
|
|
}
|
|
|
|
|
2020-09-16 01:42:53 +01:00
|
|
|
struct drgn_error *drgn_debug_info_create(struct drgn_program *prog,
|
|
|
|
struct drgn_debug_info **ret)
|
2019-04-28 16:42:28 +01:00
|
|
|
{
|
2020-09-16 01:42:53 +01:00
|
|
|
struct drgn_debug_info *dbinfo = malloc(sizeof(*dbinfo));
|
2020-09-12 01:41:23 +01:00
|
|
|
if (!dbinfo)
|
2019-04-28 16:42:28 +01:00
|
|
|
return &drgn_enomem;
|
2020-09-16 01:42:53 +01:00
|
|
|
dbinfo->prog = prog;
|
|
|
|
const Dwfl_Callbacks *dwfl_callbacks;
|
|
|
|
if (prog->flags & DRGN_PROGRAM_IS_LINUX_KERNEL)
|
|
|
|
dwfl_callbacks = &drgn_dwfl_callbacks;
|
|
|
|
else if (prog->flags & DRGN_PROGRAM_IS_LIVE)
|
|
|
|
dwfl_callbacks = &drgn_linux_proc_dwfl_callbacks;
|
|
|
|
else
|
|
|
|
dwfl_callbacks = &drgn_userspace_core_dump_dwfl_callbacks;
|
|
|
|
dbinfo->dwfl = dwfl_begin(dwfl_callbacks);
|
|
|
|
if (!dbinfo->dwfl) {
|
2020-09-12 01:41:23 +01:00
|
|
|
free(dbinfo);
|
2020-09-16 01:42:53 +01:00
|
|
|
return drgn_error_libdwfl();
|
2019-09-25 01:13:53 +01:00
|
|
|
}
|
2020-09-16 01:42:53 +01:00
|
|
|
drgn_debug_info_module_table_init(&dbinfo->modules);
|
|
|
|
c_string_set_init(&dbinfo->module_names);
|
|
|
|
drgn_dwarf_index_init(&dbinfo->dindex);
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_dwarf_type_map_init(&dbinfo->types);
|
|
|
|
drgn_dwarf_type_map_init(&dbinfo->cant_be_incomplete_array_types);
|
|
|
|
dbinfo->depth = 0;
|
|
|
|
*ret = dbinfo;
|
2019-04-28 16:42:28 +01:00
|
|
|
return NULL;
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
|
|
|
|
2020-09-12 01:41:23 +01:00
|
|
|
void drgn_debug_info_destroy(struct drgn_debug_info *dbinfo)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
{
|
2020-09-12 01:41:23 +01:00
|
|
|
if (!dbinfo)
|
2019-05-10 07:53:16 +01:00
|
|
|
return;
|
2020-09-12 01:41:23 +01:00
|
|
|
drgn_dwarf_type_map_deinit(&dbinfo->cant_be_incomplete_array_types);
|
|
|
|
drgn_dwarf_type_map_deinit(&dbinfo->types);
|
|
|
|
drgn_dwarf_index_deinit(&dbinfo->dindex);
|
2020-09-16 01:42:53 +01:00
|
|
|
c_string_set_deinit(&dbinfo->module_names);
|
|
|
|
drgn_debug_info_free_modules(dbinfo, false, true);
|
|
|
|
assert(drgn_debug_info_module_table_empty(&dbinfo->modules));
|
|
|
|
drgn_debug_info_module_table_deinit(&dbinfo->modules);
|
|
|
|
dwfl_end(dbinfo->dwfl);
|
2020-09-12 01:41:23 +01:00
|
|
|
free(dbinfo);
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
}
|
2020-09-24 00:02:02 +01:00
|
|
|
|
|
|
|
struct drgn_error *open_elf_file(const char *path, int *fd_ret, Elf **elf_ret)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
|
|
|
|
*fd_ret = open(path, O_RDONLY);
|
|
|
|
if (*fd_ret == -1)
|
|
|
|
return drgn_error_create_os("open", errno, path);
|
|
|
|
*elf_ret = dwelf_elf_begin(*fd_ret);
|
|
|
|
if (!*elf_ret) {
|
|
|
|
err = drgn_error_libelf();
|
|
|
|
goto err_fd;
|
|
|
|
}
|
|
|
|
if (elf_kind(*elf_ret) != ELF_K_ELF) {
|
|
|
|
err = drgn_error_create(DRGN_ERROR_OTHER, "not an ELF file");
|
|
|
|
goto err_elf;
|
|
|
|
}
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
err_elf:
|
|
|
|
elf_end(*elf_ret);
|
|
|
|
err_fd:
|
|
|
|
close(*fd_ret);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *find_elf_file(char **path_ret, int *fd_ret, Elf **elf_ret,
|
|
|
|
const char * const *path_formats, ...)
|
|
|
|
{
|
|
|
|
struct drgn_error *err;
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; path_formats[i]; i++) {
|
|
|
|
va_list ap;
|
|
|
|
int ret;
|
|
|
|
char *path;
|
|
|
|
int fd;
|
|
|
|
Elf *elf;
|
|
|
|
|
|
|
|
va_start(ap, path_formats);
|
|
|
|
ret = vasprintf(&path, path_formats[i], ap);
|
|
|
|
va_end(ap);
|
|
|
|
if (ret == -1)
|
|
|
|
return &drgn_enomem;
|
|
|
|
fd = open(path, O_RDONLY);
|
|
|
|
if (fd == -1) {
|
|
|
|
free(path);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
elf = dwelf_elf_begin(fd);
|
|
|
|
if (!elf) {
|
|
|
|
close(fd);
|
|
|
|
free(path);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
if (elf_kind(elf) != ELF_K_ELF) {
|
|
|
|
err = drgn_error_format(DRGN_ERROR_OTHER,
|
|
|
|
"%s: not an ELF file", path);
|
|
|
|
elf_end(elf);
|
|
|
|
close(fd);
|
|
|
|
free(path);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
*path_ret = path;
|
|
|
|
*fd_ret = fd;
|
|
|
|
*elf_ret = elf;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
*path_ret = NULL;
|
|
|
|
*fd_ret = -1;
|
|
|
|
*elf_ret = NULL;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *read_elf_section(Elf_Scn *scn, Elf_Data **ret)
|
|
|
|
{
|
|
|
|
GElf_Shdr shdr_mem, *shdr;
|
|
|
|
Elf_Data *data;
|
|
|
|
|
|
|
|
shdr = gelf_getshdr(scn, &shdr_mem);
|
|
|
|
if (!shdr)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
if ((shdr->sh_flags & SHF_COMPRESSED) && elf_compress(scn, 0, 0) < 0)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
data = elf_getdata(scn, NULL);
|
|
|
|
if (!data)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
*ret = data;
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct drgn_error *elf_address_range(Elf *elf, uint64_t bias,
|
|
|
|
uint64_t *start_ret, uint64_t *end_ret)
|
|
|
|
{
|
|
|
|
uint64_t start = UINT64_MAX, end = 0;
|
|
|
|
size_t phnum, i;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get the minimum and maximum addresses from the PT_LOAD segments. We
|
|
|
|
* ignore memory ranges that start beyond UINT64_MAX, and we truncate
|
|
|
|
* ranges that end beyond UINT64_MAX.
|
|
|
|
*/
|
|
|
|
if (elf_getphdrnum(elf, &phnum) != 0)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
for (i = 0; i < phnum; i++) {
|
|
|
|
GElf_Phdr phdr_mem, *phdr;
|
|
|
|
uint64_t segment_start, segment_end;
|
|
|
|
|
|
|
|
phdr = gelf_getphdr(elf, i, &phdr_mem);
|
|
|
|
if (!phdr)
|
|
|
|
return drgn_error_libelf();
|
|
|
|
if (phdr->p_type != PT_LOAD || !phdr->p_vaddr)
|
|
|
|
continue;
|
|
|
|
if (__builtin_add_overflow(phdr->p_vaddr, bias,
|
|
|
|
&segment_start))
|
|
|
|
continue;
|
|
|
|
if (__builtin_add_overflow(segment_start, phdr->p_memsz,
|
|
|
|
&segment_end))
|
|
|
|
segment_end = UINT64_MAX;
|
|
|
|
if (segment_start < segment_end) {
|
|
|
|
if (segment_start < start)
|
|
|
|
start = segment_start;
|
|
|
|
if (segment_end > end)
|
|
|
|
end = segment_end;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (start >= end) {
|
|
|
|
return drgn_error_create(DRGN_ERROR_OTHER,
|
|
|
|
"ELF file has no loadable segments");
|
|
|
|
}
|
|
|
|
*start_ret = start;
|
|
|
|
*end_ret = end;
|
|
|
|
return NULL;
|
|
|
|
}
|