2020-05-15 23:13:02 +01:00
|
|
|
# Copyright (c) Facebook, Inc. and its affiliates.
|
2021-04-03 09:10:35 +01:00
|
|
|
# SPDX-License-Identifier: GPL-3.0-or-later
|
2020-05-15 23:13:02 +01:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
from collections import namedtuple
|
|
|
|
import os.path
|
|
|
|
|
2020-08-20 22:27:51 +01:00
|
|
|
from tests.dwarf import DW_AT, DW_FORM, DW_TAG
|
libdrgn: use libdwfl
libdwfl is the elfutils "DWARF frontend library". It has high-level
functionality for looking up symbols, walking stack traces, etc. In
order to use this functionality, we need to report our debugging
information through libdwfl. For userspace programs, libdwfl has a much
better implementation than drgn for automatically finding debug
information from a core dump or PID. However, for the kernel, libdwfl
has a few issues:
- It only supports finding debug information for the running kernel, not
vmcores.
- It determines the vmlinux address range by reading /proc/kallsyms,
which is slow (~70ms on my machine).
- If separate debug information isn't available for a kernel module, it
finds it by walking /lib/modules/$(uname -r)/kernel; this is repeated
for every module.
- It doesn't find kernel modules with names containing both dashes and
underscores (e.g., aes-x86_64).
Luckily, drgn already solved all of these problems, and with some
effort, we can keep doing it ourselves and report it to libdwfl.
The conversion replaces a bunch of code for dealing with userspace core
dump notes, /proc/$pid/maps, and relocations.
2019-07-15 08:51:30 +01:00
|
|
|
from tests.elf import ET, PT, SHT
|
2019-05-10 07:53:16 +01:00
|
|
|
from tests.elfwriter import ElfSection, create_elf_file
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
DwarfAttrib = namedtuple("DwarfAttrib", ["name", "form", "value"])
|
|
|
|
DwarfDie = namedtuple("DwarfAttrib", ["tag", "attribs", "children"])
|
|
|
|
DwarfDie.__new__.__defaults__ = (None,)
|
|
|
|
|
|
|
|
|
|
|
|
def _append_uleb128(buf, value):
|
|
|
|
while True:
|
|
|
|
byte = value & 0x7F
|
|
|
|
value >>= 7
|
|
|
|
if value:
|
|
|
|
buf.append(byte | 0x80)
|
|
|
|
else:
|
|
|
|
buf.append(byte)
|
|
|
|
break
|
|
|
|
|
|
|
|
|
|
|
|
def _append_sleb128(buf, value):
|
|
|
|
while True:
|
|
|
|
byte = value & 0x7F
|
|
|
|
value >>= 7
|
|
|
|
if (not value and not (byte & 0x40)) or (value == -1 and (byte & 0x40)):
|
|
|
|
buf.append(byte)
|
|
|
|
break
|
|
|
|
else:
|
|
|
|
buf.append(byte | 0x80)
|
|
|
|
|
|
|
|
|
2021-05-05 00:51:40 +01:00
|
|
|
def _compile_debug_abbrev(unit_dies, use_dw_form_indirect):
|
2019-05-10 07:53:16 +01:00
|
|
|
buf = bytearray()
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
code = 1
|
2020-01-14 19:43:58 +00:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
def aux(die):
|
|
|
|
nonlocal code
|
|
|
|
_append_uleb128(buf, code)
|
|
|
|
code += 1
|
|
|
|
_append_uleb128(buf, die.tag)
|
|
|
|
buf.append(bool(die.children))
|
|
|
|
for attrib in die.attribs:
|
|
|
|
_append_uleb128(buf, attrib.name)
|
2021-05-05 00:51:40 +01:00
|
|
|
_append_uleb128(
|
|
|
|
buf, DW_FORM.indirect if use_dw_form_indirect else attrib.form
|
|
|
|
)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.append(0)
|
|
|
|
buf.append(0)
|
|
|
|
if die.children:
|
|
|
|
for child in die.children:
|
|
|
|
aux(child)
|
2020-01-14 19:43:58 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
for die in unit_dies:
|
|
|
|
aux(die)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.append(0)
|
2019-05-10 07:53:16 +01:00
|
|
|
return buf
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
|
2021-05-05 00:51:40 +01:00
|
|
|
def _compile_debug_info(unit_dies, little_endian, bits, use_dw_form_indirect):
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
byteorder = "little" if little_endian else "big"
|
|
|
|
die_offsets = []
|
|
|
|
relocations = []
|
|
|
|
code = 1
|
|
|
|
decl_file = 1
|
2020-01-14 19:43:58 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
def aux(buf, die, depth):
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
nonlocal code, decl_file
|
|
|
|
if depth == 1:
|
2019-05-10 07:53:16 +01:00
|
|
|
die_offsets.append(len(buf))
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
_append_uleb128(buf, code)
|
|
|
|
code += 1
|
|
|
|
for attrib in die.attribs:
|
2021-05-05 00:51:40 +01:00
|
|
|
if use_dw_form_indirect:
|
|
|
|
_append_uleb128(buf, attrib.form)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if attrib.name == DW_AT.decl_file:
|
|
|
|
value = decl_file
|
|
|
|
decl_file += 1
|
|
|
|
else:
|
|
|
|
value = attrib.value
|
|
|
|
if attrib.form == DW_FORM.addr:
|
|
|
|
buf.extend(value.to_bytes(bits // 8, byteorder))
|
|
|
|
elif attrib.form == DW_FORM.data1:
|
|
|
|
buf.append(value)
|
2020-07-13 23:21:55 +01:00
|
|
|
elif attrib.form == DW_FORM.data2:
|
|
|
|
buf.extend(value.to_bytes(2, byteorder))
|
|
|
|
elif attrib.form == DW_FORM.data4:
|
|
|
|
buf.extend(value.to_bytes(4, byteorder))
|
|
|
|
elif attrib.form == DW_FORM.data8:
|
|
|
|
buf.extend(value.to_bytes(8, byteorder))
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
elif attrib.form == DW_FORM.udata:
|
|
|
|
_append_uleb128(buf, value)
|
|
|
|
elif attrib.form == DW_FORM.sdata:
|
|
|
|
_append_sleb128(buf, value)
|
2020-07-13 23:21:55 +01:00
|
|
|
elif attrib.form == DW_FORM.block1:
|
|
|
|
buf.append(len(value))
|
|
|
|
buf.extend(value)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
elif attrib.form == DW_FORM.string:
|
|
|
|
buf.extend(value.encode())
|
|
|
|
buf.append(0)
|
|
|
|
elif attrib.form == DW_FORM.ref4:
|
|
|
|
relocations.append((len(buf), value))
|
|
|
|
buf.extend(b"\0\0\0\0")
|
2021-02-18 00:27:04 +00:00
|
|
|
elif attrib.form == DW_FORM.ref_sig8:
|
|
|
|
buf.extend((value + 1).to_bytes(8, byteorder))
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
elif attrib.form == DW_FORM.sec_offset:
|
|
|
|
buf.extend(b"\0\0\0\0")
|
|
|
|
elif attrib.form == DW_FORM.flag_present:
|
|
|
|
pass
|
|
|
|
elif attrib.form == DW_FORM.exprloc:
|
|
|
|
_append_uleb128(buf, len(value))
|
|
|
|
buf.extend(value)
|
|
|
|
else:
|
|
|
|
assert False, attrib.form
|
|
|
|
if die.children:
|
|
|
|
for child in die.children:
|
2021-02-18 00:27:04 +00:00
|
|
|
aux(buf, child, depth + 1)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.append(0)
|
2020-01-14 19:43:58 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
debug_info = bytearray()
|
|
|
|
debug_types = bytearray()
|
|
|
|
tu_id = 1
|
|
|
|
for die in unit_dies:
|
|
|
|
relocations.clear()
|
|
|
|
die_offsets.clear()
|
|
|
|
buf = debug_info if die.tag == DW_TAG.compile_unit else debug_types
|
|
|
|
orig_len = len(buf)
|
|
|
|
buf.extend(b"\0\0\0\0") # unit_length
|
|
|
|
buf.extend((4).to_bytes(2, byteorder)) # version
|
|
|
|
buf.extend((0).to_bytes(4, byteorder)) # debug_abbrev_offset
|
|
|
|
buf.append(bits // 8) # address_size
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
if die.tag == DW_TAG.type_unit:
|
|
|
|
buf.extend(tu_id.to_bytes(8, byteorder)) # type_signature
|
|
|
|
tu_id += 1
|
|
|
|
# For now, we assume that the first child is the type.
|
|
|
|
relocations.append((len(buf), 0))
|
|
|
|
buf.extend(b"\0\0\0\0") # type_offset
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
aux(buf, die, 0)
|
|
|
|
|
|
|
|
unit_length = len(buf) - orig_len - 4
|
|
|
|
buf[orig_len : orig_len + 4] = unit_length.to_bytes(4, byteorder)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
for offset, index in relocations:
|
|
|
|
die_offset = die_offsets[index] - orig_len
|
|
|
|
buf[offset : offset + 4] = die_offset.to_bytes(4, byteorder)
|
|
|
|
return debug_info, debug_types
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
|
|
|
|
def _compile_debug_line(unit_dies, little_endian):
|
2019-05-10 07:53:16 +01:00
|
|
|
buf = bytearray()
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
byteorder = "little" if little_endian else "big"
|
|
|
|
|
2019-04-11 23:51:20 +01:00
|
|
|
buf.extend(b"\0\0\0\0") # unit_length
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.extend((4).to_bytes(2, byteorder)) # version
|
2019-04-11 23:51:20 +01:00
|
|
|
buf.extend(b"\0\0\0\0") # header_length
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.append(1) # minimum_instruction_length
|
|
|
|
buf.append(1) # maximum_operations_per_instruction
|
|
|
|
buf.append(1) # default_is_stmt
|
|
|
|
buf.append(1) # line_base
|
|
|
|
buf.append(1) # line_range
|
|
|
|
buf.append(1) # opcode_base
|
|
|
|
# Don't need standard_opcode_length
|
|
|
|
|
|
|
|
def compile_include_directories(die):
|
|
|
|
for attrib in die.attribs:
|
|
|
|
if attrib.name != DW_AT.decl_file:
|
|
|
|
continue
|
|
|
|
dirname = os.path.dirname(attrib.value)
|
|
|
|
if dirname:
|
|
|
|
buf.extend(dirname.encode("ascii"))
|
|
|
|
buf.append(0)
|
|
|
|
if die.children:
|
|
|
|
for child in die.children:
|
|
|
|
compile_include_directories(child)
|
2020-01-14 19:43:58 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
for die in unit_dies:
|
|
|
|
compile_include_directories(die)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.append(0)
|
|
|
|
|
|
|
|
decl_file = 1
|
|
|
|
directory = 1
|
2020-01-14 19:43:58 +00:00
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
def compile_file_names(die):
|
|
|
|
nonlocal decl_file, directory
|
|
|
|
for attrib in die.attribs:
|
|
|
|
if attrib.name != DW_AT.decl_file:
|
|
|
|
continue
|
|
|
|
dirname, basename = os.path.split(attrib.value)
|
|
|
|
buf.extend(basename.encode("ascii"))
|
|
|
|
buf.append(0)
|
|
|
|
# directory index
|
|
|
|
if dirname:
|
|
|
|
_append_uleb128(buf, directory)
|
|
|
|
directory += 1
|
|
|
|
else:
|
|
|
|
_append_uleb128(buf, 0)
|
|
|
|
_append_uleb128(buf, 0) # mtime
|
|
|
|
_append_uleb128(buf, 0) # size
|
|
|
|
if die.children:
|
|
|
|
for child in die.children:
|
|
|
|
compile_file_names(child)
|
2020-01-14 19:43:58 +00:00
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
for die in unit_dies:
|
|
|
|
compile_file_names(die)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
buf.append(0)
|
|
|
|
|
2019-05-10 07:53:16 +01:00
|
|
|
unit_length = len(buf) - 4
|
|
|
|
buf[:4] = unit_length.to_bytes(4, byteorder)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
header_length = unit_length - 6
|
2019-05-10 07:53:16 +01:00
|
|
|
buf[6:10] = header_length.to_bytes(4, byteorder)
|
|
|
|
return buf
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
|
2021-02-18 00:27:04 +00:00
|
|
|
UNIT_HEADER_TYPES = frozenset({DW_TAG.type_unit, DW_TAG.compile_unit})
|
|
|
|
|
|
|
|
|
2021-05-05 00:51:40 +01:00
|
|
|
def compile_dwarf(
|
|
|
|
dies, little_endian=True, bits=64, *, lang=None, use_dw_form_indirect=False
|
|
|
|
):
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
if isinstance(dies, DwarfDie):
|
|
|
|
dies = (dies,)
|
|
|
|
assert all(isinstance(die, DwarfDie) for die in dies)
|
2021-02-18 00:27:04 +00:00
|
|
|
|
|
|
|
if dies and dies[0].tag in UNIT_HEADER_TYPES:
|
|
|
|
unit_dies = dies
|
|
|
|
else:
|
|
|
|
unit_dies = (DwarfDie(DW_TAG.compile_unit, (), dies),)
|
|
|
|
assert all(die.tag in UNIT_HEADER_TYPES for die in unit_dies)
|
|
|
|
|
|
|
|
unit_attribs = [DwarfAttrib(DW_AT.stmt_list, DW_FORM.sec_offset, 0)]
|
2020-02-26 21:22:51 +00:00
|
|
|
if lang is not None:
|
2021-02-18 00:27:04 +00:00
|
|
|
unit_attribs.append(DwarfAttrib(DW_AT.language, DW_FORM.data1, lang))
|
|
|
|
cu_attribs = unit_attribs + [
|
|
|
|
DwarfAttrib(DW_AT.comp_dir, DW_FORM.string, "/usr/src")
|
|
|
|
]
|
|
|
|
|
|
|
|
unit_dies = [
|
|
|
|
DwarfDie(
|
|
|
|
die.tag,
|
|
|
|
list(die.attribs)
|
|
|
|
+ (cu_attribs if die.tag == DW_TAG.compile_unit else unit_attribs),
|
|
|
|
die.children,
|
|
|
|
)
|
|
|
|
for die in unit_dies
|
|
|
|
]
|
|
|
|
|
2021-05-05 00:51:40 +01:00
|
|
|
debug_info, debug_types = _compile_debug_info(
|
|
|
|
unit_dies, little_endian, bits, use_dw_form_indirect
|
|
|
|
)
|
2021-02-18 00:27:04 +00:00
|
|
|
|
|
|
|
sections = [
|
|
|
|
ElfSection(p_type=PT.LOAD, vaddr=0xFFFF0000, data=b""),
|
|
|
|
ElfSection(
|
|
|
|
name=".debug_abbrev",
|
|
|
|
sh_type=SHT.PROGBITS,
|
2021-05-05 00:51:40 +01:00
|
|
|
data=_compile_debug_abbrev(unit_dies, use_dw_form_indirect),
|
2021-02-18 00:27:04 +00:00
|
|
|
),
|
|
|
|
ElfSection(name=".debug_info", sh_type=SHT.PROGBITS, data=debug_info),
|
|
|
|
ElfSection(
|
|
|
|
name=".debug_line",
|
|
|
|
sh_type=SHT.PROGBITS,
|
|
|
|
data=_compile_debug_line(unit_dies, little_endian),
|
|
|
|
),
|
|
|
|
ElfSection(name=".debug_str", sh_type=SHT.PROGBITS, data=b"\0"),
|
|
|
|
]
|
|
|
|
if debug_types:
|
|
|
|
sections.append(
|
|
|
|
ElfSection(name=".debug_types", sh_type=SHT.PROGBITS, data=debug_types)
|
|
|
|
)
|
|
|
|
|
|
|
|
return create_elf_file(ET.EXEC, sections, little_endian=little_endian, bits=bits)
|