drgn/tests
Omar Sandoval 5e2d70f09e dwarfindex: add file name to DIE index entry key
A name and tag are not always enough to uniquely identify a type or
variable. For example, "struct workspace" in the Linux kernel can refer
to one of at least three types; fs/btrfs/{lzo,zlib,zstd}.c each have
their own struct workspace type. We can, however, also differentiate
DIEs on the file they were declared in.

The naive thing to do would be to include the file name as a string in
the hash table entry. However, that means we must allocate and
canonicalize each path in the line number program header and pay an
extra cache miss plus string comparison when adding a new entry.

We can get rid of the cache miss and string comparison if we instead map
the file name to a unique identifier. The foolproof way to do this would
be to create another big hash table of file names and use the hash table
entry index as the unique identifier. However, for this, we'd still need
to allocate and canoicalize each path as well as worry about another big
hash table.

Once we observe that we can get away with "almost certainly unique"
instead of "truly unique" identifiers, the next logical step is to just
use a hash of the file name as the identifier. With a 64-bit hash and
the ~50k files in the kernel, the probability of a collision is 1 in 10
billion. Even in the extremely unlikely event that there is a collision,
it only matters if the files with colliding names also have colliding
DIEs, which brings things pretty close to the realm of impossibility.

After this change, DwarfIndex.find() returns a list of DIEs matching the
name and tag. The callers will be updated to use the list in upcoming
changes.
2018-06-21 23:13:39 -07:00
..
__init__.py type: improve type handling 2018-02-24 19:37:51 -08:00
test_corereader.py Implement core dump reading in C 2018-05-24 17:55:47 -07:00
test_leb128.py dwarf: rewrite drgn.dwarf in pure Python 2018-03-26 01:51:20 -07:00
test_memberdesignator.py Add member designator parser 2018-05-13 00:41:20 -07:00
test_program.py program: handle pointers to typedefs of struct/union types 2018-05-25 22:32:03 -07:00
test_rlcompleter.py Add better rlcompleter 2018-05-13 23:51:42 -07:00
test_type.py corereader: implement type reads in C 2018-05-25 00:41:12 -07:00
test_typeindex.py dwarfindex: add file name to DIE index entry key 2018-06-21 23:13:39 -07:00
test_typename.py typename: use a better canonical formatting for integer types 2018-04-29 23:56:31 -07:00