Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
import itertools
|
|
|
|
import os.path
|
|
|
|
import unittest
|
|
|
|
|
2019-05-10 07:53:16 +01:00
|
|
|
from drgn import filename_matches
|
2019-04-02 07:12:12 +01:00
|
|
|
from tests.libdrgn import PathIterator, path_ends_with
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
|
|
|
|
# normpath("//") returns "//". See https://bugs.python.org/issue26329.
|
|
|
|
def my_normpath(path):
|
|
|
|
path = os.path.normpath(path)
|
|
|
|
if path[:2] == "//":
|
|
|
|
return path[1:]
|
|
|
|
else:
|
|
|
|
return path
|
|
|
|
|
|
|
|
|
2019-04-02 07:12:12 +01:00
|
|
|
# Given a sequence of components, generate all of the possible combinations of
|
|
|
|
# joining or not joining those components with '/'.
|
|
|
|
def join_combinations(components):
|
|
|
|
if len(components) > 1:
|
|
|
|
for join in itertools.product([False, True], repeat=len(components) - 1):
|
|
|
|
combination = [components[0]]
|
|
|
|
for i in range(1, len(components)):
|
|
|
|
if join[i - 1]:
|
|
|
|
combination[-1] += "/" + components[i]
|
|
|
|
else:
|
|
|
|
combination.append(components[i])
|
|
|
|
yield combination
|
|
|
|
else:
|
|
|
|
yield components
|
|
|
|
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
class TestPathIterator(unittest.TestCase):
|
2019-04-02 07:12:12 +01:00
|
|
|
def assertComponents(self, path_components, expected, combinations=True):
|
|
|
|
if combinations:
|
|
|
|
cases = join_combinations(path_components)
|
|
|
|
else:
|
|
|
|
cases = (path_components,)
|
|
|
|
for case in cases:
|
|
|
|
with self.subTest(case=case):
|
|
|
|
self.assertEqual(list(PathIterator(*case)), expected)
|
|
|
|
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
def test_empty(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertEqual(list(PathIterator()), [])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
self.assertEqual(list(PathIterator("")), [])
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertEqual(list(PathIterator("", "")), [])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_simple(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("a",), ["a"])
|
|
|
|
self.assertComponents(("abc", "def"), ["def", "abc"])
|
|
|
|
self.assertComponents(("abc", "def", "ghi"), ["ghi", "def", "abc"])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_root(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("/",), [""])
|
|
|
|
self.assertComponents(("/", ""), [""])
|
|
|
|
self.assertComponents(("", "/"), [""])
|
|
|
|
self.assertComponents(("", "/", ""), [""])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_absolute(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("/root",), ["root", ""])
|
|
|
|
self.assertComponents(("/./usr",), ["usr", ""])
|
|
|
|
self.assertComponents(("/home", "user"), ["user", "home", ""])
|
|
|
|
self.assertComponents(("foo", "/root"), ["root", ""], combinations=False)
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_redundant_slash(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("a/",), ["a"])
|
|
|
|
self.assertComponents(("a//",), ["a"])
|
|
|
|
self.assertComponents(("//",), [""])
|
|
|
|
self.assertComponents(("//a",), ["a", ""])
|
|
|
|
self.assertComponents(("///a",), ["a", ""])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_dot(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("a", "."), ["a"])
|
|
|
|
self.assertComponents((".", "a"), ["a"])
|
|
|
|
self.assertComponents((".", "a", "."), ["a"])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_dot_dot(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("a", "b", ".."), ["a"])
|
|
|
|
self.assertComponents(("a", "..", "b"), ["b"])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
2019-04-02 07:12:12 +01:00
|
|
|
def test_relative_dot_dot(self):
|
|
|
|
self.assertComponents(("..", "one", "two"), ["two", "one", ".."])
|
|
|
|
self.assertComponents(("one", "..", "..", "two"), ["two", ".."])
|
|
|
|
self.assertComponents(("one", "two", "..", "..", ".."), [".."])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_dot_dot_above_root(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents(("/..", "one", "two"), ["two", "one", ""])
|
|
|
|
self.assertComponents(("/one", "..", "..", "two"), ["two", ""])
|
|
|
|
self.assertComponents(("/one", "two", "..", "..", ".."), [""])
|
Rewrite drgn core in C
The current mixed Python/C implementation works well, but it has a
couple of important limitations:
- It's too slow for some common use cases, like iterating over large
data structures.
- It can't be reused in utilities written in other languages.
This replaces the internals with a new library written in C, libdrgn. It
includes Python bindings with mostly the same public interface as
before, with some important improvements:
- Types are now represented by a single Type class rather than the messy
polymorphism in the Python implementation.
- Qualifiers are a bitmask instead of a set of strings.
- Bit fields are not considered a separate type.
- The lvalue/rvalue terminology is replaced with reference/value.
- Structure, union, and array values are better supported.
- Function objects are supported.
- Program distinguishes between lookups of variables, constants, and
functions.
The C rewrite is about 6x as fast as the original Python when using the
Python bindings, and about 8x when using the C API directly.
Currently, the exposed API in C is fairly conservative. In the future,
the memory reader, type index, and object index APIs will probably be
exposed for more flexibility.
2019-03-22 23:27:46 +00:00
|
|
|
|
|
|
|
def test_current_directory(self):
|
2019-04-02 07:12:12 +01:00
|
|
|
self.assertComponents((".",), [])
|
|
|
|
self.assertComponents(("", "."), [], combinations=False)
|
|
|
|
self.assertComponents((".", ""), [])
|
|
|
|
self.assertComponents((".", "."), [])
|
|
|
|
self.assertComponents(("foo", ".."), [])
|
|
|
|
self.assertComponents(("a", "b", "..", ".."), [])
|
|
|
|
|
2019-05-10 07:53:16 +01:00
|
|
|
def assertPathEndsWith(self, haystack, needle):
|
|
|
|
self.assertTrue(path_ends_with(PathIterator(*haystack), PathIterator(*needle)))
|
|
|
|
self.assertTrue(
|
|
|
|
filename_matches(os.path.join(*haystack), os.path.join(*needle))
|
|
|
|
)
|
|
|
|
|
|
|
|
def assertNotPathEndsWith(self, haystack, needle):
|
|
|
|
self.assertFalse(path_ends_with(PathIterator(*haystack), PathIterator(*needle)))
|
|
|
|
self.assertFalse(
|
|
|
|
filename_matches(os.path.join(*haystack), os.path.join(*needle))
|
|
|
|
)
|
|
|
|
|
2019-04-02 07:12:12 +01:00
|
|
|
def test_path_ends_with(self):
|
2019-05-10 07:53:16 +01:00
|
|
|
self.assertPathEndsWith(("ab/cd/ef",), ("ef",))
|
|
|
|
self.assertPathEndsWith(("ab/cd/ef",), ("cd/ef",))
|
|
|
|
self.assertNotPathEndsWith(("ab/cd/ef",), ("d/ef",))
|
|
|
|
self.assertNotPathEndsWith(("ab/cd", "/ef"), ("cd/ef",))
|
|
|
|
self.assertPathEndsWith(("/abc",), ("abc",))
|
|
|
|
self.assertNotPathEndsWith(("abc",), ("/abc",))
|