Summary: Adding support for the nullptr clang type (clang::BuiltinType::NullPtr). While there I augmented the exception message to include the type name that is missing.
Differential Revision: D53272742
CodeGen v2 permits template parameters to be qualified. This means that if we
call `make_field` with a template parameter it will be qualified. However, we
don't qualify the types when generating meta functions such as `NameProvider`
and `TypeHandler`. This means these qualified types don't match up with the
expected type.
Use `std::decay_t` when forwarding the type to `NameProvider` and `TypeHandler`
so they're always the base type that they were generated with. Most of this is
covered by `make_field`, but there are direct references to `TypeHandler<Ctx,
T>` in a lot of `TypeHandler::type` fields. Fix the problematic types manually
for now, there may need to be a better solution with meta functions for this in
the future.
Test Plan:
- CI
- Added a test for `std::unique_ptr<const uint64_t>` to exercise this. Failed
before, passes after.
- Added a test for `std::unique_ptr<const std::vector<uint64_t>>` to test a
non-primitive type. Failed before, passes after.
* fix string type sso computation
* fix rest of sbo/sso calculation
* make placement of uintptr_t cast consistent
* separate check-inline into function
Summary:
OI was previously using `std::regex_match` to match container names. This was bad because `libstdc++`'s implementation of regex is awful. In the case of limited inlining it was causing a stack overflow when running CodeGen for large types (I think types with large names but I never got to the bottom of it).
Replace this with the competent `boost::regex_match` that we already have a dependency on.
Reviewed By: ajor
Differential Revision: D53002752
Thrift isset was failing with a SEGV if the struct contained padding. This is
because we indexed the `isset_indexes` data structure using our field index
rather than the index of the field in Thrift. This then gave a rubbish index
for any exceeding which happens if we have added padding in the middle of the
struct, and this index was looked up in the bitset which can cause a SEGV.
Track a new index `thriftFieldIdx` which is only incremented if we've looked up
a Thrift index.
Namespaced the generated Thrift structs while I was there. This isn't necessary
anymore but cleans things up.
Test plan:
- Added a test case with lots of padding. These don't run in the CI but it
passes locally.
- `FILTER='OilIntegration.*' make test` - no failures
- `FILTER='OidIntegration.*' make test` - no new failures
Previously AlignmentCalc calculates the alignment and sets packing for every
type except a member with explicit alignment. Change this to check whether an
alignment has been previously set for a type before calculating it. Use this in
ClangTypeParser where the full alignment of the type is available.
Remove explicitly aligning members by the type because that was previously
reserved for members with explicit alignment. AlignmentCalc will correctly
align a member to the underlying type without this. Explicit member alignment
is still missing, as before this change.
Test plan:
- CI
- Too little. Gets further into a production type than without this change.
A previous change enabled running OIL tests with specific features enabled.
This highlighted that pointer code generation under TreeBuilder-v2 was very
broken. This change updates pointer code generation to work and enables the
skipped tests. All enabled tests need `expected_json_v2` added to them due to
formatting differences.
Reformatted and rewrote the basic type handler that handles primitives and
pointers. Removed the reliance on `features` to decide whether to generate for
TreeBuilder-v2 as the intermediate features have been removed. Pointers are
treated as containers with a capacity of 1 and a length of 0 if null/a cycle
and 1 if followed. This holds for void pointers where, although they aren't
followed, the length is still set.
There were a couple of other changes needed to enable these tests on TBv2 that
aren't worth their own issues and PRs, I sneaked them in here.
Extra changes:
- Added `Pointer` and `Reference` to TopoSorter so they generate
`NameProvider` instances. It might be worth visiting the graph differently
for `NameProvider` as it requires so many instances that others generators do
not. Will consider that in the future.
- Follow typedefs when calculating exclusive size for a type.
Closes#458.
Test plan:
- CI
- Enabled previously disabled tests.
Some of the logic that makes Thrift isset work for TreeBuilder-v2 in DrgnParser
(JIT OIL) wasn't ported to ClangTypeParser meaning it doesn't work in
Ahead-of-Time (AoT) OIL.
Add the template parameter name reconstruction for enum values to
ClangTypeParser.
Test plan:
- Tested with Thrift isset enabled on an internal type. Doesn't build before,
does build after.
Support the capture-thrift-isset feature with TreeBuilder-v2. Fairly minor
changes here except the type of the Enum in a template parameter now matters.
We follow the previous behaviour of capturing a value for each field in a
struct that has an `isset_bitset`. This value is a VarInt captured before the
C++ contents of the member. It has 3 values: 0 (not set), 1 (set), and 2
(unavailable). These are handled by the processor and represented in the output
as `false`, `true`, and `std::nullopt_t` respectively.
Changes:
- Add a simple Thrift isset processor before any fields that have Thrift isset.
- Store the fully qualified names of enum types in DrgnParser - it already
worked out this information anyway for naming the values and this is
consistent with classes.
- Forward all enum template parameters under their input name under the
assumption that they will all be policy type things like `IssetBitsetOption`.
This could turn out to be wrong.
Test plan:
- CI (doesn't test thrift changes but covers other regressions)
- Updated Thrift enum tests for new format.
- `FILTER='OilIntegration.*' make test` - Thrift tests failed before, succeed
after.
C++ has a concept of Primitive which holds in the type graph. However we don't
currently expose this information to the end user. Expose this from the OIL
iterator to allow future features like primitive rollups.
This affects containers like maps which have a fake `[]` element with no type.
They use this to group together the key/value in a map and to account for any
per element storage overhead. Currently the decision is to make the fake `[]`
element a primitive if all of its children are primitives. This allows for more
effective primitive rollups if that is implemented. This implementation detail
may be changed in future.
Test Plan:
- CI
- Updated simple tests.
Array members are currently being named "TODO" (whoops). Include arrays in
TopoSorter so each one can have a `NameProvider` generated in CodeGen. Then
pass array elements through `make_field`.
Test plan:
- CI
- Add array member names to an array test.
TreeBuilder did not consider a zero-length array like a container and
never read the array's sizeof stored in the data buffer, leading to a
mismatch between bytes written vs read out of the buffer.
Now, `TreeBuilder::isContainer` does consider zero-length array like
a container and properly consume all the object sizes in the buffer.
The recursive template implemented for `validate_size` does not support
incomplete types, like zero-length array.
By splitting the `validate_size` struct in two parts:
1. `validate_size_eq` that does the actual size check, and
2. `validate_size` that preserves the previous interface and inherit
from `validate_size_eq`,
we get the same interface and feature than previously, but without the
recursive template that doesn't support incomplete types.
Summary:
We have a good type representation in the Type Graph of an incomplete type and
the underlying type that represents. However, this incomplete type still ends
up in the generated code as `void` which loses information. For example, a
container that can't contain void may fail to compile because it was
initialised with `void` but really its because the type it was supposed to be
initialised with (say, `Foo`) had incomplete debug information.
This change identifies that a type is incomplete in the output by generating it
as an incomplete type `struct Incomplete<struct Foo>`. This allows us to name
the type correctly in the TreeBuilder output and filter for incomplete types,
as well as getting appropriate compiler errors if it mustn't be incomplete.
Test Plan:
- CI
- Added a unit test to namegen.
- Enabled and added an extra pointers_incomplete test.
This change is tricky to test because it isn't really user visible. The types
still use their `inputName` which is unchanged in any successful output - this
change is used so the compiler fails with a more detailed error.
Add the option to calculate total size (inclusive size) by wrapping the
existing iterator. This change provides a new iterator, `SizedIterator`, which
wraps an existing iterator and adds a new field `size` to the output element.
This is achieved with a two pass algorithm on the existing iterator:
1. Gather metadata for each element. This includes the total size up until that
element and the range of elements that should be included in the size.
2. Return the result from the underlying iterator with the additional
field.
This algorithm is `O(N)` time on the number of elements in the iterator and
`O(N)` time, storing 16 bytes per element. This isn't super expensive but is a
lot more than the current algorithm which requires close to constant space.
Because of this I've implemented it as a wrapper on the iterator rather than on
by default, though it is now on in every one of our integration test cases.
Test plan:
- Added to the integration tests for full coverage.
Attempting to complete a type which can't be completed currently fails oilgen.
For incomplete arrays, which we know are not possible to complete, return false
deliberately.
`requireCompleteType` likely needs to not fail in all cases in the future. For
now this works.
Test plan:
- `std::unique_ptr<long[]>` used to fail the generation. Now it can
successfully codegen.
Unlike DWARF, the Clang AST is capable of correctly calculating the alignment
for each member. If we do this then AlignmentCalc doesn't traverse into the
member to attempt to calculate the alignment.
This check might be wrong if the field has explicit alignment. That case can be
covered when we have proper integration testing and a repro.
Test plan:
- Without this lots of static asserts occur. With this it's okay.
Previously ClangTypeParser assumed all RecordTypes were structs. This is fine
for structs and classes but completely incorrect for unions. Check which type
it is and give type graph the correct one.
Test plan:
- Unions static assert without this change because their size/alignment is
wrong.
ClangTypeParser has emitted a duplicate type for `std::allocatr<int8_t>`.
Rather than fixing this, add the same check the compiler will do for the
duplicate templates that `addNames` emits. That is, `template<>
NameProvider<Foo>` will collide if `Foo` is used twice. We can do this by
adding a set of these strings for now. If this shows up regularly it will
likely make sense to deduplicate the type graph with a deduplication pass.
Test plan:
- Fixes the issue in prod. This change is quite logical.
Summary:
oilgen: migrate to source parsing
Using debug information generated from partial source (that is, not the final
binary) has been insufficient to generally generate OIL code.
A particular example is pointers to templates:
```cpp
#include <oi/oi.h>
template <typename T>
struct Foo {
T t;
};
template <typename T>
struct Bar {
Foo<T>& f;
};
void foo(const Bar<int>& b) {
oi::introspect(b);
}
```
The pointer/reference to `Foo<int>` appears in DWARF with
`DW_AT_declaration(true)` because it could be specialised before its usage.
However, with OIL, we are creating an implicit usage site in the
`oi::introspect` call that the compiler is unable to see.
This change reworks OILGen to work from a Clang command line rather than debug
information. We setup and run a compiler on the source, giving us access to an
AST and Semantic Analyser. We then:
- Find the `oi::introspect` template.
- Iterate through each of its callsites for their type.
- Run `ClangTypeParser::parse` on each type.
- Run codegen.
- Compile into an object file.
Having access to the semantic analyser allows us to forcefully complete a type,
as it would be if it was used in the initial code.
Test Plan:
hope
`buck2 run fbcode//mode/opt fbcode//object-introspection/oil/examples/compile-time:compile-time`
Reviewed By: tyroguru
Differential Revision: D51854477
Pulled By: JakeHillion
Currently we rely on `SymbolService::getTypeName` for getting the hash that's
included in the generated function's name. The value of this must stay the same
to match with the value expected by OIDebugger - changing it causes failure to
relocate when attaching with OID and JIT OIL.
Calculate this name in the `codegenFromDrgn` method and pass it through where
appropriate rather than passing the `drgn_type` itself through.
We don't need to name the type like that when using AoT OIL. Let's
hash the linkage name instead as that is more unique.
Test Plan:
- CI
We previously moved container identification later in CodeGen in order
to preserve information for AlignmentCalc.
However, Flattener needs to know if a class is a container in order to
apply its special handling for this case.
This new approach moves container identification in front of Flattener,
but has Container own a type node, representing its layout. This
underlying type node can be used for calculating a container's
alignment in a later pass.
MutationTracker could only store Type nodes, while ResultTracker is
templated on the result type so can store anything.
Template the Visitor base class on the return type of visit() functions.
This sets us up for allowing visitors to return different results from
their visit() functions in the future.
This will be used in a future commit introducing DrgnExporter, where we
cache drgn_type* results while walking the type graph.
They must not appear in the final generated code as we'd end up with
invalid types with void members, e.g.:
struct Foo {
int a;
void myIncompleteMember;
int c;
};
Removing them from the type graph early also ensures that padding is
calculated correctly.
Summary:
Changes jitlog to use a memfd, an anonymous in memory file descriptor, rather
than a file on disk. Also clean up this fd at the end of an OID run rather than
leaving it in the hope it's valid next time.
A previous attempt to land this used a `char*` from the OID process space in
the remote target syscall. Somehow this works with our integration test target,
but not globally. Changed to use the previous behaviour of putting the syscall
arg in the empty text segment. In doing this I noticed that the text segment
wouldn't be initialised at this point on a fresh process, so we were copying
into effectively an uninitialised address. Move the jit log fd setup to after
the segment setup accordingly.
Test plan:
- CI
- Tested on an integration test target as before. Works.
- Created a new target that definitely doesn't have this string in (simple for
loop). Failed before, works now.
Example:
```sh
$ OID_TEST_ARGS='-fjit-logging' stest OidIntegration.simple_struct
...
I1121 02:57:36.136890 500897 OIDebugger.cpp:269] Outputting JIT logs:
I1121 02:57:36.136896 500897 OIDebugger.cpp:272] JITLOG: SimpleStruct
@00007ffc639be180
I1121 02:57:36.136899 500897 OIDebugger.cpp:272] JITLOG: a @00007ffc639be180
I1121 02:57:36.136901 500897 OIDebugger.cpp:272] JITLOG: obj @00007ffc639be180
I1121 02:57:36.136904 500897 OIDebugger.cpp:272] JITLOG: b @00007ffc639be184
I1121 02:57:36.136905 500897 OIDebugger.cpp:272] JITLOG: obj @00007ffc639be184
I1121 02:57:36.136907 500897 OIDebugger.cpp:272] JITLOG: c @00007ffc639be188
I1121 02:57:36.136909 500897 OIDebugger.cpp:272] JITLOG: obj @00007ffc639be188
I1121 02:57:36.136911 500897 OIDebugger.cpp:278] Finished outputting JIT logs.
...
```
CodeGen v1 and CodeGen v2 must be in sync in order for CodeGen v2 and
TreeBuilder v1 to be compatible. This change updates CodeGen v1 to use
the same set of containers as CodeGen v2.
This specifically fixes "-ftype-graph -Ftree-builder-v2 -Fcapture-thrift-isset"
for Thrift types.
The integration test "thrift_isset_no_capture" covers this, but this bug
was missed as the Thrift tests do not run in CI.
Summary:
tbv2: add dynamic context passed through all functions
Previously for we had some shared state between all requests, noticeably the
pointers set. This change adds a by reference value to all requests which can
hold additional mutable state. The pointers set is moved into this mutable
state for OIL, which means each concurrent request will have its own pointer
set. Doing things this way allows more features to be added in the future
without such a big code modification.
Closes https://github.com/facebookexperimental/object-introspection/issues/404
Pull Request resolved: https://github.com/facebookexperimental/object-introspection/pull/410
Test Plan: - CI
Differential Revision: D51394035
Pulled By: JakeHillion
fbshipit-source-id: 55d2ba9b5e056148a29dc821020cfc3d94e5175a
Summary:
tbv2: replace DB template param with Ctx
TreeBuilder v2 adds a DB template parameter to every function. This is used as
part of the static type to decide what type of DataBuffer is being used:
currently `BackInserterDataBuffer<std::vector<uint8_t>>` for OIL and it would
be `DataSegmentDataBuffer` for OID.
This change replaces the `DB` template parameter with a more general `Ctx`. Due
to issues with dependent naming it also adds a `using DB` to each `TypeHandler`
which has the same function as before. This allows us to add more "static
context" (typedefs and constants) to functions without changing this signature
again, because changing the signature of everything is a massive pain.
Currently this change achieves nothing because Ctx contains only DB in a static
wrapper. In the next change I'm going to pass a reference of type Ctx around to
add a "dynamic context" to invocations which will contain the pointer array. In
future we'll then be able to add either static or dynamic context without any
signature adjustments.
Test Plan:
- CI
---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/facebookexperimental/object-introspection/pull/409).
* https://github.com/facebookexperimental/object-introspection/issues/410
* __->__ https://github.com/facebookexperimental/object-introspection/issues/409
Reviewed By: ajor
Differential Revision: D51352092
Pulled By: JakeHillion
Summary:
Currently there are two features between CodeGen v2 (TypeGraph) and TreeBuilder
v2. These are TypedDataSegment and TreeBuilderTypeChecking. Each of these
features currently has a full set of tests run in the CI and each have specific
exclusions.
Collapse these features into TreeBuilder v2. This allows for significantly
simplified testing as any OIL tests run under TreeBuilder v2 and any OID tests
run under TreeBuilder v1.
The reasoning behind this is I no longer intend to partially roll out this
feature. Full TreeBuilder v2 applies different conditions to containers than
the intermediate states, and writing these only to have them never deployed is
a waste of time.
Test Plan:
- it builds
- CI
Adds the range-v3 library which supports features that otherwise wouldn't be
available until C++23 or C++26. I caught a couple of uses that suit it but this
will allow us to use more in future.
Test Plan:
- CI
Not all containers have 8-byte alignment, so if we want to avoid lots of
manual logic for calculating container alignment on a case-by-case
basis, we must calculate alignment from the member variables before the
Class nodes have been replaced by Container nodes.
Leave it to the new mutator pass IdentifyContainers to replace Class
nodes with Container nodes where appropriate.
This will allow us to run passes over the type graph before identifying
containers, and therefore before we have lost information about the
internal details of the container (e.g. alignment of member variables).