TODO: Replace the references to local paths.
oilgen (the basis of Ahead Of Time compilation for OIL) has never been passed
through our large test suite and has instead had more focused testing and large
examples. When consuming DWARF information in a similar fashion to JIT OIL this
was okay, but with the new Clang AST based mechanism it means we have very
little coverage.
This change adds an oilgen test for every test case that has an oil test.
Relying on the build system to create the test target as before would make it
difficult to have failing tests, so we move the build into the integration test
runner. This involves:
1. Writing the input source to a file.
2. Consuming it with oilgen to get the implementation object file.
3. Compiling the input source and linking it with this file.
4. Running the newly created target.
This approach can give the full error message at any stage that fails and will
fail the test appropriately. The downside is the build system integration is
more difficult, as we need the correct compiler flags for the target and to use
the correct compiler. It would be very tricky to replicate this in a build
system that's not CMake, so we will likely only run these tests in open source.
Test plan:
- CI
Attempting to complete a type which can't be completed currently fails oilgen.
For incomplete arrays, which we know are not possible to complete, return false
deliberately.
`requireCompleteType` likely needs to not fail in all cases in the future. For
now this works.
Test plan:
- `std::unique_ptr<long[]>` used to fail the generation. Now it can
successfully codegen.
Unlike DWARF, the Clang AST is capable of correctly calculating the alignment
for each member. If we do this then AlignmentCalc doesn't traverse into the
member to attempt to calculate the alignment.
This check might be wrong if the field has explicit alignment. That case can be
covered when we have proper integration testing and a repro.
Test plan:
- Without this lots of static asserts occur. With this it's okay.
Previously ClangTypeParser assumed all RecordTypes were structs. This is fine
for structs and classes but completely incorrect for unions. Check which type
it is and give type graph the correct one.
Test plan:
- Unions static assert without this change because their size/alignment is
wrong.
Summary:
oilgen: migrate to source parsing
Using debug information generated from partial source (that is, not the final
binary) has been insufficient to generally generate OIL code.
A particular example is pointers to templates:
```cpp
#include <oi/oi.h>
template <typename T>
struct Foo {
T t;
};
template <typename T>
struct Bar {
Foo<T>& f;
};
void foo(const Bar<int>& b) {
oi::introspect(b);
}
```
The pointer/reference to `Foo<int>` appears in DWARF with
`DW_AT_declaration(true)` because it could be specialised before its usage.
However, with OIL, we are creating an implicit usage site in the
`oi::introspect` call that the compiler is unable to see.
This change reworks OILGen to work from a Clang command line rather than debug
information. We setup and run a compiler on the source, giving us access to an
AST and Semantic Analyser. We then:
- Find the `oi::introspect` template.
- Iterate through each of its callsites for their type.
- Run `ClangTypeParser::parse` on each type.
- Run codegen.
- Compile into an object file.
Having access to the semantic analyser allows us to forcefully complete a type,
as it would be if it was used in the initial code.
Test Plan:
hope
`buck2 run fbcode//mode/opt fbcode//object-introspection/oil/examples/compile-time:compile-time`
Reviewed By: tyroguru
Differential Revision: D51854477
Pulled By: JakeHillion
We previously moved container identification later in CodeGen in order
to preserve information for AlignmentCalc.
However, Flattener needs to know if a class is a container in order to
apply its special handling for this case.
This new approach moves container identification in front of Flattener,
but has Container own a type node, representing its layout. This
underlying type node can be used for calculating a container's
alignment in a later pass.
MutationTracker could only store Type nodes, while ResultTracker is
templated on the result type so can store anything.
Template the Visitor base class on the return type of visit() functions.
This sets us up for allowing visitors to return different results from
their visit() functions in the future.
This will be used in a future commit introducing DrgnExporter, where we
cache drgn_type* results while walking the type graph.
They must not appear in the final generated code as we'd end up with
invalid types with void members, e.g.:
struct Foo {
int a;
void myIncompleteMember;
int c;
};
Removing them from the type graph early also ensures that padding is
calculated correctly.
Adds the range-v3 library which supports features that otherwise wouldn't be
available until C++23 or C++26. I caught a couple of uses that suit it but this
will allow us to use more in future.
Test Plan:
- CI
Not all containers have 8-byte alignment, so if we want to avoid lots of
manual logic for calculating container alignment on a case-by-case
basis, we must calculate alignment from the member variables before the
Class nodes have been replaced by Container nodes.
Leave it to the new mutator pass IdentifyContainers to replace Class
nodes with Container nodes where appropriate.
This will allow us to run passes over the type graph before identifying
containers, and therefore before we have lost information about the
internal details of the container (e.g. alignment of member variables).
For the containers which are allowed to be declared with incomplete
types, it is only the contained types which are allowed to be
incomplete. Other template parameters (e.g. allocators) must always be
defined before use.
Lots of places rely on reference stability of ContainerInfo objects
(CodeGen's deduplication, Container nodes' containerInfo_ member).
In the key capture work, we need to be able to append to this list,
which would invalidate references before this change.
Dummy and DummyAllocator nodes had been changed to use NodeIds, but
were still printed out in full when visited for a second time.
[[nodiscard]] prevents future bugs of this type by turning them into
compilation errors.
Example of the now-fixed bug:
[1] Container: std::map<int32_t, int32_t, DummySizedOperator<0, 0, 8>, std::allocator<std::pair<int32_t const, int32_t>>>
Param
Primitive: int32_t
Param
Primitive: int32_t
Param
[2] Dummy [less<int>]
Param
...
[3] Container: std::map<int32_t, int32_t, DummySizedOperator<0, 0, 8>, std::allocator<std::pair<int32_t const, int32_t>>>
Param
Primitive: int32_t
Param
Primitive: int32_t
Param
[2]
Dummy [less<int>]
Param
...
With this patch, the second "Dummy" line will not be printed.
We only want to do the extra work if it's explicitly requested.
chaseRawPointers is already explicitly requested whenever it's needed
and readEnumValues currently isn't needed at all.
Summary:
Update to clang-15 compiler and libraries as clang-12 is ancient.
The changes to oilgen are necessary because the new internal toolchain is being more picky about linking PIC to PIC. In certain modes we build with PIC, but try to link a non-PIC oilgen artifact. Add the ability to build the oilgen artifacts with PIC which sorts this.
Reviewed By: ttreyer
Differential Revision: D46220858
Types within containers were previously named TODO. This sorts it out so
they're named as their most resolved type. The current implementation
skips Typedef names.
The TypeGraph class should only be responsible for storing Type nodes.
Traversing the graph and tracking which nodes have been visited should
not be included there.
Passes now take a NodeTrackerHolder as an input parameter, which
provides access to a zeroed-out NodeTracker.
Type Graph deduplicates and modifies names to better fit the generated
code, for example `int32_t[4]` becomes `OIArray<int32_t, 4>` and `struct
MyStruct` might become `struct MyStruct_0`.
Add an `inputName` which better represents the original input code which
can be used when building the tree.
This removes Printer's legacy behaviour of generating an ID for each
node as it gets printed. This old method meant that if new nodes were
added to or removed from a graph, every ID after the new/removed node
would change.
Now IDs are stable so it is easier to follow specific nodes through
multiple transformation passes in CodeGen.
Names which were generated on-demand are now stored in member variables,
which are set during the ctor and can be regenerated when required (by
NameGen).
We previously only marked as packed if there was no tail padding, which
was not a sufficient condition.
The new AlignmentCalcTest.PackedMembers test case is an example which
would previously not have been marked as packed.
CodeGen v1 does not record anything for pointers to incomplete types.
Not even the address, as is done for other pointers.
Introduce a new Primitive type "Incomplete". This behaves identically to
"Void", but allows us to tell whether a type was defined as void or if
it ended up like that because of incomplete DWARF information.
This extracts the compatibility logic from AddPadding, which allows for it to be
simplified and will make it easier to extend and eventually remove in the
future. No functional changes.
This lets us remove fields from types when they are no longer needed,
speeding up later passes.
A secondary benefit of pruning unused types means that we sometimes
remove types for which we can't generate correct C++ code. This can
allow us to CodeGen for complex types which reference these broken types
without actually requiring them (e.g. as template parameters).
Add a new feature flag "prune-type-graph" to control this pass. It makes
sense to prune most of the time, but for testing CodeGen functionality
on a wider range of types, it will be useful to have the option to not
prune.