Commit Graph

481 Commits

Author SHA1 Message Date
Jake Hillion
20cd48ac63 clangparser: provide correct kind for classes/unions
Previously ClangTypeParser assumed all RecordTypes were structs. This is fine
for structs and classes but completely incorrect for unions. Check which type
it is and give type graph the correct one.

Test plan:
- Unions static assert without this change because their size/alignment is
  wrong.
2023-12-20 16:15:02 +00:00
Jake Hillion
6d898bed95 tbv2: account for duplicate types when emitting name providers
ClangTypeParser has emitted a duplicate type for `std::allocatr<int8_t>`.
Rather than fixing this, add the same check the compiler will do for the
duplicate templates that `addNames` emits. That is, `template<>
NameProvider<Foo>` will collide if `Foo` is used twice. We can do this by
adding a set of these strings for now. If this shows up regularly it will
likely make sense to deduplicate the type graph with a deduplication pass.

Test plan:
- Fixes the issue in prod. This change is quite logical.
2023-12-20 16:14:48 +00:00
Jake Hillion
5a2ca8b059 tbv2: implement folly::IOBuf
folly::IOBuf does not have TreeBuilder v2 container support. Add it.

The implementation is a direct clone of v1. It still lacks tests.

Test Plan:
- It codegens on a prod type.
- No runtime testing... Bad form, I know.
- Issue created to add integration tests: https://github.com/facebookexperimental/object-introspection/issues/436
2023-12-20 16:13:50 +00:00
Jake Hillion
55989a9156 oilgen: migrate to source parsing (#421)
Summary:
oilgen: migrate to source parsing

Using debug information generated from partial source (that is, not the final
binary) has been insufficient to generally generate OIL code.

A particular example is pointers to templates:
```cpp
#include <oi/oi.h>
template <typename T>
struct Foo {
  T t;
};
template <typename T>
struct Bar {
  Foo<T>& f;
};
void foo(const Bar<int>& b) {
  oi::introspect(b);
}
```

The pointer/reference to `Foo<int>` appears in DWARF with
`DW_AT_declaration(true)` because it could be specialised before its usage.
However, with OIL, we are creating an implicit usage site in the
`oi::introspect` call that the compiler is unable to see.

This change reworks OILGen to work from a Clang command line rather than debug
information. We setup and run a compiler on the source, giving us access to an
AST and Semantic Analyser. We then:
- Find the `oi::introspect` template.
- Iterate through each of its callsites for their type.
- Run `ClangTypeParser::parse` on each type.
- Run codegen.
- Compile into an object file.

Having access to the semantic analyser allows us to forcefully complete a type,
as it would be if it was used in the initial code.


Test Plan:
hope

`buck2 run fbcode//mode/opt fbcode//object-introspection/oil/examples/compile-time:compile-time`

Reviewed By: tyroguru

Differential Revision: D51854477

Pulled By: JakeHillion
2023-12-19 13:26:25 -08:00
Jake Hillion
37b89d789d codegen: remove reliance on drgn type for top level name
Currently we rely on `SymbolService::getTypeName` for getting the hash that's
included in the generated function's name. The value of this must stay the same
to match with the value expected by OIDebugger - changing it causes failure to
relocate when attaching with OID and JIT OIL.

Calculate this name in the `codegenFromDrgn` method and pass it through where
appropriate rather than passing the `drgn_type` itself through.

We don't need to name the type like that when using AoT OIL. Let's
hash the linkage name instead as that is more unique.

Test Plan:
- CI
2023-12-19 15:35:59 +00:00
Alastair Robertson
2060a0491e CodeGen v2: Enable independent running without CodeGen v1
Create DrgnExporter to translate Type Graph "Type" nodes into drgn_type
structs, suitable for use in OICache and TreeBuilder.
2023-12-15 14:57:24 +00:00
Alastair Robertson
a0164e5cc7 TypeGraph: Make Class types use fully qualified names as their input names
This will ensure we continue to get fully qualified names in
user-visible output when we switch to CodeGen v2.
2023-12-15 14:45:01 +00:00
Alastair Robertson
688d483c0c TypeGraph: Fix handling for classes which inherit from containers
We previously moved container identification later in CodeGen in order
to preserve information for AlignmentCalc.

However, Flattener needs to know if a class is a container in order to
apply its special handling for this case.

This new approach moves container identification in front of Flattener,
but has Container own a type node, representing its layout. This
underlying type node can be used for calculating a container's
alignment in a later pass.
2023-12-14 18:02:45 +00:00
Alastair Robertson
aa87c3f2d1 NameGen: Override inputName for anonymous members 2023-12-14 17:42:48 +00:00
Alastair Robertson
c874f72ae2 OID: Make CodeGen v2 (TypeGraph) the default 2023-12-14 17:42:03 +00:00
Jake Hillion
35afd15ee4 capture_keys: store dynamic type path components more efficiently 2023-12-14 16:05:33 +00:00
952d3e75c8 ci: move formatting checks to nix 2023-12-14 15:31:07 +00:00
Alastair Robertson
8bf7dbae9f Type Graph: Replace MutationTracker with the more general ResultTracker
MutationTracker could only store Type nodes, while ResultTracker is
templated on the result type so can store anything.

Template the Visitor base class on the return type of visit() functions.

This sets us up for allowing visitors to return different results from
their visit() functions in the future.

This will be used in a future commit introducing DrgnExporter, where we
cache drgn_type* results while walking the type graph.
2023-12-14 13:43:19 +00:00
Jon Haslam
8193d271a8
Really, really make sure arrays have a name (#430) 2023-12-13 16:28:31 +00:00
Thierry Treyer
d79b55cfd2 Update integration tests 2023-12-13 11:59:21 +00:00
Thierry Treyer
33b67e6caf Update drgn to Omar's branch 2023-12-13 11:59:21 +00:00
Thierry Treyer
9047f69db4 Check bounds when processing unique_ptr 2023-12-13 11:59:21 +00:00
Alastair Robertson
7cc7aa8882 CodeGen: Remove Incomplete members from Classes
They must not appear in the final generated code as we'd end up with
invalid types with void members, e.g.:
  struct Foo {
    int a;
    void myIncompleteMember;
    int c;
  };

Removing them from the type graph early also ensures that padding is
calculated correctly.
2023-12-12 18:50:15 +00:00
Alastair Robertson
6b780add4a Integration Tests: Fix thrift_unions tests
With a recent Thrift update, we now must also define a destructor for
this type.
2023-12-12 18:33:52 +00:00
Alastair Robertson
4fdf44b92d TypeGraph: Delete unused function in IdentifyContainers pass
It was accidentally copied from the TypeIdentifier pass.
2023-11-21 14:07:08 +00:00
Jake Hillion
9e2b48d713 jitlog: use a memfd and glog
Summary:

Changes jitlog to use a memfd, an anonymous in memory file descriptor, rather
than a file on disk. Also clean up this fd at the end of an OID run rather than
leaving it in the hope it's valid next time.

A previous attempt to land this used a `char*` from the OID process space in
the remote target syscall. Somehow this works with our integration test target,
but not globally. Changed to use the previous behaviour of putting the syscall
arg in the empty text segment. In doing this I noticed that the text segment
wouldn't be initialised at this point on a fresh process, so we were copying
into effectively an uninitialised address. Move the jit log fd setup to after
the segment setup accordingly.

Test plan:
- CI
- Tested on an integration test target as before. Works.
- Created a new target that definitely doesn't have this string in (simple for
  loop). Failed before, works now.

Example:
```sh
$ OID_TEST_ARGS='-fjit-logging' stest OidIntegration.simple_struct
...
I1121 02:57:36.136890 500897 OIDebugger.cpp:269] Outputting JIT logs:
I1121 02:57:36.136896 500897 OIDebugger.cpp:272] JITLOG: SimpleStruct
@00007ffc639be180
I1121 02:57:36.136899 500897 OIDebugger.cpp:272] JITLOG: a @00007ffc639be180
I1121 02:57:36.136901 500897 OIDebugger.cpp:272] JITLOG: obj @00007ffc639be180
I1121 02:57:36.136904 500897 OIDebugger.cpp:272] JITLOG: b @00007ffc639be184
I1121 02:57:36.136905 500897 OIDebugger.cpp:272] JITLOG: obj @00007ffc639be184
I1121 02:57:36.136907 500897 OIDebugger.cpp:272] JITLOG: c @00007ffc639be188
I1121 02:57:36.136909 500897 OIDebugger.cpp:272] JITLOG: obj @00007ffc639be188
I1121 02:57:36.136911 500897 OIDebugger.cpp:278] Finished outputting JIT logs.
...
```
2023-11-21 12:00:13 +00:00
Jon Haslam
8f71efc2d0 Revert "jitlog: use a memfd and glog"
This reverts commit 0aa6ac4e74.
2023-11-20 19:40:44 +00:00
Jake Hillion
0aa6ac4e74 jitlog: use a memfd and glog 2023-11-20 17:44:44 +00:00
Jon Haslam
08311d2bf9
Make oitb actually log TB output (#415) 2023-11-20 13:24:28 +00:00
Alastair Robertson
8e8db376d8 OICodeGen: Respect ContainerInfo::requiredFeatures
CodeGen v1 and CodeGen v2 must be in sync in order for CodeGen v2 and
TreeBuilder v1 to be compatible. This change updates CodeGen v1 to use
the same set of containers as CodeGen v2.

This specifically fixes "-ftype-graph -Ftree-builder-v2 -Fcapture-thrift-isset"
for Thrift types.

The integration test "thrift_isset_no_capture" covers this, but this bug
was missed as the Thrift tests do not run in CI.
2023-11-16 18:11:14 +00:00
Jake Hillion
b117150f83 tbv2: add dynamic context passed through all functions (#410)
Summary:
tbv2: add dynamic context passed through all functions

Previously for we had some shared state between all requests, noticeably the
pointers set. This change adds a by reference value to all requests which can
hold additional mutable state. The pointers set is moved into this mutable
state for OIL, which means each concurrent request will have its own pointer
set. Doing things this way allows more features to be added in the future
without such a big code modification.

Closes https://github.com/facebookexperimental/object-introspection/issues/404

Pull Request resolved: https://github.com/facebookexperimental/object-introspection/pull/410

Test Plan: - CI

Differential Revision: D51394035

Pulled By: JakeHillion

fbshipit-source-id: 55d2ba9b5e056148a29dc821020cfc3d94e5175a
2023-11-16 08:03:32 -08:00
Alastair Robertson
352b71b02b OID: Don't run tree builder if we are dumping the data segment
We generally only dump the data segment when there is some issue running
tree builder and we want to process it off-host.
2023-11-16 15:25:09 +00:00
Jake Hillion
592e182e0f tbv2: replace DB template param with Ctx (#409)
Summary:
tbv2: replace DB template param with Ctx

TreeBuilder v2 adds a DB template parameter to every function. This is used as
part of the static type to decide what type of DataBuffer is being used:
currently `BackInserterDataBuffer<std::vector<uint8_t>>` for OIL and it would
be `DataSegmentDataBuffer` for OID.

This change replaces the `DB` template parameter with a more general `Ctx`. Due
to issues with dependent naming it also adds a `using DB` to each `TypeHandler`
which has the same function as before. This allows us to add more "static
context" (typedefs and constants) to functions without changing this signature
again, because changing the signature of everything is a massive pain.

Currently this change achieves nothing because Ctx contains only DB in a static
wrapper. In the next change I'm going to pass a reference of type Ctx around to
add a "dynamic context" to invocations which will contain the pointer array. In
future we'll then be able to add either static or dynamic context without any
signature adjustments.


Test Plan:
- CI

---
Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/facebookexperimental/object-introspection/pull/409).
* https://github.com/facebookexperimental/object-introspection/issues/410
* __->__ https://github.com/facebookexperimental/object-introspection/issues/409

Reviewed By: ajor

Differential Revision: D51352092

Pulled By: JakeHillion
2023-11-15 11:52:17 -08:00
Jon Haslam
02306bf80e
Make pointer tracking array facility fail gracefully when it is full (#411) 2023-11-15 16:13:54 +00:00
Jake Hillion
f9cb0115e1 types: remove now unused handlers
Summary:

Handlers were added in an intermediate form of tbv2 but those intermediate
forms have now been removed. Remove all the handlers to make grepping/find and
replaces easier across the types.

Test Plan:

- CI
2023-11-15 14:27:07 +00:00
Jake Hillion
e9d8df0ca4 oil: add support for std::list
Summary:

Remove the now useless `handler` and adds the `traversal_func` and `processor`
entries for `std::list`. This type is a bit weird as most of our sequential
containers don't have any overhead on storing the element. I went for the same
approach we take for maps where we have a shared `[]` element covering the map
overhead and below that a `key` & `value`. As we only have a single element
under it which doesn't have a logical name I went for `*`.

Closes #315.

Test Plan:
- CI
- Copied the relevant `std::vector` tests and updated the existing one.
2023-11-15 14:14:02 +00:00
Jake Hillion
3871d92abb collapse TreeBuilderV2 features
Summary:

Currently there are two features between CodeGen v2 (TypeGraph) and TreeBuilder
v2. These are TypedDataSegment and TreeBuilderTypeChecking. Each of these
features currently has a full set of tests run in the CI and each have specific
exclusions.

Collapse these features into TreeBuilder v2. This allows for significantly
simplified testing as any OIL tests run under TreeBuilder v2 and any OID tests
run under TreeBuilder v1.

The reasoning behind this is I no longer intend to partially roll out this
feature. Full TreeBuilder v2 applies different conditions to containers than
the intermediate states, and writing these only to have them never deployed is
a waste of time.

Test Plan:
- it builds
- CI
2023-11-13 19:43:03 +00:00
Jake Hillion
25426127bb add range-v3 library
Adds the range-v3 library which supports features that otherwise wouldn't be
available until C++23 or C++26. I caught a couple of uses that suit it but this
will allow us to use more in future.

Test Plan:
- CI
2023-11-13 18:42:04 +00:00
Jake Hillion
393f8aab42 clang-format: disable bin packing
Bin packing often makes code hard to read. Disable it entirely.

Test plan:
- CI
2023-11-13 18:19:53 +00:00
Alastair Robertson
c207972af6 TypeGraph: Calculate alignment before identifying containers
Not all containers have 8-byte alignment, so if we want to avoid lots of
manual logic for calculating container alignment on a case-by-case
basis, we must calculate alignment from the member variables before the
Class nodes have been replaced by Container nodes.
2023-11-06 13:16:30 +00:00
Alastair Robertson
2c5fb5d845 TypeGraph: Stop identifying containers in DrgnParser
Leave it to the new mutator pass IdentifyContainers to replace Class
nodes with Container nodes where appropriate.

This will allow us to run passes over the type graph before identifying
containers, and therefore before we have lost information about the
internal details of the container (e.g. alignment of member variables).
2023-11-06 11:45:57 +00:00
Alastair Robertson
98008f9054 TypeGraph: Add IdentifyContainers mutator pass 2023-11-06 11:28:00 +00:00
Alastair Robertson
451678b19b TypeGraph: Create MutationTracker helper class 2023-11-06 11:28:00 +00:00
Alastair Robertson
30678b1dad TypeGraph: Create Mutator and RecursiveMutator base classes 2023-11-06 11:28:00 +00:00
Alastair Robertson
81ae075a68 DrgnParserTest: Replace testContains with testGlob
It now uses '*' as a wildcard character for more powerful matching of
type graphs.
2023-11-06 11:05:09 +00:00
Jake Hillion
315a5cde6c integration: move pid back to end 2023-11-02 18:27:27 +00:00
Jake Hillion
6e1635ce1e remove oil v1 leftovers 2023-10-30 18:18:13 +00:00
Jake Hillion
6f623e95a4 incomplete: handle drgn returning a nullptr name 2023-10-30 16:47:50 +00:00
Jake Hillion
4563cbe713 tbv2: improve equality for iterator 2023-10-25 17:05:42 +01:00
Alastair Robertson
e7581ad915 TopoSorter: Only allow certain params to be incomplete
For the containers which are allowed to be declared with incomplete
types, it is only the contained types which are allowed to be
incomplete. Other template parameters (e.g. allocators) must always be
defined before use.
2023-10-25 17:01:44 +01:00
Jake Hillion
5daed4cc72 tbv2: add a repeat instruction 2023-10-25 16:05:12 +01:00
Jake Hillion
6f3580143e tbv2: change instruction stack from deque to vector 2023-10-25 15:56:01 +01:00
Jake Hillion
cc6c7bf21f compiler: update to c++20 2023-10-25 15:55:18 +01:00
Jake Hillion
fcb50df611 oil: make AoT introspect noinline 2023-10-25 13:23:47 +01:00
Jake Hillion
67739840fe add portability.h (#386)
Summary:

Spin around our open/closed source checks. Previously we defined `OSS_ENABLE` in open source builds. This change defines `OI_META_INTERNAL` instead. This is nicer, as external users don't have to do anything special to get a working build.

Use this new macro to define a boolean constant in a new header `Portability.h`. This is inspired by Folly, and makes the internal build easier - definitions in Buck2 have to propagate up from a dependency instead of down from one. Annoyingly we can't use `if constexpr` for a lot of the previous `#ifdef` blocks as we conditionally include the headers. Longer term we could fix this by exposing a header interface but no source, allowing these to build but not be compiled in. For now I did something weird: I defined a function style macro in `Portability.h` based on the compile time macro. This forces you to have included `Portability.h` before using it to ensure the definition is everywhere. Open to feedback as I haven't seen anyone else do this.

Reviewed By: tyroguru

Differential Revision: D50000454
2023-10-24 03:03:16 -07:00