object-introspection/oi/OIGenerator.cpp

208 lines
6.4 KiB
C++
Raw Normal View History

2022-12-19 14:37:51 +00:00
/*
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
2023-04-26 16:20:53 +01:00
#include "oi/OIGenerator.h"
2022-12-19 14:37:51 +00:00
#include <glog/logging.h>
#include <boost/core/demangle.hpp>
2022-12-19 14:37:51 +00:00
#include <fstream>
#include <iostream>
2023-09-19 19:16:37 +01:00
#include <string_view>
#include <unordered_map>
2022-12-19 14:37:51 +00:00
#include <variant>
#include "oi/CodeGen.h"
#include "oi/Config.h"
2023-04-26 16:20:53 +01:00
#include "oi/DrgnUtils.h"
#include "oi/Headers.h"
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
#include "oi/type_graph/DrgnParser.h"
#include "oi/type_graph/TypeGraph.h"
2022-12-19 14:37:51 +00:00
namespace oi::detail {
2022-12-19 14:37:51 +00:00
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
using type_graph::DrgnParser;
using type_graph::DrgnParserOptions;
using type_graph::Type;
using type_graph::TypeGraph;
std::unordered_map<std::string, std::string>
OIGenerator::oilStrongToWeakSymbolsMap(drgnplusplus::program& prog) {
static constexpr std::string_view strongSymbolPrefix =
"oi::IntrospectionResult oi::introspect<";
static constexpr std::string_view weakSymbolPrefix =
"oi::IntrospectionResult oi::introspectImpl<";
std::unordered_map<std::string, std::pair<std::string, std::string>>
templateArgsToSymbolsMap;
2023-05-19 16:18:04 +01:00
auto symbols = prog.find_all_symbols();
for (drgn_symbol* sym : *symbols) {
auto symName = drgnplusplus::symbol::name(sym);
2023-09-19 19:16:37 +01:00
if (symName == nullptr || *symName == '\0')
continue;
auto demangled = boost::core::demangle(symName);
if (demangled.starts_with(strongSymbolPrefix)) {
auto& matchedSyms = templateArgsToSymbolsMap[demangled.substr(
strongSymbolPrefix.length())];
if (!matchedSyms.first.empty()) {
LOG(WARNING) << "non-unique symbols found: `" << matchedSyms.first
<< "` and `" << symName << '`';
}
matchedSyms.first = symName;
} else if (demangled.starts_with(weakSymbolPrefix)) {
auto& matchedSyms =
templateArgsToSymbolsMap[demangled.substr(weakSymbolPrefix.length())];
if (!matchedSyms.second.empty()) {
LOG(WARNING) << "non-unique symbols found: `" << matchedSyms.second
<< "` and `" << symName << "`";
}
matchedSyms.second = symName;
}
}
std::unordered_map<std::string, std::string> strongToWeakSymbols;
for (auto& [_, val] : templateArgsToSymbolsMap) {
if (val.first.empty() || val.second.empty()) {
continue;
}
strongToWeakSymbols[std::move(val.first)] = std::move(val.second);
}
return strongToWeakSymbols;
}
std::unordered_map<std::string, drgn_qualified_type>
2022-12-19 14:37:51 +00:00
OIGenerator::findOilTypesAndNames(drgnplusplus::program& prog) {
auto strongToWeakSymbols = oilStrongToWeakSymbolsMap(prog);
std::unordered_map<std::string, drgn_qualified_type> out;
2022-12-19 14:37:51 +00:00
for (drgn_qualified_type& func : drgnplusplus::func_iterator(prog)) {
std::string strongLinkageName;
2022-12-19 14:37:51 +00:00
{
2023-04-03 16:44:07 +01:00
const char* linkageNameCstr;
if (auto err = drgnplusplus::error(
drgn_type_linkage_name(func.type, &linkageNameCstr))) {
// throw err;
continue;
2022-12-19 14:37:51 +00:00
}
strongLinkageName = linkageNameCstr;
2022-12-19 14:37:51 +00:00
}
std::string weakLinkageName;
if (auto search = strongToWeakSymbols.find(strongLinkageName);
search != strongToWeakSymbols.end()) {
weakLinkageName = search->second;
} else {
continue; // not an oil strong symbol
2022-12-19 14:37:51 +00:00
}
// IntrospectionResult (*)(const T&)
CHECK(drgn_type_has_parameters(func.type)) << "functions have parameters";
CHECK(drgn_type_num_parameters(func.type) == 1)
<< "introspection func has one parameter";
2022-12-19 14:37:51 +00:00
auto* params = drgn_type_parameters(func.type);
drgn_qualified_type tType;
if (auto err =
drgnplusplus::error(drgn_parameter_type(&params[0], &tType))) {
2022-12-19 14:37:51 +00:00
throw err;
}
if (drgn_type_has_name(tType.type)) {
LOG(INFO) << "found OIL type: " << drgn_type_name(tType.type);
} else {
LOG(INFO) << "found OIL type: (no name)";
}
out.emplace(std::move(weakLinkageName), tType);
2022-12-19 14:37:51 +00:00
}
return out;
}
int OIGenerator::generate(fs::path& primaryObject, SymbolService& symbols) {
2022-12-19 14:37:51 +00:00
drgnplusplus::program prog;
{
std::array<const char*, 1> objectPaths = {{primaryObject.c_str()}};
if (auto err = drgnplusplus::error(drgn_program_load_debug_info(
prog.get(), std::data(objectPaths), std::size(objectPaths), false,
false))) {
LOG(ERROR) << "error loading debug info program: " << err;
throw err;
}
}
auto oilTypes = findOilTypesAndNames(prog);
2022-12-19 14:37:51 +00:00
std::map<Feature, bool> featuresMap = {
{Feature::TypeGraph, true},
{Feature::TypedDataSegment, true},
{Feature::TreeBuilderTypeChecking, true},
{Feature::TreeBuilderV2, true},
{Feature::Library, true},
{Feature::PackStructs, true},
{Feature::PruneTypeGraph, true},
};
2022-12-19 14:37:51 +00:00
OICodeGen::Config generatorConfig{};
OICompiler::Config compilerConfig{};
compilerConfig.usePIC = pic;
auto features = config::processConfigFiles(configFilePaths, featuresMap,
compilerConfig, generatorConfig);
if (!features) {
2022-12-19 14:37:51 +00:00
LOG(ERROR) << "failed to process config file";
return -1;
}
generatorConfig.features = *features;
compilerConfig.features = *features;
2022-12-19 14:37:51 +00:00
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
std::vector<CodeGen::DrgnRequest> reqs{};
reqs.reserve(oilTypes.size());
for (const auto& [linkageName, type] : oilTypes) {
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
reqs.emplace_back(
CodeGen::DrgnRequest{.ty = type.type, .linkageName = linkageName});
2022-12-19 14:37:51 +00:00
}
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
CodeGen codegen{generatorConfig, symbols};
std::string code;
codegen.codegenFromDrgns(reqs, code);
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
std::string sourcePath = sourceFileDumpPath;
if (sourceFileDumpPath.empty()) {
// This is the path Clang acts as if it has compiled from e.g. for debug
// information. It does not need to exist.
sourcePath = "oil_jit.cpp";
} else {
std::ofstream outputFile(sourcePath);
outputFile << code;
}
OICompiler compiler{{}, compilerConfig};
if (!compiler.compile(code, sourcePath, outputPath)) {
return -1;
}
codegenv2: improve multi argument generation (#395) Summary: Adds a new function `CodeGen::codegenFromDrgns` which runs multiple drgn root types through one `DrgnParser`. Stores the naming for the output within this function as we previously decided that when doing the actual codegen, which is too late with multiple root types. This new function is used in `OIGenerator` when you have multiple OIL calls within one input object. Rather than producing separate `.o`s with likely duplicate code for each callsite, we produce a single `.o` that contains both calls. For calls with types with significant overlap this should be a significant reduction in work. The downside is the calls can't have different features, but this could be solved in the future by namespacing the code based on features/config within the generated `.cpp` - the pros seem to outweigh the con. This should also be used with `oid` in multi probe mode as it would again significantly reduce the work it has to do. I didn't do this yet as it requires changing the cache. As I've got a big refactor cache ongoing at the minute it makes sense to wait until after that's landed to make this change. Test Plan: Generally tested with GitHub CI. Tested the new multi call OILGen with the new example seen below. Outputs the following code with two root calls: P869569859 - note that `VectorOfStrings_0` is the only instance, whereas previously we'd have generated it in two files. Differential Revision: D50835153 Pulled By: JakeHillion
2023-10-31 17:22:56 +00:00
LOG(INFO) << "object introspection generation complete, generated for "
<< oilTypes.size() << "sites.";
return 0;
2022-12-19 14:37:51 +00:00
}
} // namespace oi::detail