Skip to content

Instantly share code, notes, and snippets.

@blockspacer
Created March 12, 2019 16:00
Show Gist options
  • Save blockspacer/095791578ffbaf276ff691431e5060a0 to your computer and use it in GitHub Desktop.
Save blockspacer/095791578ffbaf276ff691431e5060a0 to your computer and use it in GitHub Desktop.
cftf in docker

Clang From The Future

CFTF is a source-to-source compiler for C++ developers who want to make use of C++14 and C++17 features on platforms/toolchains that usually don't support doing so. To this end, the tool converts modern C++ language features into their equivalent "old school" versions so that your current toolchain can process it.

CFTF is intended to be used as a preprocessor for other compilers and hence integrates transparently into your existing build system. When using CMake, this process is very easy to set up.

In theory, CFTF works with any compiler, although currently only compilers with gcc/clang's CLI interfaces have been tested in practice. Patches for MSVC support or other platforms are very much welcome!

Why?

A lot of the features added in C++14 and C++17 are purely syntactical sugar, so it always bothered me that we have to wait for a compiler update rather than being able to "just" make our current compiler see through the abstraction. CFTF is my attempt of teaching existing compilers how new language features work.

There are a number of use cases for this:

  • Early adoption of new standards while waiting for official support from your toolchain vendor
  • Porting an existing C++14/17 code base to a toolchain that doesn't receive any vendor updates anymore
  • Enabling use of libraries implemented in C++17 such as hana or range-v3 in a codebase that uses C++11 apart from those libraries

Build and Usage Instructions

To build CFTF, install libclang 6.0 and then use

mkdir build
cd build
CC=clang-6.0 CXX=clang++-6.0 cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DLLVM_ROOT_DIR=/usr/lib/llvm-6.0 -DCMAKE_C_COMPILER=clang-6.0 -DCMAKE_CXX_COMPILER=clang++-6.0 ..
cmake --build .

Or with docker:

mkdir build
# build Dockerfile
sudo -E docker build -t cfrf-docker-gcc .
# Now let’s check if our image has been created. 
sudo -E docker images
# mounts $PWD to /home/u/cfrf and runs command
sudo -E docker run --rm -v "$PWD":/home/u/cfrf -w /home/u/cfrf/build cfrf-docker-gcc cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ..
sudo -E docker run --rm -v "$PWD":/home/u/cfrf -w /home/u/cfrf/build cfrf-docker-gcc cmake --build .

Build tests: CXX=../../build/cftf cmake -DCMAKE_CXX_FLAGS="-frontend-compiler=/usr/bin/g++" .. CXX=../../build/cftf cmake --build . cat /tmp/n3638_return_type_deduction_cftf_out.cpp.cpp

Then, to compile a CMake-based C++14/17 project with your existing toolchain compiler (e.g. g++), use:

CXX=/usr/local/bin/cftf cmake -DCMAKE_CXX_FLAGS="-frontend-compiler=/usr/bin/g++" ..

If the clang compiler executable is not installed on the system, you also need to add libclang's resource directory to CMAKE_CXX_FLAGS via -resource-dir=/usr/lib64/clang/6.0.1 (the actual path depends on yout libclang installation).

Projects not using CMake need to resort to hacky solutions, currently. One method is to rename your existing compiler executable and put a copy of the cftf executable in its old place. To point CFTF to the frontend compiler, set the CFTF_FRONTEND_CXX, e.g. export CFTF_FRONTEND_CXX=/usr/bin/g++.

To make sure CFTF functions correctly, you can try out the tests/ (using their standalone CMakeLists.txt).

Current Status

CFTF is ready enough to be tried out "for fun", but it's still mostly a proof-of-concept. I encourage you to try it out if the idea sounds useful to you, but do note it's not ready for use in production currently. I'm hoping with feedback from the community this will soon change, though!

At the moment, all of the following C++ features will be converted to C++11-compatible code, with support for more features planned in the future:

  • Structured bindings
  • Constexpr if
  • Function return type deduction, e.g. auto func() { return 5; }
  • Optional static assertion messages, e.g. static_assert(sizeof(T) > 4)
  • Fold expressions (soon!)

Furthermore, CFTF can convert parameter pack expansions to C++98-compatible code.

Future

The current feature list is small compared to the total list of C++14 and C++17 changes. The set of supported features is intentionally kept small for now until support for them works robustly and correctly in all weird corner cases that might arise.

That said, once things are rock-stable, support for new features will be added, and I will also explore ways of supporting pre-C++11 targets.

#ifndef CFTF_AST_VISITOR_HPP
#define CFTF_AST_VISITOR_HPP
#include <clang/AST/RecursiveASTVisitor.h>
#include <clang/Rewrite/Core/Rewriter.h>
namespace cftf {
// TODO: This should be moved elsewhere
namespace features {
inline bool constexpr_if = true;
}
class RewriterBase;
class ASTVisitor : public clang::RecursiveASTVisitor<ASTVisitor> {
using Parent = clang::RecursiveASTVisitor<ASTVisitor>;
public:
ASTVisitor(clang::ASTContext& context, clang::Rewriter& rewriter_);
bool VisitDeclStmt(clang::DeclStmt* stmt);
bool VisitVarDecl(clang::VarDecl* decl);
bool VisitSizeOfPackExpr(clang::SizeOfPackExpr* expr);
bool TraversePackExpansionExpr(clang::PackExpansionExpr* expr);
bool VisitPackExpansionExpr(clang::PackExpansionExpr* expr);
bool VisitDeclRefExpr(clang::DeclRefExpr* expr);
bool VisitCXXFoldExpr(clang::CXXFoldExpr* expr);
bool TraverseCXXFoldExpr(clang::CXXFoldExpr* expr);
// Used to explicitly specialize function templates which are otherwise specialized implicitly
bool TraverseFunctionTemplateDecl(clang::FunctionTemplateDecl* decl);
bool VisitFunctionDecl(clang::FunctionDecl* decl);
bool VisitIfStmt(clang::IfStmt* stmt);
bool VisitDecompositionDecl(clang::DecompositionDecl* decl);
bool VisitStaticAssertDecl(clang::StaticAssertDecl* decl);
bool shouldTraversePostOrder() const;
bool shouldVisitTemplateInstantiations() const { return true; }
struct FunctionTemplateInfo {
// The DeclRefExpr are parameter packs referenced in the pack expansion
struct ParamPackExpansionInfo {
clang::PackExpansionExpr* expr;
std::vector<clang::DeclRefExpr*> referenced_packs;
};
std::vector<ParamPackExpansionInfo> param_pack_expansions;
bool in_param_pack_expansion = false;
// List of all Decls in this template. Used for reverse-lookup in
// implicit specializations to find templated Decls from specialized
// ones.
std::vector<clang::Decl*> decls;
// Assumes there is only a single one of these.
// Don't use this for ParmVarDecls, since they might be parameter packs
// (i.e. multiple specialized Decls per single templated Decl)!
clang::Decl* FindTemplatedDecl(clang::SourceManager& sm, clang::Decl* specialized) const;
};
struct CurrentFunctionInfo {
clang::FunctionDecl* decl;
FunctionTemplateInfo* template_info;
struct Parameter {
// decl in the primary function template
clang::ParmVarDecl* templated;
// Decl and unique name for parameter(s) in a CFTF-specialized
// function template. Note that there may be muliple of these,
// since multiple arguments may be passed for a single variadic
// template parameter
struct SpecializedParameter {
clang::ParmVarDecl* decl;
// Unique name generated for this argument: Generally, we just
// copy over the templated parameter name into this field,
// however for variadic parameters this would cause all
// arguments generated from a parameter pack to be assigned the
// same name.
std::string unique_name;
};
// decl in the specialized function.
// If "templated" is a parameter pack, there may be multiple decls here (or none)
// NOTE: This is populated only if the parameter is named!
std::vector<SpecializedParameter> specialized;
};
std::vector<Parameter> parameters;
clang::ParmVarDecl* FindTemplatedParamDecl(clang::ParmVarDecl* specialized) const;
const std::vector<Parameter::SpecializedParameter>& FindSpecializedParamDecls(clang::ParmVarDecl* templated) const;
};
private:
// Gets the string of the contents enclosed by the two SourceLocations extended to the end of the last token
clang::SourceLocation getLocForEndOfToken(clang::SourceLocation end);
// Gets the string of the contents enclosed by the two SourceLocations extended to the end of the last token
std::string GetClosedStringFor(clang::SourceLocation begin, clang::SourceLocation end);
clang::ASTContext& context;
std::unique_ptr<RewriterBase> rewriter;
bool IsInFullySpecializedFunction() const {
return current_function.has_value();
}
std::optional<CurrentFunctionInfo> current_function;
// Only valid while traversing a template function.
// In particular, not valid in implicit specializations
FunctionTemplateInfo* current_function_template = nullptr;
std::unordered_map<clang::FunctionDecl*, FunctionTemplateInfo> function_templates;
};
} // namespace cftf
#endif // CFTF_AST_VISITOR_HPP

cftf in docker

dockerfile supports proxy filesystem -> experimental/filesystem updated readme updated cmakelists include dirs

# 3.10 is the version I'm using, but if you're on a lower version feel free to comment out this line and report back results...
cmake_minimum_required(VERSION 3.10)
project(cftf)
# set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -stdlib=libc++ -lc++abi")
find_package(Clang REQUIRED CONFIG HINTS /usr/bin /usr/lib64/cmake/clang)
find_package(LLVM REQUIRED CONFIG HINTS /usr/lib/llvm-6.0 /usr/lib64/cmake/llvm)
add_executable(cftf constexpr_if.cpp main.cpp rewriter.cpp structured_bindings.cpp template_specializer.cpp)
target_link_libraries(cftf LLVM clangAST clangBasic clangFrontend clangLex clangTooling clangRewrite stdc++fs)
target_include_directories(cftf PUBLIC ${LLVM_INCLUDE_DIRS})
set_property(TARGET cftf PROPERTY CXX_STANDARD 17)
set_property(TARGET cftf PROPERTY CXX_STANDARD_REQUIRED ON)
install(TARGETS cftf RUNTIME DESTINATION "${CMAKE_INSTALL_PREFIX}/bin")
#include "ast_visitor.hpp"
#include "rewriter.hpp"
#include <iostream>
namespace cftf {
/**
* Traverses the templated version of a function and tries to find a match for the given if statement.
* This is used for "constexpr if", where clang helpfully strips away untaken branches in specialized
* functions. Looking up the original statement allows us to get the SourceLocation information of
* those branches we need to do our transformations reliably.
*/
class FinderForOriginalIfStmt : public clang::RecursiveASTVisitor<FinderForOriginalIfStmt> {
public:
FinderForOriginalIfStmt(clang::FunctionDecl* func, clang::IfStmt* stmt) : specialized_stmt(stmt) {
// Make sure "func" is indeed a specialization
assert (func->getPrimaryTemplate() != nullptr);
{
TraverseFunctionDecl(func->getPrimaryTemplate()->getTemplatedDecl());
}
}
operator clang::IfStmt*() const {
return match;
}
bool VisitIfStmt(clang::IfStmt* stmt) {
if (specialized_stmt->getLocStart() == stmt->getLocStart()) {
match = stmt;
// Abort traversal for this check
return false;
}
return true;
}
bool needs_specialization = false;
clang::IfStmt* specialized_stmt;
clang::IfStmt* match = nullptr;
};
bool ASTVisitor::VisitIfStmt(clang::IfStmt* stmt) {
if (features::constexpr_if && stmt->isConstexpr()) {
// TODO: Only use this if VisitFunctionDecl has ensured we can modify the current function!
}
if (!IsInFullySpecializedFunction()) {
return true;
}
clang::Expr* cond = stmt->getCond();
assert(cond);
bool result;
bool eval_succeeded = cond->EvaluateAsBooleanCondition(result, context);
// TODO: For newer clang:
// clang::EvalResult result;
// bool eval_succeeded = cond->EvaluateAsConstantExpr(result, clang::Expr::EvaluateForCodeGen, context);
if (!eval_succeeded) {
// This shouldn't have compiled in the first place: "if constexpr" requires a constant expression
std::cerr << "Couldn't evaluate constexpr if condition!" << std::endl;
cond->dump();
// TODO: This actually does happen currently when we run this in a function template. Just silently ignore it for now hence! (This is covered by our tests)
//assert(false);
return true;
}
clang::Stmt* branch = result ? stmt->getThen() : stmt->getElse();
assert(current_function);
if (current_function->decl->getPrimaryTemplate()) {
// In function template specializations, clang overly helpfully strips out untaken else-branches right away...
// While that could have been very convenient, it breaks our design since we these branches will still be in the
// StagingRewriter, and now we don't get the SourceLocations of what needs to be removed.
// Hence, we need traverse the entire function we're in and find the "if constexpr" corresponding to the one in the
// specialized function. We detect this correspondence based on stmt->getLocStart(), which hopefully shouldn't
// have changed.
// TODO: This probably breaks down for manually specialized functions
clang::IfStmt* original_statement = FinderForOriginalIfStmt(current_function->decl, stmt);
assert(original_statement);
branch = result ? original_statement->getThen() : original_statement->getElse();
stmt = original_statement;
}
if (branch) {
// Remove all parts of the statement that we statically know aren't needed.
//
// We keep:
// * The conditional (including everything enclosed by the parentheses following "if constexpr")
// * The body of the branch that succeeded (including curly braces {}, if any)
//
// We throw away:
// * "if constexpr", "else", "else if", and the parentheses surrounding their conditions
// (required "if"/"else" keywords might be re-added manually later)
// * Bodies of branches that are not taken (replaced with an empty body {})
//
// Instead of replacing the entire IfStmt with only what's needed, we smartly remove all unneeded parts individually.
// This ensures the associated SourceLocations stay valid and hence rewrite rules in
// nested nodes apply properly.
//
// NOTE: Naively we might even go as far as removing the conditional entirely;
// however, we do need to carry around any variables defined in the condition
// since they are valid even in else-branches.
// Initializing expression comes first, then the condition variable declaration, then a plain condition.
// Only getCond() is guaranteed to return a non-null value.
// Test for and assign in the appropriate order.
clang::SourceLocation cond_first = stmt->getCond()->getLocStart();
if (stmt->getInit()) {
cond_first = stmt->getInit()->getLocStart();
} else if (stmt->getConditionVariableDeclStmt()) {
cond_first = stmt->getConditionVariableDeclStmt()->getLocStart();
}
// Don't need to check for initializing statement here since that precedes the condition anyway
clang::SourceLocation cond_last = stmt->getCond()->getLocEnd();
if (stmt->getConditionVariableDeclStmt()) {
cond_last = stmt->getConditionVariableDeclStmt()->getLocEnd();
}
rewriter->ReplaceTextExcludingEndToken({ stmt->getLocStart(), cond_first }, "if (");
rewriter->RemoveTextExcludingEndToken({ getLocForEndOfToken(cond_last), branch->getLocStart() });
if (result) {
rewriter->InsertTextAfter(cond_last, ") ");
} else {
// Add an empty branch body
rewriter->InsertTextAfter(cond_last, ") {} else\n");
}
rewriter->RemoveTextIncludingEndToken({ getLocForEndOfToken(branch->getLocEnd()), stmt->getLocEnd() });
} else {
// Condition was false and no else-branch has been given, so just remove the entire statement
rewriter->RemoveTextIncludingEndToken({ stmt->getLocStart(), stmt->getLocEnd() });
}
return true;
}
} // namespace cftf
FROM ubuntu:18.04
# MAINTAINER alex
# Give docker the rights to access X-server
# xhost +local:docker
# Run a terminal in container
# sudo docker run -it --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix my-docker-gcc
# an example of how to build (with Makefile generated from cmake) inside the container
# sudo docker run --rm -v "$PWD":/home/u/NetCapStatistics -w /home/u/NetCapStatistics/build alext234/my-docker-gcc cmake -DCMAKE_CXX_COMPILER=g++ -DCMAKE_BUILD_TYPE=Release ..
# sudo docker run --rm -v "$PWD":/home/u/NetCapStatistics -w /home/u/NetCapStatistics/build alext234/my-docker-gcc make
# https://askubuntu.com/a/1013396
# RUN export DEBIAN_FRONTEND=noninteractive
# Set it via ARG as this only is available during build:
ARG DEBIAN_FRONTEND=noninteractive
ENV LC_ALL C.UTF-8
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
#ENV TERM screen
ENV PATH=/usr/lib/clang/6.0/include:/usr/lib/llvm-6.0/include/:$PATH
ARG APT="apt-get -qq --no-install-recommends"
# https://www.peterbe.com/plog/set-ex
RUN set -ex
# Turn off SSL verification on the whole system (very bad).
# node
RUN echo 'NODE_TLS_REJECT_UNAUTHORIZED=0' >> ~/.bashrc
# npm
RUN echo "strict-ssl=false" >> ~/.npmrc
RUN echo "registry=http://registry.npmjs.org/" > ~/.npmrc
# ruby
RUN echo ':ssl_verify_mode: 0' >> ~/.gemrc
# yum
RUN echo "sslverify=false" >> /etc/yum.conf
RUN echo "sslverify=false" >> ~/.yum.conf
# apt
RUN echo "Acquire::http::Verify-Peer \"false\";" >> /etc/apt.conf
RUN echo "Acquire::https::Verify-Peer \"false\";" >> /etc/apt.conf
RUN echo "Acquire::http::Verify-Peer \"false\";" >> ~/.apt.conf
RUN echo "Acquire::https::Verify-Peer \"false\";" >> ~/.apt.conf
RUN echo "Acquire::http::Verify-Peer \"false\";" >> /etc/apt/apt.conf.d/00proxy
RUN echo "Acquire::https::Verify-Peer \"false\";" >> /etc/apt/apt.conf.d/00proxy
# wget
RUN echo "check-certificate = off" >> /etc/.wgetrc
RUN echo "check-certificate = off" >> ~/.wgetrc
# curl
RUN echo "insecure" >> /etc/.curlrc
RUN echo "insecure" >> ~/.curlrc
RUN $APT update
RUN $APT install -y --reinstall software-properties-common
RUN $APT install -y gnupg2 wget
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key --no-check-certificate | apt-key add -
# NOTE: need to set at least empty http-proxy
RUN apt-key adv --keyserver-options http-proxy=$http_proxy --keyserver keyserver.ubuntu.com --recv-keys 1E9377A2BA9EF27F
RUN apt-key adv --keyserver-options http-proxy=$http_proxy --keyserver keyserver.ubuntu.com --recv-keys 1E9377A2BA9EF27F
RUN apt-key adv --keyserver-options http-proxy=$http_proxy --keyserver keyserver.ubuntu.com --recv-keys 94558F59
RUN apt-key adv --keyserver-options http-proxy=$http_proxy --keyserver keyserver.ubuntu.com --recv-keys 2EA8F35793D8809A
RUN apt-add-repository "deb http://ppa.launchpad.net/ubuntu-toolchain-r/test/ubuntu $(lsb_release -sc) main"
RUN apt-add-repository -y "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-5.0 main"
RUN apt-add-repository -y "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-6.0 main"
RUN apt-add-repository -y "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-7 main"
RUN apt-add-repository -y "deb http://apt.llvm.org/bionic/ llvm-toolchain-bionic-8 main"
# update and install dependencies
RUN $APT update
RUN $APT install -y \
ca-certificates \
software-properties-common \
git \
wget \
locales
RUN $APT update
RUN $APT install -y \
make \
git \
curl \
vim \
vim-gnome
RUN $APT install -y cmake
RUN $APT install -y \
build-essential \
clang-6.0 python-lldb-6.0 lldb-6.0 lld-6.0 llvm-6.0-dev \
clang-tools-6.0 libclang-common-6.0-dev libclang-6.0-dev \
libc++abi-dev libc++-dev libclang-common-6.0-dev libclang1-6.0 libclang-6.0-dev
#include "ast_visitor.hpp"
#include "rewriter.hpp"
#include <clang/Lex/Lexer.h>
#include <clang/Frontend/FrontendAction.h>
#include <clang/Frontend/ASTConsumers.h>
#include <clang/Frontend/CompilerInstance.h>
#include <clang/Tooling/Tooling.h>
#include <clang/Rewrite/Core/Rewriter.h>
#include <llvm/Support/raw_os_ostream.h>
//#include <filesystem>
#include <experimental/filesystem>
#include <fstream>
#include <iostream>
#include <memory>
#include <numeric>
#include <string_view>
namespace fs = std::experimental::filesystem;
#ifdef __linux__
namespace llvm {
/**
* http://lists.llvm.org/pipermail/llvm-dev/2017-January/109621.html
* We can't rebuild llvm, but we can define symbol missed in llvm build.
*/
//int DisableABIBreakingChecks = 1;
}
#endif
namespace ct = clang::tooling;
namespace cftf {
static llvm::cl::OptionCategory tool_category("tool options");
ASTVisitor::ASTVisitor(clang::ASTContext& context, clang::Rewriter& rewriter_)
: context(context), rewriter(std::unique_ptr<RewriterBase>(new Rewriter(rewriter_))) {
}
bool ASTVisitor::VisitCXXFoldExpr(clang::CXXFoldExpr* expr) {
std::cerr << "Visiting CXX fold expression" << std::endl;
std::cerr << " " << std::flush;
auto* pattern = expr->getPattern();
pattern->dumpColor();
std::cerr << std::endl;
auto& sm = rewriter->getSourceMgr();
const auto pattern_base_str = GetClosedStringFor(pattern->getLocStart(), pattern->getLocEnd());
// TODO: Support operators: + - * / % ^ & | << >>, all of these with an = at the end; ==, !=, <, >, <=, >=, &&, ||, ",", .*, ->*
using namespace std::literals::string_view_literals;
std::map<clang::BinaryOperatorKind, std::string_view> operators;
operators[clang::BO_Add] = "add"sv;
operators[clang::BO_Sub] = "sub"sv;
operators[clang::BO_Mul] = "mul"sv;
operators[clang::BO_Div] = "div"sv;
operators[clang::BO_Rem] = "mod"sv;
operators[clang::BO_Xor] = "xor"sv;
operators[clang::BO_And] = "and"sv;
operators[clang::BO_Or] = "or"sv;
operators[clang::BO_Shl] = "shl"sv;
operators[clang::BO_Shr] = "shr"sv;
operators[clang::BO_AddAssign] = "add_assign"sv;
operators[clang::BO_SubAssign] = "sub_assign"sv;
operators[clang::BO_MulAssign] = "mul_assign"sv;
operators[clang::BO_DivAssign] = "div_assign"sv;
operators[clang::BO_RemAssign] = "mod_assign"sv;
operators[clang::BO_XorAssign] = "xor_assign"sv;
operators[clang::BO_AndAssign] = "and_assign"sv;
operators[clang::BO_OrAssign] = "or_assign"sv;
operators[clang::BO_ShlAssign] = "shl_assign"sv;
operators[clang::BO_ShrAssign] = "shr_assign"sv;
operators[clang::BO_Assign] = "assign"sv;
operators[clang::BO_EQ] = "equals"sv;
operators[clang::BO_NE] = "notequals"sv;
operators[clang::BO_LT] = "less"sv;
operators[clang::BO_GT] = "greater"sv;
operators[clang::BO_LE] = "lessequals"sv;
operators[clang::BO_GE] = "greaterequals"sv;
operators[clang::BO_LAnd] = "land"sv;
operators[clang::BO_LOr] = "lor"sv;
operators[clang::BO_Comma] = "comma"sv;
auto fold_op = expr->getOperator();
if (fold_op == clang::BO_PtrMemD || fold_op == clang::BO_PtrMemI) {
// TODO: These might just work, actually...
throw std::runtime_error("Fold expressions on member access operators not supported, yet!");
}
auto init_value_str = expr->getInit() ? GetClosedStringFor(expr->getInit()->getLocStart(), expr->getInit()->getLocEnd()) : "";
// TODO: What value category should we use for the arguments?
// Currently, assigment operators take lvalue-refs, and anything else copies by value
auto pattern_str = std::string("fold_expr_").append(operators.at(fold_op));
if (expr->isLeftFold()) {
pattern_str += "_left(";
if (expr->getInit()) {
pattern_str += init_value_str + ", ";
}
} else {
pattern_str += "_right(";
}
pattern_str += pattern_base_str + "...";
if (expr->isRightFold() && expr->getInit()) {
pattern_str += ", " + init_value_str;
}
pattern_str += ")";
std::cerr << " Pattern: \"" << pattern_str << '"' << std::endl;
rewriter->ReplaceTextIncludingEndToken({expr->getLocStart(), expr->getLocEnd()}, pattern_str);
return true;
}
bool ASTVisitor::TraverseCXXFoldExpr(clang::CXXFoldExpr* expr) {
// We currently can't perform any nested replacements within a fold expressions
// hence, visit this node but none of its children, and instead process those in the next pass
std::cerr << "Traversing fold expression: " << GetClosedStringFor(expr->getLocStart(), expr->getLocEnd()) << std::endl;
Parent::WalkUpFromCXXFoldExpr(expr);
return true;
}
bool ASTVisitor::VisitStaticAssertDecl(clang::StaticAssertDecl* decl) {
if (decl->getMessage() == nullptr) {
// Add empty assertion message
auto assert_cond = GetClosedStringFor(decl->getAssertExpr()->getLocStart(), decl->getAssertExpr()->getLocEnd());
auto& sm = rewriter->getSourceMgr();
auto new_assert = std::string("static_assert(") + assert_cond + ", \"\")";
rewriter->ReplaceTextIncludingEndToken({decl->getLocStart(), decl->getLocEnd()}, new_assert);
}
return true;
}
bool ASTVisitor::shouldTraversePostOrder() const {
// ACTUALLY, visit top-nodes first; that way, we can withhold further transformations in its child nodes if necessary
return false;
// Visit leaf-nodes first (so we transform the innermost expressions first)
//return true;
}
clang::SourceLocation ASTVisitor::getLocForEndOfToken(clang::SourceLocation end) {
return clang::Lexer::getLocForEndOfToken(end, 0, rewriter->getSourceMgr(), {});
}
std::string ASTVisitor::GetClosedStringFor(clang::SourceLocation begin, clang::SourceLocation end) {
auto& sm = rewriter->getSourceMgr();
auto begin_data = sm.getCharacterData(begin);
auto end_data = sm.getCharacterData(getLocForEndOfToken(end));
return std::string(begin_data, end_data - begin_data);
}
class ASTConsumer : public clang::ASTConsumer {
public:
ASTConsumer(clang::Rewriter& rewriter) : rewriter(rewriter) {}
void HandleTranslationUnit(clang::ASTContext& context) override {
ASTVisitor{context, rewriter}.TraverseDecl(context.getTranslationUnitDecl());
}
private:
clang::Rewriter& rewriter;
};
static std::string GetOutputFilename(llvm::StringRef input_filename) {
auto slash_pos = input_filename.find_last_of('/');
if (slash_pos == std::string::npos) {
slash_pos = 0;
} else {
++slash_pos;
}
// Insert "_cftf_out" before the file extension (or at the end of the filename if there is no file extension),
// and then add ".cpp"
auto period_pos = input_filename.find_last_of('.');
if (period_pos == std::string::npos) {
period_pos = input_filename.size();
}
period_pos = std::max(period_pos, slash_pos);
// TODO: Prefix the output filename with some unique token for the current compilation flags and working directory
std::string output = fs::temp_directory_path() / std::string(input_filename.begin() + slash_pos, input_filename.begin() + period_pos);
output += "_cftf_out";
std::copy(input_filename.begin() + period_pos, input_filename.end(), std::back_inserter(output));
output += ".cpp";
return output;
}
ct::CompilationDatabase* global_compilation_database = nullptr;
class FrontendAction : public clang::ASTFrontendAction {
public:
FrontendAction() {}
void EndSourceFileAction() override {
clang::SourceManager& sm = rewriter.getSourceMgr();
// TODO: Handle stdin
auto filename = sm.getFileEntryForID(sm.getMainFileID())->getName().data();
auto commands = global_compilation_database->getCompileCommands(filename);
assert(commands.size() == 1);
auto out_filename = GetOutputFilename(commands[0].Filename);
std::ofstream output(out_filename);
llvm::raw_os_ostream llvm_output_stream(output);
rewriter.getEditBuffer(sm.getMainFileID()).write(llvm_output_stream);
}
std::unique_ptr<clang::ASTConsumer> CreateASTConsumer(clang::CompilerInstance& ci, clang::StringRef file) override {
rewriter.setSourceMgr(ci.getSourceManager(), ci.getLangOpts());
return llvm::make_unique<ASTConsumer>(rewriter);
}
private:
clang::Rewriter rewriter;
};
} // namespace cftf
struct ParsedCommandLine {
// Indexes into argv, referring to detected input filenames
std::vector<size_t> input_filename_arg_indices;
// Indexes into argv, referring to command line arguments that need to be forwarded to the internal libtooling pass
std::vector<size_t> input_indexes;
std::string frontend_compiler;
};
ParsedCommandLine ParseCommandLine(size_t argc, const char* argv[]) {
using namespace std::string_literals;
// TODO: Strip CFTF-specific options from argument list
// TODO: Add options specific to the CFTF pass to argument list
// Indexes into argv, referring to detected input filenames (including the executable name)
std::vector<size_t> input_filename_arg_indices;
// Indexes into known command line arguments
std::vector<size_t> input_indexes;
std::string frontend_compiler;
for (size_t arg_idx = 1; arg_idx < argc; ++arg_idx) {
auto arg = argv[arg_idx];
auto is_cpp_file = [](const char* str) -> bool {
// TODO: This is a very incomplete heuristic:
// * Not everybody uses .cpp/.cxx for C++ files
// * The source language can be overridden by user flags
auto len = std::strlen(str);
if ((len > 3) && (0 == std::strcmp(str + len - 4, ".cpp"))) {
return true;
}
if ((len > 3) && (0 == std::strcmp(str + len - 4, ".cxx"))) {
return true;
}
return false;
};
if (std::strcmp(arg, "-") == 0) {
// stdin. TODO.
std::cerr << "Using stdin as input not handled yet" << std::endl;
std::exit(1);
} else if (arg[0] == '-') {
// This argument is some sort of flag.
if (std::strcmp(arg, "-c") == 0) {
// Compile only (no linking)
input_indexes.push_back(arg_idx);
} else if (std::strcmp(arg, "-o") == 0) {
// Output filename
// Needed to generate a suitable filename for the generated, intermediate C++ code
input_indexes.push_back(arg_idx++);
input_indexes.push_back(arg_idx);
} else if (arg[1] == 'D' || arg[1] == 'I' || std::strcmp(arg, "-isystem") == 0) {
// Preprocessor define
input_indexes.push_back(arg_idx);
if (arg[2] == ' ' || std::strcmp(arg, "-isystem") == 0) {
if (arg_idx + 1 == argc) {
std::cerr << "Invalid input: Expected symbol after \"-D\" or \"-isystem\", got end of command line" << std::endl;
std::exit(1);
}
// Include the actual definition, too.
// Note that if a value is provided, it's provided via "=VALUE", i.e. it cannot be space-separated.
input_indexes.push_back(++arg_idx);
}
} else if (arg[1] == 's' || arg[2] == 't' || arg[3] == 'd' || arg[4] == '=') {
// Set C++ language standard version
input_indexes.push_back(arg_idx);
} else if (std::strncmp(arg, "-frontend-compiler=", std::strlen("-frontend-compiler=")) == 0) {
// CTFT-internal option
frontend_compiler = arg + strlen("-frontend-compiler=");
} else if (false) {
// TODO: -i, -isystem, -iquote, -idirafter
// TODO: -stdlib?
} else {
std::cerr << "Ignoring command line option \"" << arg << "\"" << std::endl;
}
} else if (is_cpp_file(arg)) {
// TODO: Does this catch all inputs? Is there a way to specify inputs via non-positional command line arguments?
std::cerr << "Detected input cpp file \"" << arg << "\"" << std::endl;
input_filename_arg_indices.push_back(arg_idx);
}
}
return { std::move(input_filename_arg_indices), std::move(input_indexes), std::move(frontend_compiler) };
}
struct InternalCommandLine {
std::vector<const char*> args;
std::string internal_storage;
};
InternalCommandLine BuildInternalCommandLine(const ParsedCommandLine& parsed_cmdline, const char* argv[]) {
using namespace std::string_literals;
auto&& [ input_filename_arg_indices, input_indexes, ignored ] = parsed_cmdline;
// Build a restricted command line that only includes all input files
InternalCommandLine internal_command_line;
std::vector<const char*>& internal_argv = internal_command_line.args;
internal_command_line.internal_storage.reserve(1000); // TODO: HACK!!
std::transform(input_indexes.begin(), input_indexes.end(), std::back_inserter(internal_argv),
[&argv, &storage=internal_command_line.internal_storage](size_t idx) {
const char* ptr = &*storage.end();
storage += argv[idx];
storage += '\0';
return ptr;
});
std::transform(input_filename_arg_indices.begin(), input_filename_arg_indices.end(), std::back_inserter(internal_argv), [argv](size_t idx) { return argv[idx]; });
return internal_command_line;
}
static std::string GetClangResourceDirectory() {
char resource_dir[PATH_MAX];
std::FILE* stdout = popen("clang -print-resource-dir", "r");
if (!stdout) {
std::cerr << "popen failed, falling back to user-specified resource directory for libclang" << std::endl;
return "";
}
auto* resource_line = std::fgets(resource_dir, sizeof(resource_dir), stdout);
std::fclose(stdout);
if (!resource_line) {
std::cerr << "Error: Clang couldn't find its own resource-directory?" << std::endl;
return "";
}
// Strip new-line: Clang should always print this in normal operation
std::string ret = resource_line;
assert(ret.back() == '\n');
ret.pop_back();
return ret;
}
class CompilationDatabase : public ct::CompilationDatabase {
std::vector<std::string> infiles;
std::vector<ct::CompileCommand> commands;
public:
CompilationDatabase(const ParsedCommandLine& parsed_cmdline, const InternalCommandLine& internal_cmdline, const char* argv[]) {
assert(parsed_cmdline.input_filename_arg_indices.size() <= 1);
if (!parsed_cmdline.input_filename_arg_indices.empty()) {
std::transform(parsed_cmdline.input_filename_arg_indices.begin(), parsed_cmdline.input_filename_arg_indices.end(), std::back_inserter(infiles),
[argv](size_t index) {
return argv[index];
});
ct::CompileCommand cmd;
cmd.Directory = "."; // Current working directory
cmd.Filename = argv[parsed_cmdline.input_filename_arg_indices[0]];
// The first argument is always skipped over, since it's just the executable name. We add it here so libtooling doesn't skip over data that's actually important
cmd.CommandLine.push_back("cftf");
std::copy(internal_cmdline.args.begin(), internal_cmdline.args.end(), std::back_inserter(cmd.CommandLine));
// Override the resource-directory, which defaults to a path
// relative to the current working directory. This is used to
// locate standard library headers though, so we really want to
// use the resource directory of the actual toolchain instead
// TODO: Only specify this when not already provided by the user
cmd.CommandLine.push_back("-resource-dir=" + GetClangResourceDirectory());
if (cmd.Filename == "-") {
std::cerr << "stdin not supported, yet" << std::endl;
std::exit(1);
} else {
cmd.Output = cftf::GetOutputFilename(cmd.Filename);
}
commands.emplace_back(cmd);
}
}
std::vector<ct::CompileCommand> getCompileCommands(llvm::StringRef) const override {
// TODO: Take the given path into consideration
return commands;
}
std::vector<std::string> getAllFiles() const override {
return infiles;
}
};
int main(int argc, const char* argv[]){
auto parsed_cmdline = ParseCommandLine(static_cast<size_t>(argc), argv);
auto internal_argv = BuildInternalCommandLine(parsed_cmdline, argv);
CompilationDatabase compilation_database(parsed_cmdline, internal_argv, argv);
cftf::global_compilation_database = &compilation_database;
// Run FrontendAction on each input file
for (auto& file : compilation_database.getAllFiles()) {
std::cerr << "Processing file " << file << std::endl;
for (auto& cmd : compilation_database.getCompileCommands(file)) {
std::cerr << " Directory: " << cmd.Directory << std::endl;
std::cerr << " Command: ";
for (auto& cmd2 : cmd.CommandLine) {
std::cerr << cmd2 << " ";
}
std::cerr << std::endl;
std::cerr << " Output: " << cmd.Output << std::endl;
}
std::cerr << std::endl;
ct::ClangTool tool(compilation_database, file);
// TODO: Use a custom DiagnosticsConsumer to silence the redundant warning output
int result = tool.run(ct::newFrontendActionFactory<cftf::FrontendAction>().get());
if (result != 0) {
std::exit(1);
}
}
const char* frontend_command = !parsed_cmdline.frontend_compiler.empty() ? parsed_cmdline.frontend_compiler.c_str() : std::getenv("CFTF_FRONTEND_CXX");
if (!frontend_command || frontend_command[0] == 0) {
std::cerr << "Error: -frontend-compiler not set, nor was CFTF_FRONTEND_CXX set" << std::endl;
exit(1);
}
// Replace original input filenames with the corresponding cftf output
std::string modified_cmdline = frontend_command;
for (size_t arg_idx = 1; arg_idx < static_cast<size_t>(argc); ++arg_idx) {
using namespace std::literals::string_literals;
auto arg = argv[arg_idx];
if (parsed_cmdline.input_filename_arg_indices.end() != std::find(parsed_cmdline.input_filename_arg_indices.begin(), parsed_cmdline.input_filename_arg_indices.end(), arg_idx)) {
auto compile_commands = compilation_database.getCompileCommands(arg);
assert(!compile_commands.empty());
if (compile_commands.size() > 1) {
std::cerr << "Compiling the same file multiple times is not supported yet. Please raise a bug report if you run into this issue" << std::endl;
std::exit(1);
}
auto temp_output_filename = compile_commands[0].Output;
std::cerr << "Replacing presumable input argument \"" << arg << "\" with \"" << temp_output_filename << "\"" << std::endl;
// TODO: Wrap filename in quotes!
modified_cmdline += " "s + temp_output_filename;
// TODO: If "-o" has not been supplied, explicitly add it here.
// This is needed because the default filename chosen depends
// on the input filename, but since we replace the original
// input filename with our intermediate output file, the
// final output will be named differently unless we
// explicitly specifiy it
} else if (std::strncmp(arg, "-frontend-compiler=", std::strlen("-frontend-compiler=")) == 0) {
// CFTF-specific argument => silently drop it from the command line
} else if (/* DISABLES CODE */ (false) && std::strncmp(arg, "-std=", std::strlen("-std=")) == 0) {
// TODO: Should downgrade the C++ version requirements from gnu++17/c++17 to 14 or 11
} else {
// Other argument; just copy this to the new command line
// TODO: Wrap arguments in quotes or escape them!
modified_cmdline += " "s + arg;
}
}
// Add file path to the include directories to make ""-includes work
// TODO: Instead of doing this, we could just rewrite #include statements for absolute file paths
if (!parsed_cmdline.input_filename_arg_indices.empty()) {
// TODO: This needs to be done for every input file, so it won't work when compiling multiple source files stored in different folders...
assert(parsed_cmdline.input_filename_arg_indices.size() == 1);
auto path = fs::absolute(argv[parsed_cmdline.input_filename_arg_indices[0]]).parent_path();
modified_cmdline += " -I\"" + path.string() + "\"";
}
// Trigger proper compilation
std::cerr << "Invoking \"" << modified_cmdline << "\"" << std::endl;
std::system(modified_cmdline.c_str());
}
#include <type_traits>
constexpr auto test1() {
return 5;
}
constexpr decltype(auto) test1(int& a) {
return a;
}
template<typename T>
constexpr auto test2() {
if constexpr (sizeof(T) == sizeof(int)) {
return char{};
} else {
return int{};
}
}
#if CFTF_SUPPORT_ALIASING_FUNCTION_NAMES
// Same function name as the previous test2 but with a dummy parameter
template<typename T>
constexpr auto test2(int a) {
if constexpr (sizeof(T) == sizeof(char)) {
return char{};
} else {
return T{};
}
}
#endif
template<typename T>
constexpr auto test3a(T& val) {
return val;
}
template<typename T>
constexpr decltype(auto) test3b(T& val) {
return val;
}
template<typename T>
constexpr decltype(auto) test3c(T val) {
return val;
}
template<typename T>
constexpr decltype(auto) test3d(T val) {
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wreturn-stack-address"
return (val);
#pragma clang diagnostic pop
}
// Trailing return type specifications shouldn't be matched by return type deduction
constexpr auto test4() -> char {
return 5;
}
static_assert(std::is_same<decltype(test1()), int>::value);
int dummy;
static_assert(std::is_same<decltype(test1(dummy)), int&>::value);
static_assert(std::is_same<decltype(test2<char>()), int>::value);
static_assert(std::is_same<decltype(test2<int>()), char>::value);
#if CFTF_SUPPORT_ALIASING_FUNCTION_NAMES
static_assert(std::is_same<decltype(test2<char>(dummy)), char>::value);
static_assert(std::is_same<decltype(test2<int>(dummy)), int>::value);
#endif
static_assert(std::is_same<decltype(test3a(dummy)), int>::value);
static_assert(std::is_same<decltype(test3b(dummy)), int&>::value);
static_assert(std::is_same<decltype(test3c(dummy)), int>::value);
static_assert(std::is_same<decltype(test3d(dummy)), int&>::value);
static_assert(std::is_same<decltype(test4()), char>::value);
// Static assert without assertion message (since C++17)
static_assert(true);
// Static assert with assertion message (since C++11)
static_assert(true, "assertion message");
#include "rewriter.hpp"
#include <clang/Rewrite/Core/Rewriter.h>
#include <clang/Lex/Lexer.h>
namespace cftf {
bool RewriterBase::InsertTextAfter(clang::SourceLocation loc, llvm::StringRef new_str) {
loc = clang::Lexer::getLocForEndOfToken(loc, 0, getSourceMgr(), {});
return ReplaceTextExcludingEndToken({loc, loc}, new_str);
}
bool RewriterBase::RemoveTextIncludingEndToken(clang::SourceRange range) {
return ReplaceTextIncludingEndToken(range, "");
}
bool RewriterBase::RemoveTextExcludingEndToken(clang::SourceRange range) {
return ReplaceTextExcludingEndToken(range, "");
}
bool Rewriter::ReplaceTextIncludingEndToken(clang::SourceRange range, llvm::StringRef new_str) {
return rewriter.ReplaceText(range, new_str);
}
bool Rewriter::ReplaceTextExcludingEndToken(clang::SourceRange range, llvm::StringRef ref) {
// It seems that clang::Rewriter provides no way of replacing contents up to a given location without also removing the token at that location,
// so just insert the content before and then remove the previous range.
if (range.getBegin() != range.getEnd()) {
if (rewriter.RemoveText(clang::CharSourceRange::getCharRange(range))) {
return true;
}
}
return rewriter.InsertTextBefore(range.getBegin(), ref);
}
clang::SourceManager& Rewriter::getSourceMgr() {
return rewriter.getSourceMgr();
}
} // namespace cftf
#ifndef CFTF_REWRITER_HPP
#define CFTF_REWRITER_HPP
namespace clang {
class Rewriter;
class SourceLocation;
class SourceRange;
class SourceManager;
}
namespace llvm {
class StringRef;
}
namespace cftf {
/**
* Abstract interface akin to clang::Rewriter.
* This allows us to do more sophisticated source manipulation than through Rewriter alone.
*/
class RewriterBase {
public:
virtual ~RewriterBase() = default;
virtual bool ReplaceTextIncludingEndToken(clang::SourceRange, llvm::StringRef) = 0;
virtual bool ReplaceTextExcludingEndToken(clang::SourceRange, llvm::StringRef) = 0;
bool InsertTextAfter(clang::SourceLocation, llvm::StringRef);
bool RemoveTextIncludingEndToken(clang::SourceRange);
bool RemoveTextExcludingEndToken(clang::SourceRange);
virtual clang::SourceManager& getSourceMgr() = 0;
};
/**
* clang::Rewriter based Rewriter that directly operates on the contents of an input source file
*/
class Rewriter final : public RewriterBase {
public:
Rewriter(clang::Rewriter& rewriter) : rewriter(rewriter) {}
private:
bool ReplaceTextIncludingEndToken(clang::SourceRange, llvm::StringRef) override;
bool ReplaceTextExcludingEndToken(clang::SourceRange, llvm::StringRef) override;
clang::SourceManager& getSourceMgr() override;
clang::Rewriter& rewriter;
};
} // namespace cftf
#endif // CFTF_REWRITER_HPP
#include "ast_visitor.hpp"
#include "rewriter.hpp"
#include <clang/AST/ASTContext.h>
#include <iostream>
namespace cftf {
// TODO: Should this operate with QualTypes instead?
static bool TypeHasStdTupleSizeSpecialization(clang::ASTContext& context, const clang::Type* type) {
auto tu_decl = context.getTranslationUnitDecl();
auto& id = context.Idents.get("std");
auto std_id = context.DeclarationNames.getIdentifier(&id);
auto lookup_result = tu_decl->lookup(std_id);
assert(lookup_result.size() < 2);
if (lookup_result.size() == 0) {
// No standard library headers included
// => std::tuple_size is not available
return false;
}
clang::NamespaceDecl* std_decl = clang::dyn_cast<clang::NamespaceDecl>(lookup_result.front());
auto tuple_size_lookup_result = std_decl->lookup(&context.Idents.get("tuple_size"));
assert(tuple_size_lookup_result.size() < 2);
if (tuple_size_lookup_result.size() == 0) {
// <tuple> is not included => std::tuple_size is not available
return false;
}
auto tuple_size_decl = clang::dyn_cast<clang::ClassTemplateDecl>(tuple_size_lookup_result.front());
assert(tuple_size_decl);
std::cerr << "Checking if any std::tuple_size specializations match" << std::endl;
auto specs = tuple_size_decl->specializations();
auto spec = std::find_if(std::begin(specs), std::end(specs),
[type](clang::ClassTemplateSpecializationDecl* spec) {
auto&& template_args = spec->getTemplateArgs();
assert(template_args.size() == 1);
bool match = (template_args[0].getAsType().getTypePtrOrNull() == type->getAs<clang::RecordType>());
// NOTE: tuple_size is only considered when
// a proper definition is provided
// (plain forward declarations will
// not have any effect)
if (match && !spec->isCompleteDefinition()) {
std::cerr << "Skipping incomplete specialization" << std::endl;
return false;
}
return match;
});
return (spec != std::end(specs));
}
// TODO: Support bit fields. I guess we could just create a local struct type to define a bitfield of the given size?
bool ASTVisitor::VisitDecompositionDecl(clang::DecompositionDecl* decl) {
auto init_expr = decl->getInit();
assert(init_expr);
auto unqualified_type = init_expr->getType().getTypePtrOrNull();
assert(unqualified_type);
std::string temp_name;
for (auto* binding : decl->bindings()) {
if (!temp_name.empty()) {
temp_name += '_';
}
temp_name += binding->getNameAsString();
}
temp_name = "cftf_binding_group_" + temp_name;
auto rewritten_decl = "/*" + GetClosedStringFor(decl->getLocStart(), decl->getLocEnd()) + "*/\n";
rewritten_decl += "auto " + temp_name + " = " + GetClosedStringFor(decl->getInit()->getLocStart(), decl->getInit()->getLocEnd()) + ";\n";
if (unqualified_type->isArrayType()) {
// TODO: For auto, create a new array as a copy of the reference one
// TODO: Decompose array elements; decomposed type is "add_reference_t<cv-qualified element_type>"
// TODO: Assign decomposed elements to "referenced_array[i]"
std::cerr << "Decomposing array" << std::endl;
// TODO: Implement.
return true;
} else if (TypeHasStdTupleSizeSpecialization(context, unqualified_type)) {
std::cerr << "Decomposing via get<>" << std::endl;
// TODO: The decomposed types are references to
// "std::tuple_element<i, E>::type", but we still need to decide
// what kind of reference. For now, we just use "auto(&&)"
// everywhere, but that may not be sufficient in the future. In
// some cases, the bindings shouldn't be C++-references at all,
// but instead should act like transparent references (the type
// of which behaves like a value)
auto&& bindings = decl->bindings();
for (auto binding_it = bindings.begin(); binding_it != bindings.end(); ++binding_it) {
auto* binding = *binding_it;
auto index = std::distance(bindings.begin(), binding_it);
auto* var = binding->getHoldingVar();
auto type_string = clang::QualType::getAsString(binding->getType().getSplitDesugaredType(), clang::PrintingPolicy{{}});
// TODO: Does a plain "get" properly ADL-match all
// cases here? Things to consider:
// * std::get
// * Free function get(Object)
// * Member function get()
// * Friend member function get(Object)
rewritten_decl += type_string + " " + var->getNameAsString() + " = get<" + std::to_string(index) + ">(" + temp_name + ");\n";
}
} else {
// Decomposed types are references to the respective data members
std::cerr << "Decomposing struct" << std::endl;
auto bindings = decl->bindings();
auto init_expr_record = init_expr->getType().getTypePtr()->getAsCXXRecordDecl();
if (init_expr_record) {
init_expr->getType().dump();
struct FindNonEmptyBase {
clang::CXXRecordDecl* operator()(clang::CXXBaseSpecifier& base) {
auto type = base.getType().getTypePtr();
assert(type);
auto record = type->getAsCXXRecordDecl();
assert(record);
std::cerr << "Checking base ";
type->dump();
std::cerr << std::endl;
return (*this)(record);
}
clang::CXXRecordDecl* operator()(clang::CXXRecordDecl* record) {
auto&& fields = record->fields();
if (fields.begin() != fields.end()) {
std::cerr << "Done." << std::endl;
return record;
}
for (auto&& base : record->bases()) {
auto match = (*this)(base);
if (match) {
return match;
}
}
return nullptr;
}
};
// Current structure may not have any direct data members, in which case they must have been inherited from a parent
auto non_empty_base = FindNonEmptyBase{}(init_expr_record);
assert(non_empty_base);
auto fields = non_empty_base->fields();
auto fields_it = fields.begin();
for (size_t i = 0; i < bindings.size(); ++i) {
assert(fields_it != fields.end());
assert(fields_it->getFieldIndex() == i);
auto type_string = clang::QualType::getAsString(fields_it->getType().getSplitDesugaredType(), clang::PrintingPolicy{{}});
rewritten_decl += type_string + " " + bindings[i]->getNameAsString() + " = " + temp_name + "." + fields_it->getNameAsString() + ";\n";
++fields_it;
}
assert(fields_it == fields.end());
rewriter->ReplaceTextIncludingEndToken(decl->getSourceRange(), rewritten_decl);
} else {
// This will happen in dependent contexts. TODO!
std::cerr << "Right-hand side is not a CXXRecordDecl; not sure what to do with this..." << std::endl;
}
}
rewriter->ReplaceTextIncludingEndToken(decl->getSourceRange(), rewritten_decl);
return true;
}
} // namespace cftf
#include "ast_visitor.hpp"
#include "rewriter.hpp"
#include <clang/Lex/Lexer.h>
#include <clang/AST/Expr.h>
#include <iostream>
#include <numeric>
#include <variant>
namespace cftf {
static auto SourceRangeLength(clang::SourceManager& sm, clang::SourceRange range) {
auto begin_data = sm.getCharacterData(range.getBegin());
auto end_data = sm.getCharacterData(range.getEnd());
return (end_data - begin_data);
}
static auto SourceRangeToString(clang::SourceManager& sm, clang::SourceRange range) {
auto begin_data = sm.getCharacterData(range.getBegin());
return std::string(begin_data, SourceRangeLength(sm, range));
}
static bool IsSubRange(clang::SourceManager& sm, clang::SourceRange inner, clang::SourceRange outer) {
return (sm.isPointWithin(inner.getBegin(), outer.getBegin(), outer.getEnd()) &&
sm.isPointWithin(inner.getEnd(), outer.getBegin(), outer.getEnd()));
}
/**
* Rewriter that takes a copy of the given range and performs manipulations on
* it based on original SourceLocations but without modifying the original text
*
* In contrast to clang::Rewriter this class allows for "hierarchical"
* rewriting, where multiple rewrite rules might operate on nested parts
* of a single expressions (e.g. a parameter pack expansion including
* binary literals).
*
* This rewriter also supports copy-operations ("instances") such that
* consecutive edits of common subexpressions are visible in all instances
* while allowing to do individual edits as well.
*
* For example, when specializing the expression
* "my_function((0b0010 * ts)...)"
* for ts=<5, 10, 'c'>, the following operations may be performed:
* a) The parameter pack expansion rules creates three instances of the
* subexpression "(0b1000 * ts)..."
* b) The separator ", " is added to the first two instances
* c) The DeclRefExpr matcher replaces "ts" in each instance with a unique
* numbered identifier (ts1, ts2, ts3)
* d) The IntegerLiteral matcher replaces 0b0010 with 2 in all instances
* The difference between (c) and (d) is that (c) edits each instance
* separately whereas in (d), the rewriter class automatically distributes the
* edit across all instances.
* The resulting generated source code is
* "my_function((2 * ts1), (2 * ts2), (2 * ts3))".
*/
class HierarchicalRewriter final : public RewriterBase {
struct SourceNode {
// Half-open interval of source contained by this node. Beginning is included, end is not.
clang::SourceRange range;
// true if this is original, unmodified source data; false otherwise.
// if false, the contents of this child may not be split up during rewrites. In other words,
// the child must either be left untouched or replaced as a whole
bool rewriteable;
bool IsLeaf() const {
return std::holds_alternative<std::string>(data);
}
/// Child nodes or node content
std::variant<std::vector<SourceNode>, std::string> data;
std::vector<SourceNode>& GetChildren() {
return std::get<std::vector<SourceNode>>(data);
}
const std::vector<SourceNode>& GetChildren() const {
return std::get<std::vector<SourceNode>>(data);
}
std::string& GetContent() {
return std::get<std::string>(data);
}
std::string Concatenate() const {
if (IsLeaf()) {
return std::get<std::string>(data);
} else {
return std::accumulate(GetChildren().cbegin(), GetChildren().cend(), std::string{},
[](const std::string& str, const SourceNode& node) {
return str + node.Concatenate();
});
}
}
size_t node_id = 0;
size_t instance_id = 0;
static constexpr size_t all_instances = std::numeric_limits<size_t>::max();
bool operator==(const SourceNode& oth) const {
return range == oth.range && rewriteable == oth.rewriteable && data == oth.data && instance_id == oth.instance_id;
}
};
size_t running_node_id = 1000;
public:
struct InstanceHandle {
private:
InstanceHandle(SourceNode& node, SourceNode& parent) : node_id(node.node_id), parent_id(parent.node_id) {}
InstanceHandle(size_t node_id, size_t parent_id) : node_id(node_id), parent_id(parent_id) {}
// TODO: To support nested instances, these will likely need to be std::vectors of nodes in the future
size_t node_id;
size_t parent_id;
friend class HierarchicalRewriter;
};
HierarchicalRewriter(clang::SourceManager& sm, clang::SourceRange range)
: sm(sm), root{range, true, SourceRangeToString(sm, range)} {
}
std::string GetContents() const {
return root.Concatenate();
}
InstanceHandle MakeInstanceHandle(clang::SourceRange subrange) {
return MakeInstanceHandle(root, subrange);
}
SourceNode& FindNode(SourceNode& parent, size_t id) {
auto ptr = FindNodeHelper(parent, id);
assert(ptr);
return *ptr;
}
InstanceHandle CreateNewInstance(const InstanceHandle& base_instance) {
auto& parent = FindNode(root, base_instance.parent_id);
auto node = FindNode(parent, base_instance.node_id); // Copy intended
auto& children = parent.GetChildren();
auto node_it = std::find(children.begin(), children.end(), node);
// Find the most recent instance of this node, and insert a new instance after it
while (std::next(node_it) != children.end() && node_it->instance_id < std::next(node_it)->instance_id) {
++node_it;
}
auto new_instance_id = 1 + node_it->instance_id;
auto new_instance = children.insert(node_it + 1, node);
new_instance->instance_id = new_instance_id;
new_instance->node_id = ++running_node_id;
return InstanceHandle { new_instance->node_id, base_instance.parent_id };
}
bool ReplaceTextExcludingEndToken(InstanceHandle& instance, clang::SourceRange replaced_range, llvm::StringRef new_str) {
return ReplaceTextExcludingEndToken(FindNode(root, instance.node_id), replaced_range, new_str);
}
private:
SourceNode* FindNodeHelper(SourceNode& parent, size_t id) {
if (parent.node_id == id) {
return &parent;
}
if (parent.IsLeaf()) {
return nullptr;
} else {
for (auto& child : parent.GetChildren()) {
auto ptr = FindNodeHelper(child, id);
if (ptr) return ptr;
}
return nullptr;
}
}
// Returns a reference to the inner node
SourceNode& SplitNodeAt(SourceNode& node, clang::SourceRange subrange) {
std::vector<SourceNode> children;
size_t index_for_inner_node = 0;
auto left_range = clang::SourceRange { node.range.getBegin(), subrange.getBegin() };
if (left_range.getBegin() != left_range.getEnd()) {
children.emplace_back(SourceNode { left_range, true, GetHalfOpenStringFor(left_range), ++running_node_id });
index_for_inner_node = 1;
}
children.emplace_back(SourceNode { subrange, true, GetHalfOpenStringFor(subrange), ++running_node_id });
auto right_range = clang::SourceRange { subrange.getEnd(), node.range.getEnd() };
if (right_range.getBegin() != right_range.getEnd()) {
children.emplace_back(SourceNode { right_range, true, GetHalfOpenStringFor(right_range), ++running_node_id });
}
// If the node wasn't wholly covered, we should have more than one child now
assert(children.size() > 1);
node.data = std::move(children);
return node.GetChildren()[index_for_inner_node];
}
bool ReplaceTextIncludingEndToken(clang::SourceRange subrange, llvm::StringRef new_str) override {
clang::SourceRange extended_range {subrange.getBegin(), clang::Lexer::getLocForEndOfToken(subrange.getEnd(), 0, sm, {}) };
return ReplaceTextExcludingEndToken(extended_range, new_str);
}
// TODO: Remove this
public:
std::string GetHalfOpenStringFor(clang::SourceRange range) {
auto begin_data = sm.getCharacterData(range.getBegin());
auto end_data = sm.getCharacterData(range.getEnd());
return std::string(begin_data, end_data - begin_data);
}
bool ReplaceTextExcludingEndToken(SourceNode& node, clang::SourceRange replaced_range, llvm::StringRef new_str) {
if (node.IsLeaf()) {
if (node.range == replaced_range) {
// Node coincides with replaced range, so just replace it directly
node.rewriteable = false;
node.data = new_str;
} else {
// Split this leaf into (up to) 3 children and replace the inner part
auto& inner_node = SplitNodeAt(node, replaced_range);
inner_node.rewriteable = false;
inner_node.data = new_str;
}
} else {
// Recurse into the smallest child that wholly covers replaced_range
auto& children = node.GetChildren();
auto child_it = std::find_if(children.begin(), children.end(),
[&](SourceNode& child) {
return IsSubRange(sm, replaced_range, child.range);
});
if (child_it == children.end()) {
// Not implemented, currently. Not sure if we need this?
// No child contains the entire SourceRange found, so we'll replace all the children in this node that are covered.
auto child_is_fully_covered = [&](SourceNode& child) {
bool fully_covered = IsSubRange(sm, child.range, replaced_range);
return fully_covered;
};
auto first_child = std::find_if(children.begin(), children.end(), child_is_fully_covered);
// If replacement string is non-empty, replace the first matching child in-place and drop all other children.
// Otherwise, drop all matching children
auto first_child_to_remove = new_str.empty() ? first_child : (first_child + 1);
auto children_to_remove = std::remove_if(first_child_to_remove, children.end(), child_is_fully_covered);
if (!new_str.empty()) {
clang::SourceLocation range_end = (children_to_remove != children.end()) ? children.back().range.getEnd() : first_child->range.getEnd();
*first_child = SourceNode { { first_child->range.getBegin(), range_end }, false, new_str, ++running_node_id };
}
children.erase(children_to_remove, children.end());
} else {
// Recurse into the child (each instance separately)
size_t instance = 0;
auto instance_it = child_it;
while (instance_it != children.end() && instance_it->instance_id == instance) {
ReplaceTextExcludingEndToken(*child_it, replaced_range, new_str);
++instance;
++instance_it;
}
// We should have reached either the end of children or the start of another block of instances
assert(instance_it == children.end() || instance_it->instance_id == 0);
}
}
// Report success
return false;
}
bool ReplaceTextExcludingEndToken(clang::SourceRange subrange, llvm::StringRef new_str) override {
return ReplaceTextExcludingEndToken(root, subrange, new_str);
}
InstanceHandle MakeInstanceHandle(SourceNode& parent, clang::SourceRange subrange) {
if (parent.IsLeaf()) {
if (parent.range == subrange) {
// Bleh, generate a nested child, just so we don't need to look up the proper parent ourselves now...
SourceNode child = std::move(parent);
parent = SourceNode { child.range, child.rewriteable, std::vector(1, child), ++running_node_id };
return InstanceHandle { parent.GetChildren()[0], parent };
} else {
auto& inner_node = SplitNodeAt(parent, subrange);
return InstanceHandle { inner_node, parent };
}
} else {
auto& children = parent.GetChildren();
auto child_it = std::find_if(children.begin(), children.end(),
[&](SourceNode& child) {
return IsSubRange(sm, subrange, child.range);
});
if (child_it == children.end()) {
// Not implemented, currently. Not sure if we need this?
assert(false);
throw nullptr;
} else {
// Recurse into the child
// TODO: Support nested instances.
// We will need to capture all different branches that
// can reach the given subrange in the InstanceHandle
// for this!
assert(std::next(child_it) == children.end() || std::next(child_it)->instance_id == 0);
return MakeInstanceHandle(*child_it, subrange);
}
}
}
clang::SourceManager& getSourceMgr() override {
return sm;
}
clang::SourceManager& sm;
SourceNode root;
};
/**
* Returns true if the function
* i) uses "auto" or "decltype(auto)" for the return type
* ii) does not use trailing return type specification
*/
static bool FunctionReturnTypeIsDeducedFromBody(clang::ASTContext& context, clang::FunctionDecl* decl) {
clang::QualType returntype = decl->getReturnType();
auto auto_type = returntype->getContainedAutoType();
if (!auto_type) {
auto_type = returntype.getDesugaredType(context)->getContainedAutoType();
}
return (auto_type && !auto_type->hasAutoForTrailingReturnType());
}
/**
* Utility AST visitor that traverses a function and checks whether there is a need
* to explicitly specialize it (e.g. because other transformations depend on types being well-known)
*/
class FunctionNeedsExplicitSpecializationChecker : public clang::RecursiveASTVisitor<FunctionNeedsExplicitSpecializationChecker> {
public:
FunctionNeedsExplicitSpecializationChecker(clang::FunctionDecl* decl) {
// If this is a function template specialization, continue, otherwise we trivially don't need to specialize this function
if (decl->getPrimaryTemplate() != nullptr && !decl->getTemplateSpecializationInfo()->isExplicitSpecialization()) {
TraverseFunctionDecl(decl);
needs_specialization = true; // TODO: Revert
}
}
operator bool() const {
return needs_specialization;
}
bool VisitIfStmt(clang::IfStmt* stmt) {
if (features::constexpr_if && stmt->isConstexpr()) {
needs_specialization = true;
// Abort traversal for this check
return false;
}
return true;
}
bool needs_specialization = false;
};
/**
* Utility AST visitor that determines the size of the parameter pack expanded
* by the given PackExpansionExpr based on an implicitly specialized
* FunctionDecl.
*
* This helper is provided because libclang provides no direct means of getting
* the size of a parameter pack used for a specialization of a variadic
* function template.
*
* Internally, this helper traverses the entire function (up to the point of
* parameter pack expansion) to find the immediate children of expansion_expr
* in the function body and to count the number of their appearances.
* When using this helper, be cautious about the performance implications of
* this full traversal.
*
* @note One might be tempted to assume we could just calculate the parameter
* pack size as the difference between the total number of arguments used
* for the current specialization and the number of non-variadic template
* parameters. E.g. for "template<typename T, typename... Us> void f()"
* and the specialization "f<int, char, char>", this would yield the
* correct value 3 - 1 = 2.
* However, that breaks e.g. for function templates like
* "template<typename... Ts, typename... Us> void f(Us... u)".
* Maybe this approach could work with less naive inference rules, but
* I haven't further explored that idea.
*
* @note We also can't get this info from
* FunctionDecl::getTemplateSpecializationInfo (which provides a
* template argument list), since we don't know the expanded parameter
* pack. We might get away with just taking any parameter pack contained
* within the expanded expression, but there are contrived (and evil)
* examples of referencing multiple parameter packs in the same
* expansion.
* That said, maybe this could be used as a faster default, and the full
* function traversal could be used as a reliable fallback for contrived
* examples.
*/
class DetermineParameterPackSizeVisitor : public clang::RecursiveASTVisitor<DetermineParameterPackSizeVisitor> {
public:
DetermineParameterPackSizeVisitor(clang::FunctionDecl* decl, clang::PackExpansionExpr* expansion_expr) : expr(expansion_expr->getPattern()) {
TraverseFunctionDecl(decl);
}
operator size_t() const {
return count;
}
bool VisitStmt(clang::Stmt* stmt) {
// Compared for equality based on StmtClass and SourceLocations
// Find the first statement that was generated from the parameter pack expansion.
// We recognize this statement by comparing against the StmtClass and source location
auto is_generated_stmt = [this](clang::Stmt* candidate) {
auto expected_stmt_class = expr->getStmtClass();
if (clang::CXXUnresolvedConstructExpr::classof(expr) && clang::CXXFunctionalCastExpr::classof(candidate)) {
// CXXUnresolvedConstructExprs get turned into
// CXXFunctionalCastExprsGenerated in implicit specializations.
// The SourceRange doesn't change, so we can still use it for
// the purpose of comparison.
// This was observed e.g. in "func(T{}...)".
expected_stmt_class = candidate->getStmtClass();
} else if (clang::ImplicitCastExpr::classof(candidate)) {
// Generated expressions are often wrapped in a generated
// ImplicitCastExpr, so unfold that one by refering to the
// child instead.
// In particular, this occurs in expressions like "func((t)...)"
// There should only be one child in this expression
assert(std::distance(candidate->child_begin(), candidate->child_end()) == 1);
candidate = *candidate->child_begin();
}
return candidate->getStmtClass() == expected_stmt_class &&
candidate->getSourceRange() == expr->getSourceRange();
};
auto count = std::count_if(stmt->child_begin(), stmt->child_end(), is_generated_stmt);
if (count) {
this->count = count;
// Abort traversal if we found the expression (TODO: Does this abort the entire traversal or just move back to parent? Abort the entire thing if the latter!)
return false;
} else {
// Keep looking for a generated statement
// NOTE: If the parameter pack was empty, this algorithm will need to scan the entire function to detect that :/
return true;
}
}
clang::Expr* expr; // Pattern of the given PackExpansionExpr
size_t count = 0;
};
clang::ParmVarDecl* ASTVisitor::CurrentFunctionInfo::FindTemplatedParamDecl(clang::ParmVarDecl* specialized) const {
auto it = std::find_if(parameters.begin(), parameters.end(),
[specialized](const Parameter& param) {
auto it = std::find_if(param.specialized.begin(), param.specialized.end(),
[=](const Parameter::SpecializedParameter& parameter) {
return (parameter.decl == specialized);
});
return (it != param.specialized.end());
});
if (it == parameters.end()) {
return nullptr;
}
return it->templated;
}
const std::vector<ASTVisitor::CurrentFunctionInfo::Parameter::SpecializedParameter>& ASTVisitor::CurrentFunctionInfo::FindSpecializedParamDecls(clang::ParmVarDecl* templated) const {
auto it = std::find_if(parameters.begin(), parameters.end(),
[templated](const Parameter& param) {
return (param.templated == templated);
});
assert (it != parameters.end());
return it->specialized;
}
// TODO: Move elsewhere
bool ASTVisitor::VisitSizeOfPackExpr(clang::SizeOfPackExpr* expr) {
if (!current_function) {
return true;
}
rewriter->ReplaceTextIncludingEndToken({ expr->getLocStart(), expr->getLocEnd() }, "/*" + GetClosedStringFor(expr->getLocStart(), expr->getLocEnd()) + "*/" + std::to_string(expr->getPackLength()));
return true;
}
bool ASTVisitor::VisitPackExpansionExpr(clang::PackExpansionExpr* expr) {
if (!current_function_template)
return true;
// NOTE: We only ever visit this once, in the general template. So we need to iterate over all implicit specializations of this function and fill in the gaps ourselves later.
current_function_template->param_pack_expansions.push_back(FunctionTemplateInfo::ParamPackExpansionInfo{expr, std::vector<clang::DeclRefExpr*>{}});
std::cerr << "Visiting pack expansion, registering to " << current_function_template << std::endl;
return true;
}
bool ASTVisitor::TraversePackExpansionExpr(clang::PackExpansionExpr* expr) {
if (!current_function_template)
return true;
assert(!current_function_template->in_param_pack_expansion);
current_function_template->in_param_pack_expansion = true;
Parent::TraversePackExpansionExpr(expr);
current_function_template->in_param_pack_expansion = false;
return true;
}
bool ASTVisitor::VisitDeclRefExpr(clang::DeclRefExpr* expr) {
// Record uses of parameter packs within pack expansions
if (!current_function_template || !current_function_template->in_param_pack_expansion) {
return true;
}
auto parm_var_decl = clang::dyn_cast<clang::ParmVarDecl>(expr->getDecl());
if (parm_var_decl && parm_var_decl->isParameterPack()) {
current_function_template->param_pack_expansions.back().referenced_packs.push_back(expr);
}
return true;
}
static std::string MakeUniqueParameterPackName(clang::ParmVarDecl* decl, size_t index) {
// Just append a 1-based counter for now
// TODO: This will break if there is already a parameter with the new name
return decl->getNameAsString() + std::to_string(1 + index);
}
// Replace a function's return type with the given string.
// Note that if a function template specialization should be
// rewritten, "decl" should be passed the templated FunctionDecl
// instead since it's used to gather the SourceLocations.
static void ReplaceReturnType(RewriterBase& rewriter, clang::FunctionDecl& decl, llvm::StringRef new_type) {
// NOTE: For "decltype(auto)", decl.getReturnTypeSourceRange() actually
// stops at "(auto)", so we instead use its start location and then
// replace everything up to the function name
rewriter.ReplaceTextExcludingEndToken({decl.getReturnTypeSourceRange().getBegin(), decl.getNameInfo().getLoc()}, new_type.str() + " ");
}
bool ASTVisitor::TraverseFunctionTemplateDecl(clang::FunctionTemplateDecl* decl) {
if (context.getFullLoc(decl->getLocStart()).isInSystemHeader()) {
// Skip system header contents
return true;
}
WalkUpFromFunctionTemplateDecl(decl);
std::cerr << "Visiting FunctionTemplateDecl:" << decl << std::endl;
auto templated_decl = decl->getTemplatedDecl();
{
// This is the actual template definition (i.e. not one of the
// specializations generated implicitly by clang). We do a prepass over
// the template definition to gather a list of things that would be
// difficult to rewrite otherwise, such as parameter pack expansions.
auto [it, ignored] = function_templates.emplace(templated_decl, FunctionTemplateInfo{});
current_function_template = &it->second;
std::cerr << "Template: " << templated_decl->getNameAsString() << std::endl;
// TODO: Will we traverse this decl twice now?
Parent::TraverseFunctionDecl(templated_decl);
current_function_template = nullptr;
}
// The rest of this function is concerned with generating explicit
// specializations from what's an implicit template specialization in
// libclang's AST. Hence, return early from this code path.
// TODO: There may be multiple overloads with the same function name but
// different sets of deduced return values. To make sure we support
// all of these, we need to append a *mangled* version of the
// function name here!
const std::string auto_deduction_helper_struct_name = "cftf_deduced_return_type_" + decl->getNameAsString();
const bool deduce_return_type = FunctionReturnTypeIsDeducedFromBody(context, templated_decl);
for (auto* specialized_decl : decl->specializations()) {
std::cerr << "Specialization " << specialized_decl << std::endl;
bool specialize = FunctionNeedsExplicitSpecializationChecker(specialized_decl);
decltype(rewriter) old_rewriter;
CurrentFunctionInfo current_function = { specialized_decl, {} };
std::string template_argument_string; // TODO: This is initialized below (two indentation levels deeper), which is rather ugly...
if (specialize) {
current_function.template_info = &function_templates.at(templated_decl);
// Temporarily exchange our clang::Rewriter with an internal rewriter that writes to a copy of the current function (which will act as an explicit instantiation)
// TODO: This will fail sooner or later; functions can be nested e.g. by declaring a class inside a function!
// TODO: Should probably use the locations from templated_decl instead!
old_rewriter = std::exchange(rewriter, std::make_unique<HierarchicalRewriter>(rewriter->getSourceMgr(), clang::SourceRange{ specialized_decl->getLocStart(), getLocForEndOfToken(specialized_decl->getLocEnd()) }));
// Add template argument list for this specialization
{
llvm::raw_string_ostream ss(template_argument_string);
assert(specialized_decl->getTemplateSpecializationArgs());
auto&& template_args = specialized_decl->getTemplateSpecializationArgs()->asArray();
for (auto it = template_args.begin(); it != template_args.end(); ++it) {
if (it != template_args.begin()) {
ss << ", ";
}
clang::LangOptions policy; // TODO: Get this from the proper source!
if (it->getKind() == clang::TemplateArgument::Pack) {
// Print each item in the parameter pack individually
for (auto pack_it = it->pack_begin(); pack_it < it->pack_end(); ++pack_it) {
if (pack_it != it->pack_begin()) {
ss << ", ";
}
pack_it->print(policy, ss);
}
} else {
it->print(policy, ss);
}
}
ss.flush();
// TODO: Templated_decl locs!
rewriter->InsertTextAfter(specialized_decl->getLocation(), '<' + template_argument_string + '>');
}
// The template generally contains references to the template parameters (in the body and in the function parameter list).
// This is a problem in our generated specializations, which don't define the template parameters (i.e. there is no
// "template<typename T>" preceding them) but must use the actual template arguments instead.
// We address this as follows:
// * In the specialization body, we insert type aliases and constants at the top to manually declare template parameters.
// This is much easier than trying to manually replace all occurrences of template parameters with concrete arguments.
// * The parameter list is replaced by the FunctionDecl parameter list provided by clang. Stringifying this correctly
// is reasonably easy and gets rid of all template parameter references automatically.
//
// NOTE: We only need to replace anything for non-empty parameter lists, but note that a specialization's parameter list
// may well be empty while the actual template function's parameter list is not. In particular, this happens for
// template functions of the form
//
// template<typename... T> void func(T... t)
//
// when specialized for empty parameter packs.
std::transform(templated_decl->param_begin(), templated_decl->param_end(), std::back_inserter(current_function.parameters),
[&](clang::ParmVarDecl* templated_param_decl) {
// TODO: Unify this with FindTemplatedDecl!
auto is_same_decl = [&](const clang::ParmVarDecl* specialized_param_decl) {
// Unfortunately, there doesn't seem to be a better way to do this than to compare the parameters by name...
return (specialized_param_decl->getName() == templated_param_decl->getName());
};
auto first_it = std::find_if (specialized_decl->param_begin(), specialized_decl->param_end(), is_same_decl);
auto last_it = std::find_if_not(first_it, specialized_decl->param_end(), is_same_decl);
CurrentFunctionInfo::Parameter ret { templated_param_decl, {} };
if (first_it + 1 == last_it) {
// Just one argument
ret.specialized.push_back({*first_it, templated_param_decl->getNameAsString()});
} else {
// Templated parameter refers to a parameter pack for which multiple (or none) arguments were generated;
// to prevent name collisions, generate a unique name for each of them.
for (auto it = first_it; it != last_it; ++it) {
std::string unique_name = MakeUniqueParameterPackName(templated_param_decl, std::distance(first_it, it));
ret.specialized.push_back({*it, std::move(unique_name)});
}
}
return ret;
});
// Remove empty parameter packs from the specialized signature
// (Non-empty parameter packs are handled in VisitVarDecl)
for (auto templated_parameter_it = templated_decl->param_begin();
templated_parameter_it != templated_decl->param_end();
++templated_parameter_it) {
auto* templated_parameter = *templated_parameter_it;
if (templated_parameter->isParameterPack() && current_function.FindSpecializedParamDecls(templated_parameter).empty()) {
const bool is_first_parameter = (templated_parameter_it == templated_decl->param_end());
const bool is_last_parameter = (std::next(templated_parameter_it) == templated_decl->param_end());
// Remove the parameter (including any preceding or following commas)
clang::SourceLocation start_loc = is_first_parameter ? templated_decl->parameters().front()->getLocStart() : templated_parameter->getLocStart();
if (is_last_parameter) {
// Delete up to the end of the function signature
clang::SourceLocation end_loc = templated_decl->parameters().back()->getLocEnd();
rewriter->ReplaceTextIncludingEndToken({start_loc, end_loc}, "");
} else {
// Delete up to the beginning of the next parameter
auto end_loc = (*std::next(templated_parameter_it))->getLocStart();
rewriter->ReplaceTextExcludingEndToken({start_loc, end_loc}, "");
}
}
}
}
// From here on below, assume we have a self-contained definition that we can freely rewrite code in
this->current_function = current_function;
// Patch up body (parameter pack expansions, fold expressions)
/*if (decl2->getPrimaryTemplate())*/ {
auto current_function_template_it = function_templates.find(templated_decl);
if (current_function_template_it != function_templates.end()) {
auto current_function_template = &current_function_template_it->second;
assert(current_function_template);
for (auto [pack_expansion_expr, pack_uses] : current_function_template->param_pack_expansions) {
auto rewriter = static_cast<HierarchicalRewriter*>(this->rewriter.get());
auto* pattern = pack_expansion_expr->getPattern();
auto range_end = clang::Lexer::getLocForEndOfToken(pack_expansion_expr->getEllipsisLoc(), 0, rewriter->getSourceMgr(), {});
auto base_instance = rewriter->MakeInstanceHandle({pattern->getLocStart(), range_end});
size_t pack_length = DetermineParameterPackSizeVisitor { specialized_decl, pack_expansion_expr };
std::vector<HierarchicalRewriter::InstanceHandle> instances;
for (size_t instance_id = 0; instance_id < pack_length; ++instance_id) {
instances.push_back(rewriter->CreateNewInstance(base_instance));
}
// Delete the original pack expansion first, then re-add one copy for each parameter pack element
rewriter->ReplaceTextExcludingEndToken(base_instance, {pattern->getLocStart(), range_end}, "/*" + GetClosedStringFor(pattern->getLocStart(), range_end) + " of size " + std::to_string(pack_length) + "*/");
for (size_t instance_id = 0; instance_id < pack_length; ++instance_id) {
// Insert separators for all but the last instance
const char* replacement = ", ";
if (instance_id == pack_length - 1) {
// No separator needed, so just remove the ellipsis
replacement = "";
}
rewriter->ReplaceTextExcludingEndToken(instances[instance_id], {pack_expansion_expr->getEllipsisLoc(), range_end}, replacement);
// We generate unique names for function parameters
// expanded from parameter packs. Those now need to be
// patched into the function body whenever the parameter
// pack is referenced.
for (auto* pack_expr : pack_uses) {
clang::SourceRange range = { pack_expr->getLocStart(), clang::Lexer::getLocForEndOfToken(pack_expr->getLocEnd(), 0, rewriter->getSourceMgr(), {}) };
auto parm_var_decl = clang::dyn_cast<clang::ParmVarDecl>(pack_expr->getDecl());
assert(parm_var_decl && parm_var_decl->isParameterPack());
const auto& unique_name = current_function.FindSpecializedParamDecls(parm_var_decl)[instance_id].unique_name;
rewriter->ReplaceTextExcludingEndToken(instances[instance_id], range, unique_name);
}
// TODO: Also patch "T..." uses in the function body
}
}
}
}
Parent::TraverseFunctionDecl(specialized_decl);
this->current_function = std::nullopt;
if (specialize) {
// Fix up references to template parameters in the specialization by adding an explicit
// declaration of them at the top of the specialization body
auto template_parameters = decl->getTemplateParameters();
auto specialization_args = specialized_decl->getTemplateSpecializationArgs()->asArray();
assert(template_parameters);
assert(template_parameters->size() == specialization_args.size());
std::string aliases = "\n";
auto parameter_it = template_parameters->begin();
auto argument_it = specialization_args.begin();
for (; parameter_it != template_parameters->end(); ++parameter_it, ++argument_it) {
auto& parameter = *parameter_it;
auto& argument = *argument_it;
assert(parameter);
if (parameter->getNameAsString().empty()) {
// If the parameter was never named, we don't need to reexport it
continue;
}
switch (argument.getKind()) {
case clang::TemplateArgument::Type:
// e.g. template<typename Type>
aliases += "using " + parameter->getNameAsString() + " = " + argument.getAsType().getAsString() + ";\n";
break;
case clang::TemplateArgument::Integral:
// e.g. template<int Val>
// TODO: Get the actual (possibly const-qualified) type!
aliases += "auto " + parameter->getNameAsString() + " = " + argument.getAsIntegral().toString(10) + ";\n";
break;
case clang::TemplateArgument::Declaration:
// e.g. template<void* Ptr> with Ptr=&some_global_variable
std::cerr << "WARNING: TemplateArgument::Declaration not unsupported, yet" << std::endl;
aliases += "TODO " + parameter->getNameAsString() + " = TODO;\n";
break;
case clang::TemplateArgument::NullPtr:
// e.g. template<void* Ptr> with Ptr=nullptr
aliases += "decltype(nullptr) " + parameter->getNameAsString() + " = nullptr;\n";
break;
case clang::TemplateArgument::Template:
// e.g. template<template<typename> Templ>
// TODO: How should we handle these? Function bodies can't include templates!
// TODO: Instead of ignoring this error, abort specializing this template
std::cerr << "WARNING: Template template parameters unsupported" << std::endl;
aliases += "TODO template<typename> " + parameter->getNameAsString() + " = TODO;\n";
break;
case clang::TemplateArgument::Pack:
// e.g. template<typename... Types>
// e.g. template<int... Vals>
// e.g. template<template<typename>... Templs>
// We don't need to do anything here, since we expand all parameter packs in the function body
std::cerr << "WARNING: Variadic templates support is incomplete" << std::endl;
break;
default:
std::cerr << "WARNING: Unsupported template argument type: " << static_cast<int>(argument.getKind()) << std::endl;
aliases += "TODO " + parameter->getNameAsString() + " = TODO;\n";
assert(false);
break;
}
}
// TODO: Templated_decl locations
rewriter->InsertTextAfter(specialized_decl->getBody()->getLocStart(), aliases);
ReplaceReturnType(*rewriter, *templated_decl, specialized_decl->getReturnType().getAsString());
// Return type deduction: Specialize helper type trait for the
// template arguments used in this function specialization
std::string deduced_return_type;
if (deduce_return_type) {
deduced_return_type = "template<>\nstruct " + auto_deduction_helper_struct_name + "<";
deduced_return_type += template_argument_string;
deduced_return_type += "> {\n using type = ";
deduced_return_type += specialized_decl->getReturnType().getAsString();
deduced_return_type += ";\n};\n";
}
// Finalize the generated specialization
std::swap(rewriter, old_rewriter);
std::string content = static_cast<HierarchicalRewriter*>(old_rewriter.get())->GetContents();
rewriter->InsertTextAfter(specialized_decl->getLocEnd(), "\n\n// Specialization generated by CFTF\n" + deduced_return_type + "\ntemplate<>\n" + content);
}
}
// TODO: Only if we actually explicitly specialized anything!
// Now that all explicit specializations have been generated, remove
// the original template function definition since it still contains
// unmodified "future" C++ code
rewriter->ReplaceTextIncludingEndToken(templated_decl->getBody()->getSourceRange(), ";");
// Replace "auto"/"decltype(auto)" return type with a deduced type
// (introduced in C++14 via N3638). Since the deduced type may depend on
// template parameters, this is done using a helper type trait that maps
// template arguments to the deduced return type
if (deduce_return_type) {
auto* template_parameters = decl->getTemplateParameters();
// First, declare the helper type trait (which will have a separate
// definition generated for each implicit specialization)
std::string deduced_return_type_decl = "template";
deduced_return_type_decl += GetClosedStringFor(template_parameters->getLAngleLoc(), template_parameters->getRAngleLoc());
deduced_return_type_decl += "\nstruct " + auto_deduction_helper_struct_name + ";\n\n";
// NOTE: templated_decl->getLocStart() starts *after* the
// template<typename> part, so we indeed need decl->getLocStart()
// here instead
rewriter->ReplaceTextExcludingEndToken({decl->getLocStart(), decl->getLocStart()}, deduced_return_type_decl);
// Second, replace "auto" by referring to the helper type trait
std::string deduced_return_type_string = "typename " + auto_deduction_helper_struct_name + "<";
bool first_parameter = true;
for (auto& parameter : template_parameters->asArray()) {
if (!first_parameter) {
deduced_return_type_string += ", ";
}
first_parameter = false;
deduced_return_type_string += parameter->getNameAsString();
}
deduced_return_type_string += ">::type";
ReplaceReturnType(*rewriter, *templated_decl, deduced_return_type_string);
}
return true;
}
bool ASTVisitor::VisitFunctionDecl(clang::FunctionDecl* decl) {
if (decl->getDescribedFunctionTemplate() || decl->isFunctionTemplateSpecialization()) {
// If this function is templated (either a generic definition or a
// specialization), skip it since we handled it in
// TraverseFunctionTemplateDecl already
return true;
}
if (FunctionReturnTypeIsDeducedFromBody(context, decl)) {
ReplaceReturnType(*rewriter, *decl, decl->getReturnType().getAsString());
}
return true;
}
static std::string RebuildVarDecl(clang::SourceManager& sm, clang::VarDecl* decl) {
// TODO: Turn types like pseudo-code "(int[5])&& array" (currently printed as "int &&[5] t") into "int (&&t)[5]"
std::string new_decl = clang::QualType::getAsString(decl->getType().getSplitDesugaredType(), clang::PrintingPolicy{{}});
new_decl += ' ' + decl->getName().str();
if (auto init = decl->getInit()) {
new_decl += " = " + SourceRangeToString(sm, { init->getLocStart(), clang::Lexer::getLocForEndOfToken(init->getLocEnd(), 0, sm, {})});
}
return new_decl;
}
namespace ranges {
template<typename It, typename EndIt>
struct reverse_iterator {
reverse_iterator& operator++() {
--it;
return *this;
}
auto operator* () {
return *std::prev(it);
}
auto operator* () const {
return *std::prev(it);
}
bool operator!=(reverse_iterator oth) {
return it != oth.it;
}
It it;
};
template<typename Rng>
struct reversed {
using ForwardIt = decltype(std::declval<Rng>().begin());
using ForwardEndIt = decltype(std::declval<Rng>().end());
using iterator = reverse_iterator<ForwardIt, ForwardEndIt>;
reversed(Rng&& rng) : rng(std::forward<Rng>(rng)) {}
iterator begin() const {
return { rng.end() };
}
iterator end() const {
return { rng.begin() };
}
Rng&& rng;
};
} // namespace ranges
bool ASTVisitor::VisitDeclStmt(clang::DeclStmt* stmt) {
if (!IsInFullySpecializedFunction()) {
return true;
}
// The types used in declarations might be dependent on template
// parameters. That's not an issue usually since we re-export template
// parameter names in the specialized template, however for parameter packs
// this cannot be done. In those cases, we just replace the declaration
// type by the desugared type to get rid of the template parameter uses.
//
// Clang doesn't provide us with the SourceLocations to the type, so
// we need to replace the entire declaration with a manually crafted one
// instead of replacing just the type.
//
// When doing these rewrites, we need to be careful about multiple
// variables declared in the same line,
// e.g. "stuff<T> first, *second = &first;").
// The easiest way to make sure we do this correctly is to just split up
// the declarations into separate statements.
for (auto decl : ranges::reversed(stmt->decls())) {
if (auto var_decl = clang::dyn_cast<clang::VarDecl>(decl)) {
// TODO: This needs to be more sophisticated for inplace-defined struct types!
auto new_decl = RebuildVarDecl(rewriter->getSourceMgr(), var_decl) + ';';
rewriter->InsertTextAfter(clang::Lexer::getLocForEndOfToken(stmt->getLocEnd(), 0, rewriter->getSourceMgr(), {}), new_decl);
} else if (clang::StaticAssertDecl::classof(decl)) {
// Nothing to do
} else {
std::cerr << "WARNING: Unimplemented Decl: " << decl->getDeclKindName() << std::endl;
}
}
// Delete the old declaration(s)
rewriter->ReplaceTextIncludingEndToken(stmt->getSourceRange(), "");
return true;
}
clang::Decl* ASTVisitor::FunctionTemplateInfo::FindTemplatedDecl(clang::SourceManager& sm, clang::Decl* specialized) const {
auto is_templated_decl = [&sm,specialized](clang::Decl* candidate) {
// Heuristic to match Decls against each other:
// * DeclKind must be the same
// * The SourceRange of the specialized Decl must be fully covered by
// the templated one (they don't need to be equal because specialized
// Decls may e.g. exclude the "..." from parameter packs, etc)
return candidate->getKind() == specialized->getKind() &&
sm.isPointWithin(specialized->getLocStart(), candidate->getLocStart(), candidate->getLocEnd()) &&
sm.isPointWithin(specialized->getLocEnd(), candidate->getLocStart(), candidate->getLocEnd());
};
auto match_it = std::find_if(decls.begin(), decls.end(), is_templated_decl);
if (match_it == decls.end()) {
return nullptr;
}
return *match_it;
}
bool ASTVisitor::VisitVarDecl(clang::VarDecl* decl) {
if (!IsInFullySpecializedFunction()) {
if (current_function_template) {
current_function_template->decls.push_back(decl);
}
return true;
}
// TODO: If the type of the declared variable is dependent on a template parameter, replace it
// NOTE: We only need to replace dependent types, but since it would be extra work to determine whether a type is dependent, we just apply this transformation to all types for now
// TODO: To reduce the danger of incorrect transformations, we should probably put in the extra work :/
{
// NOTE: getParents() only seems to return reliable results when called
// on the templated declaration rather than on decl directly.
auto templated_decl = current_function->template_info->FindTemplatedDecl(rewriter->getSourceMgr(), decl);
assert(templated_decl);
auto parents = context.getParents(*templated_decl);
auto node_is_decl_stmt = [](auto& node) {
auto ptr = node.template get<clang::Stmt>();
return ptr && clang::DeclStmt::classof(ptr);
};
if (std::any_of(parents.begin(), parents.end(), node_is_decl_stmt)) {
std::cerr << "Skipping VarDecl visitation because it was already handled in VisitDeclStmt" << std::endl;
return true;
}
}
// Ideally, we'd just rewrite the type in this declaration. However, libclang provides no way to get the SourceRange for this, so we instead rebuild a new declaration from scratch...
// NOTE: We silently drop the default arguments from the specialized
// signature. Keeping them certainly wouldn't be legal C++, since
// they are just the same as specified in the templated function
// declaration.
auto new_decl = RebuildVarDecl(rewriter->getSourceMgr(), decl);
if (auto pv_decl = clang::dyn_cast<clang::ParmVarDecl>(decl)) {
// This is part of a function signature, so the declaration needs more
// complicated treatment than other declarations since it could have
// been generated from an expanded parameter pack
auto templated = current_function->FindTemplatedParamDecl(pv_decl);
assert(templated);
if (templated->isParameterPack()) {
std::string addendum;
bool first_parameter = true;
for (auto& parameter_and_unique_name : current_function->FindSpecializedParamDecls(templated)) {
auto* parameter = parameter_and_unique_name.decl;
if (!first_parameter) {
addendum += ", ";
}
first_parameter = false;
// TODO: For on-the-fly declared template arguments like e.g. in "func<struct unnamed>()", getAsString will print spam such as "struct(anonymous namespace)::unnamed". We neither want that namespace nor do we want the "struct" prefix!
// NOTE: CppInsights has a lot more code to handle getting the parameter name and type...
addendum += parameter->getType().getAsString();
if (!parameter->getNameAsString().empty()) {
addendum += " ";
// Parameter generated from a parameter pack will be assigned the same name,
// so we need to distinguish the generated parameter names manually.
addendum += parameter_and_unique_name.unique_name;
}
}
// NOTE: decl->getLocEnd() and templated->getLocEnd() return
// different results in some cases. E.g. for
// "decltype(T{}) t", the former coincides with the start of
// the declaration, whereas the latter correctly spans the
// entire declaration. Similarly, for unnamed parameter packs
// such as "T...", decl->getLocEnd() stops before the
// ellipsis, whereas templated->getLocEnd() includes it
rewriter->ReplaceTextIncludingEndToken(templated->getSourceRange(), "/*Expansion of " + GetClosedStringFor(decl->getLocStart(), templated->getLocEnd()) + "{-*/" + addendum + "/*-}*/");
} else {
auto old_decl = GetClosedStringFor(decl->getLocStart(), decl->getLocEnd());
if (new_decl != old_decl) {
rewriter->ReplaceTextIncludingEndToken(decl->getSourceRange(), "/*" + old_decl + "*/" + new_decl);
} else {
std::cerr << "Skipped rewrite due to matching declarations" << std::endl;
}
}
} else {
// NOTE: In particular, VarDecls should always have been handled as
// part of VisitDeclStmt
std::cerr << "Unknown VarDecl kind " << decl->getDeclKindName() << std::endl;
assert(false);
}
return true;
}
} // namespace cftf
int main() {
// Do nothing. Currently, all tests are implemented via static_asserts
}
@blockspacer
Copy link
Author

# include the minikube IP (192.168.99.100)
export NO_PROXY=192.168.99.100,.....

# NOTE DOCKER_OPTS below

# sudo -E DOCKER_OPTS='--insecure-registry registry.docker.io --insecure-registry production.cloudflare.docker.com' \
#  docker build  \
#  --build-arg http_proxy=http://172.17.0.1:3128 \
#  --build-arg https_proxy=http://172.17.0.1:3128 \
#  --build-arg no_proxy=localhost,127.0.0.*,10.*,192.168.*,*.somecorp.ru,*.mycorp.ru \
#  --build-arg HTTP_PROXY=http://172.17.0.1:3128 \
#  --build-arg HTTPS_PROXY=http://172.17.0.1:3128 \
#  --build-arg NO_PROXY=localhost,127.0.0.*,10.*,192.168.*,*.somecorp.ru,*.mycorp.ru \
#  --no-cache -t cpp-docker-cxxctp .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment