Created
May 13, 2020 17:54
-
-
Save ganler/af0cc52df8e1783dcd21f7d2c9a69b49 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// Getting start with LLVM | |
// https://www.youtube.com/watch?v=3QQuhL-dSys | |
/* | |
* LLVM Primer | |
* Transformation Examples | |
* Middle-End Pass & Backend Pass | |
*/ | |
/* | |
* LLVM: | |
* | |
* Intermediate Representation is a generic assembly language. | |
* - Easy to handle many transformations. (e.g., replace instruction A with instruction B) | |
* - Easy to lower to different targets. | |
* - Optimize IR code -> Optimize Target Codes. | |
* | |
* Instructions: | |
* | |
* - Have an opcode. | |
* - Take a set of input operands. | |
* - Produce one or zero result values. | |
* // %sum = add i32 %a, 10 | |
* - Call instructions. | |
* - Instructions has types | |
* - [Basic Block] ~ label <-> Invocation - Finalization(see my gist: https://gist.github.com/ganler/7b77f8d722562b0059d1982bede733d4) | |
* - [Functions] First basic block of a function is the special entry block. | |
* - [Module] Top level container | |
* - Function. | |
* - Declaration. | |
* - Global vars. | |
* - Others... | |
*/ | |
/* | |
* // Instructions/User/Uses | |
* - Result value of an instruction is referred to by an identifier(not variables) | |
@code | |
%sum = add i32 %a, 10 | |
%cond = icmp eq %sum, 99 | |
@endcode | |
* - % + DefName: e.g., %sum = add i32 %a, 10 | |
* - %sum is a def. | |
* - %cond uses %sum. | |
* - %cond is a user of %sum. | |
* | |
* - You can query the users of some def. | |
* - All uses must be reachable from their def. | |
* | |
* - 3 different formats of LLVM IR. | |
* -> textual: .ll (human readable) | |
* -> bitcode: .bc (efficient for storage) | |
* -> in-memory: in-memory representation. | |
*/ | |
/* | |
* IR Transformation Examples. | |
* | |
* >> Removing Dead Blocks. | |
@code | |
entry: | |
ret i32 1 | |
bb: # Dead block. | |
ret i32 2 | |
@endcode | |
* | |
*/ | |
#include <llvm/IR/Function.h> | |
#include <llvm/IR/CFG.h> | |
// https://www.cs.cornell.edu/courses/cs6120/2019fa/blog/trace-gen-llvm/ | |
// phi <T> [var_i, $entry_i], ... // if it'called from $entry_i then return var_i; | |
bool remove_dead_blks(llvm::Function& f) { | |
bool changed = false; | |
// * Remove blk without preds. | |
// * Destroy uses when destroying blocks. | |
// * Remove the blk label in his successors. | |
for(llvm::BasicBlock& blk : llvm::make_early_inc_range(f)) { | |
// A good blk should have predecessors. | |
if (!llvm::pred_empty(&blk)) // Filter out blocks with predecessors. | |
continue; // If you have predecessors, it means you have users. | |
// A good blk should have users(blk.users()). | |
// Entry should not be removed. | |
if (&f.getEntryBlock() == &blk) // Users have implicit predecessors. (caller) | |
continue; | |
// If any one is using label of `blk` but not jumping into it. Remove it! | |
for (auto* succ : llvm::successors(&blk)) | |
succ->removePredecessor(&blk); | |
while(!blk.empty()) { | |
// If blk is a dead blk, the uses of his definitions should go empty. | |
// and all his sub instructions should be removed. | |
auto& instr = blk.back(); | |
// Pop back. | |
instr.replaceAllUsesWith(llvm::UndefValue::get(instr.getType())); | |
instr.eraseFromParent(); | |
} | |
// Aha, got it! Delete it! | |
blk.eraseFromParent(); | |
// Indicate we made changes. | |
changed = true; | |
} | |
return changed; | |
} | |
// * Another Example of simplifying Conditional Branches. | |
/* >>> Practices | |
* https://zhuanlan.zhihu.com/p/66793637 | |
* | |
* llvm ir: | |
* - format: `.bc`(bitcode) `.ll`(human readable) | |
* - hr: `clang -S -emit-llvm main.c` | |
* - bitcode: `clang -c -emit-llvm main.c` | |
* - .ll -> .bc: llvm-as | |
* - .bc -> .ll: llvm-dis | |
* - llvm-link: link multiple .bc. (You can do it across languages!) | |
* : $ llvm-link factorial.bc main.bc -o linked.bc # lined.bc | |
* - llc --march=x86-64 linked.bc | |
* - Strong type lang. | |
*/ | |
/* | |
* Middle-End passes are trying to minimize the number of IR instructions and simplify control flow. | |
* | |
* Not modelled in IR. | |
* | |
* - Instruction Set. | |
* - Instruction Lat. | |
* - Available Registers | |
* | |
* TargetTransformationInfo. | |
* - Hook to query. | |
* - How big is a cache line. | |
* - GetXXXCost. | |
* | |
* Why do we need TargetTransformationInfo? | |
* - merge multiple instructions. | |
* - Codegen Passes. | |
* | |
* IR -> MIR(virtual/physical register) -> Machine Code | |
* | |
*/ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment