C++ links: Performance Modeling: Criticality
https://github.com/MattPD/cpplinks / Performance / Modeling
See also:
Criticality, Critical Path Analysis
- Calipers: A Criticality-aware Framework for Modeling Processor Performance
- 2022
- Hossein Golestani, Rathijit Sen, Vinson Young, Gagan Gupta
- https://arxiv.org/abs/2201.05884
- https://github.com/microsoft/calipers
- Critical Path based Microarchitectural Bottleneck Analysis for Out-of-Order Execution,
- IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol.E102-A, No.6, Jun. 2019
- Teruo Tanimoto, Takatsugu Ono, Koji Inoue
- https://doi.org/10.1587/transfun.E102.A.758
- https://search.ieice.org/bin/index.php?category=A&year=2019&vol=E102-A&num=6&lang=E
- A whirlwind introduction to dataflow graphs
- Concurrency Analysis in Dynamic Dataflow Graphs
- IEEE Transactions on Emerging Topics in Computing, 1–1 (2018)
- Tiago Alves, Leandro A. J. Marzulo, Felipe M. G. Franca and Sandip Kundu, Concurrency Analysis in Dynamic Dataflow Graphs
- http://ieeexplore.ieee.org/document/8269827/
- GDP: Using Dataflow Properties to Accurately Estimate Interference-Free Performance at Runtime
- High-Performance Computer Architecture (HPCA) 2018
- Magnus Jahre, Lieven Eeckhout
- http://www.idi.ntnu.no/~jahre/hpca18-gdp-author-copy.pdf
- Graph-based Dynamic Performance (GDP)
- Overview of GDP Performance Accounting - https://www.youtube.com/watch?v=IsqFxTCDbEc
- Dependence Graph Model for Accurate Critical Path Analysis on Out-of-Order Processors
- Teruo Tanimoto, Takatsugu Ono, Koji Inoue
- Journal of Information Processing Volume 25 (2017)
- https://doi.org/10.2197/ipsjjip.25.983
- https://www.jstage.jst.go.jp/article/ipsjjip/25/0/25_983/_article/-char/en
- "We improve the dependence graph model based on two key microarchitectural details that have not been taken into account in previous models: dynamic variation of branch misprediction penalty and a modern SQ (store queue) design which combines the SQ and the writeback buffer (WBB)."
- Enhanced Dependence Graph Model for Critical Path Analysis on Modern Out-of-Order Processor
- IEEE Computer Architecture Letters, vol. 16, no. 2, 2017
- T. Tanimoto, T. Ono, K. Inoue and H. Sasaki
- https://www.semanticscholar.org/paper/Enhanced-Dependence-Graph-Model-for-Critical-Path-Tanimoto-Ono/2050aac668e6fe6edbbff82265ebec8d3c365600
- http://ieeexplore.ieee.org/document/7882625/
- RpStacks: Fast and Accurate Processor Design Space Exploration Using Representative Stall-Event Stacks
- MICRO 2014
- Jaewon Lee, Hanhwi Jang, Jangwoo Kim
- https://www.semanticscholar.org/paper/RpStacks%3A-Fast-and-Accurate-Processor-Design-Space-Lee-Jang/7248a77e6b0c46225c175f61fa9fbcec94dc034f
- Using Criticality to Attack Performance Bottlenecks
- Interaction Cost and Shotgun Profiling
- ACM Trans. on Architecture and Compiler Optimizations (TACO), September 2004
- Brian A. Fields, Rastislav Bodik, Mark D. Hill, Chris J. Newburn
- http://www.cs.wisc.edu/multifacet/authorizer_links/taco04_icost.html
- http://www.cs.wisc.edu/multifacet/papers/taco04_icost.pdf
- Interaction Cost: For When Event Counts Just Don't Add Up
- IEEE Micro Special Issue: Micro's Top Picks from Microarchitecture Conferences, November-December 2004
- Brian A. Fields, Rastislav Bodik, Mark D. Hill, and Chris J. Newburn
- http://www.cs.wisc.edu/multifacet/papers/ieeemicro04_icost.pdf
- Execution Slack and Criticality
- Nikos Hardavellas - Guest Lecture: Computer Architecture (SCS 15-740), School of Computer Science, Carnegie Mellon, September 2003
- http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/www/lectures/criticality_slack.pdf
- Quantifying Instruction Criticality for Shared Memory Multiprocessors
- SPAA 2003
- http://people.ee.duke.edu/~sorin/papers/spaa03_criticality.pdf
- https://www.semanticscholar.org/paper/Quantifying-instruction-criticality-for-shared-mem-Li-Lebeck/5637e7074aa67d07a54f975a568f445411f8f609
- https://www.researchgate.net/publication/221257693_Quantifying_instruction_criticality_for_shared_memory_multiprocessors
- Using Interaction Costs for Microarchitectural Bottleneck Analysis
- MICRO-36 2003
- B. A. Fields, R. Bodik, M. D. Hill, C. J. Newburn
- https://www.microarch.org/micro36/html/pdf/fields-UsingInteractionCost.pdf
- http://dx.doi.org/10.1109/MICRO.2003.1253198
- http://www.cs.wisc.edu/multifacet/papers/micro03_icost.pdf
- http://www.cs.wisc.edu/multifacet/papers/micro03_icost.ppt
- http://doi.acm.org/10.1145/1022969.1022971
- Non-vital Loads
- HPCA 2002
- Ryan Rakvic, Bryan Black, Deepak Limaye, John P. Shen
- http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/criticality/non_vital_loads_hpca02.pdf
- Quantifying Instruction Criticality
- Parallel Architectures and Compilation Techniques (PACT) 2002
- Eric Tune, Dean M. Tullsen, Brad Calder
- https://cseweb.ucsd.edu/~calder/papers/PACT-02-CP.pdf
- "The slack of an instruction is the number of cycles that the instruction can be delayed without increasing the execution time of the program."
- "We define the tautness for an instruction as the number of cycles by which execution time is reduced when the result of that instruction was made available to other instructions immediately."
- Slack: Maximizing Performance Under Technological Constraints.
- International Symposium on Computer Architecture ISCA 2002
- Brian Fields, Rastislav Bodik, Mark D. Hill
- https://www.semanticscholar.org/paper/Slack%3A-Maximizing-Performance-Under-Technological-Fields-Bod%C3%ADk/15a0fdfe4566cc6b14e49542b147b039bb19fce0
- https://minds.wisconsin.edu/bitstream/handle/1793/8670/file_1.pdf?sequence=1
- http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/slack/isca02.pdf
- paper: http://www.cs.wisc.edu/multifacet/papers/isca02_slack.pdf
- slides: http://www.cs.wisc.edu/multifacet/papers/isca02_slack_talk.ppt
- The slack of an instruction
i
is the number of cyclesi
can be delayed without increasing the overall execution time.- explicit slack, i.e., the actual value of an instruction’s slack
- implicit slack, i.e., whether an instruction can tolerate the delay of a particular slow (non-uniform) resource
- Three variants:
- Local slack of a dynamic instruction i is the maximum number of cycles the execution of i can be delayed without delaying any subsequent instructions.
- Global slack of a dynamic instruction i is the maximum number of cycles the execution of i can be delayed without delaying the last instruction in the program.
- Apportioned slack captures slack available when we desire to delay multiple instructions simultaneously. . . . More formally, let S be an assignment of some amount of slack (possibly zero) to each instruction in such a way that the last instruction is not delayed. Given an assignment of slack S , the apportioned slack of in- struction i is S(i) , i.e., the slack assigned to i.
- After the execution, the slack is computed by determining how much latencies can be extended without growing the critical path of the graph
- Focusing Processor Policies via Critical-Path Prediction
- International Symposium on Computer Architecture ISCA 2001
- Brian Fields, Shai Rubin, Rastislav Bodik
- https://dl.acm.org/citation.cfm?id=379253
- http://carch.mycpanel2.princeton.edu/wordpress/2017/03/12/3132017-focusing-processor-policies-via-critical-path-prediction/
- http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/slack/isca01a.pdf
- https://www.semanticscholar.org/paper/Focusing-processor-policies-via-critical-path-pred-Fields-Rubin/01bae80d26360a4e9161a4b3d2b27b7f1a179996
- http://slidesprint.com/focusing-processor-policies-via-criticalpath-prediction-ppt a dependence-graph-based model of the critical path of a microexecution
- Locality vs. Criticality
- ISCA 2001
- Srikanth T. Srinivasan, Roy Dz-ching Ju, Alvin R. Lebeck, Chris Wilkerson
- http://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/criticality/locality-vs-criticality-isca01.pdf