alexcrichton · September 24, 2017 14:42 · alexcrichton · Sep 24, 2017
diff --git a/a-summary.md b/a-summary.md
diff --git a/compiletime1 b/compiletime1
 This is the amount of time it takes to compile the `cargo` crate with the 
 following settings. LTO here is when ThinLTO is enabled for just the Cargo 
 crate itself.

 cgus= 1 opt_level=2 lto=true  Duration { secs: 126, nanos: 295528106 }
 cgus= 1 opt_level=2 lto=false Duration { secs: 126, nanos: 413709122 }
 cgus= 1 opt_level=3 lto=true  Duration { secs: 124, nanos: 142468752 }
 cgus= 1 opt_level=3 lto=false Duration { secs: 120, nanos: 743886461 }
 cgus= 2 opt_level=2 lto=true  Duration { secs: 93, nanos: 27925735 }
 cgus= 2 opt_level=2 lto=false Duration { secs: 70, nanos: 59352787 }
 cgus= 2 opt_level=3 lto=true  Duration { secs: 96, nanos: 455173050 }
 cgus= 2 opt_level=3 lto=false Duration { secs: 70, nanos: 290531058 }
 cgus= 3 opt_level=2 lto=true  Duration { secs: 63, nanos: 776553622 }
 cgus= 3 opt_level=2 lto=false Duration { secs: 46, nanos: 874165672 }
 cgus= 3 opt_level=3 lto=true  Duration { secs: 63, nanos: 953188908 }
 cgus= 3 opt_level=3 lto=false Duration { secs: 47, nanos: 541836063 }
 cgus= 4 opt_level=2 lto=true  Duration { secs: 61, nanos: 111946079 }
 cgus= 4 opt_level=2 lto=false Duration { secs: 45, nanos: 426991057 }
 cgus= 4 opt_level=3 lto=true  Duration { secs: 62, nanos: 464272517 }
 cgus= 4 opt_level=3 lto=false Duration { secs: 45, nanos: 969150948 }
 cgus= 8 opt_level=2 lto=true  Duration { secs: 55, nanos: 296671216 }
 cgus= 8 opt_level=2 lto=false Duration { secs: 39, nanos: 349206076 }
 cgus= 8 opt_level=3 lto=true  Duration { secs: 56, nanos: 582326934 }
 cgus= 8 opt_level=3 lto=false Duration { secs: 39, nanos: 808054774 }
 cgus=16 opt_level=2 lto=true  Duration { secs: 45, nanos: 830051522 }
 cgus=16 opt_level=2 lto=false Duration { secs: 31, nanos: 280283218 }
 cgus=16 opt_level=3 lto=true  Duration { secs: 47, nanos: 504148287 }
 cgus=16 opt_level=3 lto=false Duration { secs: 32, nanos: 703304794 }
 cgus=32 opt_level=2 lto=true  Duration { secs: 42, nanos: 300037233 }
 cgus=32 opt_level=2 lto=false Duration { secs: 28, nanos: 80229071 }
 cgus=32 opt_level=3 lto=true  Duration { secs: 43, nanos: 666276767 }
 cgus=32 opt_level=3 lto=false Duration { secs: 29, nanos: 218090742 }
diff --git a/compiletime2 b/compiletime2
 This is a comparison of how long it takes to compile the `cargo` crate
 in *debug* mode, using the specified number of codegen units.

 cgus= 1 Duration { secs: 25, nanos: 987906757 }
 cgus= 2 Duration { secs: 19, nanos: 634406885 }
 cgus= 4 Duration { secs: 16, nanos: 760811018 }
 cgus= 8 Duration { secs: 14, nanos: 575499872 }
 cgus=16 Duration { secs: 14, nanos: 149127165 }
 cgus=32 Duration { secs: 14, nanos: 495588126 }
diff --git a/runtime1 b/runtime1
 This is the result of `cargo benchcmp` on the `regex` crate benchmark suite. This shows 
 the difference between one codegen unit and no ThinLTO (the default today) and 
 8 codegen units with ThinLTO enabled.
 
 name                                    cgu-1-lto-false ns/iter  cgu-8-lto-true ns/iter  diff ns/iter   diff %  speedup 
 misc::anchored_literal_long_non_match   19 (20526 MB/s)          26 (15000 MB/s)                    7   36.84%   x 0.73 
 misc::anchored_literal_short_match      22 (1181 MB/s)           24 (1083 MB/s)                     2    9.09%   x 0.92 
 misc::anchored_literal_short_non_match  18 (1444 MB/s)           26 (1000 MB/s)                     8   44.44%   x 0.69 
 misc::easy0_1K                          15 (70066 MB/s)          16 (65687 MB/s)                    1    6.67%   x 0.94 
 misc::easy0_1MB                         18 (58255722 MB/s)       20 (52430150 MB/s)                 2   11.11%   x 0.90 
 misc::easy0_32                          15 (3933 MB/s)           17 (3470 MB/s)                     2   13.33%   x 0.88 
 misc::easy0_32K                         15 (2186333 MB/s)        17 (1929117 MB/s)                  2   13.33%   x 0.88 
 misc::literal                           15 (3400 MB/s)           14 (3642 MB/s)                    -1   -6.67%   x 1.07 
 misc::match_class                       62 (1306 MB/s)           66 (1227 MB/s)                     4    6.45%   x 0.94 
 misc::medium_1K                         15 (70133 MB/s)          17 (61882 MB/s)                    2   13.33%   x 0.88 
 misc::medium_1MB                        18 (58255777 MB/s)       21 (49933523 MB/s)                 3   16.67%   x 0.86 
 misc::medium_32                         16 (3750 MB/s)           17 (3529 MB/s)                     1    6.25%   x 0.94 
 misc::medium_32K                        15 (2186400 MB/s)        17 (1929176 MB/s)                  2   13.33%   x 0.88 
 misc::replace_all                       163                      175                               12    7.36%   x 0.93 
 misc::reverse_suffix_no_quadratic       5,230 (1529 MB/s)        4,198 (1905 MB/s)             -1,032  -19.73%   x 1.25 
 regexdna::subst1                        895,716 (5675 MB/s)      825,029 (6161 MB/s)          -70,687   -7.89%   x 1.09 
 sherlock::name_alt2                     112,522 (5287 MB/s)      119,522 (4977 MB/s)            7,000    6.22%   x 0.94 
 sherlock::name_alt3                     123,715 (4808 MB/s)      130,415 (4561 MB/s)            6,700    5.42%   x 0.95 
 sherlock::name_alt5                     116,989 (5085 MB/s)      123,665 (4810 MB/s)            6,676    5.71%   x 0.95 
diff --git a/runtime2 b/runtime2
 This is the result of `cargo benchcmp` on the `regex` crate benchmark suite. This shows 
 the difference between one codegen unit and no ThinLTO (the default today) and 
 16 codegen units with ThinLTO enabled.

 name                                    cgu-1-lto-false ns/iter  cgu-16-lto-true ns/iter  diff ns/iter   diff %  speedup 
 misc::anchored_literal_long_non_match   19 (20526 MB/s)          26 (15000 MB/s)                     7   36.84%   x 0.73 
 misc::anchored_literal_short_non_match  18 (1444 MB/s)           27 (962 MB/s)                       9   50.00%   x 0.67 
 misc::easy0_1K                          15 (70066 MB/s)          17 (61823 MB/s)                     2   13.33%   x 0.88 
 misc::easy0_1MB                         18 (58255722 MB/s)       19 (55189631 MB/s)                  1    5.56%   x 0.95 
 misc::easy0_32                          15 (3933 MB/s)           16 (3687 MB/s)                      1    6.67%   x 0.94 
 misc::easy0_32K                         15 (2186333 MB/s)        16 (2049687 MB/s)                   1    6.67%   x 0.94 
 misc::hard_1K                           60 (17516 MB/s)          64 (16421 MB/s)                     4    6.67%   x 0.94 
 misc::hard_32                           60 (983 MB/s)            64 (921 MB/s)                       4    6.67%   x 0.94 
 misc::hard_32K                          60 (546583 MB/s)         64 (512421 MB/s)                    4    6.67%   x 0.94 
 misc::literal                           15 (3400 MB/s)           14 (3642 MB/s)                     -1   -6.67%   x 1.07 
 misc::medium_1K                         15 (70133 MB/s)          16 (65750 MB/s)                     1    6.67%   x 0.94 
 misc::medium_1MB                        18 (58255777 MB/s)       20 (52430200 MB/s)                  2   11.11%   x 0.90 
 misc::medium_32K                        15 (2186400 MB/s)        16 (2049750 MB/s)                   1    6.67%   x 0.94 
 misc::replace_all                       163                      181                                18   11.04%   x 0.90 
 misc::reverse_suffix_no_quadratic       5,230 (1529 MB/s)        4,197 (1906 MB/s)              -1,033  -19.75%   x 1.25 
 regexdna::subst1                        895,716 (5675 MB/s)      815,747 (6231 MB/s)           -79,969   -8.93%   x 1.10 
 sherlock::name_alt2                     112,522 (5287 MB/s)      119,102 (4995 MB/s)             6,580    5.85%   x 0.94 
 sherlock::name_alt3                     123,715 (4808 MB/s)      129,966 (4577 MB/s)             6,251    5.05%   x 0.95 
 sherlock::name_alt5                     116,989 (5085 MB/s)      123,255 (4826 MB/s)             6,266    5.36%   x 0.95 
 sherlock::repeated_class_negation       79,450,010 (7 MB/s)      85,060,104 (6 MB/s)         5,610,094    7.06%   x 0.93
	This is the amount of time it takes to compile the `cargo` crate with the
	following settings. LTO here is when ThinLTO is enabled for just the Cargo
	crate itself.

	cgus= 1 opt_level=2 lto=true Duration { secs: 126, nanos: 295528106 }
	cgus= 1 opt_level=2 lto=false Duration { secs: 126, nanos: 413709122 }
	cgus= 1 opt_level=3 lto=true Duration { secs: 124, nanos: 142468752 }
	cgus= 1 opt_level=3 lto=false Duration { secs: 120, nanos: 743886461 }
	cgus= 2 opt_level=2 lto=true Duration { secs: 93, nanos: 27925735 }
	cgus= 2 opt_level=2 lto=false Duration { secs: 70, nanos: 59352787 }
	cgus= 2 opt_level=3 lto=true Duration { secs: 96, nanos: 455173050 }
	cgus= 2 opt_level=3 lto=false Duration { secs: 70, nanos: 290531058 }
	cgus= 3 opt_level=2 lto=true Duration { secs: 63, nanos: 776553622 }
	cgus= 3 opt_level=2 lto=false Duration { secs: 46, nanos: 874165672 }
	cgus= 3 opt_level=3 lto=true Duration { secs: 63, nanos: 953188908 }
	cgus= 3 opt_level=3 lto=false Duration { secs: 47, nanos: 541836063 }
	cgus= 4 opt_level=2 lto=true Duration { secs: 61, nanos: 111946079 }
	cgus= 4 opt_level=2 lto=false Duration { secs: 45, nanos: 426991057 }
	cgus= 4 opt_level=3 lto=true Duration { secs: 62, nanos: 464272517 }
	cgus= 4 opt_level=3 lto=false Duration { secs: 45, nanos: 969150948 }
	cgus= 8 opt_level=2 lto=true Duration { secs: 55, nanos: 296671216 }
	cgus= 8 opt_level=2 lto=false Duration { secs: 39, nanos: 349206076 }
	cgus= 8 opt_level=3 lto=true Duration { secs: 56, nanos: 582326934 }
	cgus= 8 opt_level=3 lto=false Duration { secs: 39, nanos: 808054774 }
	cgus=16 opt_level=2 lto=true Duration { secs: 45, nanos: 830051522 }
	cgus=16 opt_level=2 lto=false Duration { secs: 31, nanos: 280283218 }
	cgus=16 opt_level=3 lto=true Duration { secs: 47, nanos: 504148287 }
	cgus=16 opt_level=3 lto=false Duration { secs: 32, nanos: 703304794 }
	cgus=32 opt_level=2 lto=true Duration { secs: 42, nanos: 300037233 }
	cgus=32 opt_level=2 lto=false Duration { secs: 28, nanos: 80229071 }
	cgus=32 opt_level=3 lto=true Duration { secs: 43, nanos: 666276767 }
	cgus=32 opt_level=3 lto=false Duration { secs: 29, nanos: 218090742 }
	This is a comparison of how long it takes to compile the `cargo` crate
	in debug mode, using the specified number of codegen units.

	cgus= 1 Duration { secs: 25, nanos: 987906757 }
	cgus= 2 Duration { secs: 19, nanos: 634406885 }
	cgus= 4 Duration { secs: 16, nanos: 760811018 }
	cgus= 8 Duration { secs: 14, nanos: 575499872 }
	cgus=16 Duration { secs: 14, nanos: 149127165 }
	cgus=32 Duration { secs: 14, nanos: 495588126 }
	This is the result of `cargo benchcmp` on the `regex` crate benchmark suite. This shows
	the difference between one codegen unit and no ThinLTO (the default today) and
	8 codegen units with ThinLTO enabled.

	name cgu-1-lto-false ns/iter cgu-8-lto-true ns/iter diff ns/iter diff % speedup
	misc::anchored_literal_long_non_match 19 (20526 MB/s) 26 (15000 MB/s) 7 36.84% x 0.73
	misc::anchored_literal_short_match 22 (1181 MB/s) 24 (1083 MB/s) 2 9.09% x 0.92
	misc::anchored_literal_short_non_match 18 (1444 MB/s) 26 (1000 MB/s) 8 44.44% x 0.69
	misc::easy0_1K 15 (70066 MB/s) 16 (65687 MB/s) 1 6.67% x 0.94
	misc::easy0_1MB 18 (58255722 MB/s) 20 (52430150 MB/s) 2 11.11% x 0.90
	misc::easy0_32 15 (3933 MB/s) 17 (3470 MB/s) 2 13.33% x 0.88
	misc::easy0_32K 15 (2186333 MB/s) 17 (1929117 MB/s) 2 13.33% x 0.88
	misc::literal 15 (3400 MB/s) 14 (3642 MB/s) -1 -6.67% x 1.07
	misc::match_class 62 (1306 MB/s) 66 (1227 MB/s) 4 6.45% x 0.94
	misc::medium_1K 15 (70133 MB/s) 17 (61882 MB/s) 2 13.33% x 0.88
	misc::medium_1MB 18 (58255777 MB/s) 21 (49933523 MB/s) 3 16.67% x 0.86
	misc::medium_32 16 (3750 MB/s) 17 (3529 MB/s) 1 6.25% x 0.94
	misc::medium_32K 15 (2186400 MB/s) 17 (1929176 MB/s) 2 13.33% x 0.88
	misc::replace_all 163 175 12 7.36% x 0.93
	misc::reverse_suffix_no_quadratic 5,230 (1529 MB/s) 4,198 (1905 MB/s) -1,032 -19.73% x 1.25
	regexdna::subst1 895,716 (5675 MB/s) 825,029 (6161 MB/s) -70,687 -7.89% x 1.09
	sherlock::name_alt2 112,522 (5287 MB/s) 119,522 (4977 MB/s) 7,000 6.22% x 0.94
	sherlock::name_alt3 123,715 (4808 MB/s) 130,415 (4561 MB/s) 6,700 5.42% x 0.95
	sherlock::name_alt5 116,989 (5085 MB/s) 123,665 (4810 MB/s) 6,676 5.71% x 0.95
	This is the result of `cargo benchcmp` on the `regex` crate benchmark suite. This shows
	the difference between one codegen unit and no ThinLTO (the default today) and
	16 codegen units with ThinLTO enabled.

	name cgu-1-lto-false ns/iter cgu-16-lto-true ns/iter diff ns/iter diff % speedup
	misc::anchored_literal_long_non_match 19 (20526 MB/s) 26 (15000 MB/s) 7 36.84% x 0.73
	misc::anchored_literal_short_non_match 18 (1444 MB/s) 27 (962 MB/s) 9 50.00% x 0.67
	misc::easy0_1K 15 (70066 MB/s) 17 (61823 MB/s) 2 13.33% x 0.88
	misc::easy0_1MB 18 (58255722 MB/s) 19 (55189631 MB/s) 1 5.56% x 0.95
	misc::easy0_32 15 (3933 MB/s) 16 (3687 MB/s) 1 6.67% x 0.94
	misc::easy0_32K 15 (2186333 MB/s) 16 (2049687 MB/s) 1 6.67% x 0.94
	misc::hard_1K 60 (17516 MB/s) 64 (16421 MB/s) 4 6.67% x 0.94
	misc::hard_32 60 (983 MB/s) 64 (921 MB/s) 4 6.67% x 0.94
	misc::hard_32K 60 (546583 MB/s) 64 (512421 MB/s) 4 6.67% x 0.94
	misc::literal 15 (3400 MB/s) 14 (3642 MB/s) -1 -6.67% x 1.07
	misc::medium_1K 15 (70133 MB/s) 16 (65750 MB/s) 1 6.67% x 0.94
	misc::medium_1MB 18 (58255777 MB/s) 20 (52430200 MB/s) 2 11.11% x 0.90
	misc::medium_32K 15 (2186400 MB/s) 16 (2049750 MB/s) 1 6.67% x 0.94
	misc::replace_all 163 181 18 11.04% x 0.90
	misc::reverse_suffix_no_quadratic 5,230 (1529 MB/s) 4,197 (1906 MB/s) -1,033 -19.75% x 1.25
	regexdna::subst1 895,716 (5675 MB/s) 815,747 (6231 MB/s) -79,969 -8.93% x 1.10
	sherlock::name_alt2 112,522 (5287 MB/s) 119,102 (4995 MB/s) 6,580 5.85% x 0.94
	sherlock::name_alt3 123,715 (4808 MB/s) 129,966 (4577 MB/s) 6,251 5.05% x 0.95
	sherlock::name_alt5 116,989 (5085 MB/s) 123,255 (4826 MB/s) 6,266 5.36% x 0.95
	sherlock::repeated_class_negation 79,450,010 (7 MB/s) 85,060,104 (6 MB/s) 5,610,094 7.06% x 0.93