-
-
Save p1-dta/2c49d5c3eb6bafdd9675bf0bb9001bf1 to your computer and use it in GitHub Desktop.
import collections | |
import random | |
import timeit | |
pop = [ | |
(b := random.randint(1900, 2000), random.randint(b + 1, b + 100)) | |
for _ in range(10000) | |
] | |
# 5.049 s (with 10000 people [1,100] years old) | |
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]: | |
living_per_year = {} | |
for birth, death in population: | |
for year in range(birth, death): | |
if year in living_per_year: | |
living_per_year[year] += 1 | |
else: | |
living_per_year[year] = 1 | |
return living_per_year | |
print(sorted(tuple(count_living_per_year(pop).items()))) | |
print( | |
timeit.timeit( | |
"count_living_per_year(pop)", | |
globals=globals(), | |
number=100, | |
) | |
) | |
# 3.697 s (with 10000 people [1,100] years old) | |
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]: | |
living_per_year = {} | |
for birth, death in population: | |
for year in range(birth, death): | |
living_per_year[year] = living_per_year.get(year, 0) + 1 | |
return living_per_year | |
print(sorted(tuple(count_living_per_year(pop).items()))) | |
print( | |
timeit.timeit( | |
"count_living_per_year(pop)", | |
globals=globals(), | |
number=100, | |
) | |
) | |
# 3.929 s (with 10000 people [1,100] years old) | |
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]: | |
living_per_year = collections.defaultdict(int) | |
for birth, death in population: | |
for year in range(birth, death): | |
living_per_year[year] += 1 | |
return living_per_year | |
print(sorted(tuple(count_living_per_year(pop).items()))) | |
print( | |
timeit.timeit( | |
"count_living_per_year(pop)", | |
globals=globals(), | |
number=100, | |
) | |
) | |
# 4.084 s (with 10000 people [1,100] years old) | |
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]: | |
return collections.Counter( | |
year for birth, death in population for year in range(birth, death) | |
) | |
print(sorted(tuple(count_living_per_year(pop).items()))) | |
print( | |
timeit.timeit( | |
"count_living_per_year(pop)", | |
globals=globals(), | |
number=100, | |
) | |
) |
Thanks :D
You will find a pure python optimized version here: https://gist.github.com/dorianturba/e9c4210a5545d8447ecd35dad8a43355#file-count_living_per_year_perf-py-L104
my benchmark reach 0.09s for this.
A conceptually simpler approach that achieves the same, and gets me down to ~~20 milliseconds on my machine (between 17 and 22 across several runs) or about 0.02s : https://gist.github.com/JSzitas/921726d1a91320991b19331465a1f06e
As an advantage, the years are, obviously, sorted. If you insist that we do not know the range ahead of time, you can just use e.g.
struct year_deque{
uint_64t max_year, min_year;
std::deque<uint_64t> population;
}
Where you would adjust the min_year and the max_year as new entries were added (and accordingly add to the front or back of the deque). This should still be fairly fast, but I did not go to the trouble of that, because I am not paid to make this fast :) Same reason for why there is no multi-threading or SIMD.
P.S.: If you insist on using using maps you might want to use unordered_map instead of map, your code will be faster.
Here's a faster numpy
implementation:
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]:
return dict(np.stack(np.unique(np.concatenate([np.arange(*item) for item in population]), return_counts=True), axis=1))
Having to return a dict
slows it down a bit.
Here's an updated numpy
method, but it's not as quick as the optimized collections.Counter
method:
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]:
births, deaths = np.array(population).T
def counter(arr, min, max):
temp = np.zeros(((max - min) + 1))
counter = np.unique(np.array(arr), return_counts=True)
temp[counter[0] - min] = counter[1]
return temp
min, max = births.min(), deaths.max()
births_count = counter(births, min, max)
deaths_count = counter(deaths, min, max)
delta_living = births_count - deaths_count
living_per_year = np.cumsum(delta_living)
return dict(np.stack((np.arange(min, max + 1), living_per_year), axis=1))
Looks like the numpy.unique
method is not quite as efficient.
ChatGPT helped speed it up a little more using bincount
instead of unique
.:
def count_living_per_year(population: list[tuple[int, int]]) -> dict[int, int]:
births, deaths = np.array(population).T
min_year = births.min()
max_year = deaths.max()
years = np.arange(min_year, max_year + 1)
birth_counts = np.bincount(births - min_year, minlength=max_year - min_year + 1)
death_counts = np.bincount(deaths - min_year, minlength=max_year - min_year + 1)
living_per_year = np.cumsum(birth_counts - death_counts)
return dict(zip(years, living_per_year))
I looked at bincount
but wasn't sure how to set the minlength
- looks obvious now!
My Timings:
11.3829692
10.4960101
9.0663184
10.8802563
(Windows OS, Pyhon 3.9.4 (CPython))
===========================
The code for the same action with the same concepts without careful optimizations. Build in MSVC with the standard setting of Toolchain for release builds for x86_64:
0.898 seconds.
Good news: Actually, this implementation is only 10 times slower than C++ code. It's pretty good.
(Of course, sometimes it's nice to increase execution time for Prototypes in Python)
CODE:
`
#include <stdio.h> /* printf, scanf, puts, NULL /
#include <stdlib.h> / srand, rand /
#include <time.h> / time /
#include / ordered map /
#include <unordered_map> / unordered map */
#include /* iostream /
#include / time */
// with 10000 people [1,100] years old
constexpr size_t kPeople = 10000;
constexpr size_t kLowBoundAge = 1;
constexpr size_t kHighBoundAge = 100;
constexpr size_t kAgeLen = kHighBoundAge - kLowBoundAge; // Age is in [kLowBoundAge, kLowBoundAge + kAgeLen]
int main()
{
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
}
`