Build the following and make it run as fast as you possibly can using Python 3 (vanilla). The faster it runs, the more you will impress us!
Your code should:
- Download this 2.2GB file: https://s3.amazonaws.com/carto-1000x/data/yellow_tripdata_2016-01.csv
- Count the lines in the file
- Calculate the average value of the tip_amount field.
All of that in the most efficient way you can come up with.
That's it. Make it fly!
import urllib
testfile = urllib.URLopener()
testfile.retrieve(\
	"https://s3.amazonaws.com/carto-1000x/data/yellow_tripdata_2016-01.csv",\
	"yellow_tripdata_2016-01.csv")(NOTE: I didn't include the download block in the benchmark due to network speed impact)
root@ubuntu-1gb-fra1-01:~# time python3 main.py
10906858 1.7506631158122512
real	0m40.243s
user	0m37.004s
sys	0m2.140s
root@ubuntu-1gb-fra1-01:~# uname -a
Linux ubuntu-1gb-fra1-01 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
root@ubuntu-1gb-fra1-01:~# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 79
model name	: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
stepping	: 1
microcode	: 0x1
cpu MHz		: 2199.998
cache size	: 30720 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch vnmi ept fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt arat
bugs		:
bogomips	: 4399.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:
root@ubuntu-1gb-fra1-01:~# cat /proc/meminfo
MemTotal:        1016156 kB
MemFree:           68416 kB
MemAvailable:     818396 kB
Buffers:            1152 kB
Cached:           872720 kB
SwapCached:            0 kB
Active:           457612 kB
Inactive:         441676 kB
Active(anon):      28232 kB
Inactive(anon):     2728 kB
Active(file):     429380 kB
Inactive(file):   438948 kB
Unevictable:        3656 kB
Mlocked:            3656 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         29072 kB
Mapped:            12020 kB
Shmem:              3124 kB
Slab:              30380 kB
SReclaimable:      18772 kB
SUnreclaim:        11608 kB
KernelStack:        1840 kB
PageTables:         2160 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      508076 kB
Committed_AS:     202332 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:      4096 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       53240 kB
DirectMap2M:      995328 kB
DirectMap1G:           0 kB