Skip to content

Instantly share code, notes, and snippets.

View mateuszdorobek's full-sized avatar
🤖
Ready to work

Mateusz Dorobek mateuszdorobek

🤖
Ready to work
View GitHub Profile
@mateuszdorobek
mateuszdorobek / iris.csv
Last active February 14, 2025 14:40 — forked from netj/iris.csv
sepal_length sepal_width petal_length petal_width species
5.1 3.5 1.4 .2 Setosa
4.9 3 1.4 .2 Setosa
4.7 3.2 1.3 .2 Setosa
4.6 3.1 1.5 .2 Setosa
5 3.6 1.4 .2 Setosa
5.4 3.9 1.7 .4 Setosa
4.6 3.4 1.4 .3 Setosa
5 3.4 1.5 .2 Setosa
4.4 2.9 1.4 .2 Setosa
year name_last name_first team position salary years total_value avg_annual
2024 Miller Erik Giants LHP 740000
2024 McCann Kyle Athletics C 740000
2024 Roupp Landen Giants RHP 740000
2024 Duarte Daniel Twins RHP 740000
2024 Murphy Penn Astros RHP 740000
2024 Chavez Jesse Braves RHP 740000
2024 Dunn Oliver Brewers INF 740000
2024 Barnhart Tucker Diamondbacks C 740000
2024 Jones Jared Pirates RHP 740000
@mateuszdorobek
mateuszdorobek / hw_25000.csv
Last active September 21, 2024 21:55
The file contains height and weight data for 25000 individuals. Each record includes 3 values: index, height (inches), weight (pounds).
We can't make this file beautiful and searchable because it's too large.
Index Height Weight
1 65.78331 112.9925
2 71.51521 136.4873
3 69.39874 153.0269
4 68.2166 142.3354
5 67.78781 144.2971
6 68.69784 123.3024
7 69.80204 141.4947
8 70.01472 136.4623
9 67.90265 112.3723
@mateuszdorobek
mateuszdorobek / spaceship_titanic.csv
Created October 22, 2022 16:01
Dataset to predict whether a passenger was transported to an alternate dimension during the Spaceship Titanic's collision with the spacetime anomaly. Source: https://www.kaggle.com/competitions/spaceship-titanic
We can't make this file beautiful and searchable because it's too large.
PassengerId,HomePlanet,CryoSleep,Cabin,Destination,Age,VIP,RoomService,FoodCourt,ShoppingMall,Spa,VRDeck,Name,Transported
0001_01,Europa,False,B/0/P,TRAPPIST-1e,39.0,False,0.0,0.0,0.0,0.0,0.0,Maham Ofracculy,False
0002_01,Earth,False,F/0/S,TRAPPIST-1e,24.0,False,109.0,9.0,25.0,549.0,44.0,Juanna Vines,True
0003_01,Europa,False,A/0/S,TRAPPIST-1e,58.0,True,43.0,3576.0,0.0,6715.0,49.0,Altark Susent,False
0003_02,Europa,False,A/0/S,TRAPPIST-1e,33.0,False,0.0,1283.0,371.0,3329.0,193.0,Solam Susent,False
0004_01,Earth,False,F/1/S,TRAPPIST-1e,16.0,False,303.0,70.0,151.0,565.0,2.0,Willy Santantines,True
0005_01,Earth,False,F/0/P,PSO J318.5-22,44.0,False,0.0,483.0,0.0,291.0,0.0,Sandie Hinetthews,True
0006_01,Earth,False,F/2/S,TRAPPIST-1e,26.0,False,42.0,1539.0,3.0,0.0,0.0,Billex Jacostaffey,True
0006_02,Earth,True,G/0/S,TRAPPIST-1e,28.0,False,0.0,0.0,0.0,0.0,,Candra Jacostaffey,True
0007_01,Earth,False,F/3/S,TRAPPIST-1e,35.0,False,0.0,785.0,17.0,216.0,0.0,Andona Beston,True
index transaction_date house_age nearest_mass_rapid_transit_station_distance convenience_stores_cnt latitude longitude unit_area_price
1 2012.917 32 84.87882 10 24.98298 121.54024 37.9
2 2012.917 19.5 306.5947 9 24.98034 121.53951 42.2
3 2013.583 13.3 561.9845 5 24.98746 121.54391 47.3
4 2013.500 13.3 561.9845 5 24.98746 121.54391 54.8
5 2012.833 5 390.5684 5 24.97937 121.54245 43.1
6 2012.667 7.1 2175.03 3 24.96305 121.51254 32.1
7 2012.667 34.5 623.4731 7 24.97933 121.53642 40.3
8 2013.417 20.3 287.6025 6 24.98042 121.54228 46.7
9 2013.500 31.7 5512.038 1 24.95095 121.48458 18.8
@mateuszdorobek
mateuszdorobek / diabetes.csv
Last active October 21, 2022 19:41
Source: https://www.kaggle.com/datasets/mathchi/diabetes-data-set This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective is to predict based on diagnostic measurements whether a patient has diabetes.
pregnancies glucose blood_pressure skin_thickness insulin bmi diabetes_pedigree_function age class
6 148 72 35 0 33.6 0.627 50 1
1 85 66 29 0 26.6 0.351 31 0
8 183 64 0 0 23.3 0.672 32 1
1 89 66 23 94 28.1 0.167 21 0
0 137 40 35 168 43.1 2.288 33 1
5 116 74 0 0 25.6 0.201 30 0
3 78 50 32 88 31 0.248 26 1
10 115 0 0 0 35.3 0.134 29 0
2 197 70 45 543 30.5 0.158 53 1
@mateuszdorobek
mateuszdorobek / room_occupancy_detection.csv
Created September 2, 2022 19:52
Based on Occupancy Detection Data Set - Experimental data used for binary classification (room occupancy) from temperature, humidity, light and CO2. Ground-truth occupancy was obtained from time stamped pictures that were taken every minute.
We can't make this file beautiful and searchable because it's too large.
date,temperature,humidity,light,CO2,humidity_ratio,occupancy
2015-02-11 14:48:00,21.76,31.1333333333333,437.333333333333,1029.66666666667,0.0050210108902138,1
2015-02-04 17:51:00,23.18,27.272,426.0,721.25,0.0047929881765052,1
2015-02-11 14:49:00,21.79,31.0,437.333333333333,1000.0,0.0050085812748017,1
2015-02-04 17:51:59,23.15,27.2675,429.5,714.0,0.0047834409493106,1
2015-02-11 14:50:00,21.7675,31.1225,434.0,1003.75,0.0050215691326541,1
2015-02-04 17:53:00,23.15,27.245,426.0,713.5,0.0047794635244219,1
2015-02-11 14:51:00,21.7675,31.1225,439.0,1009.5,0.0050215691326541,1
2015-02-04 17:54:00,23.15,27.2,426.0,708.25,0.0047715088260817,1
2015-02-11 14:51:59,21.79,31.1333333333333,437.333333333333,1005.66666666667,0.0050302977786788,1
We can make this file beautiful and searchable if this error is corrected: It looks like row 5 should actually have 31 columns, instead of 19 in line 4.
,duration,credit_amount,age,class,is_female,checking_status__0<=X<200,checking_status__<0,checking_status__>=200,checking_status__no_checking,credit_history__all_paid,credit_history__critical/other_existing_credit,credit_history__delayed_previously,credit_history__existing_paid,credit_history__no_credits/all_paid,savings_status__100<=X<500,savings_status__500<=X<1000,savings_status__<100,savings_status__>=1000,savings_status__no_known_savings,employment__1<=X<4,employment__4<=X<7,employment__<1,employment__>=7,employment__unemployed,personal_status__div/dep/mar,personal_status__male_div/sep,personal_status__male_mar/wid,personal_status__male_single,credit_amount__log,age__log
0,6,1169,67,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,7.063903961472068,4.204692619390966
1,48,5951,22,1,1,1,0,0,0,0,0,0,1,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,8.691314551644853,3.091042453358316
2,12,2096,49,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,7.647786045440933,3.8918202981106265
3,42,7882,45,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0
marketing_spendings income
0 243.26 403.92
1 159.72 498.93
2 145.79 306.38
3 105.41 106.48
4 133.95 286.55
5 131.37 313.01
6 140.15 344.26
7 124.69 325.73
8 141.08 327.92
@mateuszdorobek
mateuszdorobek / heights_weights_sample.csv
Created July 25, 2022 21:21
Height and weight sample dataset based on SOCR Data Dinov 020108 HeightsWeights
height weight
0 170.5 54.63
1 171.0 59.99
2 172.47 63.72
3 181.9 63.61
4 182.3 60.29
5 169.52 59.25
6 171.2 60.83
7 169.52 54.23
8 177.83 59.9