- Links
- Kaggle Username and password :
- [email protected] | ********
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
''' | |
Logistic Regression | |
''' | |
from sklearn.linear_model import LogisticRegression | |
from sklearn.metrics import accuracy_score, confusion_matrix, f1_score, roc_curve, precision_score, classification_report | |
lgmodel = LogisticRegression(max_iter=100, C=1e5) | |
lgmodel.fit(X_train, y_train) | |
ypred=lgmodel.predict(X_test) | |
print(lgmodel.score(X_train, y_train)) | |
print(confusion_matrix(y_test,ypred)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Turn into classes using pd.cut() | |
df_train_filt2['Fareclass']=pd.qcut(df_train_filt2['Fare'], 4, labels=[1,2,3,4]) | |
df_train_filt2['Ageclass']=pd.qcut(df_train_filt2['Age'], 5, labels=[1,2,3,4,5]) | |
df_test_filt1['Fareclass']=pd.qcut(df_test_filt1['Fare'], 4, labels=[1,2,3,4]) | |
df_test_filt1['Ageclass']=pd.qcut(df_test_filt1['Age'], 5, labels=[1,2,3,4,5]) | |
df_train_filt2.drop(['Fare','Age'], axis=1, inplace=True) | |
df_test_filt1.drop(['Fare','Age'], axis=1, inplace=True) | |
# Get the encoding done to get rid of string columns that you cannot train | |
df_train_filt2=pd.get_dummies(df_train_filt2, columns=\ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Get the Data and see the dataset | |
import os | |
import matplotlib.pyplot as plt | |
import pandas as pd | |
path=os.getcwd() | |
filepath=[] | |
for file in os.listdir(path): | |
if 'csv' in file:filepath.append(path+"\\"+file) | |
for path in filepath: | |
if "train" in path:df_train=pd.read_csv(path) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Now to fill up the NA values for Ages | |
mean_age=df_train_filt2['Age'].mean() | |
listofAgeind_=list(df_train_filt2[df_train_filt2['Age'].isna()==True].index) | |
list_of_ages=np.random.normal(mean_age, 10, len(listofAgeind_)) | |
plt.plot(list_of_ages, linewidth=2, color='g', label='Age') | |
plt.show() | |
for (idx,age) in zip(listofAgeind_,list_of_ages):df_train_filt2.loc[idx, 'Age']=age | |
print(df_train_filt2.count()) |
- https://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-x86_64.sh
cd /tmp
- curl -O https://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-x86_64.sh
- check the sha256sum: sha256sum Anaconda3-5.1.0-Linux-x86_64.sh
This list is meant to be a both a quick guide and reference for further research into these topics. It's basically a summary of that comp sci course you never took or forgot about, so there's no way it can cover everything in depth. It also will be available as a gist on Github for everyone to edit and add to.
###Array ####Definition:
- Stores data elements based on an sequential, most commonly 0 based, index.
- Based on tuples from set theory.