Skip to content

Instantly share code, notes, and snippets.

@ogyalcin
Created August 2, 2018 12:21
Show Gist options
  • Select an option

  • Save ogyalcin/6af10fcb81ea68dc6289d7440e6274e9 to your computer and use it in GitHub Desktop.

Select an option

Save ogyalcin/6af10fcb81ea68dc6289d7440e6274e9 to your computer and use it in GitHub Desktop.
Clean the Test Dataset
test['Age'].fillna(test['Age'].median(),inplace=True) # Age
test['Fare'].fillna(test['Fare'].median(),inplace=True) # Fare
d = {1:'1st',2:'2nd',3:'3rd'} #Pclass
test['Pclass'] = test['Pclass'].map(d)
test['Embarked'].fillna(test['Embarked'].value_counts().index[0], inplace=True) # Embarked
ids = test[['PassengerId']]# Passenger Ids
test.drop(['PassengerId','Name','Ticket','Cabin'],1,inplace=True)# Drop Unnecessary Columns
categorical_vars = test[['Pclass','Sex','Embarked']]# Get Dummies of Categorical Variables
dummies = pd.get_dummies(categorical_vars,drop_first=True)
test = test.drop(['Pclass','Sex','Embarked'],axis=1)#Drop the Original Categorical Variables
test = pd.concat([test,dummies],axis=1)#Instead, concat the new dummy variables
#test.head()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment