Skip to content

Instantly share code, notes, and snippets.

@esenthil2018
Created May 27, 2022 19:41
Show Gist options
  • Save esenthil2018/9f3dbc520695baed620190f06cc42d89 to your computer and use it in GitHub Desktop.
Save esenthil2018/9f3dbc520695baed620190f06cc42d89 to your computer and use it in GitHub Desktop.
dataset = raw_dataset.copy()
dataset.tail()
dataset.isna().sum()
#drop na
dataset = dataset.dropna()
dataset['Origin'] = dataset['Origin'].map({1: 'USA', 2: 'Europe', 3: 'Japan'})
#one hot encoding
dataset = pd.get_dummies(dataset, columns=['Origin'], prefix='', prefix_sep='')
dataset.tail()
train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)
sns.pairplot(train_dataset[['MPG', 'Cylinders', 'Displacement', 'Weight']], diag_kind='kde')
train_dataset.describe().transpose()
#normalization
train_dataset.describe().transpose()[['mean', 'std']]
normalizer = tf.keras.layers.Normalization(axis=-1)
normalizer.adapt(np.array(train_features))
print(normalizer.mean.numpy())
first = np.array(train_features[:1])
with np.printoptions(precision=2, suppress=True):
print('First example:', first)
print()
print('Normalized:', normalizer(first).numpy())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment