Skip to content

Instantly share code, notes, and snippets.

@raghavrv
Created February 15, 2017 18:18
Show Gist options
  • Save raghavrv/55509528ecce3f32404f8c0d907e14d8 to your computer and use it in GitHub Desktop.
Save raghavrv/55509528ecce3f32404f8c0d907e14d8 to your computer and use it in GitHub Desktop.
Generating artificial dataset with outliers
import numpy as np
from sklearn.datasets import make_classification
# Data with features in different scales
n_classes = 2
X_clean, y_clean = make_classification(
n_samples=500, n_features=2, n_redundant=0,
scale=(10, 100), random_state=0)
# Add outliers to the data
X_outliers, y_outliers = make_classification(
n_samples=10, n_features=2, n_redundant=0,
# scale=(10, 100), random_state=1)
scale=(-100, 1000), random_state=1)
X = np.concatenate((X_clean, X_outliers), axis=0)
y = np.concatenate((y_clean, y_outliers), axis=0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment