Guillaume Lemaitre glemaitre

PRs

Need Development

Test script

from __future__ import division, print_function                                                 
                                                                                                
import platform                                                                                 
import sys                                                                                      
                                                                                                
from time import time

python examples/model_selection/grid_search_text_feature_extraction.py 

==========================================================
Sample pipeline for text feature extraction and evaluation
==========================================================

The dataset used in this example is the 20 newsgroups dataset which will be
automatically downloaded and then cached and reused for the document
classification example.

	"""
	This is real case using the data of the Adult Census dataset available at:
	https://archive.ics.uci.edu/ml/datasets/Adult

	It will show that adding a smoothing noise do not has any influence on the
	classification performance but allow for a better understanding when manually
	checking the QuantileTransformer.
	"""
	import numpy as np
	import pandas as pd

	import numpy as np

	from sklearn.preprocessing import QuantileTransformer

	X = np.array([0] * 1 + [0.5] * 7 + [1] * 2).reshape(-1, 1)

	qt = QuantileTransformer(n_quantiles=10)
	qt.fit(X)

	# a behaviour which is not desired, but that frankly should