shaypal5 · August 1, 2022 17:36
diff --git a/pdp_post_adv2.py b/pdp_post_adv2.py
 >>> mp = MyPipelineAndModel(
      savings_max_val=101,
      drop_gender=False,
      standardize=True,
      ohencode_country=True,
      savings_bin_val=1,
      pca_threshold=25,
      fit_intercept=True)
 >>> mp
 <PdPipeline -> LogisticRegression>
 >>> mp.estimator
 LogisticRegression()
 >>> mp.pipeline
 A pdpipe pipeline:
 [ 0]  Drop columns Columns with at least 0.2 missing value rate
 [ 1]  Drop rows by label values
 [ 2]  Encode label values
 [ 3]  Drop columns 'Name'
 [ 4]  Apply dataframe method set_index with kwargs {'keys': 'id'}
 [ 5]  Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
      101>
 [ 6]  Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
      ~df[Bearded]
 [ 7]  Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
 [ 8]  Bin Savings by [1].
 [ 9]  One-hot encode 'Country'
 [10]  Tokenize Quote
 [11]  Stemming tokens in Quote...
 [12]  Remove stopwords from Quote
 [13]  Count-vectorizing column Quote.
 [14]  Decompose columns Columns that start with Quote with PCA
 [15]  Encode 'Savings_bin', 'Gender'
 [16]  Scale columns Columns of dtypes <class 'numpy.number'>
 [17]  Drop columns 'Bearded'
 [18]  Transform input dataframes to the following schema: <Learnable Schema>
 [19]  Validates conditions
	>>> mp = MyPipelineAndModel(
	savings_max_val=101,
	drop_gender=False,
	standardize=True,
	ohencode_country=True,
	savings_bin_val=1,
	pca_threshold=25,
	fit_intercept=True)
	>>> mp
	<PdPipeline -> LogisticRegression>
	>>> mp.estimator
	LogisticRegression()
	>>> mp.pipeline
	A pdpipe pipeline:
	[ 0] Drop columns Columns with at least 0.2 missing value rate
	[ 1] Drop rows by label values
	[ 2] Encode label values
	[ 3] Drop columns 'Name'
	[ 4] Apply dataframe method set_index with kwargs {'keys': 'id'}
	[ 5] Drop rows by qualifier <RowQualifier: Qualify rows with X[Savings] >
	101>
	[ 6] Assign column Viking with df[Country].isin(['Denmark', 'Finland']) &
	~df[Bearded]
	[ 7] Assign column YearlyGrands with df[Savings] * 1000 / df[Age]
	[ 8] Bin Savings by [1].
	[ 9] One-hot encode 'Country'
	[10] Tokenize Quote
	[11] Stemming tokens in Quote...
	[12] Remove stopwords from Quote
	[13] Count-vectorizing column Quote.
	[14] Decompose columns Columns that start with Quote with PCA
	[15] Encode 'Savings_bin', 'Gender'
	[16] Scale columns Columns of dtypes <class 'numpy.number'>
	[17] Drop columns 'Bearded'
	[18] Transform input dataframes to the following schema: <Learnable Schema>
	[19] Validates conditions