psyyz10 · July 14, 2016 00:54
diff --git a/KeystoneML b/KeystoneML
 Transfer Baogang project to KeystoneML and Spark
 ====
 Contents
 --
 [TOC]
 
 Useful Links
 --
 [KeystoneML Source Code](https://github.com/amplab/keystone)
 [An KeystoneML Example](https://github.com/amplab/keystone-example)
 [Programming Guide](http://keystone-ml.org/programming_guide.html)

 Some Concepts
 --
 ### Piplelines
 A Pipeline is a dataflow that takes some input data and maps it to some output data through a series of `nodes`. 
 ```scala
 package workflow

 trait Pipeline[A, B] {
  // ...
  def apply(in: A): B
  def apply(in: RDD[A]): RDD[B]
  //...
  final def andThen[C](next: Pipeline[B, C]): Pipeline[A, C] = //...
 }
 ```
 ### Nodes
 Nodes come in two flavors: `Transformers` and `Estimators`. 
 #### Transformers
 It takes an input, and deterministically transforms it into an output
 ``` scala
 package workflow

 abstract class Transformer[A, B : ClassTag] extends TransformerNode[B] with Pipeline[A, B] {
  def apply(in: A): B
  def apply(in: RDD[A]): RDD[B] = in.map(apply)
  //...
 }
 ```
 #### Estimators
 That is `Estimator` takes in training data as an `RDD` to its `fit()` method, and outputs a Transformer
 ```scala
 package workflow

 abstract class Estimator[A, B] extends EstimatorNode {
  protected def fit(data: RDD[A]): Transformer[A, B]
  // ...
 }
 ```
 #### Chaining Nodes and Building Pipelines
 ```scala
 val labels: RDD[Vector[Double]] = //...
 val trainImages: RDD[Image] = //...

 val pipe = GrayScaler andThen 
  ImageVectorizer andThen 
  (LinearMapEstimator(), trainImages, trainLabels) andThen 
  MaxClassifier
 ```
 Some useful packages for our project: 
 ---
 Nodes
 : **images** nodes useful for image processing
 : **learning** extends Estimator, as estimators for learning process
 : **stats** nodes useful for statistics
 : **nlp**
 : **util** : provides some utility functions

 Loaders ： load data (extends transformer)
 : 

 Evaluation: some evaluation criterion
 : 

 Pipelines (the running program) 
 : 

 Utils

 :	images 
 	:  **ImageUtils** : some utility functions (we should write some functions in this class)
 	
 Pipelines
 : **images** : provides some examples for doing image classification pipelines


 Some Methods Implementation Tips
 --
 Fft2: breeze.signal.fourierTr
 Ifft2: breeze.signal.IfourierTr	
 Convolution: breeze.signal.convolve, however it can only support vector convolution. It has matrix signature, but only in a to do list.
 ConnectedComponentLabeler: Implement with using DFS 
 BinaryConverter: check if it is gray, if not, convert to gray then convert to binary.

 **Pipeline problem**:
 How can I the non-linear pipeline?
 For  example, If I have an image call ‘im: [Image]’, and I put it to a pipeline ‘p: Pipeline[Image, Seq[Box]]’, then the output should be boxes:Seq[Box]. The Box means the coordinate group I want to crop from the Image. Then I want to use apply another image crop transformer, which need the im. How can I pass the im to the imageCroper?  In other words, how can I add an edge from the initial node to the ImageCropper node?

 One way to accomplish this is to build your pipeline piecewise: 

 e.g.
 val data: RDD[Image] 
 val prefix: Pipeline[Image, Image] //Call this some preprocessing steps - I'm assuming this is the logic you want to avoid duplicating.
 val boxExtractor: Pipeline[Image, Seq[Box]] //This is your `p`.
 val imageCropper: Transformer[Image, Image]
 val pipe1 = prefix andThen boxExtractor
 val pipe2 = prefix andThen imageCropper

 val combinedPipeline: Pipeline[Image, Seq[Any]] = Pipeline.gather(Seq(pipe1, pipe2))

 The result of running combinedPipeline on an input image will be a Seq[Any] which in this case will be a sequence of size 2 where the elements are a Seq[Box], and a (cropped) Image.


 Pipeline Designe
 --

 ### KeystoneML Baogang Pipeline Graph：

 ![KeystoneML Baogang Pipeline Graph](https://lh3.googleusercontent.com/-M_RzGusBS7c/V4RCiVAFtqI/AAAAAAAAAEE/jAKmKxrGxbIPba2zo2-5GrEMDyeGeNxAQCLcB/s0/Baogang.PNG "Baogang.PNG")

 There are two pipelines in our project, in which, one is for training and the other one is for inference.

 For the training part,  the Pseudocode is shown in  Code 1.1 below. 

 **Code 1.1**
 ```scala
 object BaogangTraining extends Logging{ 
 	        def run(sc: SparkContext, config: BaogangConfig): Pipeline[Image, Int] = { 
                val numClasses = 2 
                val trainData = BaogangLoader(sc, config.trainLocation).cache() 
 
                val trainImages = ImageExtractor(trainData) 
                val labelExtractor = LabelExtractor andThen 
 		                       ClassLabelIndicatorsFromIntLabels(numClasses) andThen 
 		                       new Cacher[DenseVector[Double]] 
                val trainLabels = labelExtractor(trainData) 
 
                val predictor = ImageReScaler(a,a) andThen
                                GrayScaler andThen 
                                new Cacher[DenseVector[Double]] andThen 
                                ImageVectorizer andThen
                                // (new StandardScaler, trainImages) andThen 
                                ConvolutionalNormalization andThen
                                new Cacher[DenseVector[Double]] andThen 
                                (ConvolutionalTrainer(conf), trainImages, trainLabels) 
                 
               
                val testData = ImageExtractor(BaogangLoader(sc, config.trainLocation)).cache()           
                val processedTestImage = usefulImageExtractor.apply(testData)    
                val testPredicted = predictor(testParsedImgs)                                   
        } 
 } 
 ```
 It is easy to find that the wrokflow is:
 ImageExtractor -> ImageScalar -> GrayScaler -> ImageVectorizer -> ConvolutioanalNormalization （-> StandardScaler) -> ConvolutionalNormalization -> ConvolutionalTrainer

 The signature of the corresponding objects are shown in Code1.3.

 For the inference part,  the Pseudocode is shown in  Code 1.2 below. There are two sub-pipelines in the image processing part.

 **Code 1.2**
 ```scala
 object BaogangInferrence extends Logging{ 
        def run(sc: SparkContext, config: BaogangConfig): Pipeline[Image, Int] = { 
                val testImages = ImageExtractor(BaogangLoader(sc, config.testLocation)).cache()         
 
                val scrapProcessor = new Container(GrayScaler andThen 
                                new ImFilter("replicate", k1) andThen 
                                new ImFilter("replicate", k2) andThen 
                                new BinaryConvertor(0.1) andThen 
                                ConnectedComponentLabeler andThen
                                new UsefulBoxExtractor) andThen  
                                new ImageBoxCropper
 				
 
                val acidProcessor = new Container(GrayScaler andThen
                                ImageReScaler andThen
                                ImageMatrixizer andThen
                                FFT2 andThen
                                ToSaltMaper andThen
                                MatToGrayConvertor andThen
                                new UsefulBoxExtractor) andThen
                                new ImageBoxCropper


                val processedTestImage = usefulImageExtractor.apply(testImages)
 				val usefulImageExtractor = Pipeline.gather { scrapProcessor :: acidProcessor :: Nil} andThen Combiner 

 				val predictor = LoadBaogangPredictor()
                val predict = predictor(processedTestImage)
        }
 }
 ```
 The wrokflow is:
 p1 = ImageExtractor -> ImFilter("replicate", k1) -> ImFilter("replicate", k2) -> BinaryConvertor(0.1) -> ConnectedComponentLabeler ->BoxExtractor -> ImageBoxCropper
 p2 = GrayScaler ->  ImageMatrixizer -> FFT2 -> ToSaltMaper -> MatToGrayConvertor -> BoxExtractor -> ImageBoxCropper

 p1 + p2 -> LoadBaogangPredictor-> predict

 The signature of the corresponding objects are shown in Code1.3.

 **Code 1.3**
 ``` scala
 ImFilter(imName: String) extends Transformer[Image,Image]
 BinaryConvertor(threshold: Int) extends Transformer[Image,Image]
 ConnectedLabeler extends Transformer[Image, DenseMatrix[Double]]
 BoxExtractor extends  Transformer[DenseMatrix[Double], Seq[BoundingBox]] {
        overide apply(labelMatrix: DenseMatrix[Double]) =
                boundingBoxGroups(labelMatrix)

        def boundingBoxGroups(labelMatrix: DenseMatrix[Double]) : Seq[BoundingBox]
 }
 LengthFilter(length: Int) extends  Transformer[Seq[BoundingBox], Seq[BoundingBox]]
 ImageBoxCropper extends  Transformer[(Image, Seq[BoundingBox]), (Seq[Image], Seq[BoundingBox])] {
 	def apply(in: (Image, Seq[BoundingBox]): Seq[Image] = {
 		in => in._1.map(box => cropImage(box, in._2)) 
 	}

 	def cropImage(box: BoundingBox, image: Image) : Image
 }

 ImageReScaler extends  Transformer[Image,Image]
 ImageMatrixizer extends  Transformer[Image,DenseMatrix[Double]]
 case class FFT2 extends  Transformer[DenseMatrix[Double], DenseMatrix[Double]]
 SaltMaper extends  Transformer[DenseMatrix[Double], DenseMatrix[Double]]
 MatToGrayConvertor extends  Transformer[DenseMatrix[Double], Image]
 Combiner extends  Transformer[Seq[Seq[Image]], Image]
 class ContainerA(p1: Pipeline[A,B]) extends Pipeline{	
 	override def apply(in: RDD[A]) : RDD[(A,B)]={
 		return (in.zip(p1.apply(in)))
 	}
 }

 class ConvolutionalPredictor(Cofigurations ...) extends LabelEstimator[DenseVector[Double], DenseVector[Double], DenseVector[Double]]

 class Displayer extends Transformer[(BoundingBox, Image, Int), Unit]



 ```


diff --git a/keystoneml b/keystoneml
 Transfer Baogang project to KeystoneML and Spark
 ====
 Contents
 --
 [TOC]
 
 Useful Links
 --
 [KeystoneML Source Code](https://github.com/amplab/keystone)
 [An KeystoneML Example](https://github.com/amplab/keystone-example)
 [Programming Guide](http://keystone-ml.org/programming_guide.html)

 Some Concepts
 --
 ### Piplelines
 A Pipeline is a dataflow that takes some input data and maps it to some output data through a series of `nodes`. 
 ```scala
 package workflow

 trait Pipeline[A, B] {
  // ...
  def apply(in: A): B
  def apply(in: RDD[A]): RDD[B]
  //...
  final def andThen[C](next: Pipeline[B, C]): Pipeline[A, C] = //...
 }
 ```
 ### Nodes
 Nodes come in two flavors: `Transformers` and `Estimators`. 
 #### Transformers
 It takes an input, and deterministically transforms it into an output
 ``` scala
 package workflow

 abstract class Transformer[A, B : ClassTag] extends TransformerNode[B] with Pipeline[A, B] {
  def apply(in: A): B
  def apply(in: RDD[A]): RDD[B] = in.map(apply)
  //...
 }
 ```
 #### Estimators
 That is `Estimator` takes in training data as an `RDD` to its `fit()` method, and outputs a Transformer
 ```scala
 package workflow

 abstract class Estimator[A, B] extends EstimatorNode {
  protected def fit(data: RDD[A]): Transformer[A, B]
  // ...
 }
 ```
 #### Chaining Nodes and Building Pipelines
 ```scala
 val labels: RDD[Vector[Double]] = //...
 val trainImages: RDD[Image] = //...

 val pipe = GrayScaler andThen 
  ImageVectorizer andThen 
  (LinearMapEstimator(), trainImages, trainLabels) andThen 
  MaxClassifier
 ```
 Some useful packages for our project: 
 ---
 Nodes
 : **images** nodes useful for image processing
 : **learning** extends Estimator, as estimators for learning process
 : **stats** nodes useful for statistics
 : **nlp**
 : **util** : provides some utility functions

 Loaders ： load data (extends transformer)
 : 

 Evaluation: some evaluation criterion
 : 

 Pipelines (the running program) 
 : 

 Utils

 :	images 
 	:  **ImageUtils** : some utility functions (we should write some functions in this class)
 	
 Pipelines
 : **images** : provides some examples for doing image classification pipelines


 Some Methods Implementation Tips
 --
 Fft2: breeze.signal.fourierTr
 Ifft2: breeze.signal.IfourierTr	
 Convolution: breeze.signal.convolve, however it can only support vector convolution. It has matrix signature, but only in a to do list.
 ConnectedComponentLabeler: Implement with using DFS 
 BinaryConverter: check if it is gray, if not, convert to gray then convert to binary.

 **Pipeline problem**:
 How can I the non-linear pipeline?
 For  example, If I have an image call ‘im: [Image]’, and I put it to a pipeline ‘p: Pipeline[Image, Seq[Box]]’, then the output should be boxes:Seq[Box]. The Box means the coordinate group I want to crop from the Image. Then I want to use apply another image crop transformer, which need the im. How can I pass the im to the imageCroper?  In other words, how can I add an edge from the initial node to the ImageCropper node?

 One way to accomplish this is to build your pipeline piecewise: 

 e.g.
 val data: RDD[Image] 
 val prefix: Pipeline[Image, Image] //Call this some preprocessing steps - I'm assuming this is the logic you want to avoid duplicating.
 val boxExtractor: Pipeline[Image, Seq[Box]] //This is your `p`.
 val imageCropper: Transformer[Image, Image]
 val pipe1 = prefix andThen boxExtractor
 val pipe2 = prefix andThen imageCropper

 val combinedPipeline: Pipeline[Image, Seq[Any]] = Pipeline.gather(Seq(pipe1, pipe2))

 The result of running combinedPipeline on an input image will be a Seq[Any] which in this case will be a sequence of size 2 where the elements are a Seq[Box], and a (cropped) Image.


 Pipeline Designe
 --

 ### KeystoneML Baogang Pipeline Graph：

 ![KeystoneML Baogang Pipeline Graph](https://lh3.googleusercontent.com/-M_RzGusBS7c/V4RCiVAFtqI/AAAAAAAAAEE/jAKmKxrGxbIPba2zo2-5GrEMDyeGeNxAQCLcB/s0/Baogang.PNG "Baogang.PNG")

 There are two pipelines in our project, in which, one is for training and the other one is for inference.

 For the training part,  the Pseudocode is shown in  Code 1.1 below. 

 **Code 1.1**
 ```scala
 object BaogangTraining extends Logging{ 
 	        def run(sc: SparkContext, config: BaogangConfig): Pipeline[Image, Int] = { 
                val numClasses = 2 
                val trainData = BaogangLoader(sc, config.trainLocation).cache() 
 
                val trainImages = ImageExtractor(trainData) 
                val labelExtractor = LabelExtractor andThen 
 		                       ClassLabelIndicatorsFromIntLabels(numClasses) andThen 
 		                       new Cacher[DenseVector[Double]] 
                val trainLabels = labelExtractor(trainData) 
 
                val predictor = ImageReScaler(a,a) andThen
                                GrayScaler andThen 
                                new Cacher[DenseVector[Double]] andThen 
                                ImageVectorizer andThen
                                // (new StandardScaler, trainImages) andThen 
                                ConvolutionalNormalization andThen
                                new Cacher[DenseVector[Double]] andThen 
                                (ConvolutionalTrainer(conf), trainImages, trainLabels) 
                 
               
                val testData = ImageExtractor(BaogangLoader(sc, config.trainLocation)).cache()           
                val processedTestImage = usefulImageExtractor.apply(testData)    
                val testPredicted = predictor(testParsedImgs)                                   
        } 
 } 
 ```
 It is easy to find that the wrokflow is:
 ImageExtractor -> ImageScalar -> GrayScaler -> ImageVectorizer -> ConvolutioanalNormalization （-> StandardScaler) -> ConvolutionalNormalization -> ConvolutionalTrainer

 The signature of the corresponding objects are shown in Code1.3.

 For the inference part,  the Pseudocode is shown in  Code 1.2 below. There are two sub-pipelines in the image processing part.

 **Code 1.2**
 ```scala
 object BaogangInferrence extends Logging{ 
        def run(sc: SparkContext, config: BaogangConfig): Pipeline[Image, Int] = { 
                val testImages = ImageExtractor(BaogangLoader(sc, config.testLocation)).cache()         
 
                val scrapProcessor = new Container(GrayScaler andThen 
                                new ImFilter("replicate", k1) andThen 
                                new ImFilter("replicate", k2) andThen 
                                new BinaryConvertor(0.1) andThen 
                                ConnectedComponentLabeler andThen
                                new UsefulBoxExtractor) andThen  
                                new ImageBoxCropper
 				
 
                val acidProcessor = new Container(GrayScaler andThen
                                ImageReScaler andThen
                                ImageMatrixizer andThen
                                FFT2 andThen
                                ToSaltMaper andThen
                                MatToGrayConvertor andThen
                                new UsefulBoxExtractor) andThen
                                new ImageBoxCropper


                val processedTestImage = usefulImageExtractor.apply(testImages)
 				val usefulImageExtractor = Pipeline.gather { scrapProcessor :: acidProcessor :: Nil} andThen Combiner 

 				val predictor = LoadBaogangPredictor()
                val predict = predictor(processedTestImage)
        }
 }
 ```
 The wrokflow is:
 p1 = ImageExtractor -> ImFilter("replicate", k1) -> ImFilter("replicate", k2) -> BinaryConvertor(0.1) -> ConnectedComponentLabeler ->BoxExtractor -> ImageBoxCropper
 p2 = GrayScaler ->  ImageMatrixizer -> FFT2 -> ToSaltMaper -> MatToGrayConvertor -> BoxExtractor -> ImageBoxCropper

 p1 + p2 -> LoadBaogangPredictor-> predict

 The signature of the corresponding objects are shown in Code1.3.

 **Code 1.3**
 ``` scala
 ImFilter(imName: String) extends Transformer[Image,Image]
 BinaryConvertor(threshold: Int) extends Transformer[Image,Image]
 ConnectedLabeler extends Transformer[Image, DenseMatrix[Double]]
 BoxExtractor extends  Transformer[DenseMatrix[Double], Seq[BoundingBox]] {
        overide apply(labelMatrix: DenseMatrix[Double]) =
                boundingBoxGroups(labelMatrix)

        def boundingBoxGroups(labelMatrix: DenseMatrix[Double]) : Seq[BoundingBox]
 }
 LengthFilter(length: Int) extends  Transformer[Seq[BoundingBox], Seq[BoundingBox]]
 ImageBoxCropper extends  Transformer[(Image, Seq[BoundingBox]), (Seq[Image], Seq[BoundingBox])] {
 	def apply(in: (Image, Seq[BoundingBox]): Seq[Image] = {
 		in => in._1.map(box => cropImage(box, in._2)) 
 	}

 	def cropImage(box: BoundingBox, image: Image) : Image
 }

 ImageReScaler extends  Transformer[Image,Image]
 ImageMatrixizer extends  Transformer[Image,DenseMatrix[Double]]
 case class FFT2 extends  Transformer[DenseMatrix[Double], DenseMatrix[Double]]
 SaltMaper extends  Transformer[DenseMatrix[Double], DenseMatrix[Double]]
 MatToGrayConvertor extends  Transformer[DenseMatrix[Double], Image]
 Combiner extends  Transformer[Seq[Seq[Image]], Image]
 class ContainerA(p1: Pipeline[A,B]) extends Pipeline{	
 	override def apply(in: RDD[A]) : RDD[(A,B)]={
 		return (in.zip(p1.apply(in)))
 	}
 }

 class ConvolutionalPredictor(Cofigurations ...) extends LabelEstimator[DenseVector[Double], DenseVector[Double], DenseVector[Double]]

 class Displayer extends Transformer[(BoundingBox, Image, Int), Unit]



 ```


diff --git a/keystoneml.md b/keystoneml.md
	Transfer Baogang project to KeystoneML and Spark
	====
	Contents
	--
	[TOC]

	Useful Links
	--
	[KeystoneML Source Code](https://github.com/amplab/keystone)
	[An KeystoneML Example](https://github.com/amplab/keystone-example)
	[Programming Guide](http://keystone-ml.org/programming_guide.html)

	Some Concepts
	--
	### Piplelines
	A Pipeline is a dataflow that takes some input data and maps it to some output data through a series of `nodes`.
	```scala
	package workflow

	trait Pipeline[A, B] {
	// ...
	def apply(in: A): B
	def apply(in: RDD[A]): RDD[B]
	//...
	final def andThen[C](next: Pipeline[B, C]): Pipeline[A, C] = //...
	}
	```
	### Nodes
	Nodes come in two flavors: `Transformers` and `Estimators`.
	#### Transformers
	It takes an input, and deterministically transforms it into an output
	``` scala
	package workflow

	abstract class Transformer[A, B : ClassTag] extends TransformerNode[B] with Pipeline[A, B] {
	def apply(in: A): B
	def apply(in: RDD[A]): RDD[B] = in.map(apply)
	//...
	}
	```
	#### Estimators
	That is `Estimator` takes in training data as an `RDD` to its `fit()` method, and outputs a Transformer
	```scala
	package workflow

	abstract class Estimator[A, B] extends EstimatorNode {
	protected def fit(data: RDD[A]): Transformer[A, B]
	// ...
	}
	```
	#### Chaining Nodes and Building Pipelines
	```scala
	val labels: RDD[Vector[Double]] = //...
	val trainImages: RDD[Image] = //...

	val pipe = GrayScaler andThen
	ImageVectorizer andThen
	(LinearMapEstimator(), trainImages, trainLabels) andThen
	MaxClassifier
	```
	Some useful packages for our project:
	---
	Nodes
	: images nodes useful for image processing
	: learning extends Estimator, as estimators for learning process
	: stats nodes useful for statistics
	: nlp
	: util : provides some utility functions

	Loaders ： load data (extends transformer)
	:

	Evaluation: some evaluation criterion
	:

	Pipelines (the running program)
	:

	Utils

	: images
	: ImageUtils : some utility functions (we should write some functions in this class)

	Pipelines
	: images : provides some examples for doing image classification pipelines


	Some Methods Implementation Tips
	--
	Fft2: breeze.signal.fourierTr
	Ifft2: breeze.signal.IfourierTr
	Convolution: breeze.signal.convolve, however it can only support vector convolution. It has matrix signature, but only in a to do list.
	ConnectedComponentLabeler: Implement with using DFS
	BinaryConverter: check if it is gray, if not, convert to gray then convert to binary.

	Pipeline problem:
	How can I the non-linear pipeline?
	For example, If I have an image call ‘im: [Image]’, and I put it to a pipeline ‘p: Pipeline[Image, Seq[Box]]’, then the output should be boxes:Seq[Box]. The Box means the coordinate group I want to crop from the Image. Then I want to use apply another image crop transformer, which need the im. How can I pass the im to the imageCroper? In other words, how can I add an edge from the initial node to the ImageCropper node?

	One way to accomplish this is to build your pipeline piecewise:

	e.g.
	val data: RDD[Image]
	val prefix: Pipeline[Image, Image] //Call this some preprocessing steps - I'm assuming this is the logic you want to avoid duplicating.
	val boxExtractor: Pipeline[Image, Seq[Box]] //This is your `p`.
	val imageCropper: Transformer[Image, Image]
	val pipe1 = prefix andThen boxExtractor
	val pipe2 = prefix andThen imageCropper

	val combinedPipeline: Pipeline[Image, Seq[Any]] = Pipeline.gather(Seq(pipe1, pipe2))

	The result of running combinedPipeline on an input image will be a Seq[Any] which in this case will be a sequence of size 2 where the elements are a Seq[Box], and a (cropped) Image.


	Pipeline Designe
	--

	### KeystoneML Baogang Pipeline Graph：

	![KeystoneML Baogang Pipeline Graph](https://lh3.googleusercontent.com/-M_RzGusBS7c/V4RCiVAFtqI/AAAAAAAAAEE/jAKmKxrGxbIPba2zo2-5GrEMDyeGeNxAQCLcB/s0/Baogang.PNG "Baogang.PNG")

	There are two pipelines in our project, in which, one is for training and the other one is for inference.

	For the training part, the Pseudocode is shown in Code 1.1 below.

	Code 1.1
	```scala
	object BaogangTraining extends Logging{
	def run(sc: SparkContext, config: BaogangConfig): Pipeline[Image, Int] = {
	val numClasses = 2
	val trainData = BaogangLoader(sc, config.trainLocation).cache()

	val trainImages = ImageExtractor(trainData)
	val labelExtractor = LabelExtractor andThen
	ClassLabelIndicatorsFromIntLabels(numClasses) andThen
	new Cacher[DenseVector[Double]]
	val trainLabels = labelExtractor(trainData)

	val predictor = ImageReScaler(a,a) andThen
	GrayScaler andThen
	new Cacher[DenseVector[Double]] andThen
	ImageVectorizer andThen
	// (new StandardScaler, trainImages) andThen
	ConvolutionalNormalization andThen
	new Cacher[DenseVector[Double]] andThen
	(ConvolutionalTrainer(conf), trainImages, trainLabels)


	val testData = ImageExtractor(BaogangLoader(sc, config.trainLocation)).cache()
	val processedTestImage = usefulImageExtractor.apply(testData)
	val testPredicted = predictor(testParsedImgs)
	}
	}
	```
	It is easy to find that the wrokflow is:
	ImageExtractor -> ImageScalar -> GrayScaler -> ImageVectorizer -> ConvolutioanalNormalization （-> StandardScaler) -> ConvolutionalNormalization -> ConvolutionalTrainer

	The signature of the corresponding objects are shown in Code1.3.

	For the inference part, the Pseudocode is shown in Code 1.2 below. There are two sub-pipelines in the image processing part.

	Code 1.2
	```scala
	object BaogangInferrence extends Logging{
	def run(sc: SparkContext, config: BaogangConfig): Pipeline[Image, Int] = {
	val testImages = ImageExtractor(BaogangLoader(sc, config.testLocation)).cache()

	val scrapProcessor = new Container(GrayScaler andThen
	new ImFilter("replicate", k1) andThen
	new ImFilter("replicate", k2) andThen
	new BinaryConvertor(0.1) andThen
	ConnectedComponentLabeler andThen
	new UsefulBoxExtractor) andThen
	new ImageBoxCropper


	val acidProcessor = new Container(GrayScaler andThen
	ImageReScaler andThen
	ImageMatrixizer andThen
	FFT2 andThen
	ToSaltMaper andThen
	MatToGrayConvertor andThen
	new UsefulBoxExtractor) andThen
	new ImageBoxCropper


	val processedTestImage = usefulImageExtractor.apply(testImages)
	val usefulImageExtractor = Pipeline.gather { scrapProcessor :: acidProcessor :: Nil} andThen Combiner

	val predictor = LoadBaogangPredictor()
	val predict = predictor(processedTestImage)
	}
	}
	```
	The wrokflow is:
	p1 = ImageExtractor -> ImFilter("replicate", k1) -> ImFilter("replicate", k2) -> BinaryConvertor(0.1) -> ConnectedComponentLabeler ->BoxExtractor -> ImageBoxCropper
	p2 = GrayScaler -> ImageMatrixizer -> FFT2 -> ToSaltMaper -> MatToGrayConvertor -> BoxExtractor -> ImageBoxCropper

	p1 + p2 -> LoadBaogangPredictor-> predict

	The signature of the corresponding objects are shown in Code1.3.

	Code 1.3
	``` scala
	ImFilter(imName: String) extends Transformer[Image,Image]
	BinaryConvertor(threshold: Int) extends Transformer[Image,Image]
	ConnectedLabeler extends Transformer[Image, DenseMatrix[Double]]
	BoxExtractor extends Transformer[DenseMatrix[Double], Seq[BoundingBox]] {
	overide apply(labelMatrix: DenseMatrix[Double]) =
	boundingBoxGroups(labelMatrix)

	def boundingBoxGroups(labelMatrix: DenseMatrix[Double]) : Seq[BoundingBox]
	}
	LengthFilter(length: Int) extends Transformer[Seq[BoundingBox], Seq[BoundingBox]]
	ImageBoxCropper extends Transformer[(Image, Seq[BoundingBox]), (Seq[Image], Seq[BoundingBox])] {
	def apply(in: (Image, Seq[BoundingBox]): Seq[Image] = {
	in => in._1.map(box => cropImage(box, in._2))
	}

	def cropImage(box: BoundingBox, image: Image) : Image
	}

	ImageReScaler extends Transformer[Image,Image]
	ImageMatrixizer extends Transformer[Image,DenseMatrix[Double]]
	case class FFT2 extends Transformer[DenseMatrix[Double], DenseMatrix[Double]]
	SaltMaper extends Transformer[DenseMatrix[Double], DenseMatrix[Double]]
	MatToGrayConvertor extends Transformer[DenseMatrix[Double], Image]
	Combiner extends Transformer[Seq[Seq[Image]], Image]
	class ContainerA(p1: Pipeline[A,B]) extends Pipeline{
	override def apply(in: RDD[A]) : RDD[(A,B)]={
	return (in.zip(p1.apply(in)))
	}
	}

	class ConvolutionalPredictor(Cofigurations ...) extends LabelEstimator[DenseVector[Double], DenseVector[Double], DenseVector[Double]]

	class Displayer extends Transformer[(BoundingBox, Image, Int), Unit]



	```