vkurpad · February 12, 2020 21:49
diff --git a/Reprocessor b/Reprocessor
 For this scenario we are going to extend the [km-aml solution accelerator](https://github.com/microsoft/solution-accelerator-km-aml) to use a custom corpus. The goal is to start with  corpus of data.
 1. Skim through the documents to identify a set of entities that should be recognized
 2. Create a list of entities
 3. Create a enrichment pipeline with a skill that takes in the list of entities and labels the text with IOB tags
 4. Train a custom entity classifier on this labeled dataset
 5. Update the enrichment pipeline to use the newly minted entity classifier
 6. Reprocess the documents to now identify the labeled entities and other similar entities
	For this scenario we are going to extend the [km-aml solution accelerator](https://github.com/microsoft/solution-accelerator-km-aml) to use a custom corpus. The goal is to start with corpus of data.
	1. Skim through the documents to identify a set of entities that should be recognized
	2. Create a list of entities
	3. Create a enrichment pipeline with a skill that takes in the list of entities and labels the text with IOB tags
	4. Train a custom entity classifier on this labeled dataset
	5. Update the enrichment pipeline to use the newly minted entity classifier
	6. Reprocess the documents to now identify the labeled entities and other similar entities