Skip to content

Instantly share code, notes, and snippets.

@vkurpad
Last active February 12, 2020 21:49
Show Gist options
  • Save vkurpad/b69599d7dd4837ea85bdb61a810b8362 to your computer and use it in GitHub Desktop.
Save vkurpad/b69599d7dd4837ea85bdb61a810b8362 to your computer and use it in GitHub Desktop.
Building an Azure Cognitive Search enrichment pipeline that supports rapid iterations
For this scenario we are going to extend the [km-aml solution accelerator](https://github.com/microsoft/solution-accelerator-km-aml) to use a custom corpus. The goal is to start with corpus of data.
1. Skim through the documents to identify a set of entities that should be recognized
2. Create a list of entities
3. Create a enrichment pipeline with a skill that takes in the list of entities and labels the text with IOB tags
4. Train a custom entity classifier on this labeled dataset
5. Update the enrichment pipeline to use the newly minted entity classifier
6. Reprocess the documents to now identify the labeled entities and other similar entities
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment