Skip to content

Instantly share code, notes, and snippets.

@ikwattro
Created July 27, 2018 11:58
Show Gist options
  • Save ikwattro/c60dc59fff05c9529a0251b8b5588020 to your computer and use it in GitHub Desktop.
Save ikwattro/c60dc59fff05c9529a0251b8b5588020 to your computer and use it in GitHub Desktop.

Create an NLP pipeline

CALL ga.nlp.processor.addPipeline({
name:"transcript", 
textProcessor: 'com.graphaware.nlp.processor.stanford.StanfordTextProcessor',
processingSteps: {tokenize:true, ner:true, dependencies:true}
})

Run the caption texts analysis

CALL apoc.periodic.iterate(
'MATCH (n:Caption) RETURN n', 
'CALL ga.nlp.annotate({
            text: n.text, 
            id: id(n), 
            pipeline: "transcript", 
            checkLanguage:false
}) 
YIELD result MERGE (n)-[:HAS_ANNOTATED_TEXT]->(result)', 
{batchSize:1, iterateList:false})

Extract best keywords from captions

CALL apoc.periodic.iterate(
'MATCH (n:Caption)-[:HAS_ANNOTATED_TEXT]->(at) RETURN at', 
'CALL ga.nlp.ml.textRank({
            annotatedText: at, 
            useDependencies: true
}) 
YIELD result RETURN count(*)', 
{batchSize:1, iterateList:false})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment