This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
uvicorn[standard] | |
gunicorn | |
nltk | |
easynmt | |
sacremoses |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FROM amazon/aws-lambda-python:3.9 | |
COPY ./requirements.txt . | |
RUN yum -y install gcc-c++ | |
RUN pip install --no-cache-dir torch --extra-index-url https://download.pytorch.org/whl/cpu | |
RUN pip install --no-cache-dir -r requirements.txt | |
ENV HOME /tmp |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Your task is to find unique individuals in the given text. As a result, return an array of objects in the given format: | |
[ | |
{ | |
firstName: string, | |
lastName: string, | |
presumedGender: "male" | "female" | "unknown" | |
} | |
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"model":"gpt-4", | |
"messages":[ | |
{ | |
"role":"system", | |
"content":"Your task is to find unique individuals in the given text. As a result return an array of objects in given format: [{firstName: string, lastName: string, presumedGender: 'male' | 'female' | 'unknown'}] As an answer I expect only an array of objects." | |
}, | |
{ | |
"role":"user", | |
"content":"${textGoesHere}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ | |
{ | |
"firstName":"Lisa", | |
"lastName":"Turner", | |
"presumedGender":"female" | |
}, | |
{ | |
"firstName":"Mark", | |
"lastName":"Davis", | |
"presumedGender":"male" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ | |
{ | |
"text": "Lisa Turner", | |
"person": { | |
"firstName": "lisa", | |
"lastName": "turner", | |
"honorific": "", | |
"presumed_gender": "female" | |
} | |
}, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ | |
{ | |
"text": "Lisa Turner, Mark Davis", | |
"person": { | |
"firstName": "lisa turner", | |
"lastName": "mark davis", | |
"honorific": "", | |
"presumed_gender": "null" | |
} | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from flask import Flask, request | |
from datasets import load_dataset, Dataset | |
import json | |
from nltk.tokenize import sent_tokenize, word_tokenize | |
nlp = spacy.load("en_core_web_trf") | |
nlp.add_pipe("span_marker",config={"model": "lxyuan/span-marker-bert-base-multilingual-cased-multinerd"}) | |
app = Flask(__name__) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import spacy | |
nlp = spacy.load("en_core_web_trf") | |
nlp.add_pipe("span_marker", config={"model": "lxyuan/span-marker-bert-base-multilingual-cased-multinerd"}) | |
def extract_people(text: str): | |
entities = nlp(text) | |
full_names = set() | |
for entity in entities.ents: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from span_marker import SpanMarkerModel | |
modelPreTrained = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-mbert-base-multinerd") | |
modelPreTrained.try_cuda() | |
def extract_people(text:str): | |
entities = modelPreTrained.predict(text) | |
full_names = set() | |
for entity in entities.ents: | |
if entity['label'] == 'PER': | |
# Check if the entity has both a first name and a last name |