Created
December 3, 2022 15:14
-
-
Save lambdaofgod/88a90e5fe64fb3ae760792d3cb95fd6c to your computer and use it in GitHub Desktop.
search app generated with
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Query: | |
| Write a gradio app that shows results of searching in a list of texts: | |
| - tokenizing texts with nltk | |
| - using rank_bm25 library | |
| - displaying results as dataframe under the search box | |
| Response: | |
| Here is an example of a gradio app that uses the rank_bm25 library to search through a list of texts, tokenizes the texts and queries using nltk, and shows the results as a dataframe under the search box: | |
| """ | |
| import gradio as gr | |
| from rank_bm25 import BM25Okapi | |
| import pandas as pd | |
| import nltk | |
| # list of texts to search through | |
| texts = ["Lorem ipsum dolor sit amet", | |
| "consectetur adipiscing elit", | |
| "sed do eiusmod tempor incididunt", | |
| "ut labore et dolore magna aliqua"] | |
| # create the BM25Okapi ranking model | |
| bm25 = BM25Okapi(texts) | |
| # function to search the texts using BM25Okapi and return the top matching results as a dataframe | |
| def search_texts(query): | |
| # tokenize the query using nltk | |
| query_tokens = nltk.word_tokenize(query) | |
| # use the BM25Okapi model to score the relevance of each text to the query | |
| scores = bm25.get_scores(query_tokens) | |
| # create a dataframe of the texts and their relevance scores | |
| df = pd.DataFrame({"Text": texts, "Score": scores}) | |
| # sort the dataframe by relevance score and return the top 3 results | |
| return df.sort_values(by="Score", ascending=False).head(3) | |
| # create the gradio app with a text input, a dataframe output, and a "search_texts" function | |
| app = gr.Interface(search_texts, [gr.inputs.Textbox(label="Search query")], gr.outputs.Dataframe(columns=["Text", "Score"]), live=True, |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment