This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| ## run mongodb of files through SPECTER's API https://github.com/allenai/paper-embedding-public-apis | |
| from typing import Dict, List | |
| import json | |
| import requests | |
| URL = "https://model-apis.semanticscholar.org/specter/v1/invoke" | |
| MAX_BATCH_SIZE = 16 | |
| def chunks(lst, chunk_size=MAX_BATCH_SIZE): | |
| """Splits a longer list to respect batch size""" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| FROM python:3.7 | |
| EXPOSE 8501 | |
| WORKDIR /app | |
| COPY requirements.txt ./requirements.txt | |
| RUN pip3 install -r requirements.txt | |
| COPY . . | |
| CMD streamlit run app.py |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import bq_helper | |
| from bq_helper import BigQueryHelper | |
| # https://www.kaggle.com/sohier/introduction-to-the-bq-helper-package | |
| stackOverflow = bq_helper.BigQueryHelper(active_project="bigquery-public-data", | |
| dataset_name="stackoverflow") | |
| bq_assistant = BigQueryHelper("bigquery-public-data", "stackoverflow") | |
| bq_assistant.list_tables() | |
| # ['badges', | |
| # 'comments', |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(tidyverse) | |
| set.seed(123) | |
| returns <- read_csv("returns.csv") %>% | |
| select(Year, equities_sp, treasury_10yr) %>% | |
| gather(key = "Asset", value = "Returns", -Year) %>% | |
| mutate(Asset = ifelse(Asset=="equities_sp", | |
| "Asset A: High risk, high return", | |
| "Asset B: Low risk, low return")) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(httr) | |
| library(tidyverse) | |
| start_time <- Sys.time() | |
| getSimulation <- function(i,numSim=1000){ | |
| url <- "http://rw-simulation.herokuapp.com/get_returns_array?stock_array=" | |
| array = paste0("[",paste(rep(i, numSim), collapse=","),"]") | |
| f <- GET(paste0(url,array)) | |
| json = content(f, "text") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| file_output = 'data/emails-hits.jsonl' # name of output file | |
| (df_hits[['Message-ID','From','To','Subject','Body']] | |
| .rename(columns={"Body": "text", "Message-ID": "id", "From": "from", "To": "to", "Subject":"subject"}) # optional rename | |
| .groupby(['text']) # group_by text | |
| .apply(lambda x: x[['id','from','to','subject']].to_dict(orient='list')) #what columns to nest | |
| .reset_index() | |
| .rename(columns={0:'meta'}) # rename nested meta | |
| .to_json(file_output, orient='records',lines=True)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| library(rethinking) | |
| library(shiny) | |
| f <- alist( | |
| W ~ dbinom( W+L ,p) , # binomial likelihood | |
| p ~ dunif(0,1) # uniform prior | |
| ) | |
| ui <- fluidPage( | |
| titlePanel("Globe Tossing Problem"), |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import prodigy | |
| import requests | |
| def get_stream(): | |
| res = requests.get("https://owen-wilson-wow-api.herokuapp.com/wows/random?results=10").json() | |
| for i in res: | |
| movie = i["movie"] | |
| url = i["video"]["480p"] | |
| yield {"video": url, "text": movie} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from prodigy.components.loaders import JSONL | |
| import prodigy | |
| import matplotlib as mpl | |
| import spacy | |
| import shap # shap requires numba, which requires < numpy 1.22; you may need to downgrade numpy to 1.21.6 | |
| def predict(texts): | |
| """Convert list of text to bare strings and use textcat to predict""" | |
| texts = [str(text) for text in texts] | |
| results = [] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from prodigy.components.db import connect | |
| # pull examples from dataset | |
| db = connect() | |
| examples = db.get_dataset("textcat-samp") | |
| # modify change rejects to "not_" as accepts | |
| new_examples = [] | |
| for eg in examples: | |
| if eg["answer"] == "reject": |