This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Conor, Fernando, Davide + wisdom of the crowd solution for | |
# https://github.com/marjaimate/runlength | |
defmodule Runlength do | |
def encode(string) do | |
encode(string, "") | |
end | |
def encode("", acc) do | |
acc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
'<?xml version="1.0" encoding="UTF-8"?>\n<TEI xmlns="http://www.tei-c.org/ns/1.0" \nxmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" \nxsi:schemaLocation="http://www.tei-c.org/ns/1.0 /opt/grobid/grobid-home/schemas/xsd/Grobid.xsd"\n xmlns:xlink="http://www.w3.org/1999/xlink">\n\t<teiHeader xml:lang="fr">\n\t\t<encodingDesc>\n\t\t\t<appInfo>\n\t\t\t\t<application version="0.5.1-SNAPSHOT" ident="GROBID" when="2018-03-20T16:03+0000">\n\t\t\t\t\t<ref target="https://github.com/kermitt2/grobid">GROBID - A machine learning software for extracting information from scholarly documents</ref>\n\t\t\t\t</application>\n\t\t\t</appInfo>\n\t\t</encodingDesc>\n\t\t<fileDesc>\n\t\t\t<titleStmt>\n\t\t\t\t<title level="a" type="main">Manuscript Title Author Name 1 2</title>\n\t\t\t</titleStmt>\n\t\t\t<publicationStmt>\n\t\t\t\t<publisher/>\n\t\t\t\t<availability status="unknown"><licence/></availability>\n\t\t\t</publicationStmt>\n\t\t\t<sourceDesc>\n\t\t\t\t<biblStruct>\n\t\t\t\t\t<analytic>\n\t\t\t\t\t\t<title level="a" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Lorem ipsum dolor sit amet, consectetur adipiscing elit. In euismod nisl vel tortor dignissim porttitor. Cras vitae auctor diam, sed bibendum magna. Mauris ut ligula tempus, ullamcorper sapien rhoncus, venenatis nisi. Suspendisse scelerisque, dolor eget vestibulum iaculis, tellus turpis eleifend libero, eget scelerisque purus risus vel magna. Nunc sit amet leo luctus, placerat ligula sit amet, elementum ipsum. Mauris consequat nibh vitae erat posuere ultrices. Morbi quis sem ac nisi porta dapibus. Etiam scelerisque non orci non luctus. | |
Duis in risus quis nibh laoreet ornare ut in urna. Vivamus at turpis egestas, tempor nibh viverra, tristique metus. Aliquam interdum tristique sapien eu interdum. Sed non est efficitur, placerat turpis ultrices, tincidunt nibh. Phasellus nibh arcu, feugiat in arcu quis, egestas blandit velit. Morbi convallis feugiat magna. Etiam iaculis dui faucibus nisl congue efficitur. Aliquam erat volutpat. Proin vitae leo suscipit, lacinia nisi elementum, gravida diam. Suspendisse ornare, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# One can check contributions/edits at | |
# https://en.wikipedia.org/w/index.php?limit=50&title=Special%3AContributions&contribs=user | |
# profile pages are at | |
# https://en.wikipedia.org/wiki/User:USERNAME | |
SELECT contributor_username, COUNT(id) AS counts | |
FROM [bigquery-public-data:samples.wikipedia] | |
WHERE comment LIKE '%grammar%' | |
GROUP BY contributor_username | |
ORDER BY counts DESC LIMIT 10; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Suppose that we want to break | |
{Hello}, {World} | |
into | |
{Hello}, | |
{World} | |
do | |
= SUBSTITUTE(B2, "}, {", CONCATENATE("'} ", char(10), "{")) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import shutil | |
from sklearn.model_selection import train_test_split | |
cats = ['negative', 'positives'] | |
for cat in cats: | |
print(cat) | |
if not os.path.exists(data_folder + "/train/" + cat): | |
os.makedirs(data_folder + "/train/" + cat) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import prodigy | |
from prodigy.components.loaders import Images | |
from prodigy.util import split_string | |
def add_label_to_stream(stream, label): | |
for eg in stream: | |
# The 'label' you get from the command line is a list | |
# so let's just assume it's always one and take the first | |
eg["label"] = label[0] | |
yield eg |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
OlderNewer