We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Neutral | |
| Neutral | |
| Neutral | |
| Positive | |
| Neutral | |
| Neutral | |
| Negative | |
| Neutral | |
| Neutral | |
| Neutral |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "embeddings": [ | |
| { | |
| "tensorName": "The soft VSM with non-regularized word embeddings on the TWITTER dataset", | |
| "tensorShape": [ | |
| 3108, | |
| 3 | |
| ], | |
| "tensorPath": "https://gist.githubusercontent.com/Witiko/860f86ca52c89ee97714371ac2a91a62/raw/8df9801310d78223e67520fad47ba2cc7db0ac2d/docsim-dense_scm-twitter-1-False-True-True-800--1.0-2-vectors.csv", | |
| "metadataPath": "https://gist.githubusercontent.com/Witiko/860f86ca52c89ee97714371ac2a91a62/raw/8df9801310d78223e67520fad47ba2cc7db0ac2d/docsim-dense_scm-twitter-1-False-True-True-800--1.0-2-metadata.csv" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/sh | |
| # Produces mean amount of financial support by extracting project codes from a PDF document and querying starfos.tacr.cz. | |
| # | |
| # Usage: ./get-mean-tacr-support.sh FILE, where FILE is a PDF document with a table of supported projects, such as | |
| # https://www.tacr.cz/wp-content/uploads/documents/2019/10/29/1572358378_Vyhlaseni_vysledku_eTA_na_web_-_podporene.pdf | |
| set -e | |
| pdfgrep TL[0-9]+ "$1" | | |
| sed -r 's/.*\s(TL[0-9]+)(\s.*|$)/\1/' | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # -*- coding:utf-8 -*- | |
| from itertools import dropwhile | |
| import json | |
| import re | |
| import sys | |
| import matplotlib.pyplot as plt | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def interpret_soft_cosine_measure(doc1, doc2, dictionary, similarity_matrix): | |
| word_pair_importances = dict() | |
| for word1_id, word1_weight in doc1: | |
| for word2_id, word2_weight in doc2: | |
| word_similarity = similarity_matrix.matrix[word1_id, word2_id] | |
| word_pair_importance = word1_weight * word_similarity * word2_weight | |
| if word_pair_importance == 0: | |
| continue | |
| word1 = dictionary.id2token[word1_id] | |
| word2 = dictionary.id2token[word2_id] |
OlderNewer