This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # The [Anki repository](http://www.manythings.org/anki/) has a lot of sentence-pairs to learn a language and they are ideal to train a translation network. | |
| # To judge the quality of a translation it helps to understand a bit both languages so in my case | |
| # the [Dutch-English](http://www.manythings.org/anki/nld-eng.zip), | |
| # [French-English](http://www.manythings.org/anki/fra-eng.zip) | |
| # and [German-English](http://www.manythings.org/anki/deu-eng.zip) were perfect. | |
| import string | |
| import re | |
| from pickle import dump | |
| from unicodedata import normalize |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import argparse | |
| import matplotlib.pyplot as plt | |
| import numpy as np | |
| from keras.layers.core import Dense | |
| from keras.models import Sequential | |
| from numpy import array | |
| from scipy import signal | |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| '''Sequence to sequence example in Keras (character-level). | |
| This script demonstrates how to implement a basic character-level | |
| sequence-to-sequence model. We apply it to translating | |
| short English sentences into short French sentences, | |
| character-by-character. Note that it is fairly unusual to | |
| do character-level machine translation, as word-level | |
| models are more common in this domain. | |
| # Summary of the algorithm |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # http://nlpforhackers.io/training-ner-large-dataset/ | |
| # http://gmb.let.rug.nl/data.php | |
| import os | |
| from nltk import conlltags2tree | |
| def to_conll_iob(annotated_sentence): | |
| """ | |
| `annotated_sentence` = list of triplets [(w1, t1, iob1), ...] | |
| Transform a pseudo-IOB notation: O, PERSON, PERSON, O, O, LOCATION, O |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| var fs = require('fs'); | |
| var path = require('path'); | |
| var d3 = require('d3'); | |
| const jsdom = require("jsdom"); | |
| const JSDOM = jsdom.JSDOM; | |
| var chartWidth = 500, chartHeight = 500; | |
| var arc = d3.svg.arc() | |
| .outerRadius(chartWidth / 2 - 10) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| require(mxnet) | |
| batch.size = 32 | |
| seq.len = 32 | |
| num.hidden = 16 | |
| num.embed = 16 | |
| num.lstm.layer = 1 | |
| num.round = 1 | |
| learning.rate= 0.1 | |
| wd=0.00001 | |
| clip_gradient=1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from gensim.utils import simple_preprocess | |
| tokenize = lambda x: simple_preprocess(x) | |
| # tokenize("We can load the vocabulary from the JSON file, and generate a reverse mapping (from index to word, so that we can decode an encoded string if we want)?!") | |
| import os | |
| import json | |
| import numpy as np | |
| from gensim.models import Word2Vec |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.