This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 语料库在线 http://www.cncorpus.org | |
| n 名词 | |
| nt 时间名词 | |
| nd 方位名词 | |
| nl 处所名词 | |
| nh 人名 | |
| nhf 姓 | |
| nhg 名 | |
| nn 族名 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| http://zh.wikipedia.org/w/api.php?action=query&titles=%E8%B4%9D%E5%A1%9E%E5%B0%94%E6%9B%B2%E7%BA%BF&redirects=&converttitles=&prop=revisions&rvprop=content&format=json |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| DBC2SBC = (str, flag) -> | |
| result='' | |
| return no if str.length <= 0 | |
| for i in [0...str.length] | |
| str1=str.charCodeAt(i) | |
| if !flag | |
| if str1 < 127 | |
| result += String.fromCharCode str.charCodeAt(i) + 65248 | |
| else |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| POS = { | |
| "n": { #1. 名词 (1个一类,7个二类,5个三类) | |
| "n":"名词", | |
| "nr":"人名", | |
| "nr1":"汉语姓氏", | |
| "nr2":"汉语名字", | |
| "nrj":"日语人名", | |
| "nrf":"音译人名", | |
| "ns":"地名", | |
| "nsf":"音译地名", |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 台 臺 | |
| 啓 啟 | |
| 老板 老闆 | |
| 開髮 開發 | |
| 爲 為 | |
| 裏 裡 | |
| 衆 眾 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 'use strict' | |
| _ = require 'lodash' | |
| MIN_FLOAT = -3.14e100 | |
| Object::default = (prop, value) -> | |
| @[prop] = value unless @hasOwnProperty(prop) | |
| Object::getValue = (prop, value) -> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| logAdd = (x, y) -> | |
| maximum = Math.max x, y | |
| minimum = Math.min x, y | |
| return maximum if Math.abs(maximum - minimum) > 30 | |
| return maximum + Math.log 1 + Math.exp(maximum - minimum) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| obj = do (obj) -> | |
| res = {} | |
| keys = Object.keys(obj).sort (a, b) -> obj[b] - obj[a] | |
| res[name] = obj[name] for name in keys | |
| res |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| sum = (arr) -> arr.reduce ((a,b) -> a+b), 0 | |
| cmp = (x, y) -> (if x > y then 1 else (if x < y then -1 else 0)) | |
| keys = Object.keys(list).sort((a,b)-> list[b]-list[a]) # sort by object value DESC | |
| # initial array with default value | |
| _.range(3).map(function () { return 'a' }) | |
| # string to bigram array | |
| toBigrams = (str) -> | |
| oneGrams = str.split('') |
This file has been truncated, but you can view the full file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| 阿爸 a1'ba4 18137 | |
| 阿昌族 a1'chang1'zu2 50849 | |
| 阿斗 a1'dou3 42632 | |
| 阿飞 a1'fei1 48603 | |
| 阿富汗 a1'fu4'han4 3461 | |
| 阿訇 a1'hong1 34432 | |
| 阿拉伯数字 a1'la1'bo2'shu4'zi4 35937 | |
| 阿拉伯语 a1'la1'bo2'yu3 30476 | |
| 阿妈 a1'ma1 16220 |