Skip to content

Instantly share code, notes, and snippets.

#!/usr/bin/env python
# -*- coding:utf8 -*-
from __future__ import unicode_literals
import codecs
import gzip
import re
import sys
import unicodedata
@hiropppe
hiropppe / wv_cosine_similarity_matrix_for_small_data.ipynb
Last active December 25, 2020 17:56
Gensim で作成した小さめの wv からコサイン類似度行列の作り方
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hiropppe
hiropppe / check_variance_to_identify_garbage_text.ipynb
Created December 25, 2020 17:50
文字や単語の分散が悪戯テキストの検出に使えそうなサンプル
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@hiropppe
hiropppe / nohup_with_time.sh
Created January 9, 2025 00:22
シェルメモ:nohup で実行するコマンドを time したいとき
bash -c "time (sleep 3 && echo \"zzz..\" && sleep 2 && echo \"woke up\")" 2>&1
$ sh ./test.sh
$ nohup sh ./test.sh &
$ nohup sh ./test.sh > test.log &
@hiropppe
hiropppe / split_tel_number.py
Created March 6, 2025 02:52
コードメモ:ハイフンなし電話番号の分割
TEL_DIGITS = {
"050": 4, # IP電話
"070": 4, # 携帯電話/PHS
"080": 4, # 携帯電話
"090": 4, # 携帯電話
# その他
"020": 3,
"0120": 3,
"0800": 3,
"0570": 3,