Skip to content

Instantly share code, notes, and snippets.

View Azure-rong's full-sized avatar

rongzhe Azure-rong

  • Sun Yat-Sen University
  • Guangzhou , China
View GitHub Profile
@Azure-rong
Azure-rong / similarity feature.py
Last active August 29, 2015 13:57
Feature extraction:Review tf-idf similarity score
#! /usr/bin/env python2.7
#coding=utf-8
"""
Compute editorial review and product review similarity feature.
This module use gensim to build review tf-idf model and compute similarity of every review and a given txt.
So this module need a excel file contain all reviews and a txt file contain editorial review as input data.
"""
@Azure-rong
Azure-rong / textprocessing.py
Last active August 29, 2015 13:57
Preprocessing:Read data and Chinese text processing
#! /usr/bin/env python2.7
#coding=utf-8
"""
Read data from excel file and txt file.
Chinese word segmentation, postagger, sentence cutting function.
"""
import xlrd