This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### GENERATE WORD-FREQUENCY MATRICES | |
### author: Thiago Marzagao | |
### contact: marzagao ddott 1 at osu ddott edu | |
### supported encoding: UTF8 | |
### supported character sets: | |
### Basic Latin (Unicode 0-128) | |
### Latin 1 Suplement (Unicode 129-255) | |
### Latin Extended-A (Unicode 256-382) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### WORDSCORES (LBG-2003) | |
### author: Thiago Marzagao | |
### contact: marzagao ddott 1 at osu ddott edu | |
import os | |
import numpy as np | |
import pandas as pd | |
ipath = '/Users/username/inputdata/' # folder containing the CSV files | |
opath = '/Users/username/outputdata/' # folder where output will be saved |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### FIGHTIN' WORDS (MCQ-2008) | |
### author: Thiago Marzagao | |
### contact: marzagao ddott 1 at osu ddott edu | |
import os | |
import sys | |
import pandas as pd | |
import numpy as np | |
from numpy import matrix as m |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
import math | |
import pickle | |
import logging | |
import gensim | |
import numpy as np | |
import pandas as pd | |
from casenames import casenames | |
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level = logging.INFO, filename = 'output.log') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pickle | |
import gensim | |
import logging | |
import pandas as pd | |
from casenames import casenames | |
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level = logging.INFO, filename = 'output.log') | |
# set number of topics | |
num_topics = 50 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import numpy as np | |
import pandas as pd | |
from sklearn.ensemble import RandomForestRegressor | |
from sklearn.ensemble import ExtraTreesRegressor | |
from sklearn.tree import DecisionTreeRegressor | |
from sklearn.ensemble import AdaBoostRegressor | |
# set input path (path to LSA or LDA results) | |
ipath = '/home/ubuntu/results/lsa/results.csv' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
casenames = [ | |
'Afghanistan1992', | |
'Afghanistan1993', | |
'Afghanistan1994', | |
'Afghanistan1995', | |
'Afghanistan1996', | |
'Afghanistan1997', | |
'Afghanistan1998', | |
'Afghanistan1999', | |
'Afghanistan2000', |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import os | |
import time | |
import pickle | |
import numpy as np | |
import pandas as pd | |
# set paths | |
basepath = '/fs/lustre/osu6994/hdf5/' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* extracting variance estimates by state (to be used in R) | |
reg pt2 party lgdpcap bolsagdp rural illiteracy nonadequate AL AM AP BA CE DF ES GO MA MG MS MT PA PB PE PI PR RJ RN RO RR RS SC SE SP TO | |
predict double eps, residual | |
robvar eps, by(state) | |
by state, sort: egen sd_eps = sd(eps) | |
generate double gw_wt = 1/sd_eps^2 | |
tabstat sd_eps gw_wt, by(state) | |
* running initial diagnostics (obs.: failed; too many observations for spatwmat) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### preliminary stuff | |
setwd("/Users/thiagomarzagao/desktop/PROJECT") | |
library(foreign) | |
library(MASS) | |
library(car) | |
library(lmtest) | |
library(spdep) | |
library(sphet) | |
library(Matrix) | |
library(spgwr) |
OlderNewer