Skip to content

Instantly share code, notes, and snippets.

@elyase
elyase / gist:6785287
Last active December 24, 2015 10:39
R vs Python comparison http://nbviewer.ipython.org/6785287
{
"metadata": {
"name": "Untitled0"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
{
"metadata": {
"name": "Scrape Victor"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
{
"metadata": {
"name": "stumbleupon"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
--------- survived ~ C(pclass) + C(title) + C(sex) + fare + age + sibsp + parch + C(embarked) ---------
Logistic Regression: 0.518866276531
Support Vector Machines: 0.554240545026
ExtraTree Classifier: 0.520438466242
Random Forest: 0.51729408682
Gradient Boosting: 0.516507991964
--------- survived ~ fare + C(pclass) + C(sex) ---------
train<-read.csv("C:/Users/vcomas/OIQ/Kaggle/Titanic/train.csv",colClasses=c("survived"="factor","pclass"="factor"))
str(train)
library("randomForest")
library(ipred)
randomForest(survived~fare+sex+pclass,data=train,ntree=100, keep.forest=FALSE)$err.rate[100]
errorest(survived~fare+sex+pclass+age,data=train,
model=randomForest, estimator = "cv", est.para=control.errorest(k=10), ntree=10, mtry=2)$err
examp1 <- "When discussing performance with colleagues, teaching, sending a bug report or searching for guidance on mailing lists and here on SO, a reproducible example is often asked and always helpful. What are your tips for creating an excellent example? How do you paste data structures from r in a text format? What other information should you include? Are there other tricks in addition to using dput(), dump() or structure()? When should you include library() or require() statements? Which reserved words should one avoid, in addition to c, df, data, etc? How does one make a great r reproducible example?"
examp2 <- "Sometimes the problem really isn't reproducible with a smaller piece of data, no matter how hard you try, and doesn't happen with synthetic data (although it's useful to show how you produced synthetic data sets that did not reproduce the problem, because it rules out some hypotheses). Posting the data to the web somewhere and providing a URL may be necessary. If the data can't be released to t
{
"metadata": {
"name": "Finance Programming Challenge"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@elyase
elyase / gist:5337846
Created April 8, 2013 15:48
Finance Programming Challenge
{
"metadata": {
"name": "Finance Programming Challenge"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
2011-01-10 16:00:00,1000000.0
2011-01-11 16:00:00,1000000.0
2011-01-12 16:00:00,1000000.0
2011-01-13 16:00:00,1004815.0
2011-01-14 16:00:00,1004815.0
2011-01-18 16:00:00,1004815.0
2011-01-19 16:00:00,1004815.0
2011-01-20 16:00:00,1004815.0
2011-01-21 16:00:00,1004815.0
2011-01-24 16:00:00,1004815.0
{
"metadata": {
"name": "Luis"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{