Skip to content

Instantly share code, notes, and snippets.

@timjurka
timjurka / cancer.R
Created October 1, 2012 03:43
How to classify breast cancer as benign or malignant using RTextTools.
# FILE: Classifying Breast Cancer as Benign or Malignant
# AUTHOR: Timothy P. Jurka
library(RTextTools);
# GET THE BREAST CANCER DATA FROM http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.names
data <- read.csv("http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data",header=FALSE)
data <- data[-1]
# ADD TEXTUAL DESCRIPTORS FOR EACH MASS CHARACTERISTIC FOR THE DOCUMENT-TERM MATRIX
@timjurka
timjurka / cancer_classification.R
Created October 17, 2012 20:28
Classifying Breast Cancer as Benign or Malignant Using RTextTools
library(RTextTools) # LOAD THE RTextTools PACKAGE
set.seed(95616) # SET THE SEED FOR REPLICABILITY
url <- "http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data"
data <- read.csv(url,header=FALSE) # GET THE BREAST CANCER DATA
data <- data[-1] # STRIP PATIENT IDs
diagnosis <- data[,10] # GET THE DEPENDENT VARIABLE: THE DIAGNOSIS
characteristics <- data[,1:9] # GET THE CHARACTERISTICS OF THE MASS
@timjurka
timjurka / cancer_classification.R
Created October 17, 2012 20:29
Classifying Breast Cancer as Benign or Malignant Using RTextTools
library(RTextTools) # LOAD THE RTextTools PACKAGE
set.seed(95616) # SET THE SEED FOR REPLICABILITY
url <- "http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data"
data <- read.csv(url,header=FALSE) # GET THE BREAST CANCER DATA
data <- data[-1] # STRIP PATIENT IDs
diagnosis <- data[,10] # GET THE DEPENDENT VARIABLE: THE DIAGNOSIS
characteristics <- data[,1:9] # GET THE CHARACTERISTICS OF THE MASS