Skip to content

Instantly share code, notes, and snippets.

View lokeshh's full-sized avatar
🌴
On vacation

Lokesh Sharma lokeshh

🌴
On vacation
View GitHub Profile

(taken from GSOC proposal)

...

contrast_interact: This is there to code interaction terms. In a dataframe with columns ‘a’ and ‘b’, ‘a:b’ is an interaction term. Again we need to code this term to produce some number of variables. But in this case the coding is somewhat different. I’ll explain with an example how to code ‘a:b’ and one can generalize the behavior. Let’s say column ‘a’ has m categories and ‘b’ has n categories. Now if ‘a’ has been mentioned in our regression expression, then we will code the column ‘b’ with n-1 variables and similarly if ‘b’ has been mentioned in the regression expression, then we will code column ‘a’ with m-1 variables. And if ‘a’ hasn’t been mentioned in our regression expression then ‘b’ will be coded with n variables and similarly if ‘a’ hasn’t been mentioned in our regression expression then ‘b’ will be coded with m variables.

Here’s a general rule to follow when we have more than two way interaction. Say we have ‘a: b:c’ and we need to decide whether to code ‘a’ w

Implement Formula language and categorical variable support in Statsample::Regression

Below is how Statsample performs regression.

def self.multiple(ds,y_var, opts=Hash.new)
  missing_data= (opts[:missing_data].nil? ) ? :listwise : opts.delete(:missing_data)
  if missing_data==:pairwise
     Statsample::Regression::Multiple::RubyEngine.new(ds,y_var, opts)
 else
=begin
Usage:
To convert expressions involving '*', '/' and brackets to expressions only involving '+' and ':' use #from_formula
>> Formula.new.from_formula '(a+b):(c+d)'
=> "a:c+a:d+b:c+b:d"
>> Formula.new.from_formula 'a*b*c*d'
=> "a+b+a:b+c+a:c+b:c+a:b:c+d+a:d+b:d+a:b:d+c:d+a:c:d+b:c:d+a:b:c:d"
=end
class Formula
class Formula
  Formula class will be used by Statsample to parse a regression formula.
  It consist of two data members.
  attr_accessor :left_terms # This will store all the left terms. For example: ['y'] in case of 'y ~ a*b'
  attr_accessor :right_terms # This will store all the right terms. For example: ['a', 'b', ['a', 'b']]
  # in case of 'y ~ a*b'
  
  It will expose 'from_formula' function to the statsample to parse formulas
  For example to parse the formula 'y ~ a*b', Statsample will call the following function
module CategoricalData
def size
20
end
end
class Vector
def initialize type = nil
self.extend(CategoricalData) if type == :category
end
(venv) lokeshh:~/workspace/sciruby-notebooks (update_notebooks) $ gem uninstall statsample
You have requested to uninstall the gem:
statsample-2.0.1
sciruby-full-0.2.7 depends on statsample (~> 2.0)
statsample-bivariate-extension-1.2.0 depends on statsample (~> 2.0)
statsample-glm-0.2.1 depends on statsample (~> 2.0)
If you remove this gem, these dependencies will not be met.
Continue with Uninstall? [yN] y
<html lang='en'>
<head>
<title>Nyaplot</title>
<script src="http://cdnjs.cloudflare.com/ajax/libs/require.js/2.1.14/require.min.js"></script>
<script>if(window['d3'] === undefined ||
window['Nyaplot'] === undefined){
var path = {"d3":"https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min","downloadable":"http://cdn.rawgit.com/domitry/d3-downloadable/master/d3-downloadable"};
NoMethodError: undefined method `keys' for nil:NilClass
/var/lib/gems/2.1.0/gems/daru-0.1.2/lib/daru/index.rb:122:in `key'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/staff/dataset.rb:252:in `get_daru_columns'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/staff/dataset.rb:281:in `init_daru_vector'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/staff/dataset.rb:67:in `initialize'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/plot.rb:273:in `new'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/plot.rb:273:in `dataset_from_any'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/plot.rb:289:in `block in parse_datasets_array'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/plot.rb:289:in `map'
/var/lib/gems/2.1.0/gems/gnuplotrb-0.3.3/lib/gnuplotrb/plot.rb:289:in `parse_datasets_array'

Daru

Pull Requests [Merged]:

@lokeshh
lokeshh / Courses.md
Last active September 6, 2022 12:01
My courses

Below is a list of online MOOC's I've successfully completed:

Edx Courses:

Massive Open Online Courses (MOOC's) by MIT:

  • Introduction to Computer Sciend and Programming using Python [MITx - 6.00.1x] Certificate

  • Introduction to Computational Thinking and Data Science [MITx - 6.00.2x] Certificate