Skip to content

Instantly share code, notes, and snippets.

@ramnathv
Last active September 2, 2022 14:31
Show Gist options
  • Save ramnathv/10012123 to your computer and use it in GitHub Desktop.
Save ramnathv/10012123 to your computer and use it in GitHub Desktop.
Rmd to IPython Notebook and HTML

I had blogged earlier about a proof-of-concept for how to convert an R Markdown Notebook into an IPython Notebook. The idea was to allow R users to author reproducible documents in R Markdown, but be able to seamlessly convert into HTML or an IPython Notebook, which allows greater interactivity.

This is an update to the same post that makes use of the native R kernel for the IPython Notebook, being developed by @takluyver.

As noted earlier, the bulk of the work here is done by notedown, a python module that helps create IPython notebooks from markdown. The approach I have taken here is to convert the RMarkdown file into a Markdown file using a custom knitr script, and then using notedown to render it as an IPython notebook.

If you want to test this out, you will first need to install notedown

$ pip install notedown

The next step is to convert the Rmd source into html and ipynb. For the sake of convenience, I have added a Makefile that will do it all. Just run make all and it will create example.html using knit2html and example.ipynb using knitr and notedown. So you get two birds with one stone!

Now, to open the ipython notebook, you will need to install the native R kernel. Installation instructions can be found here. Once you have installed all dependencies, you can open the notebook from the command line using:

ipython notebook --KernelManager.kernel_cmd="['R', '-e', 'IRkernel::main(\'{connection_file}\')']"

The notebook will not contain any output, since I set eval = F while converting it. You will need to select Cell > Run All from the toolbar to automatically execute all cells and embed output. I am presuming there might be a mechanism in the IPython Notebook architecture to trigger this execution from the command line.

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>Introduction to ggplot2</title>
<!-- Styles for R syntax highlighter -->
<style type="text/css">
pre .operator,
pre .paren {
color: rgb(104, 118, 135)
}
pre .literal {
color: #990073
}
pre .number {
color: #099;
}
pre .comment {
color: #998;
font-style: italic
}
pre .keyword {
color: #900;
font-weight: bold
}
pre .identifier {
color: rgb(0, 0, 0);
}
pre .string {
color: #d14;
}
</style>
<!-- R syntax highlighter -->
<script type="text/javascript">
var hljs=new function(){function m(p){return p.replace(/&/gm,"&amp;").replace(/</gm,"&lt;")}function f(r,q,p){return RegExp(q,"m"+(r.cI?"i":"")+(p?"g":""))}function b(r){for(var p=0;p<r.childNodes.length;p++){var q=r.childNodes[p];if(q.nodeName=="CODE"){return q}if(!(q.nodeType==3&&q.nodeValue.match(/\s+/))){break}}}function h(t,s){var p="";for(var r=0;r<t.childNodes.length;r++){if(t.childNodes[r].nodeType==3){var q=t.childNodes[r].nodeValue;if(s){q=q.replace(/\n/g,"")}p+=q}else{if(t.childNodes[r].nodeName=="BR"){p+="\n"}else{p+=h(t.childNodes[r])}}}if(/MSIE [678]/.test(navigator.userAgent)){p=p.replace(/\r/g,"\n")}return p}function a(s){var r=s.className.split(/\s+/);r=r.concat(s.parentNode.className.split(/\s+/));for(var q=0;q<r.length;q++){var p=r[q].replace(/^language-/,"");if(e[p]){return p}}}function c(q){var p=[];(function(s,t){for(var r=0;r<s.childNodes.length;r++){if(s.childNodes[r].nodeType==3){t+=s.childNodes[r].nodeValue.length}else{if(s.childNodes[r].nodeName=="BR"){t+=1}else{if(s.childNodes[r].nodeType==1){p.push({event:"start",offset:t,node:s.childNodes[r]});t=arguments.callee(s.childNodes[r],t);p.push({event:"stop",offset:t,node:s.childNodes[r]})}}}}return t})(q,0);return p}function k(y,w,x){var q=0;var z="";var s=[];function u(){if(y.length&&w.length){if(y[0].offset!=w[0].offset){return(y[0].offset<w[0].offset)?y:w}else{return w[0].event=="start"?y:w}}else{return y.length?y:w}}function t(D){var A="<"+D.nodeName.toLowerCase();for(var B=0;B<D.attributes.length;B++){var C=D.attributes[B];A+=" "+C.nodeName.toLowerCase();if(C.value!==undefined&&C.value!==false&&C.value!==null){A+='="'+m(C.value)+'"'}}return A+">"}while(y.length||w.length){var v=u().splice(0,1)[0];z+=m(x.substr(q,v.offset-q));q=v.offset;if(v.event=="start"){z+=t(v.node);s.push(v.node)}else{if(v.event=="stop"){var p,r=s.length;do{r--;p=s[r];z+=("</"+p.nodeName.toLowerCase()+">")}while(p!=v.node);s.splice(r,1);while(r<s.length){z+=t(s[r]);r++}}}}return z+m(x.substr(q))}function j(){function q(x,y,v){if(x.compiled){return}var u;var s=[];if(x.k){x.lR=f(y,x.l||hljs.IR,true);for(var w in x.k){if(!x.k.hasOwnProperty(w)){continue}if(x.k[w] instanceof Object){u=x.k[w]}else{u=x.k;w="keyword"}for(var r in u){if(!u.hasOwnProperty(r)){continue}x.k[r]=[w,u[r]];s.push(r)}}}if(!v){if(x.bWK){x.b="\\b("+s.join("|")+")\\s"}x.bR=f(y,x.b?x.b:"\\B|\\b");if(!x.e&&!x.eW){x.e="\\B|\\b"}if(x.e){x.eR=f(y,x.e)}}if(x.i){x.iR=f(y,x.i)}if(x.r===undefined){x.r=1}if(!x.c){x.c=[]}x.compiled=true;for(var t=0;t<x.c.length;t++){if(x.c[t]=="self"){x.c[t]=x}q(x.c[t],y,false)}if(x.starts){q(x.starts,y,false)}}for(var p in e){if(!e.hasOwnProperty(p)){continue}q(e[p].dM,e[p],true)}}function d(B,C){if(!j.called){j();j.called=true}function q(r,M){for(var L=0;L<M.c.length;L++){if((M.c[L].bR.exec(r)||[null])[0]==r){return M.c[L]}}}function v(L,r){if(D[L].e&&D[L].eR.test(r)){return 1}if(D[L].eW){var M=v(L-1,r);return M?M+1:0}return 0}function w(r,L){return L.i&&L.iR.test(r)}function K(N,O){var M=[];for(var L=0;L<N.c.length;L++){M.push(N.c[L].b)}var r=D.length-1;do{if(D[r].e){M.push(D[r].e)}r--}while(D[r+1].eW);if(N.i){M.push(N.i)}return f(O,M.join("|"),true)}function p(M,L){var N=D[D.length-1];if(!N.t){N.t=K(N,E)}N.t.lastIndex=L;var r=N.t.exec(M);return r?[M.substr(L,r.index-L),r[0],false]:[M.substr(L),"",true]}function z(N,r){var L=E.cI?r[0].toLowerCase():r[0];var M=N.k[L];if(M&&M instanceof Array){return M}return false}function F(L,P){L=m(L);if(!P.k){return L}var r="";var O=0;P.lR.lastIndex=0;var M=P.lR.exec(L);while(M){r+=L.substr(O,M.index-O);var N=z(P,M);if(N){x+=N[1];r+='<span class="'+N[0]+'">'+M[0]+"</span>"}else{r+=M[0]}O=P.lR.lastIndex;M=P.lR.exec(L)}return r+L.substr(O,L.length-O)}function J(L,M){if(M.sL&&e[M.sL]){var r=d(M.sL,L);x+=r.keyword_count;return r.value}else{return F(L,M)}}function I(M,r){var L=M.cN?'<span class="'+M.cN+'">':"";if(M.rB){y+=L;M.buffer=""}else{if(M.eB){y+=m(r)+L;M.buffer=""}else{y+=L;M.buffer=r}}D.push(M);A+=M.r}function G(N,M,Q){var R=D[D.length-1];if(Q){y+=J(R.buffer+N,R);return false}var P=q(M,R);if(P){y+=J(R.buffer+N,R);I(P,M);return P.rB}var L=v(D.length-1,M);if(L){var O=R.cN?"</span>":"";if(R.rE){y+=J(R.buffer+N,R)+O}else{if(R.eE){y+=J(R.buffer+N,R)+O+m(M)}else{y+=J(R.buffer+N+M,R)+O}}while(L>1){O=D[D.length-2].cN?"</span>":"";y+=O;L--;D.length--}var r=D[D.length-1];D.length--;D[D.length-1].buffer="";if(r.starts){I(r.starts,"")}return R.rE}if(w(M,R)){throw"Illegal"}}var E=e[B];var D=[E.dM];var A=0;var x=0;var y="";try{var s,u=0;E.dM.buffer="";do{s=p(C,u);var t=G(s[0],s[1],s[2]);u+=s[0].length;if(!t){u+=s[1].length}}while(!s[2]);if(D.length>1){throw"Illegal"}return{r:A,keyword_count:x,value:y}}catch(H){if(H=="Illegal"){return{r:0,keyword_count:0,value:m(C)}}else{throw H}}}function g(t){var p={keyword_count:0,r:0,value:m(t)};var r=p;for(var q in e){if(!e.hasOwnProperty(q)){continue}var s=d(q,t);s.language=q;if(s.keyword_count+s.r>r.keyword_count+r.r){r=s}if(s.keyword_count+s.r>p.keyword_count+p.r){r=p;p=s}}if(r.language){p.second_best=r}return p}function i(r,q,p){if(q){r=r.replace(/^((<[^>]+>|\t)+)/gm,function(t,w,v,u){return w.replace(/\t/g,q)})}if(p){r=r.replace(/\n/g,"<br>")}return r}function n(t,w,r){var x=h(t,r);var v=a(t);var y,s;if(v){y=d(v,x)}else{return}var q=c(t);if(q.length){s=document.createElement("pre");s.innerHTML=y.value;y.value=k(q,c(s),x)}y.value=i(y.value,w,r);var u=t.className;if(!u.match("(\\s|^)(language-)?"+v+"(\\s|$)")){u=u?(u+" "+v):v}if(/MSIE [678]/.test(navigator.userAgent)&&t.tagName=="CODE"&&t.parentNode.tagName=="PRE"){s=t.parentNode;var p=document.createElement("div");p.innerHTML="<pre><code>"+y.value+"</code></pre>";t=p.firstChild.firstChild;p.firstChild.cN=s.cN;s.parentNode.replaceChild(p.firstChild,s)}else{t.innerHTML=y.value}t.className=u;t.result={language:v,kw:y.keyword_count,re:y.r};if(y.second_best){t.second_best={language:y.second_best.language,kw:y.second_best.keyword_count,re:y.second_best.r}}}function o(){if(o.called){return}o.called=true;var r=document.getElementsByTagName("pre");for(var p=0;p<r.length;p++){var q=b(r[p]);if(q){n(q,hljs.tabReplace)}}}function l(){if(window.addEventListener){window.addEventListener("DOMContentLoaded",o,false);window.addEventListener("load",o,false)}else{if(window.attachEvent){window.attachEvent("onload",o)}else{window.onload=o}}}var e={};this.LANGUAGES=e;this.highlight=d;this.highlightAuto=g;this.fixMarkup=i;this.highlightBlock=n;this.initHighlighting=o;this.initHighlightingOnLoad=l;this.IR="[a-zA-Z][a-zA-Z0-9_]*";this.UIR="[a-zA-Z_][a-zA-Z0-9_]*";this.NR="\\b\\d+(\\.\\d+)?";this.CNR="\\b(0[xX][a-fA-F0-9]+|(\\d+(\\.\\d*)?|\\.\\d+)([eE][-+]?\\d+)?)";this.BNR="\\b(0b[01]+)";this.RSR="!|!=|!==|%|%=|&|&&|&=|\\*|\\*=|\\+|\\+=|,|\\.|-|-=|/|/=|:|;|<|<<|<<=|<=|=|==|===|>|>=|>>|>>=|>>>|>>>=|\\?|\\[|\\{|\\(|\\^|\\^=|\\||\\|=|\\|\\||~";this.ER="(?![\\s\\S])";this.BE={b:"\\\\.",r:0};this.ASM={cN:"string",b:"'",e:"'",i:"\\n",c:[this.BE],r:0};this.QSM={cN:"string",b:'"',e:'"',i:"\\n",c:[this.BE],r:0};this.CLCM={cN:"comment",b:"//",e:"$"};this.CBLCLM={cN:"comment",b:"/\\*",e:"\\*/"};this.HCM={cN:"comment",b:"#",e:"$"};this.NM={cN:"number",b:this.NR,r:0};this.CNM={cN:"number",b:this.CNR,r:0};this.BNM={cN:"number",b:this.BNR,r:0};this.inherit=function(r,s){var p={};for(var q in r){p[q]=r[q]}if(s){for(var q in s){p[q]=s[q]}}return p}}();hljs.LANGUAGES.cpp=function(){var a={keyword:{"false":1,"int":1,"float":1,"while":1,"private":1,"char":1,"catch":1,"export":1,virtual:1,operator:2,sizeof:2,dynamic_cast:2,typedef:2,const_cast:2,"const":1,struct:1,"for":1,static_cast:2,union:1,namespace:1,unsigned:1,"long":1,"throw":1,"volatile":2,"static":1,"protected":1,bool:1,template:1,mutable:1,"if":1,"public":1,friend:2,"do":1,"return":1,"goto":1,auto:1,"void":2,"enum":1,"else":1,"break":1,"new":1,extern:1,using:1,"true":1,"class":1,asm:1,"case":1,typeid:1,"short":1,reinterpret_cast:2,"default":1,"double":1,register:1,explicit:1,signed:1,typename:1,"try":1,"this":1,"switch":1,"continue":1,wchar_t:1,inline:1,"delete":1,alignof:1,char16_t:1,char32_t:1,constexpr:1,decltype:1,noexcept:1,nullptr:1,static_assert:1,thread_local:1,restrict:1,_Bool:1,complex:1},built_in:{std:1,string:1,cin:1,cout:1,cerr:1,clog:1,stringstream:1,istringstream:1,ostringstream:1,auto_ptr:1,deque:1,list:1,queue:1,stack:1,vector:1,map:1,set:1,bitset:1,multiset:1,multimap:1,unordered_set:1,unordered_map:1,unordered_multiset:1,unordered_multimap:1,array:1,shared_ptr:1}};return{dM:{k:a,i:"</",c:[hljs.CLCM,hljs.CBLCLM,hljs.QSM,{cN:"string",b:"'\\\\?.",e:"'",i:"."},{cN:"number",b:"\\b(\\d+(\\.\\d*)?|\\.\\d+)(u|U|l|L|ul|UL|f|F)"},hljs.CNM,{cN:"preprocessor",b:"#",e:"$"},{cN:"stl_container",b:"\\b(deque|list|queue|stack|vector|map|set|bitset|multiset|multimap|unordered_map|unordered_set|unordered_multiset|unordered_multimap|array)\\s*<",e:">",k:a,r:10,c:["self"]}]}}}();hljs.LANGUAGES.r={dM:{c:[hljs.HCM,{cN:"number",b:"\\b0[xX][0-9a-fA-F]+[Li]?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+(?:[eE][+\\-]?\\d*)?L\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\b\\d+\\.(?!\\d)(?:i\\b)?",e:hljs.IMMEDIATE_RE,r:1},{cN:"number",b:"\\b\\d+(?:\\.\\d*)?(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"number",b:"\\.\\d+(?:[eE][+\\-]?\\d*)?i?\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"keyword",b:"(?:tryCatch|library|setGeneric|setGroupGeneric)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\.",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\.\\.\\d+(?![\\w.])",e:hljs.IMMEDIATE_RE,r:10},{cN:"keyword",b:"\\b(?:function)",e:hljs.IMMEDIATE_RE,r:2},{cN:"keyword",b:"(?:if|in|break|next|repeat|else|for|return|switch|while|try|stop|warning|require|attach|detach|source|setMethod|setClass)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"literal",b:"(?:NA|NA_integer_|NA_real_|NA_character_|NA_complex_)\\b",e:hljs.IMMEDIATE_RE,r:10},{cN:"literal",b:"(?:NULL|TRUE|FALSE|T|F|Inf|NaN)\\b",e:hljs.IMMEDIATE_RE,r:1},{cN:"identifier",b:"[a-zA-Z.][a-zA-Z0-9._]*\\b",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"<\\-(?!\\s*\\d)",e:hljs.IMMEDIATE_RE,r:2},{cN:"operator",b:"\\->|<\\-",e:hljs.IMMEDIATE_RE,r:1},{cN:"operator",b:"%%|~",e:hljs.IMMEDIATE_RE},{cN:"operator",b:">=|<=|==|!=|\\|\\||&&|=|\\+|\\-|\\*|/|\\^|>|<|!|&|\\||\\$|:",e:hljs.IMMEDIATE_RE,r:0},{cN:"operator",b:"%",e:"%",i:"\\n",r:1},{cN:"identifier",b:"`",e:"`",r:0},{cN:"string",b:'"',e:'"',c:[hljs.BE],r:0},{cN:"string",b:"'",e:"'",c:[hljs.BE],r:0},{cN:"paren",b:"[[({\\])}]",e:hljs.IMMEDIATE_RE,r:0}]}};
hljs.initHighlightingOnLoad();
</script>
<!-- MathJax scripts -->
<script type="text/javascript" src="https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
</script>
<style type="text/css">
body, td {
font-family: sans-serif;
background-color: white;
font-size: 13px;
}
body {
max-width: 800px;
margin: auto;
line-height: 20px;
}
tt, code, pre {
font-family: 'DejaVu Sans Mono', 'Droid Sans Mono', 'Lucida Console', Consolas, Monaco, monospace;
}
h1 {
font-size:2.2em;
}
h2 {
font-size:1.8em;
}
h3 {
font-size:1.4em;
}
h4 {
font-size:1.0em;
}
h5 {
font-size:0.9em;
}
h6 {
font-size:0.8em;
}
a:visited {
color: rgb(50%, 0%, 50%);
}
pre, img {
max-width: 100%;
}
pre code {
display: block; padding: 0.5em;
}
code {
font-size: 92%;
border: 1px solid #ccc;
}
code[class] {
background-color: #F8F8F8;
}
table, td, th {
border: none;
}
blockquote {
color:#666666;
margin:0;
padding-left: 1em;
border-left: 0.5em #EEE solid;
}
hr {
height: 0px;
border-bottom: none;
border-top-width: thin;
border-top-style: dotted;
border-top-color: #999999;
}
@media print {
* {
background: transparent !important;
color: black !important;
filter:none !important;
-ms-filter: none !important;
}
body {
font-size:12pt;
max-width:100%;
}
a, a:visited {
text-decoration: underline;
}
hr {
visibility: hidden;
page-break-before: always;
}
pre, blockquote {
padding-right: 1em;
page-break-inside: avoid;
}
tr, img {
page-break-inside: avoid;
}
img {
max-width: 100% !important;
}
@page :left {
margin: 15mm 20mm 15mm 10mm;
}
@page :right {
margin: 15mm 10mm 15mm 20mm;
}
p, h2, h3 {
orphans: 3; widows: 3;
}
h2, h3 {
page-break-after: avoid;
}
}
</style>
</head>
<body>
<h2>Introduction to ggplot2</h2>
<a href="example.Rmd">Download</a>
<p>This is a short demo on how to convert an R Markdown Notebook into an IPython Notebook using knitr and notedown.</p>
<p>This is an introduction to <a href="http://github.com/hadley/ggplot2">ggplot2</a>. You can view the source as an R Markdown document, if you are using an IDE like RStudio, or as an IPython notebook, thanks to <a href="https://github.com/aaren/notedown">notedown</a>.</p>
<p>We need to first make sure that we have <code>ggplot2</code> and its dependencies installed, using the <code>install.packages</code> function.</p>
<p>Now that we have it installed, we can get started by loading it into our workspace</p>
<pre><code class="r">library(ggplot2)
</code></pre>
<p>We are now fully set to try and create some amazing plots. </p>
<h4>Data</h4>
<p>We will use the ubiqutous <a href="http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/iris.html">iris</a> dataset.</p>
<pre><code class="r">head(iris)
</code></pre>
<pre><code>## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
</code></pre>
<h4>Simple Plot</h4>
<p>Let us create a simple scatterplot of <code>Sepal.Length</code> with <code>Petal.Length</code>.</p>
<pre><code class="r">ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point()
</code></pre>
<p><img src="" alt="plot of chunk unnamed-chunk-3"/> </p>
<p>The basic idea in <code>ggplot2</code> is to map different plot aesthetics to variables in the dataset. In this plot, we map the x-axis to the variable <code>Sepal.Length</code> and the y-axis to the variable <code>Petal.Length</code>.</p>
<h4>Add Color</h4>
<p>Now suppose, we want to color the points based on the <code>Species</code>. <code>ggplot2</code> makes it really easy, since all you need to do is map the aesthetic <code>color</code> to the variable <code>Species</code>.</p>
<pre><code class="r">ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point(aes(color = Species))
</code></pre>
<p><img src="" alt="plot of chunk unnamed-chunk-4"/> </p>
<p>Note that I could have included the color mapping right inside the <code>ggplot</code> line, in which case this mapping would have been applicable globally through all layers. If that doesn&#39;t make any sense to you right now, don&#39;t worry, as we will get there by the end of this tutorial.</p>
<h4>Add Line</h4>
<p>We are interested in the relationship between <code>Petal.Length</code> and <code>Sepal.Length</code>. So, let us fit a regression line through the scatterplot. Now, before you start thinking you need to run a <code>lm</code> command and gather the predictions using <code>predict</code>, I will ask you to stop right there and read the next line of code.</p>
<pre><code class="r">ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point() + geom_smooth(method = &quot;lm&quot;,
se = F)
</code></pre>
<p><img src="" alt="plot of chunk unnamed-chunk-5"/> </p>
<p>If you are like me when the first time I ran this, you might be thinking this is voodoo! I thought so too, but apparently it is not. It is the beauty of <code>ggplot2</code> and the underlying notion of grammar of graphics.</p>
<p>You can extend this idea further and have a regression line plotted for each <code>Species</code>.</p>
<pre><code class="r">ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) + geom_point() +
geom_smooth(method = &quot;lm&quot;, se = F)
</code></pre>
<p><img src="" alt="plot of chunk unnamed-chunk-6"/> </p>
</body>
</html>
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

Introduction to ggplot2

This is a short demo on how to convert an R Markdown Notebook into an IPython Notebook using knitr and notedown.

This is an introduction to ggplot2. You can view the source as an R Markdown document, if you are using an IDE like RStudio, or as an IPython notebook, thanks to notedown.

We need to first make sure that we have ggplot2 and its dependencies installed, using the install.packages function.

Now that we have it installed, we can get started by loading it into our workspace

library(ggplot2)

We are now fully set to try and create some amazing plots.

Data

We will use the ubiqutous iris dataset.

head(iris)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Simple Plot

Let us create a simple scatterplot of Sepal.Length with Petal.Length.

ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point()

plot of chunk unnamed-chunk-3

The basic idea in ggplot2 is to map different plot aesthetics to variables in the dataset. In this plot, we map the x-axis to the variable Sepal.Length and the y-axis to the variable Petal.Length.

Add Color

Now suppose, we want to color the points based on the Species. ggplot2 makes it really easy, since all you need to do is map the aesthetic color to the variable Species.

ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point(aes(color = Species))

plot of chunk unnamed-chunk-4

Note that I could have included the color mapping right inside the ggplot line, in which case this mapping would have been applicable globally through all layers. If that doesn't make any sense to you right now, don't worry, as we will get there by the end of this tutorial.

Add Line

We are interested in the relationship between Petal.Length and Sepal.Length. So, let us fit a regression line through the scatterplot. Now, before you start thinking you need to run a lm command and gather the predictions using predict, I will ask you to stop right there and read the next line of code.

ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) + geom_point() + geom_smooth(method = "lm", 
    se = F)

plot of chunk unnamed-chunk-5

If you are like me when the first time I ran this, you might be thinking this is voodoo! I thought so too, but apparently it is not. It is the beauty of ggplot2 and the underlying notion of grammar of graphics.

You can extend this idea further and have a regression line plotted for each Species.

ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) + geom_point() + 
    geom_smooth(method = "lm", se = F)

plot of chunk unnamed-chunk-6

## Introduction to ggplot2
[Download](example.Rmd)
This is a short demo on how to convert an R Markdown Notebook into an IPython Notebook using knitr and notedown.
This is an introduction to [ggplot2](http://github.com/hadley/ggplot2). You can view the source as an R Markdown document, if you are using an IDE like RStudio, or as an IPython notebook, thanks to [notedown](https://github.com/aaren/notedown).
We need to first make sure that we have `ggplot2` and its dependencies installed, using the `install.packages` function.
Now that we have it installed, we can get started by loading it into our workspace
```{r}
library(ggplot2)
```
We are now fully set to try and create some amazing plots.
#### Data
We will use the ubiqutous [iris](http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/iris.html) dataset.
```{r}
head(iris)
```
#### Simple Plot
Let us create a simple scatterplot of `Sepal.Length` with `Petal.Length`.
```{r}
ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point()
```
The basic idea in `ggplot2` is to map different plot aesthetics to variables in the dataset. In this plot, we map the x-axis to the variable `Sepal.Length` and the y-axis to the variable `Petal.Length`.
#### Add Color
Now suppose, we want to color the points based on the `Species`. `ggplot2` makes it really easy, since all you need to do is map the aesthetic `color` to the variable `Species`.
```{r}
ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point(aes(color = Species))
```
Note that I could have included the color mapping right inside the `ggplot` line, in which case this mapping would have been applicable globally through all layers. If that doesn't make any sense to you right now, don't worry, as we will get there by the end of this tutorial.
#### Add Line
We are interested in the relationship between `Petal.Length` and `Sepal.Length`. So, let us fit a regression line through the scatterplot. Now, before you start thinking you need to run a `lm` command and gather the predictions using `predict`, I will ask you to stop right there and read the next line of code.
```{r}
ggplot(iris, aes(x = Sepal.Length, y = Petal.Length)) +
geom_point() +
geom_smooth(method = 'lm', se = F)
```
If you are like me when the first time I ran this, you might be thinking this is voodoo! I thought so too, but apparently it is not. It is the beauty of `ggplot2` and the underlying notion of grammar of graphics.
You can extend this idea further and have a regression line plotted for each `Species`.
```{r}
ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species)) +
geom_point() +
geom_smooth(method = 'lm', se = F)
```
#!/usr/bin/Rscript
library(knitr)
opts_chunk$set(eval = FALSE, tidy = FALSE)
knit('example.Rmd', output = 'example.md')
all: example.ipynb example.html
example.md: example.Rmd
./knit
example.ipynb: example.md
notedown example.md | sed '/%%r/d' > example.ipynb
example.html: example.Rmd
R -e "knitr::knit2html('example.Rmd')"
@lwaldron
Copy link

Thanks, this approach worked for me! Although I had to make a couple changes:

  1. rmarkdown::render('example.Rmd') instead of knitr, and
  2. %%R (capital R) in the notedown command

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment