--- title: "i think therefore..." author: "Pietà Schofield" date: "9 March 2015" output: ioslides_presentation: fig_caption: true fig_width: 10 fig_height: 5 wide: true highlight: kate css: presentation.css --- ```{r setup, echo=FALSE, warning=FALSE, message=FALSE, results="hide"} # # Include this so I can manage references in this document # require(RefManageR) BibOptions(check.entries = F, style = "markdown", cite.style = "authoryear", bib.style = "authoryear") bib <- ReadBib("/Users/pschofield/git_tree/biblio/bioinf.bib",check=FALSE) # # This little bit of trickery is rather complex knitr-foo thanks to Ramnath Vaidyanathan # # http://ramnathv.github.io/posts/verbatim-chunks-knitr/index.html # # It enables code chunks to be included with the surrounding brackets and options # for illustrative purposes # knitr::opts_chunk$set(tidy = F) knit_hooks$set(source = function(x, options){ if (!is.null(options$verbatim) && options$verbatim){ opts = gsub(",\\s*verbatim\\s*=\\s*TRUE\\s*", "", options$params.src) bef = sprintf('\n\n ```{r %s}\n', opts, "\n") stringr::str_c( bef, knitr:::indent_block(paste(x, collapse = '\n'), " "), "\n ```\n" ) } else { stringr::str_c("\n\n```", tolower(options$engine), "\n", paste(x, collapse = '\n'), "\n```\n\n" ) } }) # # a little bit of R from Yihui Xieto put in verbatim inline R # # http://stackoverflow.com/questions/20409172/how-to-display-verbatim-inline-r-code-with-backticks-using-rmarkdown # rinline <- function(code) { sprintf('``` `r %s` ```', code) } ``` # “non-reproducible single occurrences are of no significance to science” | `r TextCite(bib,"popper:1935")` Logik der Forschung ## *"The distinction between replication and reproducibility is, from what I understand, that* *'replicable' means 'other people get exactly the same results when doing exactly the same thing',* *while* *'reproducible' means 'something similar happens in other people's hands'.* *The latter is far stronger, in general, because it indicates that your results are not merely some quirk of your setup and may actually be right."* `r TextCite(bib, "brown:2015a")` ## *"Statisticians and computer scientists - if there is no code, there is no paper* *So I have a new policy when evaluating CV's of candidates for jobs, or when I'm reading a paper as a referee. If the paper is about a new statistical method or machine learning algorithm and there is no software available for that method - I simply mentally cross it off the CV. If I'm reading a data analysis and there isn't code that reproduces their analysis - I mentally cross it off."* `r TextCite(bib,"leek:2015a")` ## *Myth 3: We need new platforms for reproducible computational science.* *Engineers like building stuff. It sure is easier (and hence more fun, at least in the short term) than doing science. But what we need right now is scientists actually using stuff that already exists, not engineers building new stuff that no one will ever use.* *...to a first approximation, IPython Notebook and knitr have won.* `r TextCite(bib,"brown:2014a")` # Open Research ## Transparent scientific analysis - distributing analysis/code and data - Publishing data - public databases, repositories - ? (the data I work on isn't mine to share) - Publishing analysis - scripted analysis is simpler to distrubute than mouse clicks sequences - literate scripts, ipython-notebooks or **_knitr_** / **_sweave_** - make an R package `r TextCite(bib, c("wickham:2015","leek:2014"))` - post it on Git-Hub - Publishing results - open journals, [arXiv](http://arxiv.org), [figshare](http://figshare.com), [F1000](http://f1000research.com) # *"One of these days I'm gonna get organizized."* | [Bickle (1976)](https://www.youtube.com/watch?v=YP4hYtwGFlI) Taxi Driver ## - **Literate proramming** `r TextCite(bib,"knuth:1984")`, embed the code within the natural language description of the logic behind the code (cweb, noweb, **_knitr_**, ipython-notebooks) $$ versus $$ - **Documentation generation** structured comments embedded in the code are extracted to produce documentation (perldoc javadoc, sphynx, doxygen) ## *"Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do."* `r TextCite(bib,"knuth:1984")` # I R ![I R Baboon copyright owned by Cartoon Network ](I_R_Baboon.gif) # Package Development (**_devtools_**) ## **_devtools_** `r TextCite(bib, "wickhamchang:2015")` - what RStudio is doing in the background when you make a package - create a directory structure - creates and manages certain key config files - DESCRIPTION needs hand editing - NAMESPACE automatically managed - generate documentation from structured comments - **_roxygen_** - tools for automated testing - **_testthat_** - tools for generating vignettes (how-tos and tutorial documentation) ## ``` packagename | |- README |- DESCRIPTION |- NAMSPACE |- R | ` - This is where you commented R scripts live | |- man | ` - This is where the autogenerated help goes | |- tests | ` - This is where the teststructure and code goes | |- vignettes : ` - This is where the how-tos and tutorials go : :..scr `.. This is potentially where cpp (Rcpp) code would live ``` ## {.smaller} ```{r desc, echo=FALSE, comment=NA} writeLines(readLines("/Users/pschofield/git_hub/SpikeNorm/DESCRIPTION")) ``` ## {.smaller} ```{r subScript, echo=FALSE, comment=NA} writeLines(readLines("/Users/pschofield/git_hub/pietalib/R/subScript.R")) ``` ## edit your functions then rinse and repeat ``` # create the documentation from the roxygen comments in the R sources devtools::document() # load the package for testing devtools::load_all() # run the test scripts stored in the tests subdirectory devtools::test() # eventually install the packages so it can be used outwith the source directory devtools::install(reload=T) ``` # Literate Analysis (**_rmarkdown_**) ## **_knitr_**, `r TextCite(bib,"xie:2013")` embed the code for the analysis within a natural language description of the analysis. evolution of **_sweave_** - combined writing natural language in latex - with embedded chunks of R code **_knitr_** permits - latex, rmarkdown and html as the natural language format - output to Word, HTML, PDF - output as report, poster, presentation, interactive document ## **_rmarkdown_** file starts with a header ``` --- title: "how i R(oll)" author: "Pietà Schofield" date: "9 March 2015" output: ioslides_presentation: fig_caption: true fig_width: 10 fig_height: 7 wide: true css: presentation.css --- ``` ## **_rmarkdown_**, write natural language descriptive text in a Markdown dialect interspesed with chunks of R code for example to list the content of the SpikeNorm packages DESCRIPTION file ```{r, eval=FALSE, comment=NA, verbatim=TRUE} # # Specify the file to open # desFile <- "/Users/pschofield/git_hub/SpikeNorm/DESCRIPTION" # # read the file and write it to the stdio this will be sent to # the chuck output # writeLines(readLines(desFile)) # ``` ## or to generate and display a graph ```{r, fig.caption="Some Random Stuff", verbatim=TRUE, eval=FALSE} # # Plot anything R can plot # plot(1:10+rnorm(10),1:10+rnorm(10),pch="x", xlab="expected", ylab="measured",main="Demo Plot") # # One option is to just send it to the default device and knitr # captures it and put it in a temporary place # abline(0,1,col="red") # # alternatively write it to a file and link to the file in the markdown # ``` ## ```{r, fig.cap="Some Random Stuff", echo=FALSE} plot(1:10+rnorm(10),1:10+rnorm(10),pch="x", xlab="expected", ylab="measured",main="Demo Plot") abline(0,1,col="red") ``` ## **_knitr_** and hence **_rmarkdown_** interface with a program called **_pandoc_** by [MacFarlane](http://johnmacfarlane.net/pandoc/) **_pandoc_** will convert the markdown or latex generated by **_knitr_** into - HTML ( html_document, ioslides_presentation, slidy_presentation ) - PDF ( pdf_document, beamer_presentation ) - Word docx ( word_document ) (**NB:** **_pandoc_** is written in [haskell](https://www.haskell.org)! which makes it sort of cool in itself) # RStudio ## - [RStudio](http://ts-ug.lifesci.dundee.ac.uk:8787) makes this stuff very simple - literate analysis **_knitr_** (**_rmarkdown_**) - package creation (**_devtools_**) - Do as I say not as I do - it is all just R functions under the bonnet of RStudio - so if you are passionately (pathologically) addicted to vim (emac), there is always the [vim-R-plugin](http://www.vim.org/scripts/script.php?script_id=2628) ([ESS](http://ess.r-project.org)) - I believe eclipse has a plugin [StatET](http://www.walware.de/goto/statet) too ## {.flexbox .vcenter}
[Note to self do the RStudio demo bit here](http://ts-ug.lifesci.dundee.ac.uk:8787)
# Bibliography ## It is possible to include references from a bibtex library with **_knitcitations_** I prefer **_RefManageR_** `r TextCite(bib,"mclean:2014")` ```{r, eval=FALSE,verbatim=TRUE} # load the packsge require(RefManageR) mybibfile <- "/Users/pschofield/git_tree/biblio/bioinf.bib" # specify the bibliography options BibOptions(check.entries = FALSE, style = "markdown", cite.style = "authoryear", bib.style = "authoryear") # load the file bib <- ReadBib(mybibfile, check=FALSE) ``` Then you can include a citations with `r rinline('TextCite(bib,"refkey")')` in your text as type Finally you add a code chuck ```{r, eval=FALSE,verbatim=TRUE} PrintBibliography(bib) ``` ## Normally code chunks appear without the options and ticks ```{r, eval=FALSE} # # load the RefManageR package so I can have a central bib # file rather than it have to be in the same directory as # the markdown file # require(RefManageR) mybibfile <- "/Users/pschofield/git_tree/biblio/bioinf.bib" # # specify the bibliography options # BibOptions(check.entries = FALSE, style = "markdown", cite.style = "authoryear", bib.style = "authoryear") # bib <- ReadBib(mybibfile, check=FALSE) ``` but I have been showing the decorations for illustrative purposes Normally they are also syntax highlighted ## {.smaller} ```{r biblo, echo=FALSE, results="asis" } PrintBibliography(bib) ``` ## This is work in progress I hope I am getting better at it - do all my analysis in rmarkdown scripts - even script in rmarkdown to submit cluster jobs remotely via ssh - generate HTML pages in my public_html directory on the cluster [http://www.compbio.dundee.ac.uk/user/pschofield](http://www.compbio.dundee.ac.uk/user/pschofield), currently not all my pages are public as I don't own the data, you can access just ask - all my codes/scripts are in git (some on ningal some on git_hub) again you can have access - put all my useful little function in an R package - write my teaching materials and presentations in rmarkdown - produce [posters](http://www.compbio.dundee.ac.uk/user/pschofield/public/docs/rnaseq_poster.pdf) with latex/sweave **_knitr_** The code for this presentation can be found [here](http://www.compbio.dundee.ac.uk/user/pschofield/talks/gjb_lab/reproresear.Rmd) # Thank You. ## {.flexbox .vcenter} ![The image of I.R.Baboon is copyright to Cartoon Network](I_R_Baboon.gif) ```{r echo=FALSE,eval=FALSE} system(paste0("open -a /Applications/Opera.app ", rmarkdown::render("/Users/pschofield/git_tree/teaching/reproresear.Rmd", output_format="ioslides_presentation", output_dir="/homes/pschofield/public_html/talks/gjb_lab/") )) ```