# KnitR

KnitR is package for the RStudio, which allows to produce word processing documents, PDF, presentations,... with real-time embedding of data. E.g. current stock exchange rates can fetched, analyse within R and dependent on the analysis phrases and results can be inject e.g. in the text. The package KnitR is often used within RStudio as a graphical user interface for calling commands and scripts for the underlying statistic software R (see Wikipedia:Knitr for details).

From the command line up to date reports can be generated automatically by processing a R-Markdown document and at processing time the current data sources (e.g. monitoring data) is evaluated in the statistical or numerical analysis.

If learners are able to see the R-Code in the learning document they can perform activities in the software for statistics on their own. Furthermore for research publications in the Wikiversity^{[1]} readers can

- reproduce the results,
- learn from the methodology,
- apply the R-code on their own data,
- check if the algorithm are appropriate for experimental design

## Task for LearnersEdit

- Install RStudio and the package KnitR and create and process your first KnitR-document.
- Explore the concept of Scientific Hackathon and explain why KnitR can be used as a development environment for decision support products!
- Analyse the COVID-19 outbreak and the requirements of dynamic updates in 2020. What are the requirements and constraints to create a dynamic report mechanism based on KnitR and R for the COVID-19 epidemics.

## Learning ModulesEdit

## Use of KnitREdit

We will explain the creation of a knitR document with a simple example document. We will look at each part of this simple document and explain the new features used here. A knitR document basically consists of two types of text:

- Text, which can be formatted with the R markup language, quite similar to the Mediawiki markup used to format wikiversity articles
- R code snippets, which consists of R code, which is executed if the document is rendered

The first part of our sample document looks like this

```
---
title: "Descriptive Statistics of 10000 dice rolls - a simple KnitR example"
author: "Martin Papke"
date: "22 August 2018"
output: pdf_document
---
```

At the start of every knitR document, we specify a title and an author. Moreover we have to tell the interpreter, in what form we want knitR to produce the output, here we create a PDF document. Other possible outputs include Word and HTML documents.

Next, we load the packages we need in an R code snippet. Code snippets are seperated by ``` from the text parts.

```
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(knitr)
library(readr)
library(dplyr)
library(ggplot2)
```
```

At the beginning of a code snippet, we have to specify in curly brackets the language used (here: R) and a name for the code snippet. We
can give extra options as `include=FALSE`

here, which prevents R from including this code snippet into the output document. Note
that per default code snippets are included.

```
# A simple KnitR example
## Data import
[...]
```

Headings are marked with `#`

and `##`

for level 1 and level 2 headings.

```
## Statistics
Now we can do some statistics
``` {r statistics}
dicemean <- mean(dice)
dicemedian <- median(dice)
```
So, the mean of our dice throws is $\bar x = `r dicemean`$ and the median is `r dicemedian`. We
know count the absolute frequencies of the dice results:
```{r statistics2}
dicetable <- table(dice)
```
```

What we see in this part is that LaTeX markup can be used in the text to present mathematical formulae. We
also say how to include a result of an R command into the document namely by ``r command``

. In this
way, e.g. the results of calculations can be part of our document, as it is shown with the mean and the median
above.

```
## Plots
In KnitR, plots can be done into the document, just call the usual R plot command
```{r plot}
xy <- data.frame(dicetable)
ggplot(data=xy, aes(x=dice, y=Freq)) + geom_bar(stat="identity")
```
```

Plots are simply put into the document, created by usual R code, as shown above.

### A statistical example - checking for independenceEdit

We will reuse our dice data to check if the even and the odd numbered throws are independent, see here. After the loading of the data as above, we use R's internal -test to check for independence, by inwoking

```
# test the contingency table
chi <- chisq.test(tbl)
```

Now we can check whether we have (high) significance of independence by looking at the $p$-value:

```
p <- chi$p.value
if (p < 0.01) {
"high significance for independene"
} else if (p < 0.05) {
"significance for independence"
} else {
"no significance for independence"
}
```

## Compilation of a knitR source fileEdit

A knitR file can be compiled into the specified output formats by running `rmarkdown::render(<FILENAME>)`

from the R console.

## Some external knitR tutorialsEdit

- Rstudio, an IDE to create knitR documents, containts also simple examples
- Karl Broman's minimal tutorial discusses the very basics of knitR, a very good point to get you started
- The knitR homepage does not only contain the source, but also some links and examples
- An introductional video on YouTube, which illustrates the generation of a knitR document using RStudio
- A nice short video to get you started in only five minutes

## Wiki to Markdown Conversion with PanDocEdit

The OpenSource tool PanDoc is called the "swiss army knife" of document conversion. Assume we have a KnitR document of a scientific paper that contains the KnitR code chunks for processing the data, that was analysed.

- converted the Markdown document of the paper with PanDoc-Online Converter in a MediaWiki document.
- Create a sample document with the knitr-package in RStudio and save the R-Markdown file with the extension
`Rmd`to your harddrive. - Copy the content of your R-Markdown document to PanDoc-Online Converter,
- select
`Markdown (pandoc)`as input format, - select
`MediaWiki`as output format, - press Convert-button and analyze the generated MediaWiki syntax of the text.

- Create a sample document with the knitr-package in RStudio and save the R-Markdown file with the extension
- The R-Code chunks for the analysis of the data (e.g. loaded from CSV file of spreadsheet document) is converted into a <code>-environment.
- This converted KnitR document is stored together with the scientific papers in the WikiJournal (e.g. WikiJournal of Medicine). If sampling of data was performed in the same way the application of the KnitR-document with the new data will be performed in the same algorithmic way. This KnitR-approach contributes to a workflow for Reproducible Science.

## Workflow for KnitR-Backend connected to Wikiversity/WikipediaEdit

The following workflow is **not implemented** in Wikiversity currently. With this wiki resource you learn about the workflow analogy between KnitR and future benefits of a KnitR-like implementation for scientific publishing of dynamically generated learning resources in Wikiversity:

==My Section== this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is <dyncalc language="R" src="https://r-backend.example.com"> x <- rnorm(100) y <- 3*x + rnorm(100) cor(x, y) </dyncal> Now we create a scatter plot of the data <dyncalc language="R" src="https://r-backend.example.com"> {r scatterplot, fig.width=8, fig.height=6} plot(x,y) </dyncal> Now we create the text output depending on the analysis of the data. <dyncalc language="R" src="https://r-backend.example.com"> if (cor(x, y) > 0) { print("The covariance of x and y is a positive number.") } </dyncal>

After processing the wikiversity document with R-backend specified the *src*-attribute, the result could be:

==My Section== this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 0.93612 Now we create a scatter plot of the data [[File:Scatterplot9923400384204.png]] Now we create the text output depending on the analysis of the data. The covariance of x and y is a positive number.

The proposed environment for dynamic calculations are placed in a tag-environment, just like the math-tag for mathematical expressions.

- The encapsuled content in the
**math**-tag is rendered for the output. - The encapsuled content in the
**dyncal**-tag is submitted to a backend for calculation or creating a scatterplot.

The encapsuled code the real R-code, that works in R or RStudio. Workflow for R/KnitR can be found in the R-Tutorial by K. Broman^{[2]}. The R-code creates to vectors with random numbers and calculates the covariance. The R-code should be processed for wikiversity pages after a code modification by default (Server load for the R-backend). Other options are, that the learner can download the source via an API and can create the KnitR offline on the mobile device. Wikiversity community will decide if this is an option to include in a learning environement to explore analysis of data and its interpretation.

- document language will be standard Wiki markup, also known as wikitext or wikicode, consists of the syntax and keywords used by the MediaWiki software to format a page (e.g. used in Wikipedia, Wikiversity, OLAT,...).
- R-Code chunks will be recognized and interpreted by a R-backend or a reference to a versioned R-script was inserted in wiki document/article, and any found reference will lead to
*Read*-update of the wiki article if data, script or document is updated. Diagrams are e.g. still PNG files in Wikimedia that are imported in a standard way most authors of the wiki community will know. The difference between a standard Wiki document and wiki document with R-Code chunks is, that any update of data or update of script will call the R-script again and a new version of output (diagrams as PNG files, number, dynamic text elements) are created. This concept is used basically for mathematical formulas in MediaWiki by TexVC^{[3]}resp. the Math Extension for MediaWiki^{[4]}as well. The LaTeX sources are parsed and converted into images, MathML, that can be displayed in a browser. - SageMath is another potential candidate as backend, perform the numerical and statistical analysis "on the fly" in a learning resource. The benefits are tremendous especially when learners and authors too, because a learning task can be performed in SageMath by the learner and the diagrams and maps for recents events can be visible in the document without the to update diagrams/figure statistical results with the most recent data again and again. The available software package within SageMath is huge and R is one package among the SageMath package list.
- PanDoc could used to convert wiki code to markdown for processing with the KnitR package.

## Learning TaskEdit

In the previous section the workflow of a integrated approach of KnitR was elaborated. Due to the fact that this concept is not implemented yet as extension in MediaWiki yet, the workflow cannot performed with code chunk for mathematical calculations in the MediaWiki of Wikiversity directly. But it possible to learn about the workflow in general:

- Install
**R-Studio**and R on your computer. - Install the
**KnitR-Package**in R-Studio. - Learn about
**KnitR with Screencasts**(Youtube) and perform a basic KnitR tutorial so that you get the first dynamic report.**Video - Learn Knitr in 5min (Youtube)**by Ram Narasimhan - retrieved 2016/10/11**Video - Professional Report Writing with Sweave, LaTeX, and R (Youtube)**by Nicolas Yager - retrieved 2016/10/11**knitr: Automatically embedding R output in documents (Youtube)**by Joshua Wiley 2013/10/14

**Advanced Learners/Spatial Analysis:**Apply a scenario in the Risk Management Module for creating Risk Maps. What are additional requirements for the spatial analysis?- learn about the design of Geographic Information Systems (GIS) (Classification Resource by Dave Braunschweig),
- install Open Source Quantum GIS
^{[5]}and explore the QGIS manual and try to match the general element of GIS with features offered in QGIS manual (Furthermore explore QGIS ScreenCasts on Youtube according to level of expertise in GIS). - extract data from GIS database process spatial data with R-script,
- integrate generated maps and calculated R results in the KnitR document (Spatial Analysis with R Youtube published by
*Domino Data Lab*2016/02).

**(Jupyter)**Analyse the Open Source framework Jupyter and analyse the difference and similarities between KnitR and Jupyter. When will you use KnitR and when would you use Jupyter.

## See alsoEdit

- The R Programming wikibook
- PanDocElectron converter for documents
- Educational Content Sink
- Reproducibility
- Dynamic Document Generation
- HTML5 Processing of MediaWiki Markdown Acticles - First milestone - access to MediaWiki source text in a HTML document with Javascript - example shows download of Wikiversity:Swarm Intelligence in a textarea of a HTML file.
- Scientific Hackathon
- Open Community Approach
- Open Source Science
- COVID-19/Mathematical Modelling - Use case for dynamic reporting during an epidemiological spread.

## ReferencesEdit

- ↑ WikiJournal of Medicine - An open access journal with no publication costs – About ISSN: 2002-4436 www.WikiJMed.org Frequency: Continuous Since: March 2014 Publisher: Wikimedia Foundation
- ↑ Karl Broman, KnitR in a Nutshell - (accessed 2017/08/14) - http://kbroman.org/knitr_knutshell/pages/Rmarkdown.html
- ↑ Schubotz, M. (2013). Making math searchable in Wikipedia. arXiv preprint arXiv:1304.5475.
- ↑ Schubotz, M., & Wicke, G. (2014). Mathoid: Robust, scalable, fast and accessible math rendering for wikipedia. In Intelligent Computer Mathematics (pp. 224-235). Springer, Cham.
- ↑ Quantum Geographic Information System (QGIS) - Open Source Software Package for Linux, Windows, Mac (2017) - LTR 2.18.11 access 2017/08/14 - https://www.qgis.org/en/site/forusers/download.html

## External linksEdit

- KnitR Official website
- Repository on GitHub
- Example code on GitHub
- knitr package on CRAN
- R Statistics and R Studio as Graphical User Interface for R.
- Video - Learn Knitr in 5min (Youtube) by Ram Narasimhan - retrieved 2016/10/11
- Video - Professional Report Writing with Sweave, LaTeX, and R (Youtube) by Nicolas Yager - retrieved 2016/10/11
- knitr: Automatically embedding R output in documents by Joshua Wiley 2013/10/14
- PanDocElectron/How to create a html presentation
- Wikipedia:Pandoc to create MarkDown input for KnitR
- PanDocElectron Installation Guide for Linux Windows and Mac (Linux is supported currently with binaries for Ubuntu/Mint/Debian only)
- Online Demo for Pandoc for plain text documents to create MarkDown input for KnitR from other formats
- Pandoc Official Website Support for document converter
- JuPyter - Similar concepts of KnitR-package in R with python interface.