KnitR is package for the RStudio, which allows to produce word processing documents, PDF, presentations,... with real-time embedding of data. E.g. current stock exchange rates can fetched, analyse within R and dependent on the analysis phrases and results can be inject e.g. in the text. The package KnitR is often used within RStudio as a graphical user interface for calling commands and scripts for the underlying statistic software R (see Wikipedia:Knitr for details).

From the command line up to date reports can be generated automatically by processing a R-Markdown document and at processing time the current data sources (e.g. monitoring data) is evaluated in the statistical or numerical analysis.

If learners are able to see the R-Code in the learning document they can perform activities in the software for statistics on their own. Furthermore for research publications in the Wikiversity[1] readers can

  • reproduce the results,
  • learn from the methodology,
  • apply the R-code on their own data,
  • check if the algorithm are appropriate for experimental design
Knitr integration.png

Task for LearnersEdit

  • Install RStudio and the package KnitR and create and process your first KnitR-document.
  • Explore the concept of Scientific Hackathon and explain why KnitR can be used as a development environment for decision support products!
  • Analyse the COVID-19 outbreak and the requirements of dynamic updates in 2020. What are the requirements and constraints to create a dynamic report mechanism based on KnitR and R for the COVID-19 epidemics.

Learning ModulesEdit

Use of KnitREdit

We will explain the creation of a knitR document with a simple example document. We will look at each part of this simple document and explain the new features used here. A knitR document basically consists of two types of text:

  • Text, which can be formatted with the R markup language, quite similar to the Mediawiki markup used to format wikiversity articles
  • R code snippets, which consists of R code, which is executed if the document is rendered

The first part of our sample document looks like this

title: "Descriptive Statistics of 10000 dice rolls - a simple KnitR example"
author: "Martin Papke"
date: "22 August 2018"
output: pdf_document

At the start of every knitR document, we specify a title and an author. Moreover we have to tell the interpreter, in what form we want knitR to produce the output, here we create a PDF document. Other possible outputs include Word and HTML documents.

Next, we load the packages we need in an R code snippet. Code snippets are seperated by ``` from the text parts.

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)

At the beginning of a code snippet, we have to specify in curly brackets the language used (here: R) and a name for the code snippet. We can give extra options as include=FALSE here, which prevents R from including this code snippet into the output document. Note that per default code snippets are included.

# A simple KnitR example

## Data import

Headings are marked with # and ## for level 1 and level 2 headings.

## Statistics

Now we can do some statistics 
``` {r statistics}
  dicemean <- mean(dice)
  dicemedian <- median(dice)
So, the mean of our dice throws is $\bar x = `r dicemean`$ and the median is `r dicemedian`. We 
know count the absolute frequencies of the dice results: 
```{r statistics2}
  dicetable <- table(dice)

What we see in this part is that LaTeX markup can be used in the text to present mathematical formulae. We also say how to include a result of an R command into the document namely by `r command`. In this way, e.g. the results of calculations can be part of our document, as it is shown with the mean and the median above.

## Plots
In KnitR, plots can be done into the document, just call the usual R plot command 
```{r plot}
  xy <- data.frame(dicetable)
  ggplot(data=xy, aes(x=dice, y=Freq)) + geom_bar(stat="identity")

Plots are simply put into the document, created by usual R code, as shown above.

A statistical example - checking for independenceEdit

We will reuse our dice data to check if the even and the odd numbered throws are independent, see here. After the loading of the data as above, we use R's internal  -test to check for independence, by inwoking

  # test the contingency table 
  chi <- chisq.test(tbl)

Now we can check whether we have (high) significance of independence by looking at the $p$-value:

  p <- chi$p.value
  if (p < 0.01) {
    "high significance for independene"
  } else if (p < 0.05) {
    "significance for independence"
  } else {
    "no significance for independence"

Compilation of a knitR source fileEdit

A knitR file can be compiled into the specified output formats by running rmarkdown::render(<FILENAME>) from the R console.

Some external knitR tutorialsEdit

Wiki to Markdown Conversion with PanDocEdit

The OpenSource tool PanDoc is called the "swiss army knife" of document conversion. Assume we have a KnitR document of a scientific paper that contains the KnitR code chunks for processing the data, that was analysed.

  • converted the Markdown document of the paper with PanDoc-Online Converter in a MediaWiki document.
    • Create a sample document with the knitr-package in RStudio and save the R-Markdown file with the extension Rmd to your harddrive.
    • Copy the content of your R-Markdown document to PanDoc-Online Converter,
    • select Markdown (pandoc) as input format,
    • select MediaWiki as output format,
    • press Convert-button and analyze the generated MediaWiki syntax of the text.
  • The R-Code chunks for the analysis of the data (e.g. loaded from CSV file of spreadsheet document) is converted into a <code>-environment.
  • This converted KnitR document is stored together with the scientific papers in the WikiJournal (e.g. WikiJournal of Medicine). If sampling of data was performed in the same way the application of the KnitR-document with the new data will be performed in the same algorithmic way. This KnitR-approach contributes to a workflow for Reproducible Science.

Workflow for KnitR-Backend connected to Wikiversity/WikipediaEdit

The following workflow is not implemented in Wikiversity currently. With this wiki resource you learn about the workflow analogy between KnitR and future benefits of a KnitR-like implementation for scientific publishing of dynamically generated learning resources in Wikiversity:

==My Section==
this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 
<dyncalc language="R" src="">
   x <- rnorm(100)
   y <- 3*x + rnorm(100)
   cor(x, y) 
Now we create a scatter plot of the data 
<dyncalc language="R" src="">
   {r scatterplot, fig.width=8, fig.height=6}
Now we create the text output depending on the analysis of the data.
<dyncalc language="R" src="">
  if (cor(x, y) > 0) {
    print("The covariance of x and y is a positive number.")

After processing the wikiversity document with R-backend specified the src-attribute, the result could be:

==My Section==
this is text in wikiversity. Now we calculate the covariance of two vectors x and y dynamically. The covariance is 0.93612
Now we create a scatter plot of the data 
Now we create the text output depending on the analysis of the data.
The covariance of x and y is a positive number.

The proposed environment for dynamic calculations are placed in a tag-environment, just like the math-tag for mathematical expressions.

  • The encapsuled content in the math-tag is rendered for the output.
  • The encapsuled content in the dyncal-tag is submitted to a backend for calculation or creating a scatterplot.

The encapsuled code the real R-code, that works in R or RStudio. Workflow for R/KnitR can be found in the R-Tutorial by K. Broman[2]. The R-code creates to vectors with random numbers and calculates the covariance. The R-code should be processed for wikiversity pages after a code modification by default (Server load for the R-backend). Other options are, that the learner can download the source via an API and can create the KnitR offline on the mobile device. Wikiversity community will decide if this is an option to include in a learning environement to explore analysis of data and its interpretation.

  • document language will be standard Wiki markup, also known as wikitext or wikicode, consists of the syntax and keywords used by the MediaWiki software to format a page (e.g. used in Wikipedia, Wikiversity, OLAT,...).
  • R-Code chunks will be recognized and interpreted by a R-backend or a reference to a versioned R-script was inserted in wiki document/article, and any found reference will lead to Read-update of the wiki article if data, script or document is updated. Diagrams are e.g. still PNG files in Wikimedia that are imported in a standard way most authors of the wiki community will know. The difference between a standard Wiki document and wiki document with R-Code chunks is, that any update of data or update of script will call the R-script again and a new version of output (diagrams as PNG files, number, dynamic text elements) are created. This concept is used basically for mathematical formulas in MediaWiki by TexVC[3] resp. the Math Extension for MediaWiki[4] as well. The LaTeX sources are parsed and converted into images, MathML, that can be displayed in a browser.
  • SageMath is another potential candidate as backend, perform the numerical and statistical analysis "on the fly" in a learning resource. The benefits are tremendous especially when learners and authors too, because a learning task can be performed in SageMath by the learner and the diagrams and maps for recents events can be visible in the document without the to update diagrams/figure statistical results with the most recent data again and again. The available software package within SageMath is huge and R is one package among the SageMath package list.
  • PanDoc could used to convert wiki code to markdown for processing with the KnitR package.

Learning TaskEdit

In the previous section the workflow of a integrated approach of KnitR was elaborated. Due to the fact that this concept is not implemented yet as extension in MediaWiki yet, the workflow cannot performed with code chunk for mathematical calculations in the MediaWiki of Wikiversity directly. But it possible to learn about the workflow in general:

See alsoEdit


  1. WikiJournal of Medicine - An open access journal with no publication costs – About ISSN: 2002-4436 Frequency: Continuous Since: March 2014 Publisher: Wikimedia Foundation
  2. Karl Broman, KnitR in a Nutshell - (accessed 2017/08/14) -
  3. Schubotz, M. (2013). Making math searchable in Wikipedia. arXiv preprint arXiv:1304.5475.
  4. Schubotz, M., & Wicke, G. (2014). Mathoid: Robust, scalable, fast and accessible math rendering for wikipedia. In Intelligent Computer Mathematics (pp. 224-235). Springer, Cham.
  5. Quantum Geographic Information System (QGIS) - Open Source Software Package for Linux, Windows, Mac (2017) - LTR 2.18.11 access 2017/08/14 -

External linksEdit