When working with big data which is certainly the case in the area of next generation sequencing (NGS), it is important to have a set of tools or a system that supports the user in managing and analysing the available data.

The R System

In statistical data analysis R has become very popular. The philosophy of R is similar to the one of Unix of building a system using small tools. Hence the base of R is relatively small. But that small base can easily be extended by a large number of packages. The Comprehensive R Archive Network CRAN is the main repository for packages extending the functionality of the R system.

Bioconductor

When working with data from Bio- or Life-Sciences, Bioconductor is a very valuable resource. Bioconductor does not only provide a large set of R packages but it does also offer standardized workflows and example datasets. In general Bioconductor documentation is provided by vignettes following the paradigm of reproducible research.

Why R

Because first of all R is very fast in prototyping and second R is easy to extend either by writing packages in R or by using its interfaces to other languages.

Dirk Edelbuettel explained why to use R in a Google Tech Talk

R Crash Course

Learning how to use a system like R is associated with a certain learning curve. Some people pretend that this curve is especially steep when learning how to use R.

As an introduction, I have put together some slides which I would use to introduce R to an audience without prior knowledge. In case you are interested you can read more here …