# Setting up ## Bioinformatics Tools We are going to be using the following tools throughout the course: - `fastqc` - `multiqc` - `trimmomatic` - `bwa` - `samtools` (version > 1.0) - `bcftools` (version > 1.0) The easiest way to install all of the above is through [`conda`](https://conda.io/miniconda.html) as follows: 1. Download and install the latest version of miniconda: ``` curl -O -L https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh bash Miniconda3-latest-Linux-x86_64.sh ``` 2. Add the correct channels in the conda (so that it knows where to look for packages): ``` conda config --add channels defaults conda config --add channels bioconda conda config --add channels conda-forge ``` 3. Install the required tools: ``` conda install -c bioconda fastqc multiqc trimmomatic bwa samtools bcftools ``` You might also want to download and install [`IGV`](https://software.broadinstitute.org/software/igv/download), as it may be useful in several visualizations. ## Base Tools Finally, the following tools are in all likelihood already installed in most major OSs, but you should also check for them: - `git` - `curl` - `gunzip` - `java` ## R / RStudio [R](http://www.r-project.org/) is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use [RStudio](http://www.rstudio.com/). **Windows** Install R by downloading and running the [correct installer file](http://cran.r-project.org/bin/windows/base/release.htm) from [CRAN](http://cran.r-project.org/index.html). Also, please install the [RStudio IDE](http://www.rstudio.com/ide/download/desktop). Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages. **Linux** You can download the binary files for your distribution from [CRAN](http://cran.r-project.org/index.html). Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please install the [RStudio IDE](http://www.rstudio.com/ide/download/desktop). ### Install the required packages We will install some specific packages as well as all packages that might be needed / useful. ``` install.packages(c("tidyverse", "ggplot2", "rafalib", "caret", "ggpubr", "GGally", "ROCR", "Boruta", "party", "earth", "mlbench", "glmnet", "e1071", "randomForest", "neuralnet"), dependencies=c("Depends", "Suggests") ); ``` Also install bioconductor, and some bioconductor-related libraries. ``` source("https://bioconductor.org/biocLite.R") biocLite() biocLite("factoextra") biocLite("fpc") biocLite("knitr") biocLite("kableExtra") biocLite("TxDb.Hsapiens.UCSC.hg38.knownGene") biocLite("BSgenome.Hsapiens.UCSC.hg38") biocLite("DiffBind") biocLite('MotifDb') biocLite('methylKit') biocLite('genomation') biocLite('ggplot2') biocLite('TxDb.Mmusculus.UCSC.mm10.knownGene') biocLite("AnnotationHub") biocLite("annotatr") biocLite("bsseq") biocLite("DSS") ``` That's it, all done! You have now all the tools in place!