Skip to content
Snippets Groups Projects
Commit 59cfe677 authored by Menger, Nino's avatar Menger, Nino
Browse files

Made minor changes to the readme for readability.

parent 0230b4e6
No related branches found
No related tags found
No related merge requests found
......@@ -15,15 +15,15 @@ Please read 'Thesis/NinoMenger_s1098386_Basftu_Thesis_v1.0.0.pdf' for an full ov
## Run the pipeline
The pipeline is build to run on the Anunna cluster using the SLURM system to queue jobs. To run it the following steps must be followed:
###1- Download the entire project and place it on the Anunna cluster.
### 1- Download the entire project and place it on the Anunna cluster.
###2- Make sure your multi-sample VCF data is ready for use.
### 2- Make sure your multi-sample VCF data is ready for use.
The VCF data must contain multiple samples and must be seperated by chromosome over multiple VCF files.
Futhermore, a tab seperated annotation file must be present. This anootation file can contain a maximum of five columns, whereof the first one contains the sample identifiers. The other four columns can be used to classefy the samples as the user wishes. For the orginal project the following four calssification columns were used: species, domestication status, continental origin and specific origin. For the 'Multisample_VCFs_Sscrofa11.1' data set, which is used to create the pipeline, a script capeable of converting the orinial annotation file to a more orginised one is included.
###3- Setup the 'SnakefileConfig.yaml' file
### 3- Setup the 'SnakefileConfig.yaml' file
This file is used to give the user the possibility to setup the pipeline according to their wishes. For the 'Multisample_VCFs_Sscrofa11.1' data set, which is used to create the pipeline, all settings are already optimised. For other datasets, the variables listed down below have to be changed. The downloaded config file can be used as an example how the settings should look like.
......@@ -51,11 +51,11 @@ GROUPS: Define groups to seperatly visualise in plots. The user need to seperate
The column names do not refer to the column names in the annotation file, since those ar unnamed. However, the names are used in the plots. The following regex statement can be used to select all samples from a certain column: ".*". cat
###4- Check the the cluster config file: 'clusterConfig.yaml'.
### 4- Check the the cluster config file: 'clusterConfig.yaml'.
This file is used by the SLURM system to reserve resources, email about progress and write log files. The resources are already set in the downloaded file, however other QoL settings can be altered if so desired.
###5- Run the RUNME.sh
### 5- Run the RUNME.sh
Run the following commant do run the pipeline: bash RUNME.sh
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment