Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
There are two folders each containing its own README file:
FTP
Folder containing scripts to create the FTP structure with decompression,
quality check and checksum.
PIPELINE
Folder containing scripts to trim, clean and correct raw data, as well as
to expose the final files to assembly.
ftp/mkfolders.py
Integral part of the project. created the folder structure for the project
and a list of all the files in the expected places as well as their project
informatiom such as library, species, etc. as described in setup.py.
Saves the information in PROJECT_ROOT/project_description.csv
ftp/folderStruct.py
This script defines the folder structure of the project and some variables of
the data such as genome size, etc.
behavior.py
This script contains all the program's variables.
setup.json
Dump of folderStruct.py and behavior.py so that it can be read in parallel
and independently as well as to have a saved description of all parameters
used.
Last updated: Thu Sep 13 12:33:27 CEST 2012
Variables :
Qborder: 30 - calculates Q30
Contamination Cleaning:
454 : threshold = 85% identity
illumina: threshold = 95% identity
Genome Size : 950.000.000 = 950 M bases
Jellyfish:
both-strands
high : 300 - size of the graphic
mer-len : 19 - kmer length
Quake:
ratio: 800
Trim Fastq: (solexaQA Dynamic trim)
h: 20 - minimum phred quality
l: 30 - minimum sequence length