Commit 279b8995 authored by Smit, Sandra's avatar Smit, Sandra
Browse files

Updated README

parent b9974272
......@@ -56,26 +56,31 @@ Arguments is a list of key value pairs separated by whitespace.
**arguments:**
- *--database_path* or *-dp*: Path to the pangenome database.
- *--genomes-file* or *-gf*: a text file containing paths to FASTA files of genomes; each on a separate line.
- *--kmer-size* or *-ks*: the size of k-mers; if not given or out of range (6 <= K_SIZE <= 255), an optimal value would be calculated automatically.
- *--kmer-size* or *-ks*: the size of k-mers; if not given or out of range (6 <= K_SIZE <= 255), an optimal value would be calculated automatically.
- **build_panproteome** or **bp**: To build a pan-proteome out of a set of proteins.
**arguments:**
- *--database_path* or *-dp* : Path to the pangenome database.
- *--proteomes_file* or *-pf* : A text file containing paths to FASTA files of proteomes; each on a separate line.
- *--proteomes_file* or *-pf* : A text file containing paths to FASTA files of proteomes; each on a separate line.
- **add_genomes** or **ag**: To add new genomes to an available pan-genome.
**arguments:**
- *--database_path* or *-dp*: Path to the pangenome database.
- *--genomes-file* or *-gf*: a text file containing paths to FASTA files of genomes; each on a separate line.
- *--genomes-file* or *-gf*: a text file containing paths to FASTA files of genomes; each on a separate line.
- **add_annotations** or **aa**: To add new annotations to an available pan-genome.
**arguments:**
- *--database_path* or *-dp*: Path to the pangenome database.
- *--annotations-file* or *-af*: a text file of which each line contains a genome number and path to the corresponding GFF file separated by one space. Genomes are numbered in the same order they have been added to the pangenome. The protein sequence of the annotated genes will be also stored in the folder "proteins" in the same path as the pangenome.
- *--connect_annotations* or *-ca*: connect the annotated genomic features to the nodes of gDBG.
- *--connect_annotations* or *-ca*: connect the annotated genomic features to the nodes of gDBG.
- **retrieve_features** or **rf** : To retrieve the sequence of annotated features from the pan-genome. For each genome a FASTA file containing the retrieved features will be stored in the output path. For example, genes.1.fasta contains all the genes annotated in genome 1.
......@@ -83,35 +88,39 @@ Arguments is a list of key value pairs separated by whitespace.
- *--database_path* or *-dp* : Path to the pangenome database.
- *--output-path* or *-op* (**default value**: Database path determined by *-dp*) : Path to the output files.
- *--genome-numbers* or *-gn* : A text file containing genome_numbers for which the features will be retrieved.
- *--feature-type* or *-ft* (**default value**: gene) : The feature name; for example gene, mRNA, exon, tRNA, etc.
- *--feature-type* or *-ft* (**default value**: gene) : The feature name; for example gene, mRNA, exon, tRNA, etc.
- **retrieve_regions** or **rr**: To retrieve the sequence of some genomic regions from the pan-genome. The resulting FASTA files will be stored in the output path.
**arguments:**
- *--database_path* or *-dp*: Path to the pangenome database.
- *--regions-file* or *-rf*: a text file containing records with genome_number, sequence_number, begin and end positions separated by one space for each region. The resulting FASTA file would have the same name with an additional .fasta extention.
- *--regions-file* or *-rf*: a text file containing records with genome_number, sequence_number, begin and end positions separated by one space for each region. The resulting FASTA file would have the same name with an additional .fasta extention.
- **retrieve_genomes** or **rg**: To retrieve the full sequence of some genomes. The resulting FASTA files will be stored in the output path.
**arguments:**
- *--database_path* or *-dp* : Path to the pangenome database.
path to the pangenome database.
- *--genome-numbers* or *-gn*: a text file containing genome numbers to be retrieved in each line. The resulting FASTA files are named like Genome_x.fasta.
- *--genome-numbers* or *-gn*: a text file containing genome numbers to be retrieved in each line. The resulting FASTA files are named like Genome_x.fasta.
- **group** or **g** : To create homology groups in the protein space of the pangenome (panproteome). The resulting homology groups will be stored in the output path.
**arguments**:
- *--database_path* or *-dp* : Path to the pangenome database.
- *--intersection-rate* or *-ir* (**default valuue**: 0.09, **valid range**: [0.001..0.1]) : The fraction of k-mers needs to be shared by two intersecting proteins.
- *--intersection-rate* or *-ir* (**default value**: 0.09, **valid range**: [0.001..0.1]) : The fraction of k-mers needs to be shared by two intersecting proteins.
- *--min-protein-identity* or *-mpi* (**default value**: 95): the minimum similarity score. Should be in range [1-99].
- *--mcl-inflation* or *-mi* (**default value**: 9.6, **valid range**: (1..19)): The MCL inflation.
- *--contrast* or *-ct* (**default value**: 8, **valid range**: (0..10)) : The contrast factor.
- *--relaxation* or *rn* (**default value**: 1, **valid range**: [1..8]) : The relaxation in homology calls.
- *--threads-number* or *-tn* (**default value**: 1) : The number of parallel working threads.
- *--threads-number* or *-tn* (**default value**: 1) : The number of parallel working threads.
- **version** or **v**: To show the versions of PanTools and Neo4j.
- **version** or **v**: To show the versions of PanTools and Neo4j.
- **help** or **h**: To show the mannual of the tool.
## Visualization in the Neo4j browser
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment