CpGAVAS: Chloroplast Genome Annotation, Visualization, Analysis and GenBank Submission Tool.

The complete genome sequences of chloroplasts are important for the studies of phylogeny, RNA editing and divergence of species. Recent development of the next generation DNA sequencing technology makes the whole genome sequencing of chloroplast genomes technically feasible. And the number of completely sequenced chloroplast genomes has increased rapidly. With the generation of these whole genome sequences, conveniently and accurately annotating these chloroplast genome sequences becomes a limiting step for downstream analyses. To facilitate the rapid annotation of chloroplast genome, we developed CpGAVAS, which allows accurate genome annotation, the generation of circular chloroplast genome maps, the provision of useful analysis results of the annotated genome, the creation of files that can be submitted to GenBank directly. We hope CpGAVAS will become an indispensible tool for researchers study chloroplasts.

The Main Functions of CPGAVAS

1. "AnnotateGenome"- CPGAVAS can take in a completely sequenced genome and return three sets of results: a) the annotation results in GFF3 form at, b) a circular map for the annotation, and c) the basic analysis results of the genome. From here on, the user can download the GFF3 file and edit it using editing software such as Apollo.

2. "AnnotateGene" - A genome might have abnormal features, such as extremely short exons (6 to 9 bases long), trans-splicing genes, and others. The CPGAVAS genome annotation pipeline has not been able to consistently identify thes e features correctly. This page allows users to blast against particular genes in order to facilitate the identification of these features.

3. "ViewAnnotatioinResults"- This module allows the retrieval and examination of the annotation results.

4. "UpdateAnnotatioinResults"- The manually curated gene annotation information in GFF3 format file can be re-analyzed using this function. It will reproduce the circular map and the analysis results.

5. "QuickDraw" - The user can upload a file in GFF3 or tab-delimited form at and generate the circular map directly.

6. "PrepareDataBaseSubmission" - Following the instr uctions provided on this page, the user can generate the files for submission of the sequences to GenBank or EMBL.

7. "ExtractSeq" - After having annotated the newly sequenced genome, the features of the new genome can then be compared with those that have already bee n sequenced. This page allows a user to retrieve the sequences for a list of mit ochondria genes from a list of species.

The Overall Workflow of CPGAVAS

The input for CPGAVAS is a chloroplast DNA sequence and the output include the gene models in GFF3 format, circular map image, analysis results and files for GenBank submission. A workflow is shown below

workflow for CPGAVAS

Last updated: July 15th, 2016.
For questions and comments, please send email to or

Center for Bioinformatics
Institute of Medicinal Plant Development
PeKing Union Medical College
Chinese Academy of Medical Sciences
Address: No. 151, Malianwa North Road, Haidian District, Beijing 100093, P.R.China