A genome annotation pipeline for whole genome shotgun sequencing project
Lai, J. J., Hsieh, W.H., Tsai, C.W., Yang, C.T., Chen, Y.T., Hsu Y.H., Huang, Y.H., Fu J. L., Liu Y. F., Chang Y.C., and Yang U.C.
Bioinformatics Research Center, Bioinformatics Program, Institute of Biochemistry, Dept. of Life Sciences, National Yang-Ming University Taipei, Taiwan
Most of the genomes were sequenced by shotgun methods. Typically, these genomes were annotated after finishing. Even at this stage, not all projects have functional annotation, which are essential to interpret the biological regulation in a genome. To get functional annotation in early stage of a shotgun sequencing project will thus accelerate the discovery process. Since the sequences are not fully assembled, the gene prediction is not accurate. The most effective method is to use the protein sequences in model organisms to annotate a new genome. Genome interpretation goes a step further to classify components into pathways and families, so the functional aspects of a genome can be addressed. We have designed a Genome Annotation and INterpretation (GAIN) system to present the functional annotation from a model organism. The users can not only search the annotation by keywords or by blast analysis, but also browse the functional classifications on the basis of pathways or control words in gene ontology. Besides, this system keeps track of the relations among annotations, raw reads, contigs, the end sequences of linking clones, and the super-contigs. Once a sequence is retrieved from the system, the users will be able to launch analysis tools without cutting and pasting the sequence. This user-centric analysis environment is particularly useful for annotators. We have used the Ling-Chi (Ganoderma lucidum) sequencing project to test this system. Although Neurospora crassa is closely related to Ling-Chi, yeast was better annotated. The GAIN system was explained by using yeast to annotate Ling-Chi and a user-friendly web interface can be found at http://ymbc.ym.edu.tw/gl/lgp.htm.