Our EST clusters take advantages of NCBI Pinus taeda UniGene (Build #11). Each UniGene cluster is intended to include mRNA transcripts transcribed from a unique locus in the genome. Because of the algorithm adopted, it is also likely that an UniGene cluster contains transcripts from gene paralogs and/or gene families. Meanwhile, transcript isoforms derived from the same genes could also be clustered into different UniGene clusters. We adopted an unique approach in our EST clustering by (1) retrieve EST component lists of all UniGene clusters from NCBI (2) use our clean EST sequences annoated with cDNA termini that deliminates transcript ends
(3) conduct EST clustering for every individual UniGene cluster using CAP3 to create consensus or contig sequences
|
|
Continue
|
