So far, EST terminus information has been largely ignored in the major public-domain EST databases. This is mainly because the majority of sequence processing programs or tools focus on "cleaning" or "trimming" spurious sequences in ESTs, including vector fragments, insert-flanking restriction endonuclease recognition sites, adapter sequences and/or poly(A)/(T) tails. However, our recent research approved that without inspecting cDNA terminus structures, conventional bioinformatics pipelines appear to be problematic when they process many raw EST trace files. Consequently, they could create unclean or under-trimmed EST sequences, which will definitely have cascading and deleterious impacts to many downstream EST applications that use these sequences. On other hand, lots of public EST sequences have been over-trimmed with regard to their terminus structures and represent loss of directional, positional and structural information of mRNA 3'/5' ends. It is clear to us that lots of GenBank EST sequences are either under trimmed or over trimmed. Using the following web interfaces, you can find out how many public sequeneces are incorrectly trimmed. |
|
Continue
|
