3'-UTR SIRF: A database for identifying clusters of short interspersed repeats in 3' untranslated regions
Date
2007
Authors
Andken, Benjamin B.
Lim, In
Benson, Gary
Vincent, John
Ferenc, Matthew
Heinrich, Bianca
Man, Heng-Ye
Deshler, James
Version
OA Version
Citation
2007. "3'-UTR SIRF: A database for identifying clusters of short
interspersed repeats in 3' untranslated regions," BMC Bioinformatics. vol. 8 issue. 1
.
Abstract
BACKGROUND:Short (~5 nucleotides) interspersed repeats regulate several
aspects of post-transcriptional gene expression. Previously we developed an algorithm
(REPFIND) that assigns P-values to all repeated motifs in a given nucleic acid sequence and
reliably identifies clusters of short CAC-containing motifs required for mRNA localization
in Xenopus oocytes.DESCRIPTION:In order to facilitate the identification of genes possessing
clusters of repeats that regulate post-transcriptional aspects of gene expression in
mammalian genes, we used REPFIND to create a database of all repeated motifs in the 3'
untranslated regions (UTR) of genes from the Mammalian Gene Collection (MGC). The MGC
database includes seven vertebrate species: human, cow, rat, mouse and three non-mammalian
vertebrate species. A web-based application was developed to search this database of
repeated motifs to generate species-specific lists of genes containing specific classes of
repeats in their 3'-UTRs. This computational tool is called 3'-UTR SIRF (Short Interspersed
Repeat Finder), and it reveals that hundreds of human genes contain an abundance of short
CAC-rich and CAG-rich repeats in their 3'-UTRs that are similar to those found in mRNAs
localized to the neurites of neurons. We tested four candidate mRNAs for localization in rat
hippocampal neurons by in situ hybridization. Our results show that two candidate CAC-rich
(Syntaxin 1B and Tubulin beta4) and two candidate CAG-rich (Sec61alpha and Syntaxin 1A)
mRNAs are localized to distal neurites, whereas two control mRNAs lacking repeated motifs in
their 3'-UTR remain primarily in the cell body.CONCLUSION:Computational data generated with
3'-UTR SIRF indicate that hundreds of mammalian genes have an abundance of short
CA-containing motifs that may direct mRNA localization in neurons. In situ hybridization
shows that four candidate mRNAs are localized to distal neurites of cultured hippocampal
neurons. These data suggest that short CA-containing motifs may be part of a widely utilized
genetic code that regulates mRNA localization in vertebrate cells. The use of 3'-UTR SIRF to
search for new classes of motifs that regulate other aspects of gene expression should yield
important information in future studies addressing cis-regulatory information located in
3'-UTRs.