http://tandem.bu.edu
  home
  team
  tools
  research
 
news
contact
images
papers
talks
classes
collaborators
resources
code

Bagel is an a new AJAX web browser that lets you view a database of existing annotation tracks as well as upload your own.

Tandem Repeats Finder is an application for finding tandem repeats in DNA sequences. It employs a stochastic model of repeats and associated statistical detection criteria. It is extremely fast and thorough, and is now regularly used to analyze new genomic sequences (see the UCSC Golden Path human genome browser for example).

Tandem repeats are ubiquitous sequence features in both prokaryotic and eukaryotic genomes. In humans, they are known to cause at least ten inherited neurological diseases including fragile-X mental retardation, Huntington's disease, and myotonic dystrophy and are associated with a number of other major illnesses, including diabetes, epilepsy, and ovarian and other cancers. Additionally, they are the basis of DNA fingerprinting and have recently been used to discriminate between different bacterial strains, including anthrax strains.

Composition Alignment or, more whimsically, scrambled alignment, employs the mechanisms of string matching and string comparison yet avoids the overdependence of those methods on position-by-position matching. In composition alignment, we extend the matching concept to composition matching. Two strings have a composition match if their lengths are equal and they have the same nucleotide content.

Tandem Repeats Database (TRDB) is a public repository of information about tandem repeats in multiple genomes. Additionally, it provides private workspace for researchers to use the Tandem Repeats Finder with added features. For example, it can store all of the repeats found in a sequence (as well as the original sequence) in a convenient manner for later processing. The repeats are organized into sets. A variety of tools are available for additional processing: 1) clustering repeats into families, 2) predicting copy number polymorphism based on sequence characteristics, 3) annotation of repeats and families, 4) advanced filtering capabilitites, 5) data visualization, and 6) data downloading.

Inverted Repeats Finder is an application very similar to the Tandem Repeats Finder. It searches for Inverted Repeats. We have set up a Inverted Repeats Database (IRDB) web interface where you can try it out (you must register, can't run it though a guest account.) It can store repeats and sequences and has many of the filtering capabilities of TRDB. There is also a convenient graphical interface for viewing alignments.

RNA Fold Support is a tool that can be used to evaluate a predicted RNA structure based on the mutations present in a given number of alleles.

FrameSplitter is a tool to identify conserved codons where the amino acid permits 4-fold degeneracy on the third nucleotide.

History is a program that accepts repeat(s) in the TRF .dat file format and reconstructs duplication histories as described in [G. Benson and L. Dong, Reconstructing the Duplication History of a Tandem Repeat, Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology (ISMB-99)]. The output is a value, HistoryR, which can be used to predict length polymorphism as reported in [F. Denoeud, G. Vergnaud and G. Benson, Predicting Human Minisatellite Polymorphism, 2002 (submitted for publication)]. This feature is also available in TRDB.

Mutation Master accepts a multiple alignment data file (.MSF) and rapidly provides a visual display and tabulation of site, frequency, number and likelihood of point mutations. Alignment of many related sequences from viral or bacterial quasi-species can reveal important information about proteins, RNA, and DNA, including changes that correlate with pathogenicity, drug susceptibility and sequence structure. Extracting this information, manually, from multiple alignments is often difficult, especially when a large number of long sequences are utilized. Analysis of hepatitus C virus (HCV) protein sequences using Mutation Master has identified possible sites of amino acid structural interaction, and has revealed that ARFP, a novel protein encoded in an overlapping reading frame, is as conserved as conventional HCV proteins. See the paper J. Walewski, J. Gutierrez, W. Branch-Elliman, D. Stump, T. Keller, A. Rodriguez, G. Benson, and A. Branch. Mutation Master: Profiles of Substitutions in Hepatitis C Virus RNA of the Core, Alternate Reading Frame and NS2 Coding Regions, RNA 8:557-571, 2002.

Ongoing work includes developing similar tools to analyze RNA multiple alignments for structural clues including compensatory mutations.

Sequence Alignment Tool is a web interface to our sequence alignment library. You can submit 2 sequences and align them using various parameters and alignment algorithms. Composition Sequence Alignment Tool is a more composition oriented alignment tool.

Sequence Fraction Tool is useful to cut out a fraction of a sequence at a specified base position.

1st Order Markov Chain Tool is a tool to generate a probobility file based on a submitted DNA sequence. That file can then be used generate a compatible Random Sequence using the next tool.

Random Sequence Generator will give you a random sequence(s) of any size.

RNKP is a tool to calculate the probability of run of K heads in the string of length N with a probability of a head set to P.

Synthetic Sequence Creator Tool is useful to create a list of FASTA sequences with planted homologous pairs.

K'nex DNA Models is an educational website on how to build your own DNA models. Targeted for students in middle school, high school and college.


  Quick Links:

Composition Alignment: download | web | sim
TRDB | IRDB
TRF : download | web
IRF : download
FHub | CHub
K'nex DNA Models

  Latest News:

New Vntrseek 1.05 version is now available for download.
New TRF 4.07b version is now available for download.
IRF command line version is now available for download.

  Page last updated:11/29/12