Using Tandem Repeats Finder on the Command Line

Once the program is installed you can run it with no parameters to obtain information on proper usage syntax. If you installed the program as trf then by typing trf on the command line you will see the following output:


Please use: trf File Match Mismatch Delta PM PI Minscore MaxPeriod [options]

Where: (all weights, penalties, and scores are positive)
  File = sequences input file
  Match  = matching weight
  Mismatch  = mismatching penalty
  Delta = indel penalty
  PM = match probability (whole number)
  PI = indel probability (whole number)
  Minscore = minimum alignment score to report
  MaxPeriod = maximum period size to report
  [options] = one or more of the following:
        -m        masked sequence file
        -f        flanking sequence
        -d        data file
        -h        suppress html output
        -r        no redundancy elimination
        -l <n>    maximum TR length expected (in millions) (eg, -l 3 or -l=3 for 3 million)
        -ngs      more compact .dat output on multisequence files, returns 0 on success.
                  Output is printed to the screen, not a file. You may pipe input in with
                  this option using - for file name. Short 50 flanks are appended to .dat
                  output.

Note the sequence file should be in FASTA format:

>Name of sequence
aggaaacctgccatggcctcctggtgagctgtcctcatccactgctcgctgcctctccag
atactctgacccatggatcccctgggtgcagccaagccacaatggccatggcgccgctgt
actcccacccgccccaccctcctgatcctgctatggacatggcctttccacatccctgtg

Program Parameters:

Using recommended parameters the command line will look something like:

trf yoursequence.txt 2 7 7 80 10 50 500 -f -d -m

Once the program starts running it will print update messages to the screen. The word "Done" will be printed when the program finishes.

For single sequence input files there will be at least two HTML format output files, a repeat table file and an alignment file. If the number of repeats found is greater than 120, multiple linked repeat tables are produced. The links to the other tables appear at the top and the bottom of each table. To view the results start by opening the first repeat table file with your web browser. This file has the extension ".1.html". Alignment files can be accessed from the repeat table files. Alignment files end with the ".txt.html" extension.

For input files containing multiple sequences a summary page is produced that links to the output of individual sequences. This file has the extension "summary.html". You should start by opening this file if your input had multiple sequences in the same file. Also note that the output files of individual sequences will have an identifier of the form ".sn." ( n an integer) embedded in the name indicating the index of the sequence in the input file. The identifier is omitted for single sequence input files.

For more information on the output please see Table Explanation and Alignment Explanation.


Last revised October 20, 2022
Send any questions or comments to:
Gary Benson