What are the sources of data?
Unless otherwise stated, all raw data is from genome.ucsc.edu.
At what parameters is the public data processed?
Unless otherwise stated, trf parameters used were 2,7,7,50,80,10,2000
I processed a FASTA file with multiple sequences in it. How do I know which sequence the repeat came from?
There is a checkbox right under the filter and above the ordering text box called "Group By Sequences". When checked, repeats will be grouped by their source sequences and there will be a gray header above each group with sequence information.
Can I somehow filter out the repeats that came from particular sequences?
Once the "Group By Sequences" is checked, uncheck the sequence checkboxes on the blobs you want to keep, and use the "sequences checboxes"+"checked/unchecked" filter combination to get rid of the stuff you don't want. Ex: "sequence checkboxes" + "unchecked" combination will only keep the repeats from sequences whose header checkboxes are unchecked.
After I merge two sets, is there a way to tell which set a certain repeat came from?
Yes, use the "RUN ID" field to tell them apart.