Basic Introduction

[TDF currently is still under development, ues it at your own risk.]

Method

TDF

Triplex Domain Finder (TDF) characterizes the triple helix forming potential of RNA and DNA regions. Our method tests whether particular RNA regions are likely to form DNA binding domains (DBD)  in candidate lncRNAs and potential target DNA regions (e.g. promoters of differentially regulated genes after knockdown of the lncRNA). Moreover, DNA binding sites associated with the predicted DBDs are used to indicate potential target DNA regions, e.g. genes with high binding site coverage in their promoter. TDF provides two distinct statistical tests:
  1)  promoter test – used to evaluate the triple helix potential in promoters
  2)  genomic region test – used to test triple helix potential in a given set of genomic regions
The command line tool provides results in a user friendly and graphical HTML interface.

Download & Installation

Installation of TDF

If you have followed the generic instructions for the RGT suite installation, then you can start using TDF.

If you have any questions, comments, installation problems or bug reports, please access our discussion group.

Further installation instructions, including installation without pip, are found here.

Configure genome sequence and annotation data

Before performing the test, the genome sequence and annotation data should be configured. Previous installation of wget is required for Mac OS (see complete installation).

cd ~/rgtdata
python setupGenomicData.py --mm9
python setupGenomicData.py --hg19

Note that this operation will require a few minutes for completion and will download genome FASTA files.

Example Promoter Test

We will describe the necessary steps to find potential DNA binding domains of FENDRR in the promoter region of genes UP/DOWN regulated after FENDRR siRNA as presented in the TDF publication (to come).

To run TDF with the promoter test, you need as input:

  • the FASTA (What is FASTA?)
  • the gene list in form of gene symbol or ensembl id

You can download data for executing TDF on FENDRR example here and find two files in FENDRR_mm9 directory:

  • FENDRR.fasta: The sequence of FENDRR in FASTA format.
  • fendrr_gene_list.txt: A list of genes UP/DOWN regulated after FENDRR siRNA.

Run promoter test

Go to the directory with these files (FENDRR_mm9/). The promoter test can be executed with the following command:

rgt-TDF promotertest -r FENDRR.fasta -de fendrr_gene_list.txt -organism mm9 -rn FENDRR -o promoter_test/FENDRR/

After running the command,  all the result files are in the output directory (promoter_test/FENDRR/) with html report: promoter_test/FENDRR/index.html (see result webpage here). Complete usage instructions and more descriptive examples are found here.

Example Genomic Region Test

The genomic region test uses any set of genomic locations as target regions. This test should be used when target DNA regions are indicated by functional studies, such as ChIRP-Seq, CHART-Seq or ChIP-Seq.

Genomic region test requires two files: a RNA sequence in FASTA format and a set of interested genomic regions in BED format.

Before running the test, please make sure you configure the hg19 genome data (here).

We demonstrate the genomic region test with the following files in TERC in hg19 from example data obtained here:

  1. terc.fasta: RNA sequence of TERC in FASTA format.
  2. terc_peaks.bed: Target regions of TERC.
  3. Nregions_hg19.bed: The regions in the genome with “N” letters for masking in randomization.

Run genomic region test

Go to the directory with the files and execute the following command to run the test (it takes about 3~4 hours because 10,000 randomizations is the default):

rgt-TDF regiontest -r terc.fasta -bed terc_peaks.bed -rn TERC -f Nregions_hg19.bed -organism hg19 -o genomic_region_test/TERC

Then all the results and graphics are stored in the output directory (genomic_region_test/TERC) (check example results page here).

Complete usage instructions and more descriptive examples are found in tutorial for genomic region test.

Citation

S. Hanzelmann, C.C. Kuo, M. Kalwa, W. Wagner,  I. G. Costa, Triplex Domain Finder: Detection of Triple Helix Binding Domains in Long Non-Coding RNAs, biorxiv, 2016 [paper].