Tool Usage

TDF includes the following submodes:

  • promotertest: Promoter test evaluates the association between the given lncRNA to the target promoters.
  • regiontest: Genomic region test evaluates the association between the given lncRNA to the target regions by randomization.
  • get_dbss: Get TTSs in BED format from the single BED file
  • integrate: Integrate the project’s links and generate project-level statistics.

Here we introduce the common parameters for two main tests (promoter test and genomic region test) first and then describe their test-specific parameters. The last three scripts as the tools are introduced afterward.

Common Inputs for both tests

TDF can be executed with the following command:

rgt-TDF {promotertest,regiontest} [required inputs] [options]

Where:

  • {promotertest,regiontest}: Define the applying test, either promoter test, or genomic region test.
  • [required inputs]: Required inputs files and paths.
  • [options]: Additional input parameters or output options.

There are some inputs common for both tests shown below:

Required Input for both tests

Option Name Type Description
-h, –help Show the help message and exit
 -r  PATH Input file name for RNA sequence (in fasta format)
 -rn  String Define the RNA name
 -o  PATH Output directory name for all the results and temporary files
 -organism  String Define the organism (hg19, hg38, mm9, mm10… etc)

Options

Option Name Type Default Description
 -t  String  RNA name Define the title name for the results under the Output name.
 -a  Float  0.05 Define significance level for rejection null hypothesis
 -ccf  Integer  20 Define the cut off value for promoter counts
 -rt  Boolean  False Remove temporary files (fa, txp…etc)
 -log  Boolean  False Set the plots in log scale
 -ac  PATH  None Input file for RNA accecibility
 -accf  Integer  500 Define the cut off value for RNA accecibility
 -obed  Boolean  False Output the BED files for DNA binding sites.
 -showpa  Boolean  False Show parallel and antiparallel bindings in the plot separately.
 -filter_havana  Boolean  False Apply filtering to remove HAVANA entries.
 -protein_coding  Boolean  False Apply filtering to get only protein coding genes.
 -known_only  Boolean  False Apply filtering to get only known genes.
 -nofile  Boolean  False Don’t save any files in the output folder, except the statistics.

Options for TRIPLEXES

The arguments of the TRIPLEXES can be adjusted by the options below.

Option Name Type Default Description
 -l  Integer  20 Define the minimum length of triplex
 -e  Integer  20 Set the maximal error-rate in % tolerated
 -c  Integer  2 Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as were found to greatly destabilize triplexes in vitro.
 -fr  String  off Activates the filtering of low complexity regions and repeats in the sequence data
 -fm  Integer  0 Method to quickly discard non-hits (Default 0).’0′ = greedy approach; ‘1’ = q-gram filtering.
 -of  Integer  1 Define output formats of Triplexator
 -mf  Boolean  False Merge overlapping features into a cluster and report the spanning region.
 -rm  Boolean  False Set the multiprocessing
 -par  String  False Define other parameters for TRIPLEXES. Please ignore the first “-” and replace space with underline. For example, when you want to add “-G 80 -g 20”, please do “-par G_80_-g_20”.

Particular Inputs for promoter test

Required Input for promoter test

The target promoters can be defined in two ways:

  1. A gene list, which contains gene symbols or Ensembl IDs, one gene per line in plain text format. The argument, -de, should be used;
  2. Two BED files containing the regions of target promoters and non-target promoters (background). Two arguments, -bed and -bg, should be used together.
Option Name Type Description
 -de  PATH Input file for gene list (gene symbols or Ensembl ID)
 -bed  PATH Input BED file of the promoter regions of genes
 -bg  PATH Input BED file of the promoter regions of background genes

Options for promoter test

Option Name Type Default Description
 -pl  Integer  1000 Define the promotor length
 -score  Boolean  False Load score column from input gene list of BED file for analysis.
 -scoreh  Boolean  False Use the header of scores from the given gene list or BED file.

Particular Inputs for region set test

Required Input for region set test

Option Name Type Description
 -bed  PATH Input BED file for interesting regions on DNA

Options for region set test

-mp Integer 0Define the number of threads for multiprocessing.

Option Name Type Default Description
 -n  Integer  10000 Number iterations (randomization)
 -f  PATH  None Input BED file as mask for randomization
 -score  Boolean  False Load score column from input BED file

 

get_ttss

Get TTSs of the given RNA sequence with the single BED file.

rgt-TDF get_ttss [options]

Option Name Type Default Description
 -h, –help show this help message and exit
-i  PATH Input BED file of the target regions
 -tts  PATH Output BED file of the TTSs
 -tfo  PATH Output BED file of the TFOs
 -tfo  PATH Output BED file of the TFOs
 -r  PATH Input FASTA file of the RNA
 -organism  PATH Define the organism
 -l  Integer  20 [Triplexes] Define the minimum length of triplex
 -e  Integer  20 [Triplexes] Set the maximal error-rate in % tolerated
 -c  Integer  2 [Triplexes] Sets the tolerated number of consecutive errors with respect to the canonical triplex rules as such were found to greatly destabilize triplexes in vitro
 -fr  on/off  off [Triplexes] Activates the filtering of low complexity regions and repeats in the sequence data
 -fm  Integer  0 [Triplexes] Method to quickly discard non-hits (default: 0).’0′ = greedy approach; ‘1’ = q-gram filtering.
 -of  Integer  1 [Triplexes] Define output formats of Triplexes
 -mf  Boolean  False [Triplexes] Merge overlapping features into a cluster and report the spanning region.
 -rm  Integer  1 [Triplexes] Set the multiprocessing

integrate

Integrate the project’s links and generate project-level statistics.

rgt-TDF integrate [options]

Option Name Type Default Description
 -h, –help show this help message and exit
-path  PATH Define the path of the project.
 -exp  PATH Include expression score for ranking.