Tool Usage

HINT can be executed with the following command:

rgt-hint [analysis] [options] <input_files>

Where:
[analysis]: Analysis type that you would like to perform,  including footprinting, differential, tracks…

Footprinting

HINT now supports footprinting for different protocols, run this command to see all available options:

rgt-hint footprinting --help

Protocols

Protocol Name Type Default Description
–atac-seq Boolean False If set, footprint calling will be executed based on ATAC-seq model
–dnase-seq Boolean False If set, footprint calling will be executed based on DNase-seq model
–histone Boolean False If set, footprint calling will be executed based on histone modification model

Options

Option Name Type Default Description
–organism String hg19 Describes the organism in which the analysis is being performed. All default files such as genomes will be based on the chosen organism and the data.config file. Check more information on the rgtdata and data.config file.
–hmm-file FILE Default HMM HMM file that  separated by comma. If one file only, then this HMM will be applied for all histone signals, otherwise, the list must have the same number of histone files given. The order of the list should be the order of the histones in the input_matrix file. If the argument is not given, then a default HMM will be used.
–bias-table FILE1_F,FILE1_R Default bias tables List of files with all possible k-mers (for any k) and their bias estimates. The input should have two files: one for the forward and one for the negative strand. Each line should contain a kmer and the bias estimate separated by tab. If the argument is not given, then the default files will be used.
 –bias-correction  Boolean  False  If set, footprint calling will be based on bias corrected DNase-seq signal. This option is only applied to DNase-seq.
–bias-type String SH Type of protocol used to generate the DNase-seq.  Available options are: ‘SH’ (DNase-seq single-hit protocol), ‘DH’ (DNase-seq double-hit protocol)
–paired-end Boolean False Set it if your ATAC-seq data is paired-end sequenced. Note that this option is only applied to ATAC-seq data
–output-location PATH current directory Path where the output bias table files will be written
–output-prefix STRING footprints The prefix for results files.

Inputs

Inputs are sequencing reads in BAM and genomic regions in BED format that you are interested in, which usually is produced by some peak calling tools such as MACS2.

Output

Footprinting outputs a bed file containing all the footprints found by HINT within the regions queried, as well as a text file which includes some statistics about the input and output files. The 5th column in bed file represents the number of reads around the predicted footprint(±100bp) and can be used for filtering.

Differential

run this command to see all available options:

rgt-hint differential --help

Options

Option Name Type Default Description
–organism String hg19 Describes the organism in which the analysis is being performed. All default files such as genomes will be based on the chosen organism and the data.config file. Check more information on the rgtdata and data.config file.
–mpbs-file1 FILE None motif predicted binding sites file for condition 1, must be a bed file.
–mpbs-file1 FILE None motif predicted binding sites file for condition 2, must be a bed file.
 –reads-file1 FILE None The BAM file containing the DNase-seq or ATAC-seq reads for condition 1.
–reads-file2 FILE None The BAM file containing the DNase-seq or ATAC-seq reads for condition 1.
–window-size INT 200 The window size around the binding sites for differential analysis.  This number will be used to calculate the activity score for each transcription factor.
–factor1 FLOAT None The normalization factor for condition 1, if not given, it will be computed automatically.
–factor2 FLOAT None The normalization factor for condition 2, if not given, it will be computed automatically.
–condition1 STRING condition1 The name of condition1
–condition2 STRING condition2 The name of condition2
–bc Boolean False If set, all analysis will be based on the bias-corrected signal.
–nc INT 1 How many cores do you want to use for this analysis?
–standardize Boolean False If set, the signal will be rescaled to (0, 1) for plotting.
–output-profiles Boolean False If set, the footprint profiles will be written into a text, in which each row is a specific instance of the given motif
–output-location PATH current directory Path where the output bias table files will be written
–output-prefix STRING footprints The prefix for results files.

Output

Differential analysis creates a new folder named “condition1_condition2“, which includes a scatter plot, a text file and a sub-folder called “lineplot“. The scatter plot demonstrates the transcription factor (TF) activity dynamics between condition1 and condition2. Each dot represents a TF and the y-axis represents the differences in TF activity and names of TFs with significant differential activity values are colored as red (x-axis is a random number for jittering purposes). The raw results can be found from the text file. Inside the lineplot directory, the bias-corrected and normalized ATAC-seq profiles using —factor1 and –factor2 for each TF are presented.