Tools & Data
Computational Analysis Tools & Additional Data
We developed a new method to calculate the electrostatic potential in the minor groove in a high-throughput manner for any length or number of sequences based on the data mining of results from solving the non-linear Poisson-Boltzmann equation for many DNA fragments with diverse sequences. To model DNA binding specificities of transcription factors using electrostatic potential, we included a statistical machine learning approach (MLR) that combines minor-groove electrostatic potential with DNA sequence features.
TFBSshape is a motif database for analyzing structural profiles of transcription factor binding sites (TFBSs). This new release includes new entries from the JASPAR and UniPROBE databases, methylated TFBSs derived from in vitro high-throughput binding assays and in vivo methylated TFBSs. The structural profiles for each TFBS entry now include 13 shape features and minor groove electrostatic potential for DNA and four shape features for methylated DNA. We designed new tools for the shape-based alignment of TFBSs, for the comparison of methylated and unmethylated shape profiles, and for the design of shape-preserving nucleotide mutations in TFBSs.
DNAproDB is a database and structural analysis tool that offers data visualization, data processing and search functionality with which researchers can analyze, access and visualize structural data of DNA-protein complexes. The new release of DNAproDB supports any DNA secondary structure from typical B-form DNA to single-stranded DNA to G-quadruplexes. We have updated the structure of our data files to support complex DNA conformations. Support for chemically modified residues and nucleotides has been significantly improved along with the addition of new structural features and improved structural moiety assignment.
We developed a high-throughput method for predicting the effect of cytosine methylation on DNA shape and its subsequent influence on protein-DNA interactions. This approach overcomes the limited availability of experimental DNA structures that contain 5-methylcytosine.
DNAproDB is a database and web-based visualization tool which is intended to make structural analysis of DNA-Protein complexes easy. Here users can find a wealth of data on the structure of and interaction between DNA and proteins in complex for currently 2,568 structures contained in the PDB. This data can be used to analyze individual structures, or to generate large datasets by constructing queries on a set of structural and interaction features using the search form. Additionally, the users can upload their own structure using the upload form, and use the same processing and visualization tools for unpublished data.
DNAshapeR is a software package implemented in the statistical programming language R that predicts DNA shape features in an ultra-fast, high-throughput manner from genomic sequencing data. The package takes either nucleotide sequence or genomic coordinates as input, and generates various graphical representations for visualization and further analysis. DNAshapeR further encodes DNA sequence and shape features as user-defined combinations of k-mer and DNA shape features. The resulting feature matrices can be readily used as input of various machine learning software packages for further modeling studies.
GBshape provides DNA shape annotations of entire genomes. The database currently contains annotations for minor groove width, roll, propeller twist, helix twist and hydroxyl radical cleavage for 98 different organisms. Additional genomes can easily be added in the provided framework. GBshape contains two major tools, a genome browser and a table browser. The genome browser provides a graphical representation of DNA shape annotations along standard genome browser annotations.
Our new TFBSshape database disentangles the complex relationships between DNA sequence, its 3D structure, and protein-DNA binding specificity. The TFBSshape database augments nucleotide sequence motifs with heat maps and quantitative predictions of DNA shape features for 739 TF datasets from 23 different species.
We developed a new method for predicting DNA shape in a high-throughput manner on a genome-wide scale. This approach predicts structural features (several helical parameters and minor groove width) for the entire yeast genome in less than one minute on a regular laptop. The prediction can be visualized as genome browser tracks and compared to other properties of the genome such as sequence conservation.
Rao et al. Systematic prediction of DNA shape changes due to CpG methylation explains epigenetic effects on protein-DNA binding.
Epigenetics & Chromatin In press (2018)
T.P. Chiu et al. Genome-wide prediction of minor-groove electrostatic potential enables biophysical modeling of protein-DNA binding.
Nucleic Acids Res. 45, 12565-12576 (2017)