WaVPeak is a wavelet smoothing and volume filtering-based method for automatic peak picking in NMR spectra. The program below has three options for the denoising step: 1) wavelet denoising; 2) median-modifie-Wiener-filter (MMWF)-based denoising; and 3) MMWF*-based denoising, where MMWF* is a novel variation of MMWF.


1. Zhi Liu, Ahmed Abbas, Bing-Yi Jing, and Xin Gao. WaVPeak: picking NMR peaks through wavelet transform and volume-based filtering. Bioinformatics (2012), 28(7): 914-920.

2. Carlo V. Cannistraci, Ahmed Abbas, and Xin Gao. Median modified Wiener filter for nonlinear adaptive spatial denoising of protein NMR multidimensional spectra. Submitted.

SECOM is a rapid and sensitive genome-scale protein domain identification method. 


1. Ming Fan, Ka-Chun Wong, Taewoo Ryu, Timothy Ravasi, and Xin Gao. SECOM: a novel hash seed and community detection-based approach for genome-scale protein domain identification. PLoS ONE (2012), doi:  10.1371/journal.pone.0039475

KAUSTNMF is a maximum correntropy criterion-based non-negative matrix factorization package.


1. Jingyan Wang, Xiaolei Wang, Quanquan Wang, Xinge You, Yongping Li, and Xin Gao. Non-negative matrix factorization by maximizing correntropy for cancer clustering. BMC Bioinformatics (2013). 14:107.

Two domain-based protein function prediction methods that encode domain recurrence and order information. The first is the altered version of the probabilistic model proposed by Forslund and Sonnhammer 2008. The second is the domain-version of the naive Bayes model proposed by Silvescu et al. 2004.

1. Mario Messih, Meghana Chitale, Vladimir Bajic, Daisuke Kihara, and Xin Gao. Protein function prediction that considers domain recurrence and order. Bioinformatics (2012), 28(18): i444–i450. Earlier version appeared in the 11th European Conference on Computational Biology (ECCB2012).
A Benjamini-Hochberg-based algorithm to select true peaks from a large number of peaks, which has been implemented for WaVPeak and PICKY.

1. Ahmed Abbas, Xin-Bing Kong, Zhi Liu, Bing-Yi Jing, and Xin Gao. Automatic peak selection by a Benjamini-Hochberg-based algorithm. PLOS One, (2013). 8(1): e53112.
AdapGrNMF is a toolkit for adaptive graph regularized nonnegative matrix factorization.
1. Jingyan Wang, Islam
Almasri, and Xin Gao. Adaptive graph regularized nonnegative matrix factorization via feature selection. The 21st International Conference on Pattern Recognition (ICPR2012).

The program is able to predict the 12 main variants of human poly(A) motifs, i.e., AATAAA, ATTAAA, AAAAAG, AAGAAA, TATAAA, AATACA, AGTAAA, ACTAAA, GATAAA, CATAAA, AATATA, and AATAGA.


1. B. Xie, B. Jankovic, V. Bajic, L. Song and X. Gao. Prediction of poly(A) motifs in human DNA sequences. Bioinformatics. (2013). 29(13): i316-i325.


1. Hiroyuki Kuwahara, Ming Fan, Suojin Wang and Xin Gao. A framework for scalable parameter estimation of gene
circuit models using structural information. Bioinformatics. (2013). 29(13): i98-i107. [Highlighted by Nature Middle East:]


1. Peng Chen, Jinyan Li, Limsoon Wong, Hiroyuki Kuwahara, Jianhua Huang, and Xin Gao. Accurate prediction of hot spot residues through
physicochemical characteristics of amino acid sequences. PROTEINS. (2013). 81(8): 1351-1362.

LigandRFs is a random forest-based approach to predict protein-ligand binding sites.


1. Peng Chen, Jianhua Huang, and Xin Gao. LigandRFs: predict ligand-binding sites from sequence information through random forests. Submitted.

ACRE is a JAVA-based tool that enumerates all the biochemical reaction networks that consist of user-created nodes from user-selected modules under user-specified constraints; applies Shinar and Feinberg's theorem (Shinar and Feinberg, Science, 2010); and determines which of the networks have the absolute concentration robustness property.
1. X Gao, I Almasri, B Arkasosy, and H Kuwahara. ACRE: Absolute concentration robustness exploration in module-based networks. Submitted.

The .zip file contains the following files:
1. Modules: a folder containing sample SBML files for frequently-seen modules.

2. Workspaces: a folder containing sample workspaces.
3. ACRE.jar: the executable application file.
4. UserManual_ACRE.pdf: the detailed user manual.


Our method trains a two-round support vector regression model for predicting protein-DNA binding affinity.
1. X. Wang, H. Kuwahara, and X. Gao. Modeling DNA affinity landscape through two-round support vector regression with weighted degree kernels. BMC Systems Biology. Accepted.
This tools has two components, one for NMR slice picking and the other one for resonance assignment based on picked sliced. The required inputs are CBCACONH.ucsf and HNCACB.ucsf.
1. A. Abbas, X. Guo, B. Jing, and X. Gao. An automated framework for NMR resonance assignment through simultaneous slice picking and spin system forming. Journal of Biomolecular NMR. (2014). 59(2): 75-86.
PROSTA-inter is a sequence-order independent alignment tool that automatically determines and aligns interaction interfaces
between two arbitrary types of complex structures (protein-protein, protein-DNA, and/or protein-RNA).
1. Xuefeng Cui, Hammad Naveed, and Xin Gao. Finding optimal interaction interface alignments between biological complexes. ISMB2015. Also appeared in Bioinformatics. (2015) 31(12): i133-i141.

iDTP is a novel integrated strucutre- and system-based approach of drug-target prediction to enable the large-scale discovery of new targets for small molecules, such as pharmaceutical drugs, co-factors and metabolites.


1. Hammad Naveed, Umar Hameed, Deborah Harrus, William Bourguet, Stefan Arold, and Xin Gao. "An integrated structure- and system-based framework to identify new targets of metabolites and known drugs". Bioinformatics. In press. 

This is a method to form spin systems using 13C labeled NMR spectra, i.e., CBCACON and CBCANCO.


1. Ahmed Abbas, Meshri Alazmi, Xianrong Guo, and Xin Gao. A novel slice-based spin system forming method for 13C NMR spectra. Submitted.

Gracob is a deterministic graph-based biclustering method that is designed to find maximal constant-column biclusters in any given two-dimensional data matrix, particularly growth phenotype data in which each row represents a gene deletion strain and each column represents a stress condition.


1. Majed Alzahrani, Hiroyuki Kuwahara, Wei Wang, and Xin Gao. Gracob: a novel graph-based constant-column biclustering method for mining growth phenotype data. Submitted.

CMsearch is a protein homology detection method that simultaneously explore protein sequence space and protein structure space by cross-modal learning. 


1. Xuefeng Cui, Zhiwu Lu, Sheng Wang, Jingyan Wang, and Xin Gao. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also structure prediction. Submitted.

Fingerprint contribution (FC) Method is a regularized linear regression method based on chemical fingerprints to predict Gibbs free energy of biochemical reactions. 
1. Meshari Alazmi, Hiroyuki Kuwahara, Othman Soufan, Lizhong Ding, and Xin Gao. Systematic selection of chemical fingerprint features improves the Gibbs energy prediction of biochemical reactions. Under review.