Computational Genomics and System Biology

Dr. JJ Wang lab

Dr. Wang, John Junwen

BEng (Huazhong Agric.); MSc (Penn, Jiangnan); PhD (UW-Seattle)


Email: (replace firstname to Junwen and lastname to Wang)
Senior Associate Consultant II
Department of Health Sciences Research
Center for Individualized Medicine
Mayo Clinic Arizona
Scottsdale, AZ 85259

Job opening:

Publications, Achievements, and Grants are available at:

       HKU Scholars Hub, and Google Scholar

Web Servers:

       ChIP-Array, EpiRegNet, GWASrap, GWAS3D, ProteoMirExpress


       GWASdb, SNVrap, dbPSHP, PhenoPPIOrth


       FastPval, co-evo, NRProF, FaSD, DDGni, FaSD-somatic, LpRGNI, SpliceNet

Research Description:

We employ computational and biological approaches to study the relationship of biological sequences and functions. We focus on two areas:

Computational and transcriptional genomics:

Defining core promoter and surrounding transcription factor binding sites (TFBS) is a crucial step toward understanding gene regulation. We have developed computational models to detect core promoters in the human genome. We have also defined DNA sequence motifs associated with the core promoter and explored their relations to known genetic networks. Recent studies showed that many genes have multiple promoters. We discovered that among these promoters, most 5' promoters are more likely to be located within a CpG island. We are exploring this finding both computationally and biochemically. Computationally, we investigate the structure and functional variations among different promoters regarding their TFBS composition, CpG islands and promoter specificity. The computational findings are verified biochemically by DNA mutagenesis (i.e., to introduce insertions and deletions to disrupt the TFBS) aiming to demonstrate the correlation between the presence of a TFBS and a promoter function. We are developing computational methods to discover the genetic and epigenetic signatures of human/mouse embryonic stem cell differentiation.

Genome variation and diseases:

Single Nucleotide Polymorphism (SNP) and Copy Number Variation (CNV) are powerful tools to study genetic diseases, such as cancers in breast, colon and lung. There are more than 10 million SNPs in the human genome, but only a fraction have been associated with diseases. Discovering new disease-associated SNPs will improve prediction, prevention and therapy of these diseases. We have developed algorithms to detect the SNPs that are within the binding sites of transcription factors, or within a putative microRNA target. These SNPs are likely to alter normal gene regulation and causing diseases. We are developing new probabilistic models to improve detection of disease-associated SNPs and CNVs. In addition, we are developing analysis pipelines for The Cancer Genome Atlas (TCGA) project.

Current Lab Members:

  • Dr. Junwen John Wang; PI since March, 2008
  • Mr. Mulin Jun Li, RA (March 2010-June 2012); PhD student (June 2012-Nov. 2015); Postdoc since Nov, 2015; BSc., USTA; MSc., USTC
  • Mr. Panwen Wang, RA (Oct 2010-Dec 2011); PhD (Dec 2011-Oct 2015); Postdoc since Nov 2015; BSc., WHU; MSc., BUT
  • Ms. Yun Zhu, RA since May, 2012; MPhil student since June 2013; BSc. & MSc., Huazhong Agric
  • Mr. Zipeng Liu, PhD student since Sept 2012; BSc., CPU; MSc., CPU
  • Ms. Yiming Qin, PhD student since Sept 2013; BSc., Jilin
  • Ms. Zhong Joan Gu, Senior Research Assistant since July 2014; BSc., Jiangnan; MSc., Toledo
  • Mr. Shun H. Yip, PhD student since Oct 2013; BSc., Stony Brook University; MSc., Boston University
  • Mr. Zhenyang Guo, PhD student since Sept 2014; BSc., Southwest Jiaotong; MSc., CAS-Wuhan
  • Mr. Hang Xu, PhD student since Sept 2014; BSc., USTC (HKPF holder)
  • Mr. Hongcheng Yao, PhD student since Sept 2013; BSc., Nankai (UPF holder)

Previous Lab Members:

  • Dr. Jing Qin, RA (March 2010-July 2010); PhD student (Aug 2010-Nov 2013, UPF holder), Postdoc (Dec 2013-June 2015); BSc., ZJU; MPhil., CUHK; now Research Assistant Professor at CUHK.
  • Dr. Yan Wang, RA (March 2010-July 2010); PhD (Aug 2010-Sept 2014, UPF holder); BSc., PKU; now Scientist in Shanghai.
  • Dr. Hari Krishna Yalamanchili, PhD student (Jan. 2010~April, 2014); now Postdoc at Baylor College of Medicine, USA.
  • Dr. Weixin Jacky Wang, PhD student (Oct. 2009~Oct, 2013); now Postdoc at UPenn, USA.
  • Mr. Xiaorong Liu, Research Assistant (Feb. 2011~Aug, 2013), now Scientist at Shenzhen third Hospital.
  • Dr. HongQiang Wang, PostDoc (May. 2012~March, 2013), associate professor in the institute of intelligent machines(IIM) , Chinese Academy of Sciences(CAS).
  • Dr. Alan Lai, Research Assistant (Oct. 2011~March, 2012).
  • Mr. Shu Yang, MPhil student (Sept. 2008~July, 2011); now PHD student at UBC, Canada.
  • Dr. Kalpana Agrawal, part time RA (Nov. 2008~June, 2010).
  • Mr. Xinran Li, undergraduate FYP (Aug. 2008~July, 2009); now PhD student at UMich, USA.
  • Mr. Zhanyong Wang, Research Assistant (Mar. 2009-July, 2009); PhD at UCLA, USA; now Google Inc., USA.
  • Mr. Po Lo Paul Chan, undergraduate FYP (Sept. 2009~May, 2010).
  • Mr. Leung Hing Lok, undergraduate project student (Sept. 2009~May, 2010).
  • Ms. Pony Chan, undergraduate FYP (Sept. 2010~May, 2011).
  • Mr. Ocean Wong, undergraduate FYP (Sept. 2010~May, 2011).

Past Exchange Students/Summer Intern:

  • Mr. Xueya Zhou (May, 2011), from Tsinghua University, China
  • Ms. Ee Lyn Lim (Sept., 2010~Sept., 2010), from University of Oxford, UK
  • Mr. Long Chan (July, 2010~Aug, 2010), from Carlton College, USA
  • Mr. Kevin Mao (July, 2010~Aug, 2010), from Royal College of Surgeons in Ireland
  • Ms. Tina Yuen (July, 2010~Aug, 2010), from Royal College of Surgeons in Ireland
  • Ms. Vijitra Luang-In (July, 2010~Aug, 2010), from Imperial College London, UK
  • Ms. Ruijuan Li (May, 2010), from Tsinghua University, China
  • Mr. Yugang Hu (July, 2010), from NIBS, China
  • Ms. Grace Yip (July, 2009~Aug, 2009), from Imperial College London, UK

Selected Publications (name in bold: lab member, *Corresponding author):

  • Li MJ†, Liu Z†, Wang P, Wong MP, Nelson MR, Kocher JA, Yeager M, Sham PC, Chanock SJ, Xia Z, Wang JW* (2016): GWASdb2: a database for human genetic variants identified by genome-wide association studies. NAR , 44(D1):D869-76.
  • Sampson J*, Wheeler WA, Yeager M, Panagiotou O, Wang Z, …, Wang JW, …, Kraft P, Rothman N, Silverman DT, Slager S,  Chanock SJ, Chatterjee N: Analysis of heritability and shared heritability based on genome-wide association studies for thirteen cancer types. JNCI , doi: 10.1093/jnci/djv279.
  • Wang P†, Qin J†, Qin Y, Zhu Y, Wang LY, Li MJ, MQ Zhang, Wang JW* (2015) ChIP-Array2: integrating multiple OMICs data to construct gene regulatory networks. NAR , 43(W1):W264-9.
  • Nelson MR*, Tipney H, Painter JL, Shen S, Nicoletti P, Shen Y, Floratos A, Sham PC, Li MJ, Wang JW, Cardon LR, Whittaker J, Sanseau P (2015) The support of human genetic evidence for approved drug indications, Nat Genet. , 47(8):856-60. (news on MedicalXpress, genomeWeb, BioPortfolio,, ibtimes, bioworld; pharmaceutical J; )
  • Hu J, Zhao Z, Yalamanchili HK, Wang JW, Ye K, Fan X* (2015) Bayesian detection of embryonic gene expression onset in C. elegans. Annals of Applied Statistics , 9(5):950-68.
  • Qin Y, Yalamanchili, HK, Qin J, Yan B and Wang JW* (2015) The current status and challenges in computational analysis of genomic big data. Big Data Research , 2:12-8.
  • Li MJ, Deng J, Wang P, Yang W, Ho SL, Sham PC, Wang JW*, Li MX* (2015) wKGGSeq: a strategy-based and disease-targeted analysis framework for exome sequencing studies of inherited disorders. Human Mutation , 36(5):496-503.
  • Machiela MJ, Hsiung CA, Shu XO, Seow WJ, Wang Z, Matsuo K, Hong YC, …, Wang JW, …, Chanock SJ*, Rothman N*, Lan Q* (2015) Genetic variants associated with longer telomere length are associated with increased lung cancer risk among never-smoking women in Asia: A report from the Female Lung Cancer Consortium in Asia, International Journal of Cancer, 137(2):311-9.
  • Li MJ, Wang JW*: Current trend of annotating single nucleotide variation in humans - a case study on SNVrap. Methods , 79, 32-40.
  • Wang W, Wang PW, Xu F, Luo R, Wong MP, Lam TW, Wang JW*(2014) FaSD-somatic: a fast and accurate somatic SNV detection algorithm for cancer genome sequencing data. Bioinformatics , 30(17):2498-500.
  • Wang Z, Zhu B, Zhang M, Parikh H, Jia J, Chung CC, Sampson JN, Hoskins JW, Hutchinson A, Burdette L, Ibrahim A, Hautman C, Raj PS, Abnet CC, Adjei AA, Ahlbom A, Albanes D, Allen NE, Ambrosone CB, Aldrich M, Amiano P,…, Wang JW, … Chanock SJ, Yeager M, Landi MT, Shi J, Chatterjee N,Amundadottir, L*: Imputation and subset based association analysis across different cancer types identifies multiple independent risk loci in the TERT-CLPTM1L region on chromosome 5p15.33. HMG , 23(24):6616-33.
  • Yalamanchili HK, Li Z, Wong MP, Yao J*, Wang JW*:SpliceNet: Recovering Gene Networks with Splice Variant Resolution from RNA-seq data, NAR , 42(15):e121.
  • Xu M, Zhao G, Lv X, Liu G, Wang LY, Hao DL, Wang JW, Liu DP*, Liang CC: CTCF controls HOXA cluster silencing and mediates PRC2 repressive higher-order chromatin structure in NT2/D1 cells, MCB , 34 (20), 3867-3879.
  • Li MJ, Yan B, Sham PC, Wang JW* (2015) Exploring trait-associated genetic variants in transcription: approaches for   identifying human regulatory variation affecting gene regulation. Briefings in Bioinformatics , 16 (3): 393-412
  • Guan D, Shao J, Wang P, Zhao Z, Qin J, Deng Y, Boheler KR*, Wang JW*, Yan B* (2014) PTHGRN: unraveling post-translational hierarchical gene regulatory networks using protein-protein interaction, ChIP-seq and gene expression data. NAR , 42 (W1): W130-W136. (Cover story)
  • Qin J, Hu Y, Xu F, Yalamanchili HK, Wang JW* (2014) Inferring Gene Regulatory Networks from Integrative Omics Data via LASSO-type regularization methods. Methods , 67(3):294-303.
  • Li MJ, LY Wang, Z Xia, MP Wong, PC Sham, Wang JW* (2014) dbPSHP: a database of recent positive selection across human populations  Nucleic acids research , 42 (D1), D910-D916.  
  • Guan D, Shao J, Deng Y, Wang PW, Zhao Z, Liang Y, Wang JW* and Yan B* (2014) CMGRN: a web server for constructing multi-level gene regulatory networks using ChIP-seq and gene expression data. Bioinformatics , 30(8): 1190-92.
  • HK Yalamanchili, B Yan, Li MJ, J Qin, Z Zhao, FYL Chin, Wang JW* (2013) DDGni: Dynamic delay gene-network inference from high-temporal data using gapped local alignment  Bioinformatics , 30(3): 377-83.  
  • J Qin, Li MJ, P Wang, NS Wong, MP Wong, Z Xia, GSW Tsao, MQ Zhang, Wang JW* (2013) ProteoMirExpress: Inferring MicroRNA and Protein-centered Regulatory Networks from High-throughput Proteomic and mRNA Expression Data  Molecular & Cellular Proteomics , 12 (11), 3379-3387.  
  • P Wang, WF Lai, Li MJ, F Xu, HK Yalamanchili, R Lovell-Badge, Wang JW* (2013) Inference of Gene-Phenotype Associations via Protein-Protein Interaction and Orthology  PloS one , 8 (10), e77478.  
  • Li MJ, Wang LY, Xia ZY, Sham PC, and Wang JW* (2013) GWAS3D: detecting human regulatory variants by integrative analysis of genome-wide associations, chromosome interactions and histone modifications.  Nucleic Acids Research , 41(Web Server issue):W150-8.  
  • Pan X, Papasani M, Hao Y, Calamito M, Wei F, Quinn III W, Wang JW, Hodawadekar S, Zaprazna K, Liu H, Shi Y, Allman D, Cancro M, Basu A, Atchison ML* (2013) YY1 Controls Igk Repertoire and B Cell Development, and Localizes with Condensin on the Igk Locus.  EMBO J , doi:10.1038/emboj.2013.66. 
  • Lan Q*, Hsiung, CA, Matsuo K, Hong YC, ..., Wang JW, ..., Rothman, N (2012) Genome-wide association analysis identifies new lung cancer susceptibility loci in never-smoking women in Asia.  Nat Genet , 11;44(12):1330-5. 
  • Xu F†, Wang W†, Wang P, Li MJ, Sham PC, and Wang JW* (2012) A fast and accurate SNP detection algorithm for next-generation sequencing data.  Nat Commun , doi:10.1038/ncomms2256.  
  • Li MJ, Sham PC, and Wang JW* (2012) Genetic variants representation, annotation and prioritization in the post-GWAS era.  Cell Research , 22(10):1505-1508. 
  • Zhang G, Zhou B, Wang W, Zhang M, Zhao Y, Wang Z, Yang L, Zhai J, Feng CG, Wang JW*, and Chen X* (2012) A functional Single-Nucleotide Polymorphism in interleukin-6 promoter is associated with susceptibility to Tuberculosis.  The Journal of Infectious Diseases, 205:1697-1704.  
  • Li MJ, Wang PLiu X, Lim EL, Wang Z, Yeager M, Wong MP, Sham PC, Chanock S, and Wang JW*(2012) GWASdb: a database for human genetic variants identified by genome wide association studies. Nucleic Acids Research , 40(1):D1047-54. 
  • Yalamanchili HK, Xiao QW, and Wang JW* (2012) A Neural Response Algorithm for Protein Function Prediction.  BMC Systems Biology , 6(S1):S19.  
  • Wang LY, Wang PW, Li MJ, Qin J, Wang XO, Zhang MQ, and Wang JW* (2011) EpiRegNet: constructing epigenetic regulatory networks from high throughput gene expression data for human. Epigenetics , 6(12):1505-12.  
  • Yang S, Yalamanchili HK, Li X, Yao KM, Sham PC, Zhang MQ, and Wang JW* (2011) Correlated evolution of transcription factors and their binding sites.  Bioinformatics , 27(21):2972-2978.  
  • Wang W, Wei Z, Lam T-W, and Wang JW* (2011) Next generation sequencing has lower sequence coverage and poorer SNP-detection capability in the regulatory regions.  Scientific Reports , 1:55.  
  • Zhang G, Chen X, Chan L, Zhang M, Zhu B, Wang L, Zhu X, Zhang J, Zhou B, and Wang JW* (2011) An SNP selection strategy identified IL-22 associating with susceptibility to tuberculosis in Chinese.  Scientific Reports , 1:20.  
  • Qin J, Li MJ, Wang P, Zhang MQ, and Wang JW* (2011) ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor. Nucleic Acids Research , 39:W430-436. ( F1000 recommended
  • Li MJ, Sham PC, and Wang JW* (2010) FastPval: a fast and memory efficient program to calculate very low p-values from empirical distribution.  Bioinformatics , 26(22):2897-99.  
  • Wang J*, Ungar LH, Tseng H, and Hannenhalli S (2007) MetaProm: a neural network based meta-predictor for alternative human promoter prediction. BMC  Genomics 8, 374. (*Corresponding author)
  • Hunter DJ, Kraft P, Jacobs KB, Cox DG, Yeager M, Hankinson SE, Wacholder S, Wang Z, Welch R, Hutchinson A, Wang J, Yu K, Chatterjee N, Orr N, Willett WC, Colditz GA, Ziegler RG, Berg CD, Buys SS, McCarty CA, Feigelson HS, Calle EE, Thun MJ, Hayes RB, Tucker M, Gerhard DS, Fraumeni JF Jr, Hoover RN, Thomas G, and Chanock SJ (2007) A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer.  Nat Genet  39, 870-874.