Professor of Bioinformatics
Director, Centre for Integrative Bioinformatics VU
Room P1.28
Bioinformatics Section
Dept.of Computer Science
Faculty of Sciences
Vrije Universiteit
De Boelelaan  1081A
1081 HV Amsterdam
The Netherlands
+31 20 59 87649 (direct)
+31 20 59 83563 (secr.)
+31 20 59 87653

Academic Education
Work Experience
Research Interests
Editorial Board Memberships
Selected References
Selected Methods

Academic Education

1980: B.Sc. Hons. Mathematical Biology
1984: M.Sc. Biology, Majors Bioinformatics and Population Genetics, Minor Computer Science
1993: Ph.D. Bioinformatics (Cum Laude), University of Utrecht, The Netherlands. Thesis title: Local Interactions in Protein Folds: A Bioinformatic Approach

Work Experience

1980 - 1984: Assistant Research Consultant (multivariate statistics), one day per week, University of Utrecht, The Netherlands
1985 - 1988: Development Programmer and Team Leader with IBM Informations Network Service Development Center (INSDC), Uithoorn, The Netherlands
1988 - 1996: Staff Researcher, Biocomputing department, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
1996 - 2002: Group leader, Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
2002 - 2003: MRC Senior Scientist, Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
2002 - Date: Professor of Bioinformatics, Free University, Amsterdam, The Netherlands
2002 - Date: Head of Bioinformatics Section, Faculty of Sciences, Free University, Amsterdam, The Netherlands
2003 - Date: Director of the Centre for Integrative Bioinformatics VU (IBIVU), Free University, Amsterdam, The Netherlands
2009 - Date: Director of the Netherlands Bioinformatics Centre (NBIC) Education Platform "BioWise"
2010 - Date: Scientific Director of the Netherlands Bioinformatics Centre (NBIC)

Research Interests

  • Bioinformatics methods and data integration
  • Genome analysis
  • Systems Biology
  • Protein structure prediction
    • Domain boundary prediction
    • Secondary structure prediction
  • Sequence database searching
  • Multiple sequence alignment
    • Multiple sequence alignment and (predicted) secondary or tertiary structure information
    • Multiple sequence alignment and profile analysis using local weighting schemes
    • Multiple sequence alignment parameter optimalisation using genetic algorithms
    • Multiple sequence alignment quality control and iterative optimisation
    • Repeats-aware multiple alignment
  • Genomic and internal protein repeats detection
  • Micro-array (gene expression) analysis in the context of Ecogenomics

Editorial Board Memberships

Selected References

Heringa, J., and Argos, P. (1991a). Side-chain clusters in protein structures and their role in protein folding. J. Mol. Biol. 220, 151-171.
Heringa, J., and Argos P. (1993b). A method to recognize distant repeats in protein sequences. Proteins Struct. Func. Genet. 17, 391-411.
Heringa, J. (1994). The evolution and recognition of protein sequence repeats. Comp. Chem. 18, 233-243.
Heringa, J., and Taylor, W. R. (1997). Three-dimensional domain duplication, swapping and stealing. Curr. Opin. Struct. Biol. 7, 416-421.
Heringa, J. (1998) Detection of internal sequence repeats: how common are they? Curr. Opin. Struct. Biology, 8, 338-345.
Heringa J. (1999) Two strategies for sequence comparison: Profile-preprocessed and secondary structure-induced multiple alignment. Comp. Chem. 23, 341-364.
Notredame, C., Higgins D., and Heringa, J. (2000) T-Coffee: A novel method for fast and accurate multiple sequence alignment. J. Mol. Biol., 302, 205-217.
Heringa, J. (2002) Local weighting schemes for protein multiple sequence alignment. Comput. Chem., 26, 459-477.
George, R.A. and Heringa, J. (2002) SNAPDRAGON: A new method to predict protein structural domain boundaries from sequence data. J. Mol. Biol., 316, 839-851.
George R.A. and Heringa J. (2002) Protein domain identification and improved sequence similarity searching using PSI-BLAST, Proteins: Struct. Func. Gen. 48, 672-681.
Szklarczyk, R. and Heringa, J. (2004) Tracking repeats using significance and transitivity. Bioinformatics 20 Suppl. 1, i311-i317.
Lin K., Simossis V.A., Taylor W.R. and Heringa J. (2005) A Simple and Fast Secondary Structure Prediction Algorithm using Hidden Neural Networks. Bioinformatics. 21(2):152-9.
Kleinjung, J., Romein, J., Lin, K., and Heringa J. (2004) Contact-based sequence alignment. Nucl. Acids Res. 32(8), 2464- 2473.
Simossis V.A., Kleinjung, J. and Heringa J. (2005) Homology-extended sequence alignment. Nucleic Acids Res., 33(3):816-824.

Click here for further references

Selected methods produced over the years (IBIVU web services):

  • REPRO - recognition of distant repeats in a single protein sequence
  • TRUST - recognition of repeats on a genomic scale
  • CLUSPROT - delineation of densely packed (side-chain) clusters in protein 3D structures
  • OBSTRUCT - construction of largest possible protein sequence data sets based on sequence similarity and 3D structural features.
  • SSPRED - Protein secondary structure prediction.
  • PRALINE multiple sequence alignment toolkit
  • T-COFFEE - multiple sequence alignment
  • SnapDRAGON - Protein domain boundary prediction  using 3D model building consistency based on multiple alignments and secondary structure prediction
  • DOMAINATION - Protein domain boundary prediction integrating the PSI-BLAST method with on-the-fly domain boundary recognition.
  • Scooby Do - Protein domain boundary prediction based on a model of the distribution hydrophobic amino acids along the protein sequence
  • CAO - Contact Accepted mutatiOn: a new HMM-based mutation scheme for evolutionary probabilities of residue 3D contact preservation, integrated in a method to assess the quality of alignments where the structure is known for one or more input sequences.
  • AliCAO: a new alignment technique that uses the CAO evolutionary propensities associated with the residue contact network of a single protein in order to align a set of sequences. The alignments have better quality than those that are only based on sequence information and approach the quality of structural alignments based on protein 3D superpositioning.