Top down identification of unknown metabolites: structure generation and candidate rejection

Metabolite identification is one the most important issues in metabolomics research for which no general methodology exists at the moment. Successful identification of metabolites will have a major impact on biomarker and omics-research.

In this project a new strategy will be set-up to identify unknown metabolites by developing and applying rule- and algorithm-based methodologies together with biology- and (bio)chemistry-based constraints during structure generation and candidate rejection.

A structure generator was developed based on graph-theory that is capable of calculating all possible chemical structures from an elemental composition. Furthermore, it is possible to include substructures, derived from a mass spectral tree or NMR, during the structure generation in order to reduce the number of candidates dramatically. Further candidate rejection can be applied by incorporating other parameters in the structure generator, like polarity, internal energy, and is currently under investigation.

Chemoinformatic tools were developed and applied to predict the metabolite-likeness of a generated chemical structure. This tool is thought to be incorporated into the metabolite identification workflow in order to rank a limited number of possible candidate structures based on their metabolite-likeness.

Future work will concentrate on using biochemical and biological information for further candidate rejection.

The final aim of the project is to integrate the various tools developed in this project and other projects within the theme Metabolite Identification to obtain an improved workflow for metabolite identification. The first version of the improved metabolite identification workflow is currently being tested.