Prof. Przulj uses Machine Learning techniques to relate large amounts of omic data and recreate them in a computational prototype, Integrated Cell or iCell. Specifically, it fuses three tissue-specific molecular interaction networks: protein-protein interaction, gene co-expression, and genetic interaction networks. The technique by which this fusion is performed is Non-negative Matrix Tri-Factorization, a machine learning technique originally proposed for co-clustering and dimensionality reduction that was recently used for data-integration.
The authors of theNature Communicationsarticle titled " Towards a data-integrated cell ", have applied this method to reconstruct cells from four of the most common types of cancer - breast, prostate, lung and colon - and in all of them, it has proven useful in locating new genes related to these diseases. The method has indicated 63 genes, and a biological validation process has confirmed that at least 36 of them contribute to the irregular growth of the cells. The validation has been carried out through gene deactivation experiments followed by cell viability tests and analysis of patient survival data.
The experimentation revealed, for instance, that breast cancer patients with high expression of MRPL3, a mitochondrial ribosomal protein which was not related to cancer previously, have reduced survival. This is an example of how the new method may be used to uncover new biomarker genes, which may be relevant in the stratification and prediction of survival in cancer patients.
Natasa Przulj is ICREA Professor and has just joined the BSC as the leader of the Computational Integrative Network Biology group.
Alfonso Valencia, ICREA Professor and Director of BSC's Life Sciences Department stated that "Natasa's iCells perfectly complement our BSC cancer genome analysis portfolio, and it is only the first one of the many strong computational methods that we are expecting to see developed by her new group in the coming years".
Prof. Przulj highlighted that this new method to analyze cells "enables the identification of perturbed genes in cancer that do not appear as perturbed in any data type alone. This discovery emphasizes the importance of integrative approaches to analyze biological data and paves the way towards comparative integrative analyzes of all cells".
Possible applications range from various other diseases to aging, with the ultimate goal being uncovering intrinsic principles of inner organization of life on Earth.