MGPS & l'INRA
The microbiota in human health
and well being.

InfoBioStat

INFOBIOSTAT platform is a major player in our R&D activities, participating in both quantitative (in depth whole bacterial DNA sequencing) and functional (mechanisms prevailing over host-bacteria interactions) metagenomic studies, and thus interacting with all three platforms at INRA.

Through close interactions with Sambo, MetaQuant and MetaFun,INFOBIOSTAT contributes its expertise to study the genetic potential of human microbiota in health and disease, and  modeling their diversity and dynamic in order to:

 

Stratify individuals by their microbiota composition

Identify biomarkers specific to a microbiota status

Model in silico diagnostic/prognostic test

Cutting edge and powerful informatics tools and technologies are required to process and mine the huge data volumes generated by metagenomics.  In this context, INFOBIOSTAT platform provides continuously optimized solutions in terms of infrastructure for:

Big-data storage

Data security, in an internal data storage facility

High-throughput data transmission

HPC computation

Database management (relational and/or real-time analytics)

Software development for validated processes

Tools for our INFOBIOSTAT experts to overcome the big data wall – providing GPU/HPC/Xeon Phi dedicated tools

Our know-how is demonstrated in different software developed by INFOBIOSTAT:

Meteor for the primary processing of sequencing data: construction of metagenomic catalog and delivery of microbial gene abundance information (APP IDDN.FR.001.420008.000.R.P.2013.000.30000)

MetaOMineR for data-mining: a software suite designed to process meta(gen)omics data, starting from microbial genes count, to the identification of biomarkers, ecosystems, diagnostic or prognostic modeling in the context of complex diseases, and integration with other available metadata (APP IDDN.FR.001.220005.000.R.P.2014.000.10000)

MetaProf for metagenomic catalog organisation around MGS (MetaGenomic Species)

Our development take into account the following issues:

Delivery of treatment to the client within a defined schedule

Quality insurance

Big-data access for the client

 

The pool of expertise implemented in INFOBIOSTAT is deployed in a large amount of projects and is recognized in several high-impact publications.

This powerful infrastructure is maintained and upgraded by a continuous technology watch, powered in particular through participation in collaborative projects such as :

We participated in OpenGPU (2010-2012) through which GPU based calculation tools were implemented. The internal orchestration facility is done with Active ProActive – a well established tool since 2007. The platform is evolving and prepares the future – larger dataset, bigger sample dataset. In order to scale-up and to handle more than 4000 samples in a single set – we are developing a new database design with the company ParStream. In order to scale up 'R' ability to handle large dataset  (>0.1GB) we are involved in the ITEA3 project MACH (MAssive Calculations on Hybrid systems); in which we prepare a version of 'R' using GPU and Xeon PHI with native binary.

The team - Publications - Contact