Digital: Microbiomes
Advances in microbiome ecology and data science are set to fuel the next industrial revolution, improving drug development, health, and wellbeing
Sven Sewitz at Eagle Genomics
The ‘bio revolution’ is forecast to have a direct economic impact of up to $4 trillion a year over the next 10 to 20 years (1). It also promises to provide some of the most convincing regenerative economic answers to the current climate crisis, with advances stemming from the biological sciences fuelling a new wave of innovation.
The terminology may differ but, when Boston Consulting Group talks about ‘nature co-design’, the same megatrend is being referenced; namely, a new industrial revolution, which will harness nature’s design principles and manufacturing capabilities to produce beneficial new materials, from the atomic level up (2). One of the components in this coming revolution is the microbiome, which is the ecosystem of bacteria, fungi and viruses present in virtually all spheres of life: plants, animals, air, water, and the soil.
When better understood, microbiome ecology promises to have a significant impact on health and wellbeing. In fact, the potential for microbiomes to speed development of innovative medicines, improve global food production and produce industry-leading, sustainable consumer goods, is vast.
As an example, major food brands are looking at advances in understanding of the microbiome to make the human gut healthier, aiding digestion, as well as creating more opportunities for better food products. There is also great work taking place to build a new microbiomebased economic ecosystem. This work will revolutionise production and value chains across wide product ranges, including drugs, cosmetics, and food ingredients. It will lead to improved medical treatments, better wellbeing, and healthier nutrition.
Without doubt, the complex, multi-dimensional nature of microbiome data has proved to be a significant challenge.
With the right tools and, importantly, standardised data annotation and analysis approaches, it will soon be possible to unlock the promise of the microbiome and help support the accelerating bio revolution.
However, greatly improved data and process management methods are needed. Rapid and scalable innovation in the field is hampered by historically poor data curation and management processes that don’t comply with current FAIR (findability, accessibility, interoperability, and reusability) data practices.
This has led to inefficiencies arising from wasted time and effort associated with incessant data wrangling, inconsistent reproducibility of experiments, and poorly documented processes that do not readily transfer or scale to further stages of development. In order to make the bio revolution work, this needs to change.
Fortunately, progress is being made.
The complexity of multi-omics data management requires technology and platforms that can handle large volumes of information, process it, and provide valuable outputs. Such platforms must enable consumers of those data, be they scientists, data scientists, product marketers, researchers, members of legal and compliance, or business owners, to quickly see how complex data can be applied effectively to find real-time solutions.
In response to this growing demand, a multi-omics data management software industry has arisen to understand and apply complex data meaningfully. For example, R&D teams in science-focused industries are applying data fabrics. These are integrated layers of data and connecting processes that use “continuous analytics over existing, discoverable and inferenced metadata assets to support the design, deployment, and utilisation of integrated and reusable data across all environments”, according to Gartner (3). Fit-for-purpose data fabrics ensure comprehensive sets of microbiome-related data are available and can be exchanged, compared, and understood by non-data scientists. Gartner predicts that companies using the right combination of data fabrics, active metadata, and machine learning could reduce time to data delivery and improve value by 30% by 2030 (4).
“Gartner predicts that companies using the right combination of data fabrics, active metadata, and machine learning could reduce time to data delivery and improve value by 30% by 2030”
Another crucial tool is AI, which can help to decode patterns that may not be obvious to a human observer. This is particularly relevant for microbiome data, the most important of which are high dimensional genomic, transcriptomic and metabolomic.
Finally, the causal inference programming approach is proving highly effective to microbiome researchers (5). This is a valuable way to pick out causal relations in diverse data, allowing scientists to design new studies to delve deeper into root cause analysis. Causal inference work holds great potential for assisting understanding of the environmental role of the microbiome.
The microbiome industry is still in its infancy. What is encouraging for startups and investors entering the field is the fact that other early-stage industries that leveraged data effectively went on to deliver great rewards. Examples include small molecule drug discovery and semiconductor process and product development. What got them off the ground was a determination to move quickly to help customers by swift agreement on common standards, best practices, and sophisticated tools. Such an approach will also help the nascent microbiome industry.
Similarly, improved understanding of the microbiome is set to contribute significantly to an understanding of biology, resulting in a better quality of life for all. To get there, a deep, data-driven partnership between science and technology is a prerequisite. It is time to look beyond IT alone and consider the potential of the combination of IT and science to drive a truly revolutionary change.
References
Sven Sewitz is Director of Biodata Innovation at Eagle Genomics. He is an experienced and driven scientist, with an interdisciplinary background. He gained his PhD in molecular and cellular biology from Oxford University and trained in translational biology, bioinformatics and data science at the University of Cambridge.
He focuses on metagenomic data analysis and graph learning technologies.