Connect with us


Data processing technology and its growing role in genomics



Health Tech World hears from Seqera Labs, which is helping scientists accelerate the process of genomic sequencing through digital tech.

The history of genomic sequencing can be traced to the 1980s but it is only in the last several years that the field has scaled up.

Today, there are numerous types of sequencing that are taking place in the lab. For example, RNA sequencing allows scientists to sequence full genomes, while ChIP sequencing looks at the three-dimensional structures of DNA.

This information can then be used in both a clinical and research setting. Disease prediction and therapeutic development are two key areas where genomic sequencing has the greatest potential.

However, a barrier exists between taking this data and turning it into something useful. Sequencing data is displayed in a huge text document of A’s, T’s, G’s and C’s – the four types of nucleotide bases found in a DNA molecule.

“Often the word big data is used, but this is almost unwieldy data,” Dr Evan Floden, CEO and co-founder of Sequera Labs, said. “It’s very different in terms of its dimensions.”

This presents some big challenges for researchers. The sheer size of the files – terabytes or even petabytes of files – makes processing the data a monumental task. In addition, the digital tools required to analyse the data are complex and multifaceted.

It is not as simple as running the data through a set of sequences to produce an output. There are dozens of steps between each sequence, each one requiring a different piece of software that may be proprietary or sourced externally. The result is a highly complex pipeline that takes a significant amount of time to develop.

“[This] can often result in you spending a lot of time on the infrastructure and not necessarily focusing on the science of the problem you’re trying to solve,” Dr Floden said.

“There became a requirement for the technology, and particularly the software, to improve because it was just not feasible to be able to do it given the volume of [data]. The data generation is really driving the technology here.”

Seqera Labs aims to shorten the time it takes for researchers to analyse their data, using its open-source software, Nextflow, to turn overwhelming quantities of biological data into easy-to-analyse content.

The Barcelona-based firm primarily works with large pharma and mid-market biotech companies, along with a growing user base in the fields of public health and population genomics.

Dr Evan Floden, CEO and co-founder of Sequera Labs

The company recently partnered with Genomics England – the government-owned firm behind the 100,000 Genomes Project. The collaboration will help support Genomics England’s personalised medicine goals and enable it to scale up its pipelines and its capacity to develop models for disease diagnosis, prognosis and treatment response.

“The problem we’re solving is how scientists can first write those pipelines, run them and then share them at scale,” Dr Floden said.

“What we enable folks to do is provision that infrastructure – essentially set up the systems and the computation that they need,” Dr Floden explained.

“They can then use pipelines or develop pipelines themselves using modern software engineering practices. Then they can run them and share them in a reproducible way, with flexibility and all the cost benefits that come with running things efficiently.

“This is a concept of bringing modern software engineering practices to scientists”

In addition to shaving down costs and time, these techniques have also fostered better collaboration between scientists.

Seqera Labs’ nf-core community pipelines see hundreds of experts working on the same analysis. The result is a collective-like approach to science.

To date, the community has released more than 80 bioinformatics pipelines, and nearly 1,000 reusable modules that are freely available and in use by bioinformaticians worldwide.

“Everyone can benefit from that,” Dr Floden said. “You can have hundreds of people using that pipeline independently. They can be running it on their laptop or running it in a cluster in South Africa, or in a pharma company in Boston. It allows more sets of eyes on it.”

Genomic sequencing software is still in its infancy but interest is growing. In its first year, Seqera Labs had a user base of just ten people. Now, Nextflow software is downloaded over 160,000 times per month, up from 55,000 in March 2022.

“We’re still pretty early and into this,” Dr Floden said. “There’s still a big need to make these kinds of tools accessible to everyone and make folks aware of them. It’s definitely growing, but there’s a lot more to do.”

The pandemic helped accelerate the adoption. When the virus hit, scientists were able to come together quickly and develop pipelines collaboratively within days and weeks.

“We were then able to set up that infrastructure and do the analysis where the sequencing was taking place,” Dr Floden said. “The fact that people could come together really quickly, work on that pipeline, work on that analysis very fast and spin up that whole infrastructure meant they were able to really mobilise that workforce.

“The UK in particular did a great job of this and of course, the variant tracking was really significant.

“I definitely think that within the public health area [Covid] has accelerated things. In that regard, it pushed things forward a few more years.

Since the pandemic, there are now several sequencing centres that are using a “weather map of disease” to sequence and track viruses as they emerge.

“They are tracking things in a way which didn’t exist before,” Dr Floden said. “It’s great to see that governments are being a lot more prepared now and investing in that.”

Keeping Nextflow Tower open source has played a key role in fostering a community between scientists and encouraging collaboration. This feeds into the wider concept of open science, a movement that aims to make scientific research more accessible and transparent.

“Open Science is a little bit a little bit more broad,” Dr Floden said. “That kind of openness allows scientists to transparently share analyses. It allows those domain experts to come together and it builds a real community around the software which we develop. Open source is a key part of that.

“It’s great to see that the adoption of open source is growing.,” he added. “Genomic analysis is just at the tip of that in terms of its adoption given the need, which is driven from sequencing demands. But it would be it’d be fantastic to see this being propagated through the rest of the industry.”

Looking to the future, Dr Floden is particularly excited by the new discoveries in drug development that are being enabled by this rapidly advancing technology.

“It’s a really exciting time to be to be in this space,” he added.

“The different chemistries that are coming in, the decreasing cost of sequencing and the general innovation that is taking place in the sequencing space is really bleeding through into a lot of drug development.

“Every day there are fantastic discoveries.”

There is also potential for genomics to accelerate the push towards more personalised medicine and make these therapies cost-effective.

“At the moment, [personalised medicines] are relatively prohibitive in terms of their costs, even though they have fantastic outcomes. I’m really looking forward to what’s going to happen with the productionisation of personalised medicine and how that’s going to affect the sector.”

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending stories