WuXi NextCODE Takes on Cancer: Breakthroughs and Innovation in Sequencing using TCGA and AI

Hannes Smarason NextCODE TCGA cancer

Sequencing reads of a sample prepared by the traditional whole-genome sequencing workflow for fresh-frozen samples and data generated using the SeqPlus whole-genome FFPE method. The green and purple indicate reads sequenced in the forward and reverse directions, respectively, and yellow represents bases with non-reference sequence. The center of the image shows a C to A mutation in each of the tumor samples.

Cancer is one of the most active fields in genomics, spurring mountains of research papers and scores of clinical trials. WuXi NextCODE (WXNC) is committed to pushing this field forward and so we had a special “Genomes for Breakfast” session devoted to this topic at the recent ASHG17 event. Featured talks addressed our pathbreaking work in how to extract impactful findings from the renowned TCGA dataset; get better sequencing results from FFPE samples; and apply deep learning to drug discovery, drug repurposing, and identifying subtypes for diagnostics and clinical trials.

The Cancer Genome Atlas (TCGA) is one of the most useful public genomic cancer databases available and has already led to numerous critical discoveries, including entirely new drug targets as well as better insights into tumor origination, development, and spread. It includes data from approximately 11,000 patients and covers 33 cancer types. Data types include WES, RNA-Seq, mi RNA, CNV, Methylation array, and clinical sample data. The data is big and complex, and can include multiple samples from one patient, which is crucial to know when doing analyses.

During his ASHG talk, Jim Lund, WXNC’s Director of Tumor Product Development, shared some insights into how we put this rich data source to work in concert with our own unique data and analytical tools, in a process he dubs “multiomics analysis.” He described how we specially process the data and use our unique analytical platform to help scientists find just what they are looking for. Researchers can search the data by cancer type, age of diagnosis, sex, ethnicity, year of diagnosis, sample type (e.g. metastatic, new primary), and more.

Multiple pivotal studies using this dataset have already been published, including some examining the prevalence of specific mutations across human cancer types as well as in-depth profiling of specific tumors, such as breast cancer and lung adenocarcinoma. Layering different types of data, such as reads from DNA and RNA, allows much more accurate detection of features such as variants with allele-specific effects on gene expression. The user-friendly but sophisticated data interface makes it easier to see such findings. Over the years, our own database and our capabilities have both grown exponentially, creating a powerful tool for multiomics cancer research. You can see Jim putting the portal through its paces in a recent webinar.

In his talk, Shannon Bailey described how Whole Genome Sequencing (WGS) can be applied to formalin-fixed paraffin-embedded (FFPE) tumor samples, which are stored by the hundreds of thousands in repositories around the world. Shannon is the Associate Director of our Cancer Genetics division. He pointed out that while these samples are abundant and often paired with extensive clinical and outcome data, there are specific hurdles to using these for the type of large-scale retrospective studies many groups are eager to carry out.

For one thing the genetic material in such samples can be degraded, crosslinked, or in low quantities. Of all these problems, the biggest issue is getting sufficient quantity of quality DNA for sequencing. Numerous studies have found that these types of samples are difficult to work with and often provide very low success rates for gene sequencing studies. Clearly, fresh frozen samples provide much better results, but they are also much harder to obtain.

In response, our team has developed the WXNC SeqPlus FFPE extraction method, which provides substantially improved coverage compared to traditional methods and even approximates the results obtained with fresh frozen samples at 10X depth, with similar numbers of heterozygous and homozygous calls.

We tested SeqPlus in a study that comprised 516 tumor-normal pairs (i.e., 1,032 samples) that had been stored for 3 to 6 years. The targeted sequencing depth was 30X for the normal tissue and 70X for tumor tissue. The starting amount of DNA was 400 ng. The results were excellent, with SeqPlus delivering a coverage analysis just about 1% below what the fresh frozen control samples achieved. Further, a comparison of our analyses to results from the TCGA, using fresh frozen samples, showed striking similarity. These study results give us confidence that SeqPlus is a new “power tool” for FFPE sequencing studies. This webinar describes the process.

Sequencing reads of a sample prepared by the traditional whole-genome sequencing workflow for fresh-frozen samples and data generated using the SeqPlus whole-genome FFPE method. The green and purple indicate reads sequenced in the forward and reverse directions, respectively, and yellow represents bases with non-reference sequence. The center of the image shows a C to A mutation in each of the tumor samples.

Another area of great interest at WXNC is artificial intelligence (AI). We have been pioneers in AI for pulling novel insights out of massive multiple datasets. Leading this effort is Tom Chittenden, our Vice President of Statistical Sciences, Founding Director of the Advanced AI Research Labs, and a Lecturer on Pediatrics and Biological Engineering at Harvard Medical School and MIT. He also spoke at the breakfast series.

Our AI capabilities improve the tools we have and expand their capabilities. For example, using our AI tools, we can improve functional annotation of missense variants to an accuracy of >99%, integrate multiple types of data to discover new genes and elaborate pathways, and improve tumor subtype and drug-response classification accuracy by combining DNA- and RNA-seq, among other data types. These tools can be used for such varied purposes as target discovery, drug repurposing, and defining responders and non-responders in clinical trials.

We’ve already helped to develop breakthrough results, such as identifying an intriguing new target for both cardiovascular and cancer drug discovery. We’ve also classified breast and lung cancer subtypes with 97% to 100% accuracy, classified 8,200 tumors of 22 TCGA cancer types with >99% accuracy, and discovered a completely novel pan-cancer molecular survival signature.

The power of our deepCODE AI tools is in part thanks to a novel, causal statistical-learning method and deep-learning classification strategy. But another advantage is that they were built on our global platform for genomic data, which underpins the majority of the world’s largest genomics efforts and includes all major global reference databases. Our database stores, manages, and integrates any type of genomic data and correlates it with phenotype, ‘omics’, biology, outcome, and virtually any other type of data that may be relevant to a particular medical challenge.

If you want to know more, I recently gave an interview to WXpress outlining WXNC’s AI strategy. As we continue to deepen our commitment to this field, I’m sure we’ll have more exciting results to share.

email

News Flash: Drawing a “Molecular Portrait” of Mutations in Brain Disease

WuXi NextCODE‘s AI group is helping to advance cutting-edge applications across the breadth of our platform and with partners across the life sciences. Recently, they put some of their toolkit to work supporting exciting work by our colleagues at Boston Children’s Hospital and Harvard Medical School. Together, they have generated sequence data of unprecedented accuracy from single neurons, and we’ve been able to help with the analysis and the discovery of some very compelling mechanisms underlying neurodegenerative disease. Kudos to the BCH and HMS teams and to our AI group on this latest collaborative publication. That report is described below and on our new WuXi NextCODE blog.

WuXi NextCODE AI Team Helps to Draw Molecular Portrait of How Somatic Mutations May Contribute to Neurodegenerative Disease

  • Boston Children’s Hospital and Harvard Medical School-led study in Science leverages WXNC expertise in feature selection and pathway enrichment
  • Study shows how individual neurons accumulate mutations over time and how this process differs between normally aging people and those with early-onset disease

A study published yesterday provides the most direct and detailed picture to date of how single-letter mutations accumulate in the sequence of the DNA of neurons as we age, and how different this process looks in neurologically healthy individuals as well as those with early-onset neurodegenerative disease. Entitled “Aging and neurodegeneration are associated with increased mutations in single human neurons,” the study is published in the online edition of Science.

Led by scientists from Boston Children’s Hospital, Harvard Medical School, MIT, and the Howard Hughes Medical Institute, the study analyzed sequence data from 161 single neurons taken postmortem from 15 neurologically normal people, ranged in age from four months to 82 years, and nine individuals with early-onset neurodegenerative diseases, Cockayne syndrome and Xeroderma pigmentosum. A press release from Boston Children’s on the study and its impact is available here.

At a first level, this study utilizes important advances by the authors in techniques for accurately sequencing and reading mutations in the DNA of individual neuronal cells, a hurdle that has until now prevented directly testing the theory that such somatic mutations built up in neurons over time. With this data, the lead scientists were then able for the first time to observe directly in a substantial dataset the patterns of accumulation of somatic mutations in individual neurons in relation to age, region of the brain (in the prefrontal cortex and hippocampus), and disease state. From this they developed broad signatures for these three different types of variation.

The scientists’ next question was whether they could further tease apart the associational signature for early-onset disease to discover something further about the biological processes that were contributing to neurodegeneration. For that task, they called upon the expertise of their longtime collaborators at WuXi NextCODE’s Advanced Artificial Intelligence Laboratory. Tom Chittenden, WXNC’s vice president of statistical sciences, and Chandri Yandava and Pengwei Yang, senior bioinformatician and senior computational statistician, respectively, are co-authors on the study. They used techniques developed in our AI and deep-learning program to identify the most informative mutations from the vast original datasets, to map mutations onto the most informative genes, and to identify the biological pathways those genes are involved in.

“This extraordinary group, including Chris Walsh and Mike Lodato as well as their talented teams, has enabled us to take another step forward and to see better than ever before the progressive mutational burden in individual neurons,” said Tom Chittenden. “We’ve used our toolkit and functional enrichment models to identify the pathways being most impacted by these mutations. This has pointed the group to the importance of oxidative mutations affecting DNA repair and, particularly, in genes that are heavily transcribed.”

“What Tom’s group has done is helped us to model how, as the somatic mutation burden increases, the brain loses function. What we see is that the more genes are transcribed, the more likely they are to be damaged and lose function,” said Mike Lodato of Boston Children’s Hospital and Harvard Medical School, one of the six first authors on the paper. “At the same time, because genes interact through these pathways, linear increases in the number of mutations appears to lead to exponential loss of brain function. It is essentially a scenario of use it and lose it.”

The study authors note that the identification of these pathways and the apparently important role of oxidative mutations points to potential novel therapeutic approaches for neurodegenerative diseases. This study also paves the way for the group’s next challenge: to take these discoveries in severe early-onset neurodegenerative disease and apply them to improve our understanding of the mechanisms and pathways involved in other related conditions, including Alzheimer’s disease.

Tom Chittenden says this is a challenge that is going to call on his full arsenal of AI and deep-learning capabilities. “To address Alzheimer’s disease, we are looking not only at early-onset disease but at subtler phenotypes around mild cognitive impairment. We are going to have to bring in not just sequence data but also methylation data, mRNA, and many other data types. The results we are presenting today are a step in the right direction, however—going from association to causal inference models to identify dysregulated pathways involved in disease. This is how AI is going to help to provide novel understanding of disease and progression.”

 

 

News Flash: Key Advances in Big Genomics from WuXi NextCODE Highlighted

Jeff Gulcher, CSO and co-founder of WuXi NextCODE

WuXi NextCODE‘s CSO and co-founder, Jeff Gulcher, spoke with Frontline Genomics at this year’s ASHG meeting about our recent breakthrough with FFPE sequencing, advancing toward using AI to diagnose cancer, how we are integrating complex datasets, and the importance of having a global platform. Here is a link to that article.

WuXi NextCODE Wins Scrip Award: As Reported in Our New Blog

WuXi NextCODE wins Scrip award

WuXi NextCODE was awarded the 2017 Scrip Award for Best Specialist CRO for enabling the pharmaceutical industry to take advantage of genomics. Dr Mark Hughes (middle), WuXi NextCODE’s Business Development Director for Europe, accepted the award on behalf of the company.

Below is news about our recent Scrip Award for Best Specialist CRO. This story is from our new WuXi NextCODE blog.

We’re thrilled to share that Scrip, the global biopharmaceutical industry news and analysis service, has named WuXi NextCODE as specialist contract research organization of the year for 2017.

The award, announced at the 13th annual Scrip Awards dinner in London, singled out WuXi NextCODE’s unique Contract Genomics Organization (CGO) capabilities for putting the full power of our world-leading genomics platform into the hands of biotechnology and pharmaceutical companies around the world.

WuXi NextCODE was selected from among many leading service providers and half a dozen finalists by an expert panel of judges comprised of independent, senior industry experts from around the world.

“Our aim is to enable anyone to use the genome to the benefit of people everywhere, so it is particularly gratifying to see our work recognized by the industry that creates new medicine,” said Hannes Smarason, CEO of WuXi NextCODE.

“We have built our CGO business to make it possible for organizations around the world to access our comprehensive range of best-in-class capabilities in genomics. We make it possible for biotechnology and pharmaceutical companies to take advantage of the best technology and largest global datasets, and to do so precisely according to their needs rather than a major fixed cost.”

WuXi NextCODE Solutions
WuXi NextCODE’s global platform is unique in offering truly comprehensive and integrated solutions for sequencing and querying genomic Big Data. Solutions include:

  • Sourcing global cohorts
  • Experimental and trial design
  • Research and clinical DNA sequencing
  • Access to large datasets for diagnostics and research
  • State-of-the-art genomic sequencing of FFPE samples
  • The unrivalled GOR database management system
  • Clinical interpretation and case-control research analytics
  • AI and deep learning

Pioneering Rare Disease Diagnostics in China—An Interview with Fudan Children’s Hospital Clinicians at ASHG17

wuxi-nextcode-fudan-university

The first year of WuXi NextCODE’s partnership with Fudan Children’s Hospital has delivered 11,000 clinical reports and a diagnosis rate of 33%, matching the throughput and success rate of the world’s leading laboratories.

One year ago, WuXi NextCODE (WXNC) and the Children’s Hospital of Fudan University (CHFU) launched a joint laboratory to put the global gold standard in sequence-based rare disease diagnostics at the service of patients in China. In the first year of that joint effort, the partnership delivered some 11,000 clinical reports—with more than 1,000 new reports now being generated each month—and a diagnosis rate of 33%. This matches the throughput and success rate of the leading laboratories in the world.

Dr Lin Yang of CHFU presented a summary of this remarkable progress at WXNC’s breakfast session on rare disease at the ASHG17 meeting held recently in Orlando. Afterwards, WXNC’s global communications lead, Edward Farmer, sat down to talk about this collaboration and what it means for patients, with Associate Professor and Laboratory Director Dr. Huijin Wang from the clinical team; Dr. Bing Bing Wu, director of the medical diagnostics laboratory; and Assistant Professor Dr Xinran Dong, who leads the bioinformatics team, as well as WXNC Chief Scientific Officer, Jeff Gulcher.

Edward Farmer: It’s a real pleasure to have with us our colleagues from Fudan and to be able to hear about this collaboration in rare disease diagnostics and genome-based testing in the neonatal unit. To start us off, Jeff, can you tell us how this partnership came about and how you see its importance to rare disease diagnostics in China and worldwide?

Jeff Gulcher: It’s been a fantastic partnership that started about two-and-a-half years ago, when we began discussing the possibility of creating a joint laboratory. The aim was to take advantage of WXNC’s technology and sequencing expertise together with Fudan’s expertise, both on the medical side and the interpretation side. The goal was to enable whole exome or medical exome sequencing of very sick children. In parallel, we decided to see if sequencing is useful in the neonatal ICU setting.

Through this partnership, we’ve now sequenced a large number of children and worked together to make diagnoses. Our medical genetics teams have worked closely together to interpret these cases, and in about one-third of the cases, we’ve been able to deliver diagnoses that were not suspected by the treating physician. In many cases, that has led to different treatments, with better outcomes for the children.

Together, we have now sequenced over 11,000 pediatric cases, including some 2,500 neonatal ICU cases, and we are very pleased with this partnership.

Edward Farmer: We have with us several senior people from Fudan. Huijin, let’s begin with you, as director of our joint laboratory. Can you share with us your impressions of this partnership so far and some of your results?

Huijin Wang: We have had a very good experience with this collaboration. We have many cases and, each week, we have a case meeting with the Cambridge WXNC team and we discuss the data and variant curation for the more difficult ones. The results have been impactful for the patients. In many cases, we can deliver a clinical diagnosis, and some of these offer real treatment options.

I remember one case that first came to the neurological clinic with seizures and hypoglycemia. This child had presented with recurrent hypoglycemia at a very early age and was in the NICU. We sequenced the family and found a recessive variant in the FBP1 gene, which the patient had inherited from both parents.

After this diagnosis, the doctor was able to discuss the problem with the family and advise them on how to limit the child’s diet to avoid hypoglycemia. The child is now doing well and no longer experiencing hypoglycemic episodes. And his family came back later and planned to have another child, and we referred them for prenatal diagnosis, and they were able to have another child who is healthy. This was a very successful case and is the sort of story that encourages us and shows us the value of the work we are doing.

Edward Farmer: That is an encouraging result. Lin, as an attending physician, how do you see the impact of introducing this technology into China at scale?

Lin Yang: We have more and more children at our hospital with birth defects or congenital malformations, so we really want to get a diagnosis and whatever possible treatment for them, including new treatments when available.

The collaboration between our hospital and WXNC starts with us deciding whether the case is likely to be the result of a genetic disorder. If it is, we do pre-testing counseling for the whole family before taking DNA samples. We then use WXNC’s capabilities for the sequencing and analysis of the results. Finally, we need to interpret the sequencing results and report them to the parents. It is often very difficult for parents to understand “what is a gene,” “what is a mutation,” “what is the disorder,” and “how can your child benefit from a molecular diagnosis?” So that is a critical part of our work.

But more and more patients are choosing molecular diagnosis and, if they get a correct diagnosis early, they may find a useful and more targeted treatment earlier.

In the NICU, we have some patients that have immune deficiency disorders. These can be very serious conditions, as the children suffer from repeated infections. It is very hard on the whole family. For such cases, if you have a specific diagnosis, there is often a cure. This is very good news for these families in the NICU, as they now have the possibility of getting a molecular diagnosis and then a treatment.

Edward Farmer: Are there any specific examples or cases that you can share with us?

Lin Yang: I had a newborn patient who had very low platelet counts and petachiae (red spots from small bleeds) on his face and body. We found that he has a mutation in the WAS gene, inherited from his mother’s side of the family, which means that his bone marrow is not producing enough platelets. But with a hematopoietic cell transplantation [HCT, which can include bone marrow] from a relative or closely matched donor, he has every chance of being cured of the disease and becoming a healthy boy. He is now waiting for a matched donor.

Edward Farmer: Huijin, you’ve done amazing work so far, and I know you are only getting started, but I wonder what proportion of the patients you see are able to benefit from the work of your lab and the collaboration with WXNC?

Huijin Wang: Currently, we are delivering a diagnosis to about 30% of patients, and we are able to recommend specific clinical treatment for about 20% of our patients.

And very often, we can give some guidance, if not a cure. Sometimes just knowing exactly what the diagnosis is gives patients peace of mind and new options. For example, many can go to a specialty clinic. But just knowing the diagnosis is often a comfort.

There is also a big need, and as a national center of excellence our diagnostics can help people across the country. About 80% of our patients come from outside of Shanghai, so with a diagnosis, they can go back to where they live and take some action there.

Lin Yang: There is also a difference among different diseases. I think we are now able to provide actionable results to about 50% of patients with neuromuscular disorders, and for respiratory maybe something less than that. For NICU, it’s maybe 15% that get a diagnosis, but we want to boost all of these.

We can benefit many more patients with this technology. In our hospital and with the WXNC collaboration, we can see an increasing number of patients. But there are a lot of undiagnosed patients, and in many places, there is not yet access to molecular diagnostics, so we hope this capability spreads to other parts of the country as well.

Edward Farmer: And Xinran, as we’re talking about building the scale and reach of molecular diagnostics, perhaps you can tell us a bit about how you are dealing with all of this data.

Xinran Dong: We have collected a lot of data. And from my bioinformatics perspective, one of the things that the WXNC collaboration is helping us to do is to make good use of the data, both for our clinical cases and for research.

I see part of my job as helping to build this into one of the biggest databases on rare disease in China and maybe the world. This is going to help patients today and advance the discovery of new genes.

Edward Farmer: Clearly there is no lack of ambition here. I want to thank you all for your time, and we look forward to sharing more stories of our work together.