Marking Progress in Genomics: Reflections and Prospects

Progress in Genomics WuXi NextCODEAs leaders of our field gather in Vancouver for the annual American Society of Human Genetics Meeting (ASHG 2016), it is an excellent time to take stock of the past and clarify our perspectives for the future. For the field of genomics, this is an opportunity both to reflect on our accomplishments over the last few years and to consider what we can achieve in the years ahead.

Indeed, our accomplishments have been numerous and our goals are ambitious, yet achievable. Here, I would like to summarize five significant ways in which our work in genomics has been revolutionizing medicine and improving patient outcomes.  In addition, I would like to share my thoughts about five areas in which I believe our field can drive meaningful change over the next few years.

What We Have Achieved
1. Improvements in Sequencing Technology and Analytical Tools
The ever-increasing volume of genomic data is testimony to the dramatic increases in sequencing speed and efficiency over recent years.  At the same time, novel methods of analysis, like the powerful genomics platform employed here at WuXi NextCODE, have considerably advanced our understanding of genetic variations and their clinical significance.

2. Transformations in Cancer Treatment
As I have discussed here, the expanding use of genomic data to guide treatment decisions in oncology is transforming the way clinicians approach cancer treatment.  In addition, our growing understanding of genetic predispositions for certain cancers is helping high-risk individuals make informed choices about preventive care.

3. Progress in Rare Diseases
Genomic data has brought new hope to families struggling with rare diseases by shortening diagnostic odysseys, guiding treatment, and building communities.  I provide examples of the game-changing power of genomics in the diagnosis of rare diseases here.

4. Empowerment of Patients and Consumers
Patients and consumers are increasingly informed about the innovative and meaningful ways in which genomic data can guide healthcare decisions.  The successes in our field are empowering individuals to pursue personalized medicine and generating interest in direct-to-consumer testing.  I offer my thoughts about DTC genetic testing here.

5. Innovations in Cloud-Based Analysis
The vast and ever-growing quantity of genomic data and related information necessitates new approaches to storage and analysis.  As I have previously discussed, cloud-based computing is essential to continued success in genomics.  WuXi NextCODE’s Exchange is at the forefront of the accelerated research made possible by real-time collaboration and analysis in the cloud.

What We Can Achieve in the Years Ahead

1. Effective Communication and Collaboration
Realizing the full potential of big data and cloud-based computing will require new efforts to dismantle “data silos.”  I am encouraged by recent initiatives to facilitate collaboration in cancer research, and – as I have recently discussed – call upon researchers and clinicians throughout the field of genomics to improve communication among all stakeholders.

2. Policies for Research with Patient Data
Our field derives its greatest power from careful analysis of genomic data, and access to data is critical to effecting meaningful change in healthcare.  In order to gather this game-changing data – from patients, from consumers, and from population-wide studies – we need to develop and embrace policies that lead to consider the ‘biorights’ of patients.  Individuals who wish to contribute information for research should have the opportunity to do so, and all parties should clearly communicate the purposes and extent of data-sharing.

3. Integration for Clinical Trials
I perceive significant movement toward the development of clinical trials that test the efficacy of treatments tailored to specific genetic anomalies – and use genetic information to screen participants.  This is an area in which genomics will dramatically accelerate the development of personalized therapies that will surely improve patient outcomes.

4. Actionable Information from Population-Wide Genomic Studies
I believe that in the near future we will reap significant rewards from projects that gather population-wide genomic information.  Analysis of the data we are collecting around the world, which I describe here, is an essential step to reshaping healthcare practices worldwide.

5. Globalization of Genomic Products: ‘Think Globally, Act Locally’
The power of genomic information is now known throughout the globe, and can be applied in a multitude of positive ways.  With such widespread potential, individual countries and cultures will choose to advance and roll-out genomics in their own distinct ways for the benefit of their citizens.  Companies that develop genomic products will need to adapt and design their products for use in specific markets.  At WuXi NextCODE, the first focus of our product portfolio for individual patients and families is in China, where we are delivering three offerings: population-optimized diagnostics, carrier screening, and whole-genomic wellness scans.

Together these initiatives build upon our recent accomplishments and further the creation of data and analysis necessary for meaningful change in healthcare.

The genomic revolution in medicine that we envisage will be achieved through applied use of research and development that is:

  • Fueled by big data, including data provided by informed consumers and patients and data derived from population-wide studies;
  • Supported by clinical trials crafted to assess the safety and efficacy of treatments tailored to individual characteristics; and
  • Enabled by collaborative work and effective communication.

At WuXi NextCODE, we are energized by the prospects for genomics in the years to come. We are proud to be at the cutting edge, providing the tools and resources that researchers and clinicians need to harness the transformative power of genomic data. And—importantly—we are confident that our field will continue to drive meaningful changes in healthcare that improve patient outcomes.


Genomic Information and the Importance of Communication

Communicating clinically useful results both to doctors and patients will drive success

genomics-communications-hannes-smarasonAround the world, researchers and clinicians are taking on the challenge of integrating genomic analysis into medical practice. Physicians and patients are increasingly aware of the potential utility of genomic data. As genomics continues to become a more powerful tool in healthcare, there is a clear and compelling need for a commitment to excellence in communication.

At WuXi NextCODE, we are proud to provide sequencing and analysis resources that help doctors:

  • Shorten diagnostic odysseys, as I have discussed here; and
  • Improve treatment choices, as I have discussed here.

Maximizing the opportunities afforded by the ‘big data’ of genomics necessitates collaboration and communication, which I discuss in more detail here. As part of our genomics business, we are dedicated to the highest standards of communication – indeed, we view effective communication as central to how our technologies will improve health in both the near and the long term.

The task of harnessing the vast and expanding quantity of genomic data to improve clinical care requires interpretation and discovery powered to translate the data into clinically useful information. Leveraging that information to improve patient outcomes also requires clear and accurate communication:

  • Between researchers and clinicians;
  • Between specialists in different medical fields;

And, increasingly,

  • Between doctors and patients.

As the recent CLARITY Undiagnosed competition highlighted, applying genomic data to medical practice involves interpreting the sequenced genomes and identifying molecular diagnoses – and a third step: communicating clinically useful results both to doctors and to patients.

The CLARITY challenge winners, including WuXi NextCODE, were explicitly recognized for the quality and clinical utility of their reports.

Studies and surveys have shown that many people favor greater access to genetic information. Individuals want analysis of their genomes in order to:

  • Reveal their unique risk factors for inherited diseases;
  • Pinpoint a diagnosis if they are ill; and
  • Guide their decisions if they are seeking treatment.

Genomics is helping to inform patients in all these ways.

In addition, genomics demonstrates enormous potential to empower individuals.

The hundreds of thousands of people who purchase genomic testing through direct-to-consumer businesses like 23andMe are demonstrating a robust enthusiasm for gathering genomic information. And patients enrolled in clinical trials and donors participating in population-wide genomic studies express a desire to be more informed. Patients and consumers consistently seek resources that transform their personal genomic signatures into information they can use to make better healthcare and lifestyle decisions.

And most patients and consumers are willing – often eager – to share their genomic information to aid medical research and discovery. 23andMe reports, for example, that 80% of its customers consent to share their genomes for research.

It is unmistakably clear that, in the not-too-distant future, every individual in many countries around the world will have their genome sequenced. Throughout a person’s life, medical professionals will be able to access genomic information to guide health decisions – from identifying inherited conditions to assessing risk for complex diseases to calculating appropriate treatments, drugs, and even dosages for truly personalized healthcare.

The more effectively we communicate – the more we share information within the research community and parlay that into clinically useful information for patients – the greater the benefit to all.

As much as people understandably prefer simple, definitive answers to questions about their personal health, the information that genomics provides can be complex and even ambiguous. A genetic variant might be identified, for example, that can be tied to family medical history and translated into a probability or likelihood. This was the case for Angelina Jolie Pitt, who noted in her New York Times piece that her genomic analyses “gave [her] an estimated 87 percent risk of breast cancer and a 50 percent risk of ovarian cancer.” Percentage risks are nuanced, and individual perceptions of acceptable risk vary considerably. It is therefore difficult to define precisely the circumstances under which a genetic variant becomes clinically actionable.

Or a genetic variant might be identified which gives physicians clues but does not explicitly identify a specific disease. For example, a patient seeking a diagnosis may have a genetic variant that correlates to a number of diseases involving dysregulation of lipid metabolism. Identifying the variant provides physicians and caregivers with a clear direction for further analysis and treatment, but does not yield a conclusive diagnosis or prognosis.

Or a genetic variant might be identified which has yet to be understood as causing or playing a role in disease. Such a variant may occur by chance and have no medical relevance, or its meaning may be uncovered as science in the field advances. But for the person who is having the genomic information analyzed today, it offers no actionable information.

As all of these examples illustrate, effective communication about genomic information can be a significant challenge. There is a risk that poor communication will be a barrier to the adoption of genomic medicine, but if we strive to communicate clearly with patients and the public, our successes will likely accelerate more widespread use of genomics. The role of genomics in transforming health care will grow exponentially as we all endeavor to improve communication with patients, their families, and the public at large.

Our work at WuXi NextCODE is advancing the transformation of medical practice through genomics. As part of that vision, we recognize the critical importance of facilitating effective communication among all stakeholders. We provide the resources that enable researchers and clinicians to identify disease and inform treatment decisions. And we strive to add additional value by communicating about genomic information accurately and proactively, all with the ultimate goal of meaningfully improving patient outcomes.

Genomics: Big Data Leading to Big Opportunities

The Big Data of Genomics

WuXi NextCODE Exchange

The big data of genomics will continue to expand, and our approaches to analyzing genomic data need to continue to evolve to meet the growing demands of clinicians and researchers. Cloud-based platforms such as WuXi NextCODE’s Exchange are essential to address the fundamental big data challenge of genomics.

Beyond question, we are in the midst of an explosion of “Big Data” in many facets of human endeavors. In fact, data-storage leader IBM asserts that roughly 2.5 quintillion bytes of data are generated every day and 90% of the world’s data was created in the last two years.

An outpouring of articles in scientific journals and major newspapers has highlighted the promising potential of big data in medicine, including a special section in the current issue of Nature.  Genomics has become a major source of the growth of such big data, particularly as the cost of sequencing genomes has plummeted. The raw sequence data for just one person’s whole genome use as much as 100GB—and already hundreds of thousands of individual genomes have been sequenced.  With more than 2,500 high-throughput sequencing instruments currently used in 55 countries across the globe, more genomes are added every day. The aggregate amount of genomic data is growing explosively, and next-generation sequencing (NGS) sequencing data are estimated to have doubled in volume annually since 2007.

The accumulation of genomic data is a worldwide phenomenon.  Impressive population-wide sequencing efforts are leading the way, from 100,000 genomes in England, Saudi Arabia, and Iceland to 350,000 in Qatar to a million in both China and the U.S.

And earlier this month, the CEO of the Cleveland Clinic predicted that soon children will routinely have their whole genomes sequenced at birth, implying a near-future in which 10s of millions of new genomes are sequenced annually.

Turning Data into Resources

But sequencing genomes is not enough, and the creation of genomic big data is just the beginning.

Thanks to the analysis of big data in genomics and associated informatics, we are seeing meaningful progress in cancer care and the diagnosis of rare diseases, as I have discussed here and here. We clearly have a tremendous opportunity to use the big data of genomics to continue to drive a revolution in healthcare.

Yet there is a broad consensus that a ‘data bottleneck’ is hampering collaboration and discovery. Not all researchers and physicians confronting the current onslaught of genomic big data can readily determine how to use genetic information to prevent or treat disease. To succeed, researchers and physicians clearly need resources that:

  • Draw together useful data from disparate sources;
  • Facilitate analysis and collaboration; and
  • Improve clinical practice.

The power of genomic analysis needs to expand outward from major research centers and hospitals to the myriad clinics and community hospitals where many patients receive care. To have the greatest impact on the broadest population, clinicians throughout the world’s health systems need access to the big data generated by DNA sequencing, even—or perhaps especially—if they are not affiliated with research institutions. They also need to be able to make sense of the data they have access to.

Answers in the Cloud

Sequencing provides the raw data to uncover the genetic variants that contribute to disease. But the datasets are too big to transfer repeatedly—and too big even for smaller hospitals, labs, or clinics to store onsite. Key medical advancements require not only big data, but also tools and resources to generate, interpret, and share analysis of millions of genomes.

Cloud-based platforms—such as WuXi NextCODE’s Exchange—are essential to address the fundamental big data challenge of genomics. Collaboration in the cloud works to dismantle existing “data silos”—genomic information hosted only on local servers and analyzed on idiosyncratic, closed platforms. The NextCODE Exchange, in contrast, is a browser-based hub that affords secure, seamless collaboration with colleagues around the world. Moreover, users get access to NextCODE’s tools for making the critical links between variation in the genome and disease and other phenotypes, backed by harmonized links to the the most important public reference data.

And cloud-based computing is inherently scalable: resources for data storage and analysis expand as needed, allowing researchers and physicians to leverage massive datasets to improve patient care in the clinic. The big data of genomics will continue to expand, and our approaches to analyzing genomic data need to continue to evolve to meet the growing demands of clinicians and researchers.

At WuXi NextCODE, we have built upon our heritage of conducting the largest analysis of genomic data (deCODE’s path-breaking Icelandic analysis) by assembling an ever-growing database of human genomes. We are committed to driving the movement of sequence data into patient diagnosis and care through user-friendly, leading-edge analysis and informatics. I am confident that data analysis and collaboration in the cloud will revolutionize healthcare, and exceptionally proud that WuXi NextCODE’s Exchange is at the forefront of this exciting advancement.

Genomics Offers Game-Changing Solution to Rare Disease Diagnosis, Costs

Hannes Smarason Wuxi NextCODE

As genomics is used more and supported by ever-more robust analysis and interpretation, its potential to offer a solution to diagnosing rare diseases is truly game-changing.

I believe strongly and have previously blogged on the potential for genomics to shift the care paradigm for rare diseases, and here I’d like to detail in particular the huge potential value genomics can add to rare disease diagnosis. According to the National Institutes of Health (NIH), there are over 7,000 rare diseases affecting between 25 and 30 million Americans, which is nearly 1 in 10 people, making the overall prevalence of rare diseases significant. Rare diseases can be chronic, progressive, debilitating, disabling, severe, and life-threatening.

When a patient presents with a spectrum of unusual symptoms, a costly scramble naturally begins to diagnose the patient’s disease. Some people refer to this diagnosis process for rare diseases as a “diagnostic odyssey,” as patients and their families are subjected to test after test while being handed from one doctor to another, oftentimes to medical centers far from their home. Too often, this odyssey yields no concrete diagnosis or—worse—misdiagnosis. The direct medical costs can be significant, and the indirect costs—the frustration and disillusion felt by the patients and the family—can be extraordinary.

Since NIH believes that approximately 80 percent of rare diseases have genetic origins, the potential for genomic sequencing, interpretation, and analysis to offer a solution here is truly game-changing. A recent article in Bloomberg BusinessWeek highlighted medical histories of two patients who recently received a diagnosis informed by genomics. In both these examples, genomic analyses provided an end to the burden, cost, and stress of their multidecade-long diagnostic odyssey:

  • Jackie Smith, 35, spent the 32 years from age 3 unable to receive a correct diagnosis that could account for her weak limbs and turned-in ankles, despite seeing many doctors on numerous occasions. Indeed, Jackie’s parents were told to “take the 3-year-old girl home and enjoy her while they could”…”[her disease] would probably kill her before she was old enough to drive.”  This past February, using genomic interpretation and analyses from Wuxi NextCODE, Claritas Genomics definitively identified her condition as centronuclear myopathy in less than three weeks.
  • Dustin Bennett, 24, would tremble and violently jerk for hours or days at a time and had been developmentally delayed since childhood. After dozens of doctor visits and incorrect diagnoses—seizures, muscle disorders, mental health problems—a Mayo Clinic genomic-based analysis showed he has episodic ataxia type I, a neurological disease characterized by hours-long attacks with no clear trigger. Dustin, a 24-year-old who functions at a first-grade level, is now on the second round of a medication doctors say should help reduce the frequency and severity of his episodes.

As genomics is used more and supported by ever-more robust analysis and interpretation, I expect these types of clear successes to become even more commonplace. And the value to the healthcare system and the patient is clear, expressed powerfully in the Bloomberg BusinessWeek piece:

While there isn’t yet a cure, Smith is participating in research that may one day lead to treatments or more supportive care. “Just being connected feels good. I felt alone for a long time,” she says. “And I want to do it for the bigger picture, too. Not just for myself, but so I can be counted.”


Bringing Together Core Technologies Unlocks Genomic Data to Improve Healthcare

genome analysis technologies

Within the “3-legged stool” of genomics-enabling technologies, lower-cost genome sequencing has reached a point of strong commercial viability, and the remaining two legs—genomic analysis tools database storage—are rapidly evolving to support the use of genomic information in medical care.

The adoption of genome sequencing technology is rapidly expanding as medical centers around the world embrace its utility in informing healthcare decisions—an emerging reality of personalized medicine.

There are three important areas of technology that are driving the use of genomic data in healthcare:  genome sequencing, genomic analysis tools, and database storage.

The first of these—genome sequencing—has advanced to the point that it is more widely accessible, with the cost of sequencing at nearly $1,000 or less. This lower cost of genome sequencing has reached a critical milestone to enable the use of sequencing as a mass-market product for medical care.

The second and third core genomic technologies—genomic analysis tools and database storage—are in the midst of evolution. Their progress and integration are critical for the next stage of adoption of genomic data into health care.

The rapidly evolving legs of the “3-legged stool” of genomics technology are genomic analysis tools and database storage.

  • Genomic Analysis Tools: Since the human genome was first sequenced more than a decade ago, an increasingly robust body of research has showcased the links between mutations identified in the genome and disease risk. Informatics tools have been developed by medical centers and genomics companies to apply to whole-genome samples. Increasingly, these genome analysis tools will need to adapt to the steady pace of new genomic linkages to disease and to operate at a level approaching “big data.”
  • Database Storage for Human Genomes: There are a growing number of robust databases of human genomes, including data for healthy people or those with certain diseases. When properly analyzed, these databases offer the potential to provide the medical community with a reference library against which to compare genetic data. Large-scale, high-quality databases are an essential element to cross-reference a patient genome to guide more informed medical decisions.

Recently, two leading genomics companies—WuXi and NextCODE Health—have combined their technology capabilities in these two areas. WuXi has industry-leading capabilities to analyze, store, and manage the vast amounts of genomic data. NextCODE Health brings a leading-edge system for sequence-based clinical diagnostic applications and genome analysis.

The combination of WuXi’s foundational genomic database storage and management and NextCODE’s sophisticated genome analysis tools will integrated the key components that are most rapidly evolving to apply genomics to medical care.

Initiatives like these advance the state-of-the-art in genomic analysis and database storage, bringing us to the heart of helping the world to fully harness personalized medicine and providing tools directly to doctors to provide better diagnostics and treatments to patients.

The progress to date has been amazing. Yet the opportunities ahead are even more extraordinary to improve the speed, accuracy, and accessibility of genomic information to improve human health.

A New Era, New Vision for WuXi and NextCODE Health


WuXi PharmaTech has acquired NextCODE Health to create WuXi NextCODE Genomics, a global leader in genomic medicine. Pairing WuXi’s technology and existing reach with NextCODE’s leading analytics and database promises to advance the pace of genomics research today.

In the fast-paced genomics community, we continually look for new opportunities and strategies to enhance the value of genomics and use the increasingly robust body of genomic data for the advancement of clinical medicine.

We’re excited to announce a new, ambitious vision to do just that, with WuXi’s acquisition of NextCODE Health. NextCODE will be merged with WuXi’s existing Genome Center in wholly-owned subsidiary called WuXi NextCODE Genomics, with unique, comprehensive and global capabilities for using genomic data to deliver better medicine and improve healthcare.

WuXi, a Shanghai-based genomic laboratory service partner for companies in the pharma and biotech community, has already been collaborating with NextCODE to provide analysis services to customers of the WuXi Genome Center. Now, with the in-house capability to analyze, store, and manage the vast amount of genomic data, NextCODE’s industry-leading genome sequence analysis platform will expand WuXi’s core next-generation sequencing benefits and services.

Pairing WuXi’s technology and existing reach with NextCODE’s leading analytics and database promises to advance the pace of genomics research today. More importantly, however, this new era for NextCODE brings exciting opportunities to maximize the most advanced tools available today and contribute to major advances in genomic medicine.

Global Projects Move Genomic Medicine to the Next Level


NextCODE takes top marks in Genomics England analysis and interpretation “bake-off:” NextCODE’s proven population-scale platform delivered the best results in rare disease and cancer clinical interpretation, as well as secondary analysis and variant refinement.

New genomics-based technologies and tools are making their way into a range of exciting research programs and clinical studies around the world. Leading-edge organizations are quickly adopting hardware for sequencing and systems for collecting genomic data. Now, the focus has turned to analysis and interpretation – the critical component necessary to gain the insights from the sequence data that will transform medicine.

Earlier this year, Genomics England announced investments for broad sequencing and analysis of 100,000 human genomes. At the time, Genomics England had selected Illumina as its sequencing partner and was coordinating resources and centers to support the effort, including resourcing for analysis and interpretation. [See blog post here]. Other initiatives, such as the Qatar genomics program and the initiatives by Longevity and Regeneron also represent the accelerated progress in seeking medical advancements from genomic data insights. [See blog post here.]

This week, Genomics England announced a select group of companies with advanced capabilities to move to the next stage of evaluation to provide clinical interpretation for the 100K Genomes Project. At the tip top was NextCODE, which received top marks by Genomics England for its analytical capabilities across all the categories evaluated: rare disease interpretation, secondary pipeline analysis and cancer interpretation. [See press release here.] The company’s advanced Genomically-Ordered Relational database, or GOR, combined with its clinical and discovery interfaces offer the most advanced and reliable capabilities to support the ambitious tasks undertaken by Genomics England, and are already proven at population scale. [Read more on the GOR database here.]

The coming months will be a very exciting time for genomic medicine, with interpretation taking the spotlight as we take leaps toward the next stage of personalized medicine.

Population-Scale Research Efforts Enabled by Progress in Sequencing

population-scale genomics

Significant insights gained from population-scale genomic studies, based on the knowledge of genetic variation and disease causation, will help to enable a new reality of personalized medicine and treatment.

The ability to sequence whole genomes quickly and economically is driving interest in population-scale sequencing efforts that can reveal meaningful insights on a much more systematic basis than previous approaches. A range of large initiatives announced recently are prime examples of the trend in population sequencing, including industry programs by Regeneron and Human Longevity, and the 100,000 Genomes Project by Genomics England. Perhaps better than any other effort since the founding of deCODE in Iceland, the establishment of a high-throughput Genomics Center at Sidra Medical and Research Center in Qatar embodies the movement toward these types of population studies. The eventual goal of the project is to sequence the entire Qatari population of some 300,000 people. But from the beginning, the Sidra facility will help advance genetic mapping projects, including the creation of Arab consensus genome to obtain a better understanding of genetic variants that influence health across Arab populations and, indeed, beyond. In addition to these efforts, the center will focus on uncovering the causes of rare genetic diseases. The significant insights that can be gained from population-scale studies, based on the knowledge of genetic variation and disease causation, will help to enable a new reality of personalized medicine and treatment. And this is where efficient, powerful and industrial-scale analysis will become critical. NextCODE’s analytics and interpretation systems have already been tested at such scale, as they are based on the world’s first and largest population genomics effort—that of deCODE. [see blog post] Our systems will be useful tools to efficiently deliver insights based on the vast amount of data that will be generated by these major population-based efforts to improve the state of global healthcare.

Genomics and Rare Diseases: Hope for Solving Unanswered Questions

genomics and rare diseases

Leading institutions around the world are leveraging the power of advanced sequencing technology to solve some of the greatest unanswered questions in medicine.

As we learn more about disease biology and uncover new insights thanks to the availability of genomic technologies, we are making meaningful progress in identifying means to address many rare diseases for which there is little medical hope today.

With these new genomic tools and insights, a wide range of opportunities has emerged to improve diagnosis and treatment of rare diseases. Over the past few years, DNA sequencing has begun to uncover the causes of rare diseases and, at the heart of each case solved is a patient and a family that has gained new understanding about their condition. With time, these success stories in diagnosis will lead to more successes in treatment.

Now more than ever, there is more hope that identifying the key mutations will lead to better understanding of the biology of disease and then to novel therapies. Better and faster technologies are being promoted by leaders in the field of genomics that are enabling much more rapid analysis and interpretation of a patient’s genome to find answers. The critical first step is to obtain sufficient data to analyze, compare it against a robust database of reference data, and gain an accurate understanding of potential mutations associated with these rare conditions.

As researchers focus on specific areas, new partnerships are extending access to data and accelerating progress with rare diseases around the world. Recently, genomic analysis collaborations were initiated by ACoRD at University College Dublin to implement NextCODE’s proprietary database and analytical tools to mine whole genome data for variants linked to autism spectrum disorders. [See blog post here]. Another genomic analysis program with ANZAC in Australia applies advanced sequencing analysis technology to better understand X-linked Charcot-Marie-Tooth Syndrome, a rare and progressively debilitating neurodegenerative disorder. [See blog post here] More collaborations are in the works and we’ll be talking about them as soon as we can.

We look forward to the results of these and other collaborations as leading institutions around the world make efforts to leverage the power of advanced sequencing technology to solve some of the greatest unanswered questions in medicine.

A Standard Database Architecture Will Build a Stronger Foundation for Genome Discoveries

big data genome sequencing hannes smarason

The general adoption of the Genomically-Ordered Relational database (GOR) as a data standard for storing genomic data may greatly accelerate the spread of sequencing and its effectiveness as a tool for advancing medicine.

It is widely accepted that the ability to share the analysis and insights from DNA sequencing will be a key driver of discovery and innovation. But one current limitation to extending this knowledge is that sequencing and analysis platforms, as well as samples, are often proprietary to and stored at different institutions. Perhaps more important, the structures and formats in which genomic data has customarily been stored—the relational databases developed by the likes of IBM and Oracle—make it unwieldy to analyze as the amount of data grows, and very difficult to share. The upshot is that institutions cannot easily share and consolidate information to generate more robust analyses and clinically relevant insights. This presents a serious hurdle to discovery both in rare disorders, where samples need to be gathered in order to generated adequate analytical power, and in complex ones, where truly massive studies can tease apart different facets of disease and reveal their causes.

Over the past decade, a novel and comprehensive database model has been developed to solve this bottleneck, offering a flexible and fast means to overcome these problems. It is called the Genomically-Ordered Relational database, or GOR, and was designed to manage and query the detailed genomic data amassed by deCODE genetics in Iceland – the world’s first and still by far largest and most comprehensive population-based genomic database.

The thinking behind the GOR is as simple as it is revolutionary. Genomic data is a sort of big data but one with an important difference: It is divided up in distinct packets—the chromosomes—and then arranged within each chromosome in linear fashion. The GOR makes use of this by storing and querying sequence data according to its unique position in the genome, rather than as huge files as long as the sequence. This radically reduces the data burden of querying even large numbers of whole genomes, at the same time making it possible to store and visualize instantly the raw sequence underlying an analysis.

In practice, the GOR thereby enables researchers to home in on specific variants without having first to call up entire patient genomes, and separates raw data from annotations to focus in on only the most relevant search components. It’s these types of functions and features that can be consistently applied across data storing systems to allow for more multi-institutional, collaborative research and consistency in outcomes worldwide.

Leaders in the genomic research community are now beginning to create coalitions and working groups to underpin and coordinate the adoption of standards for sharing genomic data. As these groups create flexible and efficient policy frameworks, the GOR is tested and ready to support the fundamental data requirements of global data sharing and the acceleration of discoveries in genome-based medicine. The general adoption of the GOR as a data standard for storing genomic data may greatly accelerate the spread of sequencing and its effectiveness as a tool for advancing medicine around the world.