Latest News

Matchmaker Exchange Networks Genetic Databases For Rare Disease Diagnosis

By Ann Neuer, MBA

June 24, 2016 | An entire issue of Human Mutation was recently dedicated to the critical need for data sharing among scientists seeking a better understanding of the genetic makeup of rare diseases. That issue appeared in October 2015, and focused exclusively on Matchmaker Exchange (MME), a project launched in 2013 to create a network that connects several isolated rare disease databases using a common application programming interface (API) that supports queries and responses.

Why is this important?

At least 80% of the time, rare diseases have been identified as having genetic origins, and there are a lot of rare diseases—an estimated 6,000—each needing treatments or cures. Though numerous together, each disease exists in only a handful of patients.  According to the National Institutes of Health, a disease is considered rare if it affects less than 200,000 people in the United States. As a result, diagnosing those conditions is difficult, essentially a needle-in-a-haystack exercise, and that’s where a network of databases can help. If it can provide researchers and clinicians with data from just one more case with a deleterious variant in the same gene and phenotypic features in common, this may be sufficient evidence to identify the causative gene for a specific rare disease.

PhilippakisAccording to Anthony Philippakis, Chief Data Officer at the Broad Institute, “We are good at sequencing genomes, but not very good yet at reading them or interpreting them. By just looking at the genome, it’s difficult to definitively say that a certain mutation or variant is causal. We need the ability to aggregate many patients with similar genotypes to see if the phenotypes match up. That’s what MME allows us to do.”  

A Federated Approach

MME is a demonstration project for the Global Alliance for Genomics and Health (GA4GH), an international coalition of partners that share genomic and clinical data. As part of this effort, MME is a matchmaking service that offers a systematic approach to rare disease gene discovery by harnessing the power of collective data contained in networked, yet separate, databases. There is no single centralized entry point for users, but rather, they must choose one of the existing MME services as a starting point.

This method, known as a “federated” system, has clear advantages. First, instead of having to query the disconnected databases individually, members only need to make one query in order to access all of the connected data. Second, members deposit their genetic data just once, which are then shared across the databases. Matching algorithms are defined by the individual databases, and queries are structured to allow a gene or genotype, combined with phenotypic features, to be submitted. Next, members receive returned responses, which contain any similar or “matched” cases. Philippakis comments, “What is wonderful about this approach is that it is a very focused use case. What the user wants to do can be easily stated, which is looking for other patients with the mutation in this gene or other patients with a given phenotype.”

There are other reasons why the federated approach is preferred over use of a single centralized database of all datasets. For each database, the data holder is able to maintain complete control over the data, but when sharing those data with another entity via a centralized database, a higher regulatory burden must be maintained. This is because a central database must ensure a level of security compatible with all data submitters—not an easy task. Also, regulations vary by country regarding the sharing of genomic information. Moreover, users may only want to share certain datasets that can be better controlled by the use of an API. And, finally, there is the issue of phenotypes, which are dynamic within a patient, but static within a disconnected database, making them difficult to share “longitudinally”, meaning when measured at various time points.

Given these factors, a federated network in which multiple databases—each slightly different—are connected through APIs, and each database supports queries of the other networked databases, is a big step forward.

RehmHeidi Rehm, Associate Professor of Pathology at Brigham & Women’s Hospital and Harvard Medical School, explains, “Each database may have a different primary focus, and whereas they all store data for rare diseases, some focus more on the genotype, others, on the phenotype, and still others focus on the candidate gene. With MME, we have a common set of critical fields that were put together by the members of the Exchange, and they are important for answering the main question we are asking, which is ‘Do you have another case with these attributes?’”

Autonomous Databases

The MME network allows each database to remain autonomous with respect to its own data schema, and no single database operates as the “central” database. Yet, because the data come from a consortia of organizations, responsible data sharing is a major issue. To address this concern, the founding members of the MME have set up an infrastructure. This includes establishing a set of requirements for participating matchmaking services, a user agreement for those wishing to use the MME, and a steering committee to govern the program. That committee is composed of a representative from each approved MME service, as well as program organizers and representation from GA4GH and the International Rare Diseases Research Consortium. One of the issues the steering committee has tackled addresses the level of consent needed for MME.

Rehm, who sits on the steering committee explains, “We developed a two-level consent policy for the MME to address what level of consent is required of patients based on risk of being discoverable through MME. When only sharing gene candidates and structured phenotypes, consent is not needed, but when genotype data and other sensitive information are shared, patients must consent to data sharing on the MME.”

Currently, there are a dozen participants in MME, three of which have implemented the MME API and are capable of returning results from queries. These include DECIPHER, GeneMatcher, and PhenomeCentral. DECIPHER is an interactive database with a suite of tools designed to aid in the interpretation of genomic variants. DECIPHER contains in excess of 20,000 open access patient records, meaning they have given consent for data-sharing, plus another 25,000 anonymized records. GeneMatcher allows investigators to post a gene of interest and then connect with investigators who post the same gene. The match is done automatically. GeneMatcher, which has nearly 2,200 candidate genes, was developed with support from the Baylor-Hopkins Center for Mendelian Genomics as part of the Centers for Mendelian Genomics network. PhenomeCentral uses algorithms to match cases on de-identified phenotypic and genotypic data, and currently has data on more than 1,000 patients.

Thus far, MME is making impressive headway. There have been seventeen publications, and looking ahead, other services plan to launch the MME API at various time points, adding to the network. Also, MME services will continue refining the matching algorithms and integrating additional supporting evidence as to why a candidate gene has been flagged in a given case. Soon, patients will be brought into the mix, and will be able to establish patient-led matchmaking.