Google Play icon

Patient Matching Algorithm Challenge

Posted July 15, 2017

The goal of the Patient Matching Algorithm Challenge is to bring about greater transparency and data on the performance of existing patient matching algorithms, spur the adoption of performance metrics for patient data matching algorithm vendors, and positively impact other aspects of patient matching such as deduplication and linking to clinical data. Participants will be provided a data set and will have their answers evaluated and scored against a master key. Up to 6 cash prizes will be awarded with a total purse of up to $75,000.00.

Challenge Summary

This Challenge uses a large test data set, provided by ONC, against which participants must run their algorithms and provide their results for evaluation. A small set of true match pairs (that have been created and verified through manual review) exist within the large data set and will serve as the “answer key” against which participants’ submissions will be scored.

Participants will unlock and download the test data set at the time of registration. Participants will then run their algorithms and submit their results to the scoring server on the this website. These submissions will receive performance scores and may appear on a Challenge leaderboard.  Upon submitting results, participants will receive objective evaluation metrics (F-scores) that can be used to guide system improvements; a total of 100 re-runs will be allowed.  Up to six participants will be selected as winners for this Challenge and awarded cash prizes.  Top prizes will be awarded to participants’ algorithms that generate the highest F-Score(s). Additionally, algorithms with the best recall, best precision, and best first F-Score run performance will also receive a cash prize.

Background Information

In late 2015, the Office of the National Coordinator for Health Information Technology (ONC) published “Connecting Health and Care for the Nation: A Shared Nationwide Interoperability Roadmap” (the Roadmap). The final Roadmap reflects the combined contributions of dozens of experts and hundreds of public comments received during its drafting phase. The Roadmap includes “Section L,” which was specifically framed to reflect the challenges health care faces with respect to accurate individual data matching. This section highlights matching’s overall importance to interoperability and the nation’s health IT infrastructure. Indeed, health care providers must be able to share patient health information and accurately match a patient to his or her data from a different provider in order for many anticipated interoperability benefits to be realized. Conversely, matching mistakes can contribute toward adverse events, compromised safety and privacy, and increased health care costs due to repeat tests, and other factors. The cost to manually correct mismatched patient records is estimated to be $60 per record not including the potential harm that could be caused due to a patient receiving the wrong treatment and potential legal fees.

Given the substantive impacts poor patient matching can have on care delivery, it is important for organizations to be able to quantify their patient matching algorithm’s performance and compare the results to industry standard benchmarks and performance metrics. To date, the absence of such benchmarks as well as a baseline for different use cases has made it difficult to make advances in patient matching. Reports such as ONC’s Patient Identification and Matching Final Report list patient match rates in the range of 50%-Mid 90%.

Every patient matching algorithm has blind spots and there are methods to calculate the performance of a patient matching algorithm and help identify these blind spots. This is accomplished by giving an algorithm a known data set in order to see how many of the known linkages the algorithm can correctly identify. Matching algorithms can make two types of errors. The first error is the failure to find a matching pair (often referred to as a “false negative”), which is measured by “inverse recall” in the field of information retrieval. The second type of error is a record that is matched when it should not be (often referred to as a “false positive”), which measured by a metric known as “precision.”  The weighted average of precision and recall generates the final metric pertinent to this Challenge is known as “F-Score.”

Important Note

The Patient Matching Challenge opened on 6/12/17 at noon EST.

Challenge Timeline

  • Announcement of Challenge: April 28, 2017
  • Registration Period Begins May 10, 2017
  • Submission Period Begins: Upon availability of test data
  • Submission Period Ends: September 12, 2017
  • Winners Notified: 1 week from the end of submission period
  • Winners Announced: 1 week from winner notification date


On May 10May 17, and May 24 online webinars were held. The recorded webinar and PowerPoint slides can be found below.

Date: May 10, 2017

Recorded Webinar Link
PowerPoint Slides

Date: May 17, 2017

Recorded Webinar Link
PowerPoint Slides

Date: May 24, 2017

Recorded Webinar Link
PowerPoint Slides

Challenge Requirements

The patient matching Challenge website will manage the submissions and provide the scoring results back to the Participant. The answer key can be submitted in either CSV, XML, or JSON files. In order for a submission to be eligible to win this Challenge, it must meet the following requirements:

  • No HHS or ONC logo – The product must not use HHS’ or ONC’s logos or official seals and must not claim endorsement.
  • A product may be disqualified if it fails to function as expressed in the description provided by the Submitter, or if it provides inaccurate or incomplete information.
  • Submissions must be free of malware. Submitter agrees that ONC may conduct testing on the product to determine whether malware or other security threats may be present. ONC may disqualify the submission if, in ONC’s judgment, it may damage government or others’ equipment or operating environment.


Highest F-Score

  • First Place: $25,000
  • Second Place: $20,000
  • Third Place $15,000

Best in Category Supplemental Prizes (1 prize for each category at $5,000):

  • Best precision (with Recall >= .9)
  • Best recall (with Precision >= .9)
  • Best first F-Score run

Total Prize Purse: Up to $75,000


Featured news from related categories:

Technology Org App
Google Play icon
86,938 science & technology articles

Most Popular Articles

  1. You Might Not Need a Hybrid Car If This Invention Works (January 11, 2020)
  2. Toyota Raize a new cool compact SUV that we will not see in this part of the world (November 24, 2019)
  3. An 18 carat gold nugget made of plastic (January 13, 2020)
  4. Human body temperature has decreased in United States, study finds (January 10, 2020)
  5. Donkeys actually prefer living in hot climate zones (January 6, 2020)

Follow us

Facebook   Twitter   Pinterest   Tumblr   RSS   Newsletter via Email