Matt Holms, Vice President, Commercial Patient Engagement & Recruitment, Citeline
Key Takeaways
· 80% of clinical trials experience enrolment delays and massive sunk costs with underperforming sites.
· Only 4% of US physicians participate in clinical research, leaving 96% of patients inaccessible under other HCPs’ care.
· Tokenization links Citeline’s RWD & proprietary data; specifically, 300+ million claims lives, 245+ million lab lives & 55 million+ EMR patient lives with 1.7+ million HCP national provider identifiers (NPIs).
· Direct-to-patient traditional advertising recruitment campaigns fail with complex protocol criteria, most commonly in oncology and rare disease protocols.
· Last-mile operationalization requires fair market value compensation and patient concierge services.
Clinical trial site selection remains one of the pharma industry's most expensive guessing games. "If you look at the cost to get a site set up, it can range from anywhere from $40,000 to $60,000. When you hire a site, if they do not get you any patients, that's a tremendous sunk cost," Matt Holms explained at Clinical Innovation 2025. Industry data (Tufts Center for the Study of Drug Development) confirms that half of all study sites consistently underperform their enrollment targets, creating a cascade of delays and change orders that plague trial timelines.
The root cause extends far deeper than site performance metrics suggest. "Really it's only about 4% of physicians in the United States that participate in clinical research. Therefore, 96% of the patients out there in the US are not in the sites’ databases that you're hiring and that everybody is so fixated on," Holms emphasized. Holms, whose son participated in a Phase 2 clinical trial at Duke, treating symptoms of his autism spectrum disorder (ASD) with stem cells, brings personal urgency to solving patient access challenges in clinical research that extend beyond operational efficiency into therapeutic innovation itself.
Traditional Patient Recruitment Approaches Can't Solve a Scarcity Problem
Holms says the first generation of patient recruitment was the assumption that the clinical research sites sponsors selected would enroll 100% of the trial with patients from their own database alone. The “first generation of patient recruitment” operates on a premise that data consistently contradict. "The duration of trials are 30% longer, 80% of studies are late, and 85% of studies require a significant protocol amendment, " Holms noted, citing industry-wide performance metrics that reflect systemic rather than isolated failures.
Sponsors invest substantial resources into site selection, partnering with CROs that claim superior investigator networks, that all too often face rescue scenarios six months into Phase 2 and Phase 3 studies when enrollment contributions collapse.
The principal investigator scarcity crisis operates at a scale most planning models ignore. A retrospective analysis from 1999 to 2015 identified 173,000 investigators who filed 1,572 forms during that 16-year period. "There were about 173,000 investigators; 50% of them did a clinical trial once and then said, ‘This isn't for me, we're done,’" Holms revealed. The attrition reflects operational realities: Research demands time, resources, and infrastructure that most clinical practices cannot sustain alongside patient care obligations.
Holms said the second generation of patient recruitment approach attempted to circumvent site database limitations through direct-to-patient advertising campaigns often using a variety of outreach tactics to target patients not known to the study sites. Traditional patient recruitment central campaigns have demonstrated effectiveness in chronic indications where patients could viably answer IRB approved pre-screening questions about conditions like COPD, asthma, or diabetes. However, complex protocol designs expose this approach’s fundamental weakness. Holms presented an ER-positive HER2-negative breast cancer case where patients responding to outreach advertisements could confirm simple questions about their breast cancer diagnosis but could not answer many other critical protocol criteria-focused questions about receptor status, prior treatment lines, or biomarker results. This traditional approach is limiting when patients can’t accurately self-report responses on majority of the protocol’s key criteria in the first phase of responding to outreach. The pre-screened referrals reaching sites often lacked sufficient qualification to justify the administrative burden.
"A lot of these research sites are understaffed. They don't have the capacity to contact all the pre-screened referrals that are sent to them to begin with and often clinical trial agreements don’t account for remuneration for the time and effort required to process external referrals. This is where things fall down a lot of the time," Holms observed. Rescue campaigns are put in place as a last-ditch effort, but if the protocol is complex where IRB/EC approved pre-screener questions only cover a small percentage of the I/E criteria, the pre-screened referral volumes can overwhelm rather than support site operations.
Tokenization as Infrastructure: Linking RWD & Proprietary Data for Visibility at the Patient, Disease, and Provider Level
Holms explained that Citeline’s vision with the third generation of patient recruitment” replaces assumption-based planning with data-linked patient identification at population scale. Citeline has aggregated real-world data across multiple dimensions, creating visibility into patient populations that traditional site databases cannot access. "We have 55 million-plus EMR lives, 245 million-plus lab lives, 300 million claims lives and have tokenized this data in combination with Citeline’s world-class proprietary data,” Holms said.
Tokenization technology, developed by Datavant, eliminates data silos by assigning unique patient tokens that aggregate patient longitudinal data from sources such as EMR, lab, and claims information into unified patient profiles. The innovation extends beyond patient data aggregation. "We have about 1.7 million HCPs in the United States for which we have preferred contact information that's not in the public domain. And when we tokenize all of these data sources we're eliminating the silos," Holms noted. The patient tokens can be linked to National Provider Identifier numbers for HCPs, as well as facility NPIs, creating visibility into who treated patients, when they were treated, how they were treated, where treatment occurred, and what therapies were administered.
Protocol-specific algorithms query this data lake using key protocol inclusion-exclusion criteria, generating four alert types that fundamentally expand recruitment geography. PI-treated alerts identify patients in investigator databases. PI partner-treated alerts surface patients managed by colleagues in the same practice as an investigator. Affiliated HCP alerts reveal patients within the same health system but treated by non-research HCPs. Geotargeted HCP alerts identify nearby providers outside the hospital network. Data refreshes on weekly basis enable real-time protocol matching.
Oncology economics illustrate why affiliated HCP identification matters strategically. "If you think about reimbursement in the United States with oncology, a lost patient can often translate to lost revenue for HCPs. So affiliated means they're part of the same health or hospital system and remunerated from the same source where there is a higher likelihood of referring patients to a PI within the same system," Holms explained.
The IRT Data Redaction Implications to Results-based Pricing
Privacy protection measures created unintended operational consequences that have destabilized recruitment vendor accountability models. Historically, recruitment companies verified that pre-screened referred patients either consented or randomized by matching Interactive Response Technology (IRT) data using unique identifiers like first and last initials combined with full date of birth. Sites updated vendor portals inconsistently, making IRT data the gold standard reconciliation mechanism for results-based pricing contracts for recruitment vendors meeting patient delivery milestones.
"There's been a big shift in the industry to redact that data where sponsors’ IRT systems typically are only collecting year of birth and/or gender now. This has led to unintended recruitment vendor behaviours where the only way a lot of recruitment companies are able to corroborate that they actually delivered a consented or randomized patient is to essentially bombard the sites with communications to validate if referrals signed an informed consent form (ICF) and ultimately randomized."
Holms said.
Vendors taking on accountability with risk-share is important, but the inability to validate results via IRT has several experienced sponsors realizing that some risk-share models are directly correlated to further overburdening sites as this is the only mechanism for recruitment companies to get remunerated. It sounds counterintuitive, but some large sponsors have gotten rid of risk-share requirements from their patient recruitment vendors because they want to prioritize their site relationship preservation over vendor risk sharing.
The accountability dilemma extends to referring physicians who operate outside investigator networks too. "We get the question all the time, ‘What's in it for a non-research HCP to refer a patient that could fit a protocol’s I/E criteria?’ Non-investigator physicians face three simultaneous barriers: time required for chart review and coordination, revenue loss from reimbursement structures, and patient loss from clinical continuity,” Holms said. Some sponsors are starting to assess fair market value compensation frameworks to address these barriers by compensating HCPs for their time and chart review work. It is critical to outline that these fair market value models are not paying for an actual non-research HCP to send a referral, as this could violate anti-kickback statutes. Revenue and continuity concerns require broader solutions to facilitate patient access to clinical trials.
The Last Mile: Why Data Identification Is Only a Part of the Solution
Citeline launched its tokenized recruitment solution, Citeline PatientMatch, nearly two years ago, generating critical learnings about the gap between patient identification and patient randomization. "The biggest learning we've had is that the focus was initially just on the data, but the term, 'last mile' relates to services and support infrastructure to facilitate getting a patient to the site, whether it's through organic outreach or through referring HCPs. This is a huge focus in the industry right now," Holms said, emphasizing that data alerts alone do not address site capacity constraints or referring physician barriers.
The "last mile" encompasses service infrastructure that operationalizes data insights. Many sponsors’ medical affairs teams, often the medical science liaisons (MSLs), focus on building relationships and engaging potential referring non-investigator physicians around trial sites. Citeline’s comprehensive patient-matched alerts provide MSLs a roadmap for where these protocol-matched patients exist within an eligibility window at non-research HCP practices.
Cross-vendor collaboration defines Citeline's strategic approach. The company partners with organizations that have also tokenized data through Datavant, enabling token-sharing for direct-to-patient outreach where appropriate. Other partnerships focus on providing “last mile” patient concierge services that can operate on the protected health information (PHI) side to follow up with identified patients. "We want to partner with whoever can help with the last mile and use our data to be able to make patient access a reality in a lot of these indications that are an objective for sponsors, but also patients out there looking for trials," Holms said, underscoring that financial structures must evolve alongside data capabilities.
Alert specificity emerged as a critical success factor. "A big learning for us in the beginning was these alerts have to be specific enough to have detail for the site to then re-identify them in their EMR. We have continued to enhance the amount of details that we provide for alerts," Holms said. Measurement challenges persist where tracking conversion from identification to randomization requires coordination across MSLs, contracting teams, patient services, and sites — a cross-functional orchestration that extends beyond traditional recruitment vendor scope.
From Proprietary Advantage to Collaborative Ecosystem
Holms said third-generation recruitment represents a paradigm shift from simply "hire the best sites" to "identify patients both in sites and around sites at a specific moment in time based on their clinical pathway." The 96% of patients treated by physicians who don't conduct research become accessible through tokenized data linkage, but accessibility requires operational models that address physician incentives and site capacity simultaneously. "Now you need to be fixated on these sites. There's no denying that. But the assumption that they're going to enroll all of your study hasn't been the case for a long time," Holms reiterated, framing patient access as both a data and service delivery challenge rather than simply a site selection problem.
Upstream applications demonstrate the technology's strategic range. Sponsors increasingly use tokenized RWD for protocol design and feasibility analysis before finalizing inclusion-exclusion criteria, preventing enrollment challenges through evidence-based protocol optimization.
Therapeutic area segmentation matters: Direct-to-patient campaigns retain value in chronic disease indications where patients can appropriately answer pre-screening questions, while oncology, rare disease and any complex indications’ protocols require data-driven identification of protocol-matched patients and their treating physicians.
"We're certainly seeing a lot of sponsors who are having their MSLs and medical affairs engaging with HCPs that are not actually the PI, doing dinners with these HCPs and trying to assess appropriate ways to remunerate them for their time and efforts,"
Holms noted, describing infrastructure that scales referring physician engagement beyond manual MSL outreach.
Portfolio implications extend to cost-per-randomized-patient metrics that include sunk site costs. When half of sites fail to meet enrollment targets and setup costs range from $40,000 to $60,000, identifying patients around underperforming sites provides rescue options that preserve initial site investments. "I do believe that collaboration is key for us to be successful, and this is why Citeline is welcoming collaboration with other niche vendors that in some cases have been historical competitors," Holms said. Third-generation recruitment will succeed not through just proprietary advantage but through orchestrated collaboration that converts data visibility into randomized patients.
In order to get you the highlights of Pharma Clinical Innovation USA 2025 faster, we are using generative AI technology to summarise the transcripts of the sessions. The conference organiser is checking the summary for accuracy. If you have any feedback about the summary and the post-event report, please contact Daisy.Beale@thomsonreuters.com
Discover more on this topic at Pharma USA 2026 (March 17-18, Philadelphia) - North America's largest cross-functional pharma gathering. Visit the website here.