Skip to content
AetionApr 30, 20257 min read

Feature Spotlight: Streamlining Propensity Score Adjustment with Substantiate

Nataile Schibell, MPH and Pippa Hodgkins

 

 

Addressing Confounding in Real-World Treatment Comparisons

 

In real-world settings, treatment decisions aren’t randomized—they reflect clinical judgment, patient characteristics, and systemic factors. These underlying differences introduce confounding, making it difficult to isolate true treatment effects.

Propensity score (PS) methods — such as matching, weighting, and high-dimensional PS—help address this bias by estimating the probability of treatment based on baseline covariates. When applied correctly, these techniques can balance treatment groups and support more credible comparisons in non-randomized studies. High-dimensional PS, as described in recent peer-reviewed research by Aetion co-founders Jeremy Rassen, Sc.D., and Sebastian Schneeweiss, M.D., Sc.D., offers a scalable, data-driven approach to covariate selection in complex datasets such as claims and EHRs.¹

Aetion® Substantiate operationalizes these methods within a structured, transparent workflow, enabling teams to adjust for confounding with scientific rigor and day-to-day efficiency. From study design through reproducible outputs, Substantiate helps ensure that PS implementation is consistent, scalable, and aligned with regulatory expectations.

 

Why Use Substantiate for Propensity Score Adjustment

Propensity score methods are foundational to confounding adjustment in real-world evidence. But their value depends on consistent, transparent execution. Substantiate enables teams to implement these methods—matching, weighting, and high-dimensional PS—within a defined, audit-ready workflow that aligns with study protocols.

Rather than stitching together manual steps or relying on custom code, users can apply PS adjustments end-to-end within the platform’s Comparative Effectiveness Plan. Every step is linked—from cohort construction through covariate selection, model configuration, diagnostics, and output—ensuring alignment across studies and teams.

Sub PS blog image 4Image 1: Getting started with the Comparative Effectiveness Analysis Plan

Substantiate allows users to:

  • Select covariates and define baseline windows: Configure variables and time frames directly within the study design interface, anchored to key analytic milestones.
  • Choose the adjustment method: Apply PS matching, inverse probability of treatment weighting (IPTW), ATT/SMR weighting, overlap weighting, or high-dimensional PS—based on study requirements. Each analysis plan supports one matching and one weighting method, with additional methods available through sensitivity analyses.
  • Configure method-specific parameters: Adjust caliper width, matching ratios, trimming thresholds, or variable ranking strategy as needed.
  • Evaluate covariate balance: Review diagnostics, including standardized mean differences and distribution plots to confirm baseline alignment.
  • Export study specifications and version-controlled outputs: Generate traceable documentation for internal governance or external submission, with every decision captured in-platform

This framework supports reproducibility and operational consistency while giving teams the flexibility to tailor methods to the complexity of their data. Substantiate brings structure to real-world analytics—so that scientifically sound methods scale reliably across studies, datasets, and therapeutic areas.

 

PS Matching in Substantiate

Propensity score matching is a widely used method for reducing confounding in real-world comparative studies. The goal is to create two groups with similar observed characteristics, so any outcome differences are more likely due to treatment than baseline differences.

In Substantiate, propensity score matching is built into the core study workflow. Analysts define the covariates, specify the time window for baseline measurement, and the platform generates propensity scores. Based on the configuration, Substantiate then matches patients across treatment arms using one of several matching algorithms, including 1:1 and variable-ratio methods.

 

Matching methods available in Substantiate:

  • 1:1 Nearest Neighbor Matching (without replacement): Creates tightly aligned patient pairs for precise comparability
  • Variable-Ratio Matching (Parallel or Sequential): Expands matching flexibility by allowing each treated patient to be matched with multiple referents, preserving more sample size
  • Caliper: (optional parameter): Can be applied across all matching methods to restrict matches to a specified score distance, improving balance and match quality.

Matching is particularly useful when constructing trial-like cohorts—such as in external control arms, trial emulation studies, or regulatory-aligned analyses. Substantiate tracks and retains all produced study outputs, providing traceability of matching parameters and patient selection logic throughout the study lifecycle.

Sub PS blog image 1Image 2: Propensity score method selection options in the Comparative Effectiveness Analysis Plan

PS Weighting in Substantiate

Propensity score weighting adjusts for baseline differences by scaling how much each patient contributes to the analysis. Unlike matching, all patients are retained; their influence is weighted based on the probability of receiving treatment, rebalancing the population to improve comparability. This approach is particularly useful when preserving sample size is important or when treatment arms have sufficient overlap but full matching isn’t feasible. For a detailed explanation of these methods, see Understanding Propensity Score Weighting Methods.

In Substantiate, weighting is integrated directly into the comparative effectiveness workflow. Users can select from several weighting strategies and configure method-specific parameters—all within a structured, transparent interface that supports consistent application across studies.

Sub PS blog image 2Image 3: Propensity score overlap diagram showing patient density across treatment arms

Weighting options available in Substantiate:

  • Inverse Probability of Treatment Weighting (IPTW): Estimates the average treatment effect (ATE) across the full population.
  • ATT / SMR Weighting: Estimates the treatment effect among treated patients; commonly used when comparing real-world data to clinical trial cohorts.
  • Overlap Weighting: Focuses on patients with similar treatment probabilities, reducing the influence of outliers and improving balance.

Truncation and trimming can be applied to any of the above methods to reduce the impact of extreme weights. This is particularly important in studies with low overlap or high-dimensional covariate sets.

High-Dimensional Propensity Scores in Substantiate

In real-world datasets—especially claims and EHRs—the number of potential covariates can be extensive. Manually selecting covariates can be time-intensive and may require iterative clinical and methodological input. While investigator-defined models remain standard, they can be limited when working with unfamiliar therapeutic areas or large, exploratory datasets.

High-dimensional propensity score (hdPS) methods were developed to help address this complexity. Instead of relying solely on predefined covariate lists, hdPS uses structured, data-driven algorithms to systematically identify and rank covariates most likely to influence both treatment and outcome. This is especially useful when confounders are not well-characterized or when covariate definitions vary across datasets. Published validation has shown that hdPS can perform comparably to—or better than—investigator-defined models in high-dimensional settings.

Sub PS blog image 3Image 4: Patient Characteristics for patients with propensity scores generated by the high-dimensional propensity score model

In Substantiate, high-dimensional propensity score (hdPS) modeling is integrated directly into the workflow. The platform enables users to:

  • Specify data attributes: Choose from key data attributes—such as diagnoses, procedures, or medications, to define the covariate space used for automated selection in high-dimensional propensity score modeling.
  • Surface candidate covariates automatically: Identify potential confounders across the selected domains without manual extraction or coding.
  • Rank covariates based on user-selected logic: Choose from predefined options to prioritize covariates by prevalence (frequency) or bias potential (association with both treatment and outcome).
  • Generate the propensity score model: Build the PS model using the ranked covariates—entirely within the platform, with no external preprocessing required.

The process is fully transparent and easy to adjust. Users can control which data attributes are included, how far back in time to look for baseline information, and how covariates are prioritized—while relying on automation to identify those most likely to influence treatment and outcome. This helps ensure that studies remain consistent and repeatable across teams and datasets.

Choose the Right PS Strategy—All in One Platform

Different study designs call for different adjustment strategies. Some require tightly matched cohorts for interpretability; others prioritize preserving the whole sample or minimizing variance. Substantiate supports multiple propensity score methods within a unified platform, giving researchers the flexibility to choose the right approach based on the question, the data, and the analytic constraints.

Propensity Score Methods and Their Application Across RWE Teams

Method

Purpose

What It Does

Best Used When

Matching

Create balanced comparison groups

Matches patients with similar propensity scores; supports both 1:1 and variable-ratio configurations

Used by RWE teams conducting comparative effectiveness studies, regulatory-aligned analyses, and trial emulations

IPTW (Weighting)

Estimate the treatment effect across the full population

Applies weights to all patients to balance covariates between groups

Preferred by HEOR teams conducting population-level analyses where generalizability and full sample retention matter

ATT/SMR Weighting

Estimate the effect among those treated

Reweights the comparator arm to resemble the treated group

Common in safety or outcomes studies comparing real-world cohorts to trial populations or registry-based controls

Overlap Weighting

Focus on the most comparable subset of patients

Prioritizes patients with similar treatment probabilities; minimizes the influence of outliers

Ideal when treatment groups differ substantially—often used by methods teams and comparative safety researchers

High-Dimensional PS

Empirically identify key confounders in large datasets

Algorithmically selects and ranks covariates based on bias or prevalence to build the PS model

Used by data science teams working with large claims or EHR data when covariate selection is complex or uncertain

 

Structure Matters —Substantiate Makes It Work

Substantiate equips research teams to implement propensity score methods—matching, weighting, and high-dimensional PS—within a consistent, transparent framework. The platform guides users from cohort construction through covariate selection, model configuration, and diagnostics, supporting both methodological rigor and day-to-day workflow efficiency.

All study inputs and outputs are version-controlled and fully documented, supporting internal reproducibility and external defensibility. Whether designing early feasibility analyses or generating comparative evidence for regulatory or payer decision-making, teams can rely on Substantiate to deliver consistency across studies and datasets.

Explore the Evidence Hub or contact our team to see how Substantiate powers reliable, scalable implementation of PS methods.

RELATED ARTICLES