Approved

Standard Operating Protocol: Cohort Definition Using OHDSI ATLAS

Purpose

This SOP describes a standardized process for defining cohorts in OHDSI ATLAS. It is tailored to CHoRUS Bridge2AI (B2AI) data converted to the OMOP CDM, but the procedures are generalizable to any ATLAS instance. The purpose is to ensure cohorts are defined correctly and consistently across teams, with clear documentation of inclusion/exclusion logic and adherence to best practices in observational research.

Scope

This SOP applies to all cohort definitions created and reviewed within the CHoRUS network’s OHDSI ATLAS environment. It covers cohort definitions for analyses such as incidence, characterization, and outcomes research. The scope includes steps from initial cohort design through cohort entry criteria specification, inclusion rules, censoring (exit) criteria, and cohort era settings.

Cohort Lifecycle & Governance

Lifecycle

Use this canonical lifecycle for every phenotype/cohort:

Clinical question → Phenotype design → ATLAS implementation → Phenotype Inventory Registration → Diagnostics → Revision → Approval → Release → Versioning → Reuse

The cohort definition is not "done" when it runs; it is done when it is validated, approved, versioned, and reusable.

Ownership and approval checkpoints

Phenotype owner:

A named individual or team responsible for the clinical/methodological intent and long-term maintenance of the phenotype.
The owner is the final decision-maker for acceptance of changes.

Approval is requested when:

A cohort is intended for cross-site use, publication, or inclusion in a phenotype library.
Any change alters who enters the cohort, when entry occurs, or time-at-risk / exit behavior.

A phenotype is considered "locked" when:

The definition has passed the minimum diagnostics checklist (see Diagnostics & Transportability), and
It has been approved by the phenotype owner + QA reviewer, and
A version tag has been assigned and recorded (see below).

Phenotype naming convention

Use a consistent naming scheme so cohorts are discoverable, comparable across sites, and auditable over time.

Naming template

B2AI-<Phenotype>-<Population/Setting>-<IndexAnchor>-v<MAJOR.MINOR.PATCH>

Examples

B2AI-ICU-Shock-VasopressorStart-Adult-v1.0.0
B2AI-ICU-MechanicalVentilation-Invasive-Adult-v1.1.0
B2AI-Infection-HAI-Suspected-ICU-v0.3.0
B2AI-Neuro-TBI-InpatientOrER-Adult-v2.0.0

Required fields

Prefix: B2AI
Phenotype name: stable clinical label (e.g., Sepsis, Shock, ARDS, AKI)
Population/setting: at least one of {Adult|Peds} and {ICU|Inpatient|ER|Outpatient}
Index anchor hint: the event used as time zero (e.g., Admission, Dx, DrugStart, Procedure, DeviceStart)
Version: vMAJOR.MINOR.PATCH (see policy below)

Optional fields (use when needed)

Intent tag: {Incident|Prevalent} (if relevant to interpretation)
Time window tag: {Early24h|Day0-2|Day0-7|28d} (if the window is defining)
Comparator tag: {Target|Comparator} (for PLE studies)

Naming rules (to reduce ambiguity)

Avoid local site names or ETL artifacts in the name (e.g., do not embed “SiteA” or “has_visit_detail_id”).
Prefer concise, consistent abbreviations (ICU, ER, RRT, MV, VTE).
If the definition is designed for reuse, freeze the name and increment version rather than renaming.

Versioning & change control

Use a simple semantic policy: MAJOR.MINOR.PATCH (e.g., 1.2.0).

MAJOR (e.g., 1.x → 2.0): clinical meaning changes
Examples: new index event anchor, new domain (Condition → Procedure), new inclusion logic that changes population identity, changing incident↔prevalent intent, changing outcome definition.
MINOR (e.g., 1.1 → 1.2): analytic behavior changes, but clinical meaning remains the same
Examples: time-window adjustments (0-2d → 0-3d), tightening/loosening observation requirements, adding/removing a secondary restriction, altering era collapse gap.
PATCH (e.g., 1.2.0 → 1.2.1): non-functional edits
Examples: typos, formatting, better documentation text, link fixes (no logic changes).

Change control requirements:

Every change must be recorded with: what changed, why, expected impact, reviewer, date.
For MINOR/MAJOR changes: re-run diagnostics and re-approve.

Applicable Roles and Responsibilities

Data Analysts/Phenotype Developers: Responsible for creating cohort definitions in ATLAS following this SOP, including selecting appropriate initial events, inclusion and exit criteria, and verifying cohort logic.
Quality Assurance (QA) Team: Responsible for reviewing cohort definitions for correctness, completeness, and compliance with this SOP. QA ensures that all criteria (entry, inclusion, exit, era collapse) are appropriately applied and documented.
Compliance Officers/Data Stewards: Ensure that the cohort definitions (and any patient data they entail) adhere to regulatory and data use policies. They verify that sensitive events (e.g. death) are handled properly (e.g. as censoring events) and that cohort definitions are reproducible and auditable.
Project Leads/Scientists: Define the clinical intent of the cohort (phenotype) and confirm that the implemented cohort definition in ATLAS matches the intended definition. They provide clinical context (e.g. requiring certain observation times or prior conditions) and approve the final cohort definition.

Glossary

Prerequisites

Access to ATLAS: The user must have an active ATLAS account with permissions to create and edit cohort definitions. This SOP assumes the user can log into the CHoRUS ATLAS instance and navigate to the Cohort Definitions module.
OMOP CDM Familiarity: Users should understand the OMOP CDM tables and how clinical events are represented. A basic understanding of the data content in CHoRUS (e.g. types of source data and how they map to OMOP) is also required.
Concept Sets Prepared: Ideally, any needed concept sets (collections of standard codes for conditions, drugs, etc.) are created or available in ATLAS prior to cohort definition. ATLAS allows creating concept sets on the fly during cohort definition, but having them pre-defined (especially standard ones like "Inpatient Visit" or specific diagnosis categories) can speed up the process and ensure consistency.
Cohort Definition Design / Phenotype Intent Documented: Before implementing in ATLAS, document a clear clinical definition and methodological intent for the cohort, including the index (onset) anchor, rationale for inclusion/exclusion criteria, baseline and follow-up (time-at-risk) windows, handling of multiple events per person (episode vs first event), and the intended meaning of era/collapse settings (e.g., in a protocol or phenotype definition document).

Cohort Definition Workflow (OHDSI ATLAS)

This workflow describes how to build a cohort (computable phenotype) in ATLAS: define cohort entry events, optionally restrict those events, apply inclusion criteria, then define exit/censoring and era collapse settings. Use the attrition/diagnostics outputs to QA each stage.

Normative rule (Entry vs Inclusion semantics):

Entry criteria must define the biological onset anchor (the event you treat as “time zero”).

Inclusion criteria must define analytic eligibility (who you allow into the analysis after the anchor is defined).

This rule reduces variability across analysts and improves transportability across sites.

1. Cohort Entry Events

1.1 Add Initial Event vs Add Attribute

Click + Add Initial Event to add the primary entry criterion (domain + event type), e.g. Condition Occurrence, Drug Exposure, Procedure, Visit Occurrence, etc.
Click Add attribute… to refine the selected criterion (e.g., first diagnosis, age filter, occurrence count, date constraints).
Use Delete Criteria to remove an event/attribute/group you added by mistake.

Rule of thumb

Add Initial Event = defines what event anchors cohort entry.
Add attribute = adds constraints to that event.

1.2 Attach a Concept Set to the Initial Event

Add an initial event (e.g., Condition Occurrence).
Open the criterion’s dropdown and choose Import Concept Set.
Select an existing concept set or create a new one, then attach it to the criterion.

Concept set considerations (Required for reproducibility)

Document the following for each concept set used in entry or key restrictions:

Descendant explosion policy: are descendants included?
Impact: can massively change cohort size and clinical meaning.
Mapped vs unmapped concepts: prefer standard concepts; confirm whether your ETL maps source codes properly.
Impact: undercounting occurs if data exists only as unmapped/non-standard codes.
Domain drift: ensure concepts match the criterion domain (Condition concepts in Condition Occurrence, etc.).
Impact: silent undercounts and wrong populations.
Vocabulary release freeze (recommended when reusing cohorts): record the vocabulary version/date and freeze concept set membership for a release.
Impact: concept hierarchy changes across vocabulary releases can change cohort membership without any cohort logic edits.

1.3 Continuous Observation Requirement (X days before / Y days after)

In Cohort Entry Events, configure:

with continuous observation of at least X days before and Y days after event index date

This requires an OBSERVATION_PERIOD spanning the index event with sufficient lookback/follow-up.

Technical meaning

The index date must occur inside OBSERVATION_PERIOD with enough coverage:

index date ≥ observation start + X days
index date ≤ observation end − Y days

Analytical meaning

Baseline covariate identifiability: Without enough prior observation, "absence" of history may mean "not observed," not "truly absent."
Incident cohort (new onset / new user): Require a washout period (e.g., ≥180–365 days of prior observation) with no prior evidence of the condition/exposure, so the index event is treated as the first observed start.
Prevalent cohort (existing / ongoing): Do not require absence during baseline; include anyone with evidence of the condition/exposure, even if it started before the available observation window.
Immortal time bias risk: If you require long post-index observation (large Y), you may exclude early deaths/early loss-to-follow-up, creating biased survival comparisons. Prefer handling death as a censoring/outcome strategy rather than excluding those patients via Y-days requirements.

Operational logic summary

Index date must be ≥ X days after observation start and ≥ Y days before observation end.
Be cautious: large Y can exclude patients who die/leave early (often better handled via censoring rather than exclusion).

Common configurations

0 / 0: no baseline or follow-up requirement (feasibility or prevalence-oriented cohorts).
365 / 0: require ≥1 year baseline (incident/new-user design baseline, covariate lookback).
0 / 30: require ≥30 days follow-up (short-term outcome observation).
365 / 365: strict baseline + follow-up requirement (protocol-driven designs; use cautiously; justify).

1.4 Limit Initial Events per Person

Use Limit initial events to:

All events per person: allow multiple cohort entries per person (episodic/recurrent-event analyses).
Earliest event per person: one entry per person at first qualifying event (incident cohorts).
Latest event per person: one entry per person at most recent qualifying event (rare; cross-sectional “latest occurrence” use cases).

Example

"All pneumonia episodes" → All events
"First-ever Myocardial Infarction" → Earliest
"Most recent opioid exposure" → Latest

1.5 Restrict Initial Events (Entry Event Restrictions)

Use Restrict initial events to: having [all/any/at least/at most] of the following criteria to add context tightly coupled to the index event.

Choose the group logic: all / any / at least X / at most X
Click Add criteria to group to add each required criterion.
Use Add attribute on each criterion to specify occurrence counts, timing, etc.

How to choose all / any / at least X / at most X (with examples)

Examples

ALL of the following (AND logic)
Use when every condition is required for the index event to be valid.
- Index: Sudden cardiac death-related Condition Occurrence
  Restrictions (ALL):
  1. Inpatient/ER visit context
    - Criterion: Visit Occurrence = concept set Inpatient or ER visit
    - Attributes:
      - Temporal: visit starts between All days before and 0 days after index start; visit ends between 0 days before and All days after index start (overlap)
      - Optional checkbox: Restrict to the same visit occurrence
  2. Adult population
    - Attribute: Age at occurrence ≥ 18 (computed from PERSON.year_of_birth (and month/day if available) relative to the event start date)
ANY of the following (OR logic)
Use when alternative pathways should qualify entry (one is sufficient).

Example A
Index (surveillance anchor): Inpatient visit start (or device exposure start, depending on the HAI definition) Restrict initial events (ANY): evidence of suspected/confirmed infection after index

1) Positive culture signal within 0-7 days after index

Criterion: Measurement (or Observation, depending on local OMOP ETL) = microbiology culture/result concept set
Attributes:
- Occurrence count: at least 1
- Temporal anchor: event starts between 0 days after and 7 days after index start date
- (Optional, if supported by local modeling) result/value constraint indicating positive / organism detected
Note: Microbiology representation is ETL-specific; verify whether positivity is modeled as measurement value, value_as_concept, or separate result rows.

2) Broad-spectrum antibiotic exposure within 0-2 days after index

Criterion: Drug Exposure = concept set Broad-spectrum systemic antibacterials
Attributes:
- Occurrence count: at least 1
- Temporal anchor: drug exposure starts between 0 and 2 days after index start date
- (Optional, to approximate “new start”) baseline washout:
  - exactly 0 exposures to the same antibiotic concept set in 30 to 1 days before index
Note: Without washout, this identifies antibiotic exposure near index, not necessarily a new initiation.

3) Fever signal within 0–1 day after index

Criterion (ANY within this branch, depending on phenotype design):
- Measurement = body temperature, or
- Observation/Condition = fever concept set
Attributes (measurement-based pattern):
- Occurrence count: at least 1
- Temporal anchor: event starts between 0 and 1 day after index start date
- Value constraint: temperature ≥ 38.0°C (only if units are standardized in ETL)
Attributes (coded-fever pattern):
- Occurrence count: at least 1
- Temporal anchor: event starts between 0 and 1 day after index start date'
Example B
- Index: Traumatic brain injury (TBI) (choose your index event, commonly Condition Occurrence: TBI)
  Restrictions (ANY):
  1. TBI diagnosis evidence
    - Criterion: Condition Occurrence = concept set TBI diagnoses
    - Attributes:
      - Occurrence count: at least 1 (default)
      - (Optional) Inpatient/ER context: add a Visit criterion and use same visit occurrence
  2. Head CT procedure evidence (alternative signal when diagnosis coding is incomplete)
    - Criterion: Procedure Occurrence = concept set CT head/brain
    - Attributes:
      - Occurrence count: at least 1
      - Temporal overlap with index (if index is visit-based) and/or
      - Restrict to the same visit occurrence (recommended when index is visit/encounter anchored)
AT LEAST X of the following (k-of-n logic; "≥X")
Identifies a specific condition or subphenotype based on the presence of a minimum number (k) of criteria, symptoms, or markers out of a total set (n) available. Use when you want a minimum evidence threshold from multiple indicators to improve specificity (the ability of a test to correctly identify individuals who do not have a condition, representing the true negative rate).

Example A
- Index: Sepsis suspicion at admission (often Inpatient Visit start as index)
  Restrictions (AT LEAST 2 of 3):
  1. Blood culture ordered (0-1 day)
    - Criterion: Procedure or Measurement = concept set Blood culture order/collection
    - Attributes: at least 1, starts 0-1 day after index start
  2. IV antibiotics started (0-1 day)
    - Criterion: Drug Exposure = concept set IV systemic antibiotics
    - Attributes: at least 1, starts 0-1 day after index start
  3. Lactate measured (0-1 day)
    - Criterion: Measurement = concept set Lactate
    - Attributes: at least 1, starts 0-1 day after index start
    - (Optional) value attribute: lactate ≥ 2 mmol/L (protocol-dependent)
Example B
- Index: Epilepsy phenotype (often Condition Occurrence: epilepsy as index)
  Restrictions (AT LEAST 2 of 3):
  1. Epilepsy diagnosis
    - Criterion: Condition Occurrence = concept set Epilepsy
    - Attributes: at least 1; (optional) earliest per person if incident
  2. Anti-seizure medication exposure
    - Criterion: Drug Exposure = concept set Anti-seizure meds (ASMs)
    - Attributes: at least 1; temporal within -180 to +30 days of index (example window)
  3. EEG performed
    - Criterion: Procedure Occurrence = concept set EEG
    - Attributes: at least 1; temporal within -180 to +30 days of index
AT MOST X of the following (upper-bound constraint; "≤X")
Use to enforce exclusions / mutual exclusivity (keep the cohort "clean") or limit competing signals.

Example A
- Index: Isolated TBI cohort
  Restrictions (AT MOST 0 of the following):
  1. Polytrauma major injury diagnoses in the same visit
    - Criterion: Condition Occurrence = concept set Major trauma/polytrauma
    - Attributes:
      - Occurrence count: exactly 0
      - Restrict to the same visit occurrence (or temporal overlap with index visit)
  2. Penetrating trauma mechanism codes in the same visit
    - Criterion: Condition Occurrence (or Observation) = concept set Penetrating trauma
    - Attributes: exactly 0, same visit occurrence (recommended)
Example B
- Index: First-line monotherapy new user (e.g., ACE inhibitor initiation)
  Restrictions (AT MOST 0 of the following):
  1. Other antihypertensive classes during baseline window
    - Criterion: Drug Exposure = concept set Other antihypertensive drug classes
    - Attributes:
      - Occurrence count: exactly 0
      - Temporal: -365 to -1 days before index start (example baseline)
Example C
- Index: Clean baseline cohort for a safety/effectiveness study
  Restrictions (AT MOST 1 of the following):
  1. End-stage renal disease (ESRD)
  2. Metastatic cancer
  3. Transplant
    - Criterion: Condition Occurrence = concept set High-risk comorbidities (or separate criteria per condition set)
    - Attributes:
      - Occurrence count: at most 1 across the listed criteria (group-level)
      - Temporal: All time before to 0 days before index start (or a defined baseline window) → Use cautiously; document rationale because it changes generalizability.
Summary
ALL of the following (AND logic)
Use when everything on your list must be true.
Why you need it: You use ALL when you want a strict, precise cohort - e.g., the event must happen in an inpatient/ER visit and the person must be an adult - so you don’t include people who match only part of what you mean.
ANY of the following (OR logic)
Use when any one item on your list is enough.
Why you need it: You use ANY when the same clinical idea can show up in different ways in the data - e.g., infection suspicion might appear as positive culture OR antibiotics OR fever - so you don’t miss real cases just because one signal is missing.
AT LEAST X of the following (threshold rule; “X out of N”)
Use when you want more than one piece of evidence.
Why you need it: You use AT LEAST X when one signal alone is too weak or noisy - e.g., require 2 of 3 sepsis signals (culture, IV antibiotics, lactate) or 2 of 3 epilepsy signals (diagnosis, ASM medication, EEG) - to reduce false positives.
AT MOST X of the following (upper limit rule)
Use when you want to prevent entry if too many “disqualifying” things are present.
Why you need it: You use AT MOST X to keep the cohort clean and focused - e.g., exclude polytrauma to get "isolated TBI," or exclude other antihypertensive classes to ensure true monotherapy - so your cohort matches the study intent.

When to use restrictions

Context bound to the index event (e.g., "diagnosis occurs during inpatient visit").
Narrow time-window rules around index.
First/absence constraints tightly defining "entry".

1.6 Temporal Anchors (Event Start/End Windows)

ATLAS expresses timing relative to the index event:

event starts between [A] days Before/After and [B] days Before/After index start date
event ends between [C] days Before/After and [D] days Before/After index start date

Baseline window defines covariate observability Absence criteria define phenotype specificity Washout defines incident design

Patterns with examples

1) Baseline prior history (lookback): "did the person have X before index?"
Use for eligibility, covariates, prior disease, prior treatment.

Pattern: start between 365 days before and 0 days before
Example: "Hypertension diagnosis in prior 365 days"
- Criterion: Condition Occurrence = concept set Hypertensive disorder
- Attributes:
  - Occurrence count: at least 1
  - Temporal: event starts 365d before → 0d before index start

2) Washout / new use: "ensure no prior exposure before index"
Use for incident/new-user designs (avoid prevalent/ongoing users).

Pattern: start between All days before and 1 day before + count = 0
Example: "No ACE inhibitor exposure in baseline"
- Criterion: Drug Exposure = concept set ACE inhibitors
- Attributes:
  - Occurrence count: exactly 0
  - Temporal: event starts All days before → 1d before index start

3) Post-index follow-up window: “capture outcomes after index”
Use for outcomes or downstream events.

Pattern: start between 0 days after and 30 days after
Example: “AKI within 7 days after surgery”
- Criterion: Condition Occurrence = concept set Acute kidney injury
- Attributes:
  - Occurrence count: at least 1
  - Temporal: event starts 0d after → 7d after index start

4) Overlap with index (classic for visits): "event must span the index date"
Use to require that something (usually a visit) covers the index date.

Pattern (overlap template):
- event starts between All days before and 0 days after index start
- event ends between 0 days before and All days after index start
- meaning: started on/before index and ended on/after index → index occurs during the event
Example: "Index diagnosis occurred during an inpatient/ER visit"
- Criterion: Visit Occurrence = concept set Inpatient or ER visit
- Attributes:
  - Temporal (overlap): start ≤ index AND end ≥ index (using the template above)
  - Optional checkbox: Restrict to the same visit occurrence

5) Same-day only (tight coupling): "must happen on index date"
Use when you truly want same-day events.

Pattern: start between 0 days before and 0 days after
Example: "Culture collected on index date"
- Criterion: Measurement/Procedure = concept set Blood culture collection/order
- Attributes:
  - Occurrence count: at least 1
  - Temporal: event starts 0d before → 0d after index start

Boundary note (important)

0 days before usually includes the same calendar day as the index.
1 day before enforces strictly before (excludes same-day events).

How to choose "ideal" temporal anchors (practical guidance)

A) Start from the research question + required time ordering
Define windows that match the clinical timeline you mean:

Baseline (eligibility/covariates): must occur before index
Exposure ascertainment: defines "on treatment" or "new use"
Outcome window: must occur after index, within a clinically plausible timeframe
Document the rationale (1-2 sentences per window).

B) Use clinical knowledge to avoid impossible timing
Pick anchors consistent with care processes and physiology:

labs/imaging often occur after symptom onset
antibiotics may start after culture collection but before culture results
"hospital-acquired" often requires a minimum time since admission (e.g., ≥48h); implement using visit-based index + timing restrictions.

C) Prefer literature / validated phenotypes when available
Reuse time windows from:

published cohort/phenotype definitions
OHDSI examples and training materials
network phenotype libraries
Even if adapted, keeping the same temporal structure improves comparability.

D) Stress-test empirically (recommended)
Before finalizing:

Check attrition/diagnostics (e.g., generation running/failed => View Report), to see which temporal rule drives drop-offs
Review time-to-event distributions (days relative to index)
Run sensitivity windows (e.g., baseline 180 vs 365; outcome 30 vs 90) to confirm results are stable

Quick cheat sheet

Goal	Recommended anchor pattern
“Had X in prior year”	start 365 before → 0 before
“No X before index (washout)”	start All before → 1 before, count exactly 0
“Outcome within 30 days”	start 0 after → 30 after
“During same inpatient visit”	overlap template + same visit occurrence
“Same-day procedure/event”	start 0 before → 0 after

1.7 Visit Restrictions & “Restrict to the same visit occurrence”

Goal: Require that the index event happens during a specific visit type (e.g., inpatient/ER), and optionally ensure the linked events are from the same encounter record (same visit_occurrence_id).

How to do it

Under Restrict initial events, click Add criteria to group.
Add Visit Occurrence as a criterion (e.g., Inpatient Visit or ER Visit concept set).
Configure the overlap timing so the visit spans the index date:
- Visit starts between All days before and 0 days after index start
- Visit ends between 0 days before and All days after index start
  (the visit started on/before index and ended on/after index → index occurred "inside" the visit.)
Check Restrict to the same visit occurrence.

What "Restrict to the same visit occurrence" means

It forces the visit criterion and the index event to share the same visit_occurrence_id.
Without it, ATLAS can satisfy the visit criterion using any visit in the time window (a different encounter), which can create incorrect linkages.

When to use (typical)

When "during the same hospital encounter" is part of the clinical meaning of the phenotype.
When your index event can occur in many settings and you need to restrict to one (inpatient-only, ER-only, ICU-only).
When you want to reduce false matches due to multiple visits close in time.

Examples

Example A - "Inpatient/ER fever"

Index event: Fever (Measurement+Meas Value: core temperature OR Observation/Condition: fever)
Restriction: Must occur during an inpatient/ER visit
- Criterion: Visit Occurrence = concept set Inpatient or ER visit
- Attributes:
  - Temporal overlap (visit spans index):
    - start: All before → 0 after
    - end: 0 before → All after
  - Restrict to the same visit occurrence: ON
    (fever must be recorded within that same inpatient/ER encounter.)

Example B - "Head CT during same encounter as suspected TBI"

Index event: Condition Occurrence = concept set TBI diagnoses
Restriction: Head CT must be performed in the same visit as the diagnosis
- Criterion: Procedure Occurrence = concept set CT head/brain
- Attributes:
  - Occurrence count: at least 1
  - Temporal: starts 0 before → 0 after (same day) or 0 before → 1 after (within 24-48h)
  - Restrict to the same visit occurrence: ON
    (the CT is tied to the same encounter where TBI was diagnosed.)

1.8 Allow Events Outside Observation Period

What it does:

Allows a criterion be satisfied by events that fall outside the person’s OBSERVATION_PERIOD.

Why this is risky:

Events outside OBSERVATION_PERIOD are typically treated as not reliably observable (data may be incomplete or not expected to exist there).

Recommendation

Keep OFF by default.
Turn ON only if your network explicitly defines valid events outside observation (rare, requires documentation).

Example (rare but justified case)

Your network defines Observation Periods (OP) conservatively (e.g., coverage starts later than true care history), but still stores verified historical diagnoses before OP start that you are instructed to use.
If you enable this, document the policy and validate with QA.

2. Inclusion Criteria

Definition Inclusion criteria are post-entry filters applied after initial events are identified.
Why they matter: They improve clarity, modularity, and QA because ATLAS can report attrition per rule.

2.1 When to use Inclusion Criteria vs Entry Restrictions

Use Inclusion criteria for:
- modular eligibility rules (age, comorbidities)
- prior history requirements (baseline diagnoses/exposures)
- lab thresholds
- exclusions you want to audit ("how many were removed by this rule?")
Use Entry restrictions for:
- same-visit / overlap mechanics
- "must happen during inpatient visit"
- narrow context tied directly to the index event

Rule of thumb:
If you want to see a separate attrition line item for it, prefer Inclusion Criteria.

2.2 Add Inclusion Rules

Click New inclusion criteria.
Name it descriptively (e.g., "Adult", "Baseline observation ≥365d", "No prior cancer").
Click Add criteria to group and configure attributes.

Examples

Example A - "Adult (≥18 at index)"

Criterion: Person (age at index)
Attributes:
- Age ≥ 18 at cohort start date (index)
  (exclude pediatrics.)

Example B - "No prior cancer in baseline"

Criterion: Condition Occurrence = concept set Malignancy
Attributes:
- Occurrence count: exactly 0
- Temporal: starts 365 before → 1 before index
  (ensure cancer-free baseline for cleaner interpretation.)

Example C - "Baseline observation ≥365 days"

Prefer using the Cohort Entry continuous observation setting, but you can also enforce with an inclusion rule if needed for QA visibility. (require enough history to know what happened before index.)

2.3 Limit Qualifying Events per Person (post-inclusion)

After inclusion rules, optionally set:

All / earliest / latest qualifying event per person

Examples

Incident design: Start with All candidate events (to test rules), then keep Earliest qualifying so each person contributes one index date.
Recurrent episodes: Keep All qualifying if multiple episodes per person are meaningful (e.g., repeated infections).
Cross-sectional "most recent" cohort: Keep Latest qualifying if your question needs the most recent event per person.

3. Cohort Exit

Definition: Exit logic defines when the cohort episode ends.
Cohort end = persistence rule (default end) + optional censoring events (end earlier if something happens).

3.1 Event Persistence (default end strategy)

Choose one:

End of continuous observation
- Follow people until their observation period ends (data coverage ends).
  Example: "Follow until we lose data coverage" (common for long follow-up).
Fixed duration relative to initial event
- End = index start or end + N days
- Offset from start date → everyone gets the same duration
- Offset from end date → duration depends on the event length (visit length, drug era length)
  Examples:
- "30-day outcome window after diagnosis" → start + 30 days
- "Follow 14 days after discharge" (visit has length) → end + 14 days
End of continuous drug exposure
- Cohort persists while drug exposure continues, allowing a gap (grace period).
  Examples:
- On-treatment safety: remain in cohort while taking Drug A with gap ≤30 days
- Treatment episode: end when exposure stops (plus optional extension)

Documentation tip Write one sentence: "We used [persistence] because [clinical/analytic rationale]."
Example: "Fixed 90-day follow-up after index to capture medium-term outcomes."

3.2 Death handling

Death can play different roles depending on study intent:

Death as an outcome (mortality studies):
- Define death as the outcome event (typically using DEATH logic/outcome cohort).
Death as censoring (effectiveness studies):
- Stop follow-up at death to avoid counting impossible time.
Death as a competing event:
- Recognize that death can preclude other outcomes; handle in analysis design/interpretation (not only cohort definition).

Operational rule:

Do not represent death via "death concept sets" in Condition/Observation; use DEATH logic (outcome cohort or censoring).

3.3 Censoring Events (Add Censoring Event)

Definition: Censoring events end the cohort early, before the persistence end, when an important event happens.

Click + Add Censoring Event
Add Death (common) and/or other events (e.g., treatment switch, competing outcome)

Examples

Example A - Death censoring

Censoring criterion: Death (DEATH table logic)
Attributes: usually none needed (any death)
(stop follow-up when the person dies.)

Example B - Censor on treatment switch

Censoring criterion: Drug Exposure = concept set Competing therapy
Attributes:
- Occurrence count: at least 1
- Temporal: starts 0 after → All after index
  (stop "on-treatment" follow-up once they start another therapy.)

Death handling (critical)

Do not search for "death" using condition/observation concept sets.
Use Death criterion (DEATH table logic) via censoring or as a dedicated outcome cohort.

Effective cohort end Cohort end date = earliest of:

persistence-defined end
first censoring event date (across all censoring criteria)

4. Cohort Era (Collapse)

Definition: Collapse settings control whether multiple qualifying episodes for the same person are merged into one longer episode.

4.1 Collapse gap size

Interpretation principle: Each era gap encodes a clinical assumption about what constitutes a single episode.

Set Specify era collapse gap size (days):

If the gap between one episode ending and the next starting is ≤ N days → ATLAS merges them into one longer era.
0 days = only overlapping or back-to-back episodes merge.
Larger gaps → fewer eras per person, longer era durations.

Document the intended meaning using a table such as:

Collapse gap / persistence	Encoded clinical assumption
0 days	Separate events/episodes
30 days	One episode of an acute event (e.g., repeated codes for same illness)
90 days	Patient state / prolonged clinical phase
180 days	Chronic/relapsing condition grouping
Drug persistence (end of continuous drug exposure)	Therapeutic treatment episode (on-treatment concept)

Operational guidance:

Choose gap based on clinical meaning and protocol/literature when available.
If uncertain: start with 0 and run sensitivity checks (0 vs 30 vs 180).

Real-life examples (what this looks like in practice)

Example 1 - Drug refills (statin / antihypertensive)

Patient fills lisinopril:
- Rx #1: Jan 1 (30 days supply) → ends Jan 30
- Next fill: Feb 10 → starts Feb 10
- Gap = 11 days (Jan 31 → Feb 10)
If collapse gap is 30 days, these two exposures are treated as one continuous treatment episode.
If collapse gap is 0 days, they become two separate episodes.
Why this matters: many patients refill late; a small grace period better matches real medication use.

Example 2 - Antibiotics (short courses)

Amoxicillin course:
- Jan 1-Jan 7, then another course Jan 20-Jan 27 (gap 12 days)
With 30-day gap, ATLAS may merge into one "extended antibiotic episode," which could be wrong if you want distinct infection events.
Typical choice here is 0-7 days depending on the question.
Rule of thumb: for short-course meds, use smaller gaps unless you explicitly want "any use within a month."

Example 3 - Pneumonia diagnoses (billing repeats during one illness)

Diagnosis recorded:
- Jan 3, Jan 10, Jan 18 (same illness episode, repeated coding)
If your cohort exit is fixed (e.g., 30 days after diagnosis), then without collapse, you might create multiple overlapping pneumonia episodes.
Using collapse gap 30 days often merges these into one pneumonia episode.
Why: repeated codes within weeks often represent the same clinical episode, not reinfection.

Example 4 - Chronic disease follow-up codes (diabetes, hypertension)

Diabetes diagnosis appears repeatedly across visits (Mar, Apr, Jul, Oct).
Using a large collapse gap (e.g., 180 days) can produce one long era that spans months.
When appropriate: if your cohort is "people living with diabetes" (prevalent phenotype).
When not appropriate: if your cohort is "new diabetes diagnosis episode" (incident). For that, collapse is usually irrelevant because you’ll use earliest per person.

Example 5 - Relapsing/remitting conditions (COPD exacerbation, MS relapse)

COPD exacerbation encounters:
- Feb 1 (episode), Feb 20 (return visit), Mar 5 (another flare)
If you consider returns within ~30 days part of the same exacerbation, set gap 30 days.
If you want to count each flare separately, set gap 0-7 days.
Why: patients often re-present for the same exacerbation within weeks.

Example 6 - Hospitalizations (readmissions)

Discharge: May 1
Readmission: May 10 (gap 9 days)
If you collapse with 30 days, you may treat a readmission as the same "episode," which is often not desired if your outcome is readmission.
For hospitalization episodes, many studies keep 0 days (separate encounters), and handle readmission explicitly as an outcome.

Typical defaults

Drug episodes: gap 30 days (refill grace period)
Relapsing/chronic conditions: gap 180 days (group close recurrences)
Incident cohorts (earliest per person): collapse often irrelevant (only one episode anyway)

Simple advice

Pick the gap so it matches what you consider "the same episode" clinically:
- Short acute events (hospitalizations, procedures): usually 0 days
- Repeated coding for one illness (pneumonia follow-ups): often 30 days
- Long-term treatment (maintenance meds): often 30 days
- Chronic condition prevalence: large gaps (e.g., 180 days) can be reasonable
If unsure: run sensitivity checks (0 vs 30 vs 180) and compare:
- number of eras per person,
- average era duration,
- whether merging changes conclusions.

4.2 Trimming (if available)

Defaults usually trim to:

observation period boundaries
censoring events

Recommendation

Prefer explicit study-window restrictions via entry/inclusion logic (e.g., "index date after 2020-01-01") rather than relying on trimming.

5. Diagnostics & Transportability

Minimal cohort diagnostics checklist

Concept prevalence / concept contribution
- confirm top contributing concepts match clinical intent
Time distributions
- index date distribution; event timing relative to index for key restrictions/outcomes
Negative control prevalence (minimum sanity check)
- confirm at least one "should be near-zero" check for obvious impossibilities (e.g., pediatric-only condition in adult-only cohort, sex-specific condition in opposite sex) OR an explicit rationale if not applicable
Cross-site transportability sanity
- check whether cohort depends on site-specific ETL artifacts (e.g., only works if visits are linked a certain way)

General transportability principle (normative)

Phenotype definitions should represent the intended clinical definition, not compensate for source-specific ETL issues.

If the phenotype only "works" at one site due to an ETL quirk, fix mapping/ETL rather than encoding that quirk into the cohort logic.

6. Operational pre-release checklist

Before publishing/releasing a cohort:

Clinical question and phenotype intent documented (index anchor, eligibility, time-at-risk).
Concept set policies documented (descendants, mapping, domain, vocabulary freeze).
Continuous observation justified (baseline and follow-up rationale; bias considerations noted).
Entry vs inclusion semantics consistent with SOP rule.
Death role specified (outcome vs censoring vs competing event).
"Allow events outside observation period" is OFF or has documented justification + explicit approval.
Cohort era/collapse meaning documented (gap value + assumption table).
Minimal diagnostics completed and reviewed.
Version assigned; change log updated; owner + QA sign-off recorded.

7. Resources and References

The following office hour session provide additional context and demonstrations related to this SOP:

[02-12-26] Cohort Definitions Using ATLAS
- Video Recording | Transcript

Approved

Standard Operating Protocol: Cohort Definition Using OHDSI ATLAS

Purpose​

Scope​

Cohort Lifecycle & Governance​

Lifecycle​

Ownership and approval checkpoints​

Phenotype naming convention​

Naming template​

Required fields​

Optional fields (use when needed)​

Naming rules (to reduce ambiguity)​

Versioning & change control​

Applicable Roles and Responsibilities​

Glossary​

Prerequisites​

Cohort Definition Workflow (OHDSI ATLAS)

1. Cohort Entry Events​

1.1 Add Initial Event vs Add Attribute​

1.2 Attach a Concept Set to the Initial Event​

Concept set considerations (Required for reproducibility)​

1.3 Continuous Observation Requirement (X days before / Y days after)​

Technical meaning​

Analytical meaning​

Operational logic summary​

1.4 Limit Initial Events per Person​

1.5 Restrict Initial Events (Entry Event Restrictions)​

How to choose all / any / at least X / at most X (with examples)​

1) Positive culture signal within 0-7 days after index​

2) Broad-spectrum antibiotic exposure within 0-2 days after index​

3) Fever signal within 0–1 day after index​

1.6 Temporal Anchors (Event Start/End Windows)​

Patterns with examples​

Boundary note (important)​

How to choose "ideal" temporal anchors (practical guidance)​

Quick cheat sheet​

1.7 Visit Restrictions & “Restrict to the same visit occurrence”​

1.8 Allow Events Outside Observation Period​

2. Inclusion Criteria​

2.1 When to use Inclusion Criteria vs Entry Restrictions​

2.2 Add Inclusion Rules​

2.3 Limit Qualifying Events per Person (post-inclusion)​

3. Cohort Exit​

3.1 Event Persistence (default end strategy)​

3.2 Death handling​

3.3 Censoring Events (Add Censoring Event)​

4. Cohort Era (Collapse)​

4.1 Collapse gap size​

Real-life examples (what this looks like in practice)​

Typical defaults​

Simple advice​

4.2 Trimming (if available)​

5. Diagnostics & Transportability​

Minimal cohort diagnostics checklist​

General transportability principle (normative)​

6. Operational pre-release checklist​

7. Resources and References​

Related Office Hours​

Purpose

Scope

Cohort Lifecycle & Governance

Lifecycle

Ownership and approval checkpoints

Phenotype naming convention

Naming template

Required fields

Optional fields (use when needed)

Naming rules (to reduce ambiguity)

Versioning & change control

Applicable Roles and Responsibilities

Glossary

Prerequisites

1. Cohort Entry Events

1.1 Add Initial Event vs Add Attribute

1.2 Attach a Concept Set to the Initial Event

Concept set considerations (Required for reproducibility)

1.3 Continuous Observation Requirement (X days before / Y days after)

Technical meaning

Analytical meaning

Operational logic summary

1.4 Limit Initial Events per Person

1.5 Restrict Initial Events (Entry Event Restrictions)

How to choose all / any / at least X / at most X (with examples)

1) Positive culture signal within 0-7 days after index

2) Broad-spectrum antibiotic exposure within 0-2 days after index

3) Fever signal within 0–1 day after index

1.6 Temporal Anchors (Event Start/End Windows)

Patterns with examples

Boundary note (important)

How to choose "ideal" temporal anchors (practical guidance)

Quick cheat sheet

1.7 Visit Restrictions & “Restrict to the same visit occurrence”

1.8 Allow Events Outside Observation Period

2. Inclusion Criteria

2.1 When to use Inclusion Criteria vs Entry Restrictions

2.2 Add Inclusion Rules

2.3 Limit Qualifying Events per Person (post-inclusion)

3. Cohort Exit

3.1 Event Persistence (default end strategy)

3.2 Death handling

3.3 Censoring Events (Add Censoring Event)

4. Cohort Era (Collapse)

4.1 Collapse gap size

Real-life examples (what this looks like in practice)

Typical defaults

Simple advice

4.2 Trimming (if available)

5. Diagnostics & Transportability

Minimal cohort diagnostics checklist

General transportability principle (normative)

6. Operational pre-release checklist

7. Resources and References

Related Office Hours