Skip to main content

Awaiting Approval

Before merging an approved SOP, the contributor should change this header from Awaiting Approval to Approved

Imaging SOP

Version History

Purpose

Guidance for Data Acquisition Sites regarding imaging data processing, harmonization, and upload

Procedures

Data Preparation

  1. Identify data extraction partners and methods (e.g., CTSI)

  2. Identify the sources of the data (e.g., PACS, site research repository)

  3. Determine which image modalities would need to be extracted, date range, and size

  4. Confirm the type of data format you will receive (e.g., DICOM). DICOM is the target format for CHORUS)

  5. Confirm the type of IDs used to link this data to the patient (e.g., name) or other hospital ID (e.g., MRN), procedure codes, and time shifting => refer to Standards SOP on data linkage and date shifting. Linkage ID a) Accession number = OMOP: image_occurrence_id b) Procedure number = OMOP: procedure_occurrence_id

  6. Ensure accession numbers for specific image tests are part of the encounters of interest. The preferred method is using EHR search to filter tests based on encounter (e.g., CDW) and not pulling directly from PACS based on the patient identifiers. This is to simplify image test mapping to the primary encounter of interest for the multimodal data pull.

  7. Map header metadata relevant to your local site with header metadata selected by CHORUS (list of metadata fields below). Refer to points 12 and 13 for look up tale specifics for the deidentification tool. Having the RSNA LOINC codes specifying the type of study would be useful to include. Imaging studies should have a corresponding entry in the PROCEDURE_OCCURENCE OMOP table. Thus, imaging studies can be linked to the rest of the data using the procedure_occurence_id field. Refer to multimodal linkage SOP on specifics about linking images to the other data: https://chorus-ai.github.io/Chorus_SOP/docs/Multimodal-Linkage/ . For date shifting, please refer to the Date Shifting SOP.

Study Level: AccessionNumber StudyDescription StudyInstanceUID Modality StudyDate Manufacturer ManufacturersModelName StudyTime MagneticFieldStrength BodyPartExamined Radiopharmaceutical ContrastBolusAgent ContrastBolusRoute

Series Level: SeriesDescription SeriesInstanceUID SliceThickness ViewPosition ImageLaterality ImagesinAcquisition TransducerType TransducerFrequency SeriesNumber

Image Level: N/A

  1. Identify metadata not covered by CHORUS that would be relevant to your site locally.

  2. Pursue local data quality review (e.g., data missingness).

  3. Identifiable cross-walk table creation (stays local): maps all original DICOM metadata fields to the deidentification procedure (e.g., replacement, erase, etc) and the new field in the deidentified DICOM, so this can be used as a crosswalk (crosswalk table stays local at the site and is not shared). Refer to points 12 and 13 below based on DICOM deidentification status in your site.

  4. Deidentified DICOM metadata tables (for sharing): to enable imaging data querying without loading DICOM data.

  5. DICOM with metadata already deidentified at your local site (skip to #13 if you do not have data already deidentified): DICOM files can be shared if the site already has an approved solution that may or may not require pixel deidentification, but you should still use the CHORUS-specific CTP version to process your data consistently.

  • https://github.com/chorus-ai/CTP-deid/tree/main
  • You will need to prepare lookup tables: "image_map.csv" and "personal_map.csv" — examples in CHoRUS_metadata_deid_instruction/pydicom/loopup_table/ PatientID and AccessionNumber in the DICOM metadata are replaced by person_id and image_occurrence_id from the OMOP table. Selected data tags (see repository Table 1) are shifted by a predefined number of days (specific to each PatientID). Ensure this shift is consistent across EHR and waveform data
  • You will first need to run a pydicom script for wrangling these lookup tables and then run the CTP tool
  • For CTP set up instructions, contact Xiang Li, PhD: XLI60@mgh.harvard.edu
  1. DICOM with metadata that needs deidentification: You should use the CHORUS-specific CTP tool.
  • https://github.com/chorus-ai/CTP-deid/tree/main
  • You will need to prepare lookup tables: "image_map.csv" and "personal_map.csv" — examples in CHoRUS_metadata_deid_instruction/pydicom/loopup_table/ PatientID and AccessionNumber in the DICOM metadata are replaced by person_id and image_occurrence_id from the OMOP table. Selected data tags (see repository Table 1) are shifted by a predefined number of days (specific to each PatientID). Ensure this shift is consistent across EHR and waveform data
  • You will first need to run a pydicom script for wrangling these lookup tables and then run the CTP tool
  • For CTP set up instructions, contact Xiang Li, PhD: XLI60@mgh.harvard.edu
  1. Test deidentified DICOM readability: load file in Horos/Osirix to ensure that DICOM can open

  2. Folder and file structure and naming: ensure folder structure and naming follow this structure:

The Images folder should contain all images for the patient, with images organized in the standard DICOM hierarchy with study/series folders.

Folder and File names should follow the format below:

Patient Identification: Typically includes a de-identified person ID. Study Id: The study Id, may be a de-identified version of DICOM tags StudyInstanceUID or StudyId. Series Id: The series Id, may be a de-identified version of DICOM tags SeriesInstanceUID or SeriesNumber. Modality: Refers to the type of equipment used for the scan, such as MR (Magnetic Resonance), CT (Computed Tomography), XR (X-ray), or US (Ultrasound). Instance Number: Represents the specific image number within a series.

Optional: Inclusion of patient id, study id, and series id in the filename helps to recover orphaned files but is optional.

Example of a DICOM Image folder and filenames root ├── 10001001 ├── Images ├── 1234 ├── 5678 ├── 10001001_1234_5678_CT_000.dcm ├── 10001001_1234_5678_CT_001.dcm ├── 10001001_1234_5678_CT_002.dcm ... This example would represent a CT scan for patient 1000100020 with Study Id 1234 and Series Id 5678. This image is the first in that series. Please Make sure that these fields match the dicom metadata tags.

DICOM Pixel de-identification

  • under development

DICOM defacing

  • under development

Data Upload to Azure

  • refer to Azure data upload SOP

Reference Materials

RSNA CTP Documentation: https://mircwiki.rsna.org/index.php?title=MIRC_CTP