Dataset Structure

Schema Overview

The data types are selected to best suit a dataframe or SQL database for analysis.

Field name Recommended type Description Sample values
Gender Categorical / String The biological sex of the patient. Male, Female
Age String (Mixed) The age of the patient.
Note: May vary depending on source image type.
Varies
Modality String The document type. All entries are SC (Secondary Capture). SC
Description String Source image type and content category. Dose Report Screenshot, Clinical Photo, Scanned Document
Size_raw String The file size as displayed in the UI. 500 KB, 2 MB
Size_bytes Float / Int (Derived) The file size converted to a standard numerical unit for analysis. 500000, 2000000

Usage & considerations

Technical characteristics of
secondary capture (SC)

Secondary capture
purpose

SC objects enable storage of non-DICOM images within PACS using DICOM format. Common sources include: screenshots from non-DICOM modalities or applications, scanned paper documents and reports, clinical photographs (wounds, skin lesions, surgical sites, dermoscopy), imported studies from external facilities on CD/DVD, images from legacy systems without native DICOM support, dose monitoring screenshots, and patient consent forms or documentation.

Image source
diversity

SC images originate from heterogeneous sources with significant quality variation: smartphone or digital cameras (variable resolution, lighting, and white balance), flatbed scanners (paper documents converted to digital format with potential skew or artifacts), screen captures from workstations (varying resolution and compression), external media from other institutions (CD/DVD imports with unknown acquisition parameters), and non-medical imaging devices. This diversity impacts automated image analysis and quality assessment.

DICOM
encapsulation

SC objects wrap existing images (JPEG, PNG, TIFF, BMP, etc.) in DICOM metadata structure, providing patient demographics, study context, and PACS integration capabilities. However, original modality-specific acquisition metadata is typically absent or incomplete. SC images lack technical parameters (kVp, mAs, field of view, pixel spacing) typical of native DICOM from medical imaging devices. The Conversion Type attribute indicates whether the image is from digitized film (DF), digital interface (DI), synthesized image (SYN), or wrong presentation (WSD).

Quality
considerations

SC images frequently exhibit: inconsistent resolution and aspect ratios (ranging from low-quality phone photos to high-resolution scans), variable compression artifacts from source format, non-standardized patient positioning and anatomical orientation, embedded text annotations or measurement graphics from source systems, potential PHI visibility in screenshots (patient names, medical record numbers, dates visible in captured interface), lack of geometric calibration information preventing accurate measurements, color inconsistencies and lighting variations in clinical photographs, and artifacts from document scanning (shadows, fold marks, coffee stains on paper documents).

Clinical documentation use cases

SC enables comprehensive patient records by incorporating supplementary visual documentation that complements primary diagnostic imaging. Wound care photography tracks healing progression over time. Pre-operative and post-operative surgical site images document anatomical appearance and surgical outcomes. Scanned outside facility reports provide comparison studies and prior imaging history. Dermatological photography captures skin lesions for teledermatology consultation and longitudinal monitoring. Consent forms, patient questionnaires, and clinical trial documentation maintain complete electronic health records within PACS.

Workflow
integration

SC objects facilitate PACS-centric workflows by consolidating all patient-related imaging into a unified archive. Radiologists can review imported outside studies alongside current examinations. Clinicians access wound photographs and clinical documentation directly from PACS without switching to separate systems. Quality assurance teams capture dose report screenshots for radiation safety monitoring. IT departments use SC for legacy system migration, converting historical film archives and proprietary format images into DICOM-compliant objects.

Primary use cases

  • Developing robust image classification systems that handle heterogeneous image sources and significant quality variations across SC objects.
  • Training text detection and PHI scrubbing models to automatically identify and redact sensitive information visible in screenshots and scanned documents.
  • Building document classification systems to categorize SC images by content type: clinical photographs, dose reports, outside studies, scanned documents, consent forms, or other documentation categories.
  • Creating automated quality assessment models that flag SC images with insufficient diagnostic quality, incorrect orientation, excessive compression artifacts, or embedded PHI requiring redaction before sharing.
  • Analyzing PACS workflow patterns and SC usage trends to optimize storage allocation, identify workflow inefficiencies, and guide enterprise imaging strategy development.
  • Developing clinical photography standardization systems that provide guidance on optimal lighting, positioning, distance, and anatomical views for wound care and dermatological imaging.
  • Training models resilient to real-world image variability (diverse sources, resolutions, lighting conditions) for deployment in heterogeneous clinical environments where SC objects are common.
  • Building automated document processing pipelines that extract structured data from scanned reports, consent forms, and clinical documentation captured as SC objects.

Unlock your true
speed to scale 

Accelerate what data and AI can do together.