The data types are selected to best suit a dataframe or SQL database for analysis.
| Field name | Recommended type | Description | Sample values |
|---|---|---|---|
| Gender | Categorical / String | The biological sex of the patient. | Male, Female |
| Age | String (Mixed) | The age of the patient at the time of the study. Note: Inherits from parent imaging study. |
Varies |
| Modality | String | The document type. All entries are SR (Structured Report). | SR |
| Description | String | Report type and content category. | Measurement Report, CAD Report, Key Object Selection |
| Size_raw | String | The file size as displayed in the UI. | 50 KB, 150 KB |
| Size_bytes | Float / Int | (Derived) The file size converted to a standard numerical unit for analysis. | 50000, 150000 |
Structured Reports follow the DICOM SR standard, which defines a hierarchical tree structure for encoding clinical information. Content items include measurements, observations, codes from standard terminologies (SNOMED, RadLex), and relationships between findings. This structured format enables machine parsing and automated quality assurance.
Common SR types include: Measurement Reports (quantitative analysis results), CAD Reports (computer-aided detection findings), Key Object Selection (references to significant images), Dose Reports (radiation exposure documentation), and Comprehensive SR (full radiology reports with coded findings).
SR documents organize content using coded concepts linked through relationships (CONTAINS, HAS OBS CONTEXT, INFERRED FROM). Each content item has a concept name (from standard vocabulary), value (numeric, text, or coded), and optional qualifiers. This semantic structure enables sophisticated queries and analytics.
SRs maintain references to source images through DICOM UIDs, enabling correlation between findings and imaging data. Spatial coordinates can be encoded to mark lesion locations. This linkage is essential for training AI models with ground-truth annotations derived from clinical reports.
SRs are generated by PACS workstations, CAD systems, quantitative analysis tools, and voice recognition systems. They enable standardized reporting templates, automated data extraction for registries, quality metrics calculation, and clinical decision support through real-time rule evaluation.