Skip to content

Instantly share code, notes, and snippets.

@alexanderkiel
Last active July 17, 2025 13:32
Show Gist options
  • Save alexanderkiel/09336f0ecbb90d891d42c546adde2d52 to your computer and use it in GitHub Desktop.
Save alexanderkiel/09336f0ecbb90d891d42c546adde2d52 to your computer and use it in GitHub Desktop.

Flattening Concept

Problem Statement

Researchers like to have flat tables of the extracted DUP data. We have already TORCH which extracts the DUP data. In that process, TORCH creates one FHIR bundle per patient and a single core bundle with data not connected to a specific patient like medications.

The patient bundles contain FHIR resources according to the CDS profiles. However the data inside the resources is already minimised which is beneficial for the flattening process. On top of that we have the CRDTL description of every data extraction which contains all constraints the FHIR data fulfills on top of the constraints of the CDS profiles.

Flattening

The process of flattening converts the hierarchical data of a FHIR resource into one or many rows of a table. In the simplest case that the hierarchical data contain no arrays or all arrays have at most one element, one row in the table con represent all of the hierarchical data without any loss.

Simple Example

The following FHIR observation resource:

{
  "resourceType": "Observation",
  "id": "DFVZ5VHCULKOGYN3",
  "code": {
    "coding": [
      {
        "system": "http://loinc.org",
        "code": "20570-8"
      }
    ]
  },
  "subject": {
    "reference": "Patient/DFVZ5VHCULKOGYMX"
  },
  "valueQuantity": {
    "value": 45,
    "unit": "percent"
  }
}

will be flattened into the following Observation table:

id code-loinc subject.type subject.id value.value value.unit
DFVZ5VHCULKOGYN3 20570-8 Patient FEVZI5VHCULKOGYMX 45 percent

In addition to the data CSV file, we will also have a data dictionary CSV file.

column data-type description
id string identifier
code-loinc string code from http://loinc.org
subject.type resource-type type of the subject
subject.id string id of the subject
value.value number value of the observation
value.unit string unit of the value

Data types:

data-type description
resource-type one from http://hl7.org/fhir/ValueSet/resource-types|4.0.1

Example with CRTDL

We need one table for each attribute group. An attribute group is:

[
  {
    "id": "1",
    "name": "Patient",
    "groupReference": "https://www.medizininformatik-initiative.de/fhir/core/modul-person/StructureDefinition/PatientPseudonymisiert",
    "attributes": [
      {
        "attributeRef": "Patient.gender"
      }
    ]
  },
  {
    "id": "2",
    "name": "ObservationLab",
    "groupReference": "https://www.medizininformatik-initiative.de/fhir/core/modul-labor/StructureDefinition/ObservationLab",
    "attributes": [
      {
        "attributeRef": "Observation.code"
      },
      {
        "attributeRef": "Observation.subject",
        "linkedGroups": [
          "1"
        ]
      },
      {
        "attributeRef": "Observation.encounter",
        "linkedGroups": [
          "3"
        ]
      },
      {
        "attributeRef": "Observation.value"
      }
    ]
  },
  {
    "id": "3",
    "name": "Encounter",
    "includeReferenceOnly": true,
    "groupReference": "https://www.medizininformatik-initiative.de/fhir/core/modul-fall/StructureDefinition/KontaktGesundheitseinrichtung",
    "attributes": [
      {
        "attributeRef": "Encounter.subject",
        "linkedGroups": [
          "1"
        ]
      },
      {
        "attributeRef": "Encounter.hospitalization",
        "mustHave": true
      }
    ]
  }
]

The following FHIR observation resource:

{
  "resourceType": "Observation",
  "id": "DFVZ5VHCULKOGYN3",
  "code": {
    "coding": [
      {
        "system": "http://loinc.org",
        "code": "20570-8"
      }
    ]
  },
  "subject": {
    "reference": "Patient/DFVZ5VHCULKOGYMX"
  },
  "encounter": {
    "reference": "Encounter/DFVZ5VHCULKOGXMX"
  },
  "valueQuantity": {
    "value": 45,
    "unit": "percent"
  }
}
{
  "resourceType": "Encounter",
  "id": "8767856",
  "subject": {
    "reference": "Patient/DFVZ5VHCULKOGYMX"
  },
  "hospitalization": {
    "admitSource": {
      "coding": [
        {
          "system": "http://fhir.de/CodeSystem/dgkev/Aufnahmeanlass",
          "code": "N"
        }
      ],
      "text": "foo"
    }
  }
}

For this attribute group, the table Observation-1234 is created.

Data Dictionary

table column data-type description
Observation-2 id string identifier
Observation-2 code-loinc string code from http://loinc.org
Observation-2 subject string id of the patient in table Patient-1
Observation-2 encounter string id of one encounter in table Encounter-3
Observation-2 value.value number value of the observation
Observation-2 value.unit string unit of the value
Encounter-3 id string identifier
Encounter-3 subject string id of the patient in table Patient-1
Encounter-3 hospitalization.admitSource.code-Aufnahmeanlass string code from http://fhir.de/CodeSystem/dgkev/Aufnahmeanlass

Reference

FHIR Data Types

CodeableConcept

For CodeableConcepts, only columns with the code values are generated. The text is omitted, because all concepts in the KDS are coded. The code columns are generated for each code system present in the data. Each code column ends with the name of the code system which can be taken directly from the CodeSystem resource. For example in condition resources, if we have ICD-10 and SNOMED CT codes, we will have two columns one named code-ICD10 and one named code-SCT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment