Skip to content

Instantly share code, notes, and snippets.

@maxnikulin
Created December 30, 2022 10:28
Show Gist options
  • Save maxnikulin/bd9f630466e1cac6aea162279ce5e15e to your computer and use it in GitHub Desktop.
Save maxnikulin/bd9f630466e1cac6aea162279ce5e15e to your computer and use it in GitHub Desktop.
glx2dicom - Convert Sirona GALAXIS (GALILEOS Viewer) CBCT format to DICOM

glx2dicom - Convert Sirona GALAXIS (GALILEOS Viewer) CBCT format to DICOM

Usage:

python3 glx2dicom.py --tag PatientName='Surname^Name' \
       --tag PatientBirthDate='19700101' \
       [--tag TAG=VALUE]... SRC_DIR OUT_DIR

This script is not intended for usage in clinics since nobody may assure correctness of conversion result. Perhaps the proper way to get CT from a Sirona device in DICOM format is to request to add an attachment with DICOM files in addition to the proprietary format. I would not be surprised if it is impossible without purchasing an extra option for the device.

I got a CD with cone beam computed tomography (CBCT) results and a proprietary Sirona GALILEOS Viewer (GALAXIS) application for MS Windows. I have not managed to get it running under wine, so I decided to convert it to a more wide spread file format.

The result may be explored using e.g. SightViewer (or dicomXplorer, another application create with SIGHT framework developed by IRCAD) or amide. Some applications do not have 3D features such as aeskulap, but still able to display individual slices.

Requirements: python3 with pydicom, GDCM, and lxml packages.

My guess concerning CT format is the following. It has an ID that is used in file names.

<ID>.gwg
file likely contain metadata such as patient name and birth date, but I have no idea which way it is encoded or encrypted. The file header (signature) is GWG16. It does not look like (possibly compressed) QDataStream. I am unsure what serializers are available in .NET. This script does not use the file, so it is necessary to specify metadata using command line options.
<ID>_vol_0
directory contains data and should be used as the argument of this script.
<ID>_vol_0/<ID>_proj_0
is an XML file with parameters that might be used to create panoramic projection. It seems the image is not saved and the vendor viewer computes it in runtime. Unused by this script.
<ID>_vol_0/<ID>_vol_0
is a gzipped XML file describing device settings and parameters of the saved files. There is a nested element having entity-encoded XML subtree as its content. This script takes some values, however more parameters may be obtained such as Xray tube current and spectrum peak.
<ID>_vol_0/<ID>_vol_0_<NNN>
files having different trailing number. This is CT slices written as gzip compressed file. They are 2 bytes per pixel (int16 or uint16) little endian raw greyscale images. Actually 12 bits are used despite the <ID>_vol_0/<ID>_vol_0 file declares 14 bits. Instead of using this script you may uncompress these files using e.g. zcat(1) and load them as raw images to e.g. ImageJ.

Some additions to the usual disclaimer:

  • I never faced DICOM format before.
  • This script is tested on a single CT, unsure ether it was produced by Galileos or Orthophos
  • Not all mandatory DICOM attributes are set.
  • Images are compressed using RLE method. Unsure if it is possible to easily use some lossless JPEG compression from, python however the gdcmconv(1) CLI tool can do it.
  • I am completely unsure concerning transfer function (DICOM Look up Table), so "Window" and "Rescale" attributes may be incorrect, see DICOM PS3.3 C.11 Look Up Tables and Presentation States
  • Likely it is possible to add the original files with parameters as custom DICOMDIR entries. It will make conversion lossless since the result may be updated later without access to source files. It is not done however.

glx2dicom 0.1

Copyright (C) 2022 Max Nikulin

License GPLv3+: GNU GPL version 3 or later https://gnu.org/licenses/gpl.html.

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.

#!/usr/bin/env python3
# glx2dicom - Convert Sirona GALILEOS Viewer CBCT format to DICOM
#
# Copyright (C) 2022 Max Nikulin
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""glx2dicom - Convert Sirona GALAXIS (GALILEOS Viewer) CBCT format to DICOM
Usage::
python3 glx2dicom.py --tag PatientName='Surname^Name' \\
--tag PatientBirthDate='19700101' \\
[--tag TAG=VALUE]... SRC_DIR OUT_DIR
This script is not intended for usage in clinics
since nobody may assure correctness of conversion result.
Perhaps the proper way to get CT from a Sirona device in DICOM_ format
is to request to add an attachment with DICOM files
in addition to the proprietary format.
I would not be surprised if it is impossible without purchasing
an extra option for the device.
I got a CD with cone beam computed tomography (CBCT) results
and a proprietary Sirona GALILEOS Viewer (GALAXIS) application
for MS Windows. I have not managed to get it running under wine_,
so I decided to convert it to a more wide spread file format.
The result may be explored using e.g. SightViewer_ (or dicomXplorer_,
another application create with `SIGHT framework`_ developed by IRCAD)
or amide_. Some applications do not have 3D features such as
aeskulap_, but still able to display individual slices.
Requirements: python3_ with pydicom_, GDCM_, and lxml_ packages.
My guess concerning CT format is the following. It has an ID
that is used in file names.
``<ID>.gwg``
file likely contain metadata such as patient name
and birth date, but I have no idea which way it is encoded
or encrypted. The file header (signature) is ``GWG16``.
It does not look like (possibly compressed) ``QDataStream``.
I am unsure what serializers are available in .NET.
This script does not use the file, so it is necessary
to specify metadata using command line options.
``<ID>_vol_0``
directory contains data and should be used
as the argument of this script.
``<ID>_vol_0/<ID>_proj_0``
is an XML file with parameters that
might be used to create panoramic projection.
It seems the image is not saved and the vendor viewer
computes it in runtime. Unused by this script.
``<ID>_vol_0/<ID>_vol_0``
is a gzipped XML file describing
device settings and parameters of the saved files.
There is a nested element having entity-encoded XML
subtree as its content. This script takes some values,
however more parameters may be obtained such as Xray
tube current and spectrum peak.
``<ID>_vol_0/<ID>_vol_0_<NNN>``
files having different trailing number. This is CT slices written
as gzip compressed file. They are 2 bytes per pixel (int16 or uint16)
little endian raw greyscale images. Actually 12 bits are used
despite the ``<ID>_vol_0/<ID>_vol_0`` file declares 14 bits.
Instead of using this script you may uncompress these files
using e.g. ``zcat(1)`` and load them as raw images to e.g.
ImageJ_.
Some additions to the usual disclaimer:
- I never faced DICOM format before.
- This script is tested on a single CT,
unsure ether it was produced by Galileos or Orthophos
- Not all mandatory DICOM attributes are set.
- Images are compressed using RLE method. Unsure if it is possible
to easily use some lossless JPEG compression from, python however
the |gdcmconv|_ CLI tool can do it.
- I am completely unsure concerning transfer function (DICOM Look up Table),
so "Window" and "Rescale" attributes may be incorrect,
see |dicom ps3.3 C.11|_
- Likely it is possible to add the original files with parameters
as custom DICOMDIR entries. It will make conversion lossless
since the result may be updated later without access to source files.
It is not done however.
.. _aeskulap: https://sight.pages.ircad.fr/sight/
.. _amide: https://github.com/ferdymercury/amide
.. _DICOM: https://www.dicomstandard.org/current
.. |dicom ps3.3 C.11| replace:: DICOM PS3.3 C.11
 Look Up Tables and Presentation States
.. _`dicom ps3.3 C.11`: https://dicom.nema.org/medical/dicom/current/
output/chtml/part03/sect_C.11.html
.. _dicomXplorer: https://sight.pages.ircad.fr/sight-doc/
Introduction/src/applications.html#dicomxplorer
.. _GDCM: https://sourceforge.net/projects/gdcm/
.. |gdcmconv| replace:: ``gdcmconv(1)``
.. _gdcmconv: https://gdcm.sourceforge.net/html/gdcmconv.html
.. _ImageJ: https://github.com/imagej/ImageJ
.. _`lxml - XML and HTML with Python`:
.. _lxml: https://lxml.de/
.. _pydicom: https://pydicom.github.io/pydicom/
.. _python3: https://www.python.org/
.. _`SIGHT framework`: https://github.com/IRCAD/sight/
.. _SightViewer: https://www.ircad.fr/research/software/
.. _wine: https://www.winehq.org/
"""
__title__ = "glx2dicom"
__version__ = "0.1"
__author__ = "Max Nikulin"
__email__ = "[email protected]"
__description__ = \
"""Convert Sirona GALAXIS (GALILEOS Viewer) CBCT format to DICOM"""
__license__ = "GPLv3+"
__copyright__ = "Copyright (C) 2022 Max Nikulin"
import datetime
import gzip
from lxml import etree
# import numpy
from pathlib import Path
from pydicom import uid, datadict
from pydicom.dataset import FileDataset
from pydicom.dataset import FileMetaDataset
from pydicom.fileset import FileSet
import re
from typing import Any, Dict, List
usage = """%(prog)s --tag PatientName='Surname^Name' \\
--tag PatientBirthDate='19700101' \\
[--tag TAG=VALUE]... SRC_DIR OUT_DIR
or: %(prog)s [--help | --version]
"""
version = f"""%(prog)s {__version__}
{__copyright__}
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
"""
default_dicom_attrs = {
# Random UUID obtained using ``uid.generate_uid(None)``
'ImplementationClassUID': '2.25.88075347621178512596802224299896711910',
'SpecificCharacterSet': 'ISO_IR 100',
'TransferSyntaxUID': uid.ExplicitVRLittleEndian,
# PS3.3 C.7.1 patient
# No UID
# Likely encoded in ``.gwg` files
'PatientName': 'Test^Firstname',
'PatientID': '1234456789',
# TODO (0018, 5100) PatientPosition
# PS3.3 C.7.2 study
'StudyDescription': 'Dental CBCT Study',
# Physisian
# PS3.3 C.7.3 series
'Modality': 'CT',
'SeriesNumber': "1",
'SeriesDescription': "CT series",
# Unsure, perhaps "ORIGINAL" would be for images from which
# CT is reconstructed.
'ImageType': ['DERIVED', 'PRIMARY'],
'NumberOfFrames': 1,
# PS3.3 C.7.6 Image
# TODO 'PatientOrientation'
# PS3.3 C.7.6.2, 10.7
# Unsure, but do not see obviously wrong orientation.
'ImageOrientationPatient': [1.0, 0.0, 0.0, 0.0, 1.0, 0.0],
# PS3.3 C7.6.3, C7.6.3.3
'SamplesPerPixel': 1,
'PhotometricInterpretation': 'MONOCHROME2', # 2 bytes per pixel greyscale
# SightViewer-21.1.1 can not handle 0 (unsigned) 16 bit images
'PixelRepresentation': 1, # 2's component (unsigned)
'BitsAllocated': 16 # multiple of 8
}
uid_prefix = None
"""UID prefix used to generate e.g. image identifiers
``None``
UUID (default since I have not registered prefix),
``""``
prefix registered for ``pydicom``.
"""
def read_fbp_params(ct_dir: Path) -> Dict[str, Any]:
with gzip.open(ct_dir / ct_dir.name, "rb") as f:
tree = etree.fromstring(f.read())
# Other <LibParams> children:
# - TimeStamp: no seconds
# - SequenceID
# - CrossHairX, Y, Z.
assert tree.tag == "FBPParams"
params = tree.find("LibParams")
assert params is not None
assert params.find("ZoomFactor").text == "1"
# <DEVICEPLUGIN> (parsed from <ScanParamsXml> text)
# attributes that might be useful:
# - deviceid
# - dx89serialsno
# - p2kserialno
# - kv
# - ma
# - xraypulsewidth
scan = etree.fromstring(params.find("ScanParamsXml").text)
assert scan is not None
assert scan.tag == "DEVICEPLUGIN"
# print(etree.tostring(scan, pretty_print=True, encoding="unicode"))
image = scan.find('IMAGEDESCRIPTION/IMAGE')
assert image is not None
img_date = image.attrib['takedate']
img_time = image.attrib['taketime']
assert re.match("[0-9]{4}-[0-9]{2}-[0-9]{2}", img_date)
assert re.match("[0-9]{2}:[0-9]{2}:[0-9]{2}", img_time)
img_date = re.sub("-", "", img_date)
img_time = re.sub(":", "", img_time)
retval = {}
# PS3.3 C.7.6 Image
retval['ContentDate'] = img_date
retval['ContentTime'] = img_time
# PS3.3 C.7.6.3 Image Pixel Module
# PS3.3 C.7.6.3.3 Image Pixel Description Macro
rows = int(params.find('VolSizeX').text)
assert rows > 0
columns = int(params.find('VolSizeY').text)
assert columns > 0
retval['Rows'] = rows
retval['Columns'] = columns
# 14, but really 12
bits = int(params.find('ImageInBitDepth').text)
assert bits > 8 and bits <= 16
value_max = int(params.find('VoxelValueMax').text)
assert value_max < (2 << bits)
retval['BitsStored'] = bits
# PS3.3 C.7.6.3.3
# and PS3.5 8.1.1 Pixel Data Encoding of Related Data Elements
retval['HighBit'] = bits - 1
# PS3.3 C.7.6.2, 10.7
retval['PixelSpacing'] = [
float(params.find('VoxelSizeX').text),
float(params.find('VoxelSizeY').text)]
thickness = float(params.find('VoxelSizeZ').text)
assert thickness > 0.0
retval['SliceThickness'] = thickness
retval['SpacingBetweenSlices'] = thickness
z = float(params.find('CenterZ').text)
retval['ImagePositionPatient'] = [
float(params.find('CenterX').text)
- 0.5*(columns - 1)*retval['PixelSpacing'][0],
float(params.find('CenterY').text)
- 0.5*(rows - 1)*retval['PixelSpacing'][1],
z]
retval['SliceLocation'] = z # likely unused by applications
retval['SeriesDate'] = img_date
retval['SeriesTime'] = img_time
scan_id = params.find("ScanID").text
assert scan_id is not None
# PS3.3 C.7.2 study
# Unsure, perhaps a scan may contain more than single volume
retval['StudyID'] = scan_id
retval['StudyDate'] = img_date
retval['StudyTime'] = img_time
retval['FileSetID'] = scan_id
# PS3.3 C.11.2.1.2.1 Note 4
# Look Up Tables and Presentation States
# > VOI LUT Module
# > Window Center and Window Width
# > Default LINEAR Function
#
# Added with hope that sight viewer will use it to construct
# default transfer function instead of displaying white brick at startup
x1 = int(params.find('VoxelValueMin').text)
assert x1 >= 0
x2 = int(params.find('VoxelValueMax').text)
assert x2 > x1
retval['WindowCenter'] = (x1 + x2 + 1)//2
retval['WindowWidth'] = x2 - x1 + 1
# FIXME I am completely unsure in the following values,
# they are taken from a DICOM file sample,
# but they makes SightViewer behavior better.
# Otherwise it displays a white brick when a CT is loaded
# and it is necessary to adjust transfer function.
retval['RescaleIntercept'] = -1000
retval['RescaleType'] = 'HU'
retval['RescaleSlope'] = 1.0
return retval
def glx2dicom(srcdir: Path, dstdir: Path, dicom_attrs) -> None:
# It seems trailing "/" does not matter.
attrs = dicom_attrs.copy()
media_storage_sop_class_uid = attrs.pop('MediaStorageSOPClassUID', None)
sop_class_uid = attrs.pop('SOPClassUID', media_storage_sop_class_uid)
if not sop_class_uid:
sop_class_uid = uid.CTImageStorage
elif (
media_storage_sop_class_uid and
sop_class_uid != media_storage_sop_class_uid):
raise ValueError(
'SOPClassUID != MediaStorageSOPClassUID',
(sop_class_uid, media_storage_sop_class_uid))
file_meta = FileMetaDataset()
file_meta.MediaStorageSOPClassUID = sop_class_uid
ds = FileDataset(
None, {},
file_meta=file_meta,
is_implicit_VR=False,
is_little_endian=True)
# Not necessary in the case of
# ds.save_as(ds.filename, write_like_original=False)
# It seems ``FileSet.write()`` ensures proper format as well.
# ds.preamble = b'\0'*128
# PS3.3 C.12
# C.12.1.1.1.1 should be equal to File Meta Information header
ds.SOPClassUID = file_meta.MediaStorageSOPClassUID
# ``FileSet`` attribute, image values are generated using UID prefix
media_storage_sop_instace_uid = attrs.pop(
'MediaStorageSOPInstanceUID', None)
sop_instance_uid = attrs.pop(
'SOPInstanceUID', media_storage_sop_instace_uid)
if not sop_instance_uid:
sop_instance_uid = uid.generate_uid(uid_prefix)
elif (
media_storage_sop_instace_uid and
sop_instance_uid != media_storage_sop_instace_uid):
raise ValueError(
'SOPInstanceUID != MediaStorageSOPInstanceUID',
(sop_instance_uid, media_storage_sop_instace_uid))
# :dcm:`part03/sect_C.7.4.html` PS3.3 C.7.4 Frame of Reference Module
# Identifies frame of reference withing a series.
for tag in [
'StudyInstanceUID', 'SeriesInstanceUID', 'FrameOfReferenceUID']:
value = attrs.pop(tag, None) or uid.generate_uid(uid_prefix)
setattr(ds, tag, value)
fs = FileSet()
fs.MediaStorageSOPInstanceUID = sop_instance_uid
# PS3.3 C.12
instance_creation_date = attrs.pop('InstanceCreationDate', '')
instance_creation_time = attrs.pop('InstanceCreationTime', '')
if (not instance_creation_date) and (not instance_creation_date):
dt = datetime.datetime.now()
instance_creation_date = dt.strftime("%Y%m%d")
instance_creation_time = dt.strftime("%H%M%S.%f")
ds.InstanceCreationDate = instance_creation_date
ds.InstanceCreationTime = instance_creation_time
for k, v in attrs.items():
tag1 = (datadict.tag_for_keyword(k) & 0xffff0000) >> 16
obj = ds
if tag1 == 0x0002:
obj = file_meta
elif tag1 == 0x0004:
obj = fs
setattr(obj, k, v)
transfer_syntax_uid = attrs['TransferSyntaxUID']
photometric_interpretation = attrs['PhotometricInterpretation']
slice_spacing = attrs['SpacingBetweenSlices']
z_center = attrs['ImagePositionPatient'][2]
pixel_bytes, bits_allocated_remnant = divmod(attrs['BitsAllocated'], 8)
assert bits_allocated_remnant == 0
data_length = attrs['Rows']*attrs['Columns']*pixel_bytes
src = sorted(src_dir.glob(src_dir.name + '_[0-9]*[0-9]'))
if not len(src) > 0:
raise RuntimeError(f'No files found in {src_dir:r}', src_dir)
z_base = z_center - 0.5*slice_spacing*(len(src) + 1)
i = 1
for f in src:
print(f'{i:3d} {f}')
# It seems compressing image or adding dataset to fileset
# resets some attributes
ds.PhotometricInterpretation = photometric_interpretation
file_meta.TransferSyntaxUID = transfer_syntax_uid
file_meta.MediaStorageSOPInstanceUID = uid.generate_uid(uid_prefix)
# PS3.3 C.12
# C12.1.1.1.1 should be equal to File Meta Information header
ds.SOPInstanceUID = file_meta.MediaStorageSOPInstanceUID
# PS3.3 C.7.6 Image
ds.InstanceNumber = str(i) # Earlier "Image Number"
z = z_base + i*slice_spacing
ds.ImagePositionPatient[2] = z
ds.SliceLocation = z # likely unused by applications
with gzip.open(f, "rb") as img:
data = img.read()
assert len(data) == data_length
ds.PixelData = data
# Attempt to set ``SmallestImagePixelValue``
# and ``LargestImagePixelValue`` causes
# TypeError: object of type 'numpy.int16' has no len()
# on writing DICOM file. It does not matter if ``pixel_array``
# is obtained from ``ds`` or created explicitly.
# pixel_type = numpy.dtype(numpy.int16)
# pixel_type.newbyteorder('<')
# pixel_array = numpy.frombuffer(data, pixel_type)
# ds.SmallestImagePixelValue = pixel_array.min()
# ds.LargestImagePixelValue = pixel_array.max()
# ds.PixelData = pixel_array.tobytes()
# Docs mention 'pylibjpeg' plugin as well.
ds.compress(uid.RLELossless, encoding_plugin='gdcm')
# Adds to a temporary directory at first.
# It seems there is no option to control it.
fs.add(ds)
# TODO Should 'Rows' and 'Columns' attributes be added
# to each image directory record?
i += 1
fs.write(dstdir)
def create_argument_parser():
from argparse import ArgumentParser, RawDescriptionHelpFormatter
parser = ArgumentParser(
usage=usage,
description=__description__,
# Do not wrap --version
formatter_class=RawDescriptionHelpFormatter)
parser.add_argument("-V", "--version", action="version", version=version)
parser.add_argument(
"--tag", metavar="DICOM_TAG=VALUE",
default=[], action="append",
help="set or override DICOM tag")
parser.add_argument(
"src_dir", metavar="SRC_DIR",
help='Directory with source CT images'
', "ID_vol_0" adjucent to the "ID.gwg" file')
parser.add_argument(
"dst_dir", metavar="DST_DIR",
help='Directory to write the result DICOMDIR and images')
return parser
def cli_tags2dict(tags: List[str]):
retval = {}
errors = []
for kv in tags:
k, v = kv.split("=", 2)
try:
datadict.dictionary_VR(k)
# TODO convert integer values
retval[k] = v
except (KeyError, ValueError):
errors.append(f'Unknown DICOM tag: {kv!r}')
return retval, errors
def readme():
"""Print description intended for ``README.rst`` file.
- Merge the module docstring with disclaimer and license statement.
- Check metadata consistency.
Usage::
python3 -c 'from glx2dicom import readme; print(readme())'
"""
title, usage_par, usage_cmd, body = __doc__.split(sep='\n\n', maxsplit=3)
assert title == f'{__title__} - {__description__}'
use = re.sub(r'\n.*--version.*\n', '', usage)
assert re.sub(r'^ *', '', usage_cmd, flags=re.MULTILINE) == re.sub(
r'^ *', '',
use % dict(prog=f'python3 {__title__}.py'),
flags=re.MULTILINE)
title_bar = re.sub('.', '=', title)
print('\n'.join([
title_bar, title, title_bar, '', usage_par,
'', usage_cmd, '', body, '']),
end='')
print(
re.sub('<([^>]+)>', '\\1', version).strip()
.replace('\n', '\n\n') % dict(prog=__title__))
if __name__ == '__main__':
parser = create_argument_parser()
args = parser.parse_args()
src_dir = Path(args.src_dir)
# ``collections.ChainMap(dicom_attrs, default_dicom_attrs)``
# would not work because ``pop(key)`` removes element
# from first dict only and next time the ``key`` may be obtained
# from second dict.
dicom_attrs = default_dicom_attrs.copy()
glx_attrs = read_fbp_params(src_dir)
dicom_attrs.update(glx_attrs)
cli_attrs, errors = cli_tags2dict(args.tag)
if errors:
# "%(prog)s" is not supported here
parser.error("\n".join(errors))
dicom_attrs.update(cli_attrs)
# TODO --verbose option
# from pprint import pprint
# pprint(dicom_attrs)
glx2dicom(src_dir, Path(args.dst_dir), dicom_attrs)
@hengdm
Copy link

hengdm commented Mar 13, 2024

Thanks for the work. I got some files from my dentist which is a folder named "ID_vol_0" and a single file "ID.gwg", got error message "FileNotFoundError: [Errno 2] No such file or directory: 'src/src'". So I manually tar & gzip them, then I got following error:

Traceback (most recent call last):
  File "/Users/dm/Downloads/CT/glx2dicom.py", line 515, in <module>
    glx_attrs = read_fbp_params(src_dir)
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/dm/Downloads/CT/glx2dicom.py", line 202, in read_fbp_params
    tree = etree.fromstring(f.read())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "src/lxml/etree.pyx", line 3264, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1989, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1876, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1164, in lxml.etree._BaseParser._parseDoc
  File "src/lxml/parser.pxi", line 633, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 743, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 672, in lxml.etree._raiseParseError
  File "<string>", line 
lxml.etree.XMLSyntaxError: Start tag expected, '<' not found, line 1, column 1

It's not the expected structure of the SRC_DIR? Any way to solve? Thanks in adv.

@maxnikulin
Copy link
Author

Thanks for the work. I got some files from my dentist which is a folder named "ID_vol_0" and a single file "ID.gwg", got error message "FileNotFoundError: [Errno 2] No such file or directory: 'src/src'". So I manually tar & gzip them

If you put ID_vol_0 (with some long ID) into the src folder then specify src/ID_vol_0 (not just src) as the SRC_DIR argument. Do not tar or gzip ID_vol_0 directory. Check that the ID_vol_0 file exists in that folder.

Ask your clinic, perhaps they can provide the same scan in DICOM format.

@sondreamer
Copy link

sondreamer commented Jun 1, 2024

Thanks for the work. I got some files from my dentist which is a folder named "ID_vol_0" and a single file "ID.gwg", got error message "FileNotFoundError: [Errno 2] No such file or directory: 'src/src'". So I manually tar & gzip them

If you put ID_vol_0 (with some long ID) into the src folder then specify src/ID_vol_0 (not just src) as the SRC_DIR argument. Do not tar or gzip ID_vol_0 directory. Check that the ID_vol_0 file exists in that folder.

Ask your clinic, perhaps they can provide the same scan in DICOM format.

i have '[Errno 2] No such file or directory' error when i enter src/id_vol_0 as SRC_DIR argument. problem is, it just appends last directories name to the argument ('no such directory src/id_vol_0/id_vol_0' when i specified just src/id_vol_0).
edit: ok i got it, it expects file of the same name in the directory, but still that file is present.

@maxnikulin
Copy link
Author

it expects file of the same name in the directory, but still that file is present.

Yes, it is layout of files that I have. So use the directory where you put your CT as SRC_DIR.

@sondreamer
Copy link

it expects file of the same name in the directory, but still that file is present.

Yes, it is layout of files that I have. So use the directory where you put your CT as SRC_DIR.

it appends the directories name eg 'CTdirectory' and says no such file or directory "CTdirectory/CTdirectory'.

@maxnikulin
Copy link
Author

it appends the directories name eg 'CTdirectory' and says no such file or directory "CTdirectory/CTdirectory'.

I have no idea what is the file layout in your case. I have a CD with a directory having a long ID and _vol_0 in its name. Path to this folder should be specified as the SRC_DIR argument. The directory contains a file having the same name is the directory and the script reads metadata from this file. Check that files and directories in your case are consistent to the structure described above.

@sondreamer
Copy link

it appends the directories name eg 'CTdirectory' and says no such file or directory "CTdirectory/CTdirectory'.

I have no idea what is the file layout in your case. I have a CD with a directory having a long ID and _vol_0 in its name. Path to this folder should be specified as the SRC_DIR argument. The directory contains a file having the same name is the directory and the script reads metadata from this file. Check that files and directories in your case are consistent to the structure described above.

mine's the same: CTdir/id_vol_0/id_vol_0. no matter if i specify CTdir or id_vol_0 as SRC_DIR it wont work

@maxnikulin
Copy link
Author

mine's the same: CTdir/id_vol_0/id_vol_0. no matter if i specify CTdir or id_vol_0 as SRC_DIR it wont work

If the *_vol_0 file exists in the directory with the same name then ask a friend or a colleague how to properly specify path to this directory. ls (or dir in the case of Windows) with the directory as its argument should not report an error.

@sondreamer
Copy link

mine's the same: CTdir/id_vol_0/id_vol_0. no matter if i specify CTdir or id_vol_0 as SRC_DIR it wont work

If the *_vol_0 file exists in the directory with the same name then ask a friend or a colleague how to properly specify path to this directory. ls (or dir in the case of Windows) with the directory as its argument should not report an error.

apologies, ran it again and it worked with CTdir/id_vol_0 as SRC_DIR. i applied forementioned "image date 'dirty hack' " because it didnt work before and apparently i didn't pay attention that i was entering wrong src_dir after.

@ragelalexandr
Copy link

ragelalexandr commented Nov 4, 2024

Hello! I have problems with code execution. Help me)

Traceback (most recent call last):
File "/home/rav/Development/mrt/glx2dicom.py", line 528, in
glx2dicom(src_dir, Path(args.dst_dir), dicom_attrs)
File "/home/rav/Development/mrt/glx2dicom.py", line 321, in glx2dicom
sop_class_uid = uid.CTImageStorage
File "/usr/lib/python3/dist-packages/pydicom/uid.py", line 44, in getattr
raise AttributeError(f"module {name} has no attribute {name}")
AttributeError: module pydicom.uid has no attribute CTImageStorage

@maxnikulin
Copy link
Author

sop_class_uid = uid.CTImageStorage
File "/usr/lib/python3/dist-packages/pydicom/uid.py", line 44, in getattr
raise AttributeError(f"module {name} has no attribute {name}")
AttributeError: module pydicom.uid has no attribute CTImageStorage

What version of pydicom do you use? It seems, CTImageStorage is still in its place:
https://github.com/pydicom/pydicom/blob/main/src/pydicom/uid.py#L680
What is the result of the following command in your case?

python3 -c "from pydicom import uid; print(uid.CTImageStorage)"

When inserting verbatim text, please, add triple backticks around to suppress treating some characters as markup, see https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/creating-and-highlighting-code-blocks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment