-
-
Save jendelel/3a8e768a8eb9345d49f2a82d02946122 to your computer and use it in GitHub Desktop.
from skimage.draw import polygon | |
import numpy as np | |
import plistlib | |
def load_inbreast_mask(mask_path, imshape=(4084, 3328)): | |
""" | |
This function loads a osirix xml region as a binary numpy array for INBREAST | |
dataset | |
@mask_path : Path to the xml file | |
@imshape : The shape of the image as an array e.g. [4084, 3328] | |
return: numpy array where positions in the roi are assigned a value of 1. | |
""" | |
def load_point(point_string): | |
x, y = tuple([float(num) for num in point_string.strip('()').split(',')]) | |
return y, x | |
mask = np.zeros(imshape) | |
with open(mask_path, 'rb') as mask_file: | |
plist_dict = plistlib.load(mask_file, fmt=plistlib.FMT_XML)['Images'][0] | |
numRois = plist_dict['NumberOfROIs'] | |
rois = plist_dict['ROIs'] | |
assert len(rois) == numRois | |
for roi in rois: | |
numPoints = roi['NumberOfPoints'] | |
points = roi['Point_px'] | |
assert numPoints == len(points) | |
points = [load_point(point) for point in points] | |
if len(points) <= 2: | |
for point in points: | |
mask[int(point[0]), int(point[1])] = 1 | |
else: | |
x, y = zip(*points) | |
x, y = np.array(x), np.array(y) | |
poly_x, poly_y = polygon(x, y, shape=imshape) | |
mask[poly_x, poly_y] = 1 | |
return mask |
Hi, thank you for your great work.
Recently, I using the INbreast dataset too.
This is just my simple idea, I think this code can brush up for specify roi categories (e.g, Mass/Calcification/Cluster).
When I used your code AS-IS, all rois were loaded as label array.
So, I suggest following changes,
@roi_class_name is the roi attribute name, None = all roi, Calcification(including cluster) or Mass
'''
def load_inbreast_roimask(roi_class_name=None, mask_path='', imshape=(4084, 3328)):
def load_point(point_string):
x, y = tuple([float(num) for num in point_string.strip('()').split(',')])
return y, x
mask = np.zeros(imshape)
with open(mask_path, 'rb') as mask_file:
plist_dict = plistlib.load(mask_file, fmt=plistlib.FMT_XML)['Images'][0]
numRois = plist_dict['NumberOfROIs']
rois = plist_dict['ROIs']
assert len(rois) == numRois
for roi in rois:
# to check dict in roi
'''
for k, v in roi.items():
print(k, v)
'''
# here -start-
if roi_class_name is not None:
if roi_class_name == "Calcification":
if roi_class_name != roi['Name'] and roi["Name"] != "Cluster":
continue
else:
if roi_class_name != roi['Name']:
continue
# -end-
numPoints = roi['NumberOfPoints']
points = roi['Point_px']
assert numPoints == len(points)
points = [load_point(point) for point in points]
if len(points) <= 2:
for point in points:
mask[int(point[0]), int(point[1])] = 1
else:
x, y = zip(*points)
x, y = np.array(x), np.array(y)
poly_x, poly_y = polygon(x, y, shape=imshape)
mask[poly_x, poly_y] = 1
return mask.astype(np.uint8)
'''
Do you think this changes is looks good? (I do not have my confidence...)
I waiting your reply.
Hi @tatsunidas what's the point of specify roi_class_name == "Calcification"
if roi_class_name is not None
? If you only want to skip the categories different from what you specify in the roi_class_name
parameter you can simply use the if roi_class_name != roi['Name']
condition
Hello, @jasminjahanpuspo @Feyn-Man @jendelel have you solved how to solve the conversion of the.xml files into png format the code which is shown here is not working for all the files and reproducing an error.
please reply and share your insights ASAP,
Thanks & Regards,
Satwik Sunnam
Hello, @jasminjahanpuspo @Feyn-Man @jendelel have you solved how to solve the conversion of the.xml files into png format the code which is shown here is not working for all the files and reproducing an error.
please reply and share your insights ASAP, Thanks & Regards, Satwik Sunnam
Hello @satwiksunnam19 Use the code hope you find it useful: https://github.com/wentaozhu/inbreast
Hey, @jasminjahanpuspo I've got no experience using Matlab and if you give me a step-wise method to execute the files to run the code it would be a great thing.
Thanks & Regards,
Satwik Sunnam.
Hello @jasminjahanpuspo I'm trying to replicate the work of this project https://github.com/Holliemin9090/Mammographic-mass-CAD-via-pseudo-color-mammogram-and-Mask-R-CNN
I'm using INBREAST Dataset.
For dataset preparation, I've tried the conversion of images and ROIs
- Completed the Conversion of DICOM to PNG of the images.
- Completed the conversion of XML files to PNG/JPG images.
My doubt is
- I've more DICOM images when compared to the number of ROIs images, how to mitigate the problem.
Is there any suggestion from your side and also implementing the GitHub mentioned above?
Thanks & Regards,
S S
Hello, I am using this code. It's pretty much easy to understand but somehow I get all values zeros when run on Colab. Here is a screenshot of my code. Any kind of help will be appreciated.
Hi @jasminjahanpuspo were you able to resolve this? As I also tried the code and I run into the same issue.
Hello, I am using this code. It's pretty much easy to understand but somehow I get all values zeros when run on Colab. Here is a screenshot of my code. Any kind of help will be appreciated.
Hi @jasminjahanpuspo were you able to resolve this? As I also tried the code and I run into the same issue.
@shubham-pipada Thank you for asking! Unfortunately, I wasn't able to solve this issue yet.
Hi, you find a simplified version of this code (without the load_point function) on my fork 😄