-
-
Save rfezzani/b4b8852c5a48a901c1e94e09feb34743 to your computer and use it in GitHub Desktop.
from tifffile import TiffFile | |
import numpy as np | |
def get_crop(page, i0, j0, h, w): | |
"""Extract a crop from a TIFF image file directory (IFD). | |
Only the tiles englobing the crop area are loaded and not the whole page. | |
This is usefull for large Whole slide images that can't fit int RAM. | |
Parameters | |
---------- | |
page : TiffPage | |
TIFF image file directory (IFD) from which the crop must be extracted. | |
i0, j0: int | |
Coordinates of the top left corner of the desired crop. | |
h: int | |
Desired crop height. | |
w: int | |
Desired crop width. | |
Returns | |
------- | |
out : ndarray of shape (imagedepth, h, w, sampleperpixel) | |
Extracted crop. | |
""" | |
if not page.is_tiled: | |
raise ValueError("Input page must be tiled.") | |
im_width = page.imagewidth | |
im_height = page.imagelength | |
if h < 1 or w < 1: | |
raise ValueError("h and w must be strictly positive.") | |
if i0 < 0 or j0 < 0 or i0 + h >= im_height or j0 + w >= im_width: | |
raise ValueError("Requested crop area is out of image bounds.") | |
tile_width, tile_height = page.tilewidth, page.tilelength | |
i1, j1 = i0 + h, j0 + w | |
tile_i0, tile_j0 = i0 // tile_height, j0 // tile_width | |
tile_i1, tile_j1 = np.ceil([i1 / tile_height, j1 / tile_width]).astype(int) | |
tile_per_line = int(np.ceil(im_width / tile_width)) | |
out = np.empty((page.imagedepth, | |
(tile_i1 - tile_i0) * tile_height, | |
(tile_j1 - tile_j0) * tile_width, | |
page.samplesperpixel), dtype=page.dtype) | |
fh = page.parent.filehandle | |
jpegtables = page.tags.get('JPEGTables', None) | |
if jpegtables is not None: | |
jpegtables = jpegtables.value | |
for i in range(tile_i0, tile_i1): | |
for j in range(tile_j0, tile_j1): | |
index = int(i * tile_per_line + j) | |
offset = page.dataoffsets[index] | |
bytecount = page.databytecounts[index] | |
fh.seek(offset) | |
data = fh.read(bytecount) | |
tile, indices, shape = page.decode(data, index, jpegtables, | |
jpegheader=page.jpegheader) | |
im_i = (i - tile_i0) * tile_height | |
im_j = (j - tile_j0) * tile_width | |
out[:, im_i: im_i + tile_height, im_j: im_j + tile_width, :] = tile | |
im_i0 = i0 - tile_i0 * tile_height | |
im_j0 = j0 - tile_j0 * tile_width | |
return out[:, im_i0: im_i0 + h, im_j0: im_j0 + w, :] |
why you use page.parent.pages[0] instead of simple page.pages[0]
👍 Good point @HowcanoeWang!
Do you know how can I resample this kind of image?
I have an .NDPI image whose image size is around 186496 × 36608. I want to downsample this image as in the range of ((186496, 36608), (93248, 18304), (46624, 9152), (23312, 4576), (11656, 2288), (5828, 1144), (2914, 572), (1457, 286), (728, 143)). Is it possible?
Do you know how can I resample this kind of image?
I have an .NDPI image whose image size is around 186496 × 36608. I want to downsample this image as in the range of ((186496, 36608), (93248, 18304), (46624, 9152), (23312, 4576), (11656, 2288), (5828, 1144), (2914, 572), (1457, 286), (728, 143)). Is it possible?
This is a task like clipping raster geotiff by polygon
, maybe other python library (install by conda is recommended) like GDAL, osgeo, fiona have a neat function to do this job, rather than using this basic package.
Hi,
FYI I tried using the method on a NDPI image, and it almost worked out of the box but I got an error on the page.decode
in line 69 (Jpeg8Error), because the data didn't contain the JPEG headers.
Changing it to:
tile, indices, shape = page.decode(page.jpegheader+data, index, jpegtables)
and it works well.
In case anyone gets the same problem !
Excellent! Thank you @adfoucart 🙏. Another option is to use the jpegheader
argument:
tile, indices, shape = page.decode(data, index, jpegtables, jpegheader=page.jpegheader)
I will add this modification to the gist 😉
why you use
page.parent.pages[0]
instead of simplepage.pages[0]