-
-
Save ernestum/601cdf56d2b424757de5 to your computer and use it in GitHub Desktop.
import numpy as np | |
from scipy.ndimage.interpolation import map_coordinates | |
from scipy.ndimage.filters import gaussian_filter | |
def elastic_transform(image, alpha, sigma, random_state=None): | |
"""Elastic deformation of images as described in [Simard2003]_. | |
.. [Simard2003] Simard, Steinkraus and Platt, "Best Practices for | |
Convolutional Neural Networks applied to Visual Document Analysis", in | |
Proc. of the International Conference on Document Analysis and | |
Recognition, 2003. | |
""" | |
if random_state is None: | |
random_state = np.random.RandomState(None) | |
shape = image.shape | |
dx = gaussian_filter((random_state.rand(*shape) * 2 - 1), sigma, mode="constant", cval=0) * alpha | |
dy = gaussian_filter((random_state.rand(*shape) * 2 - 1), sigma, mode="constant", cval=0) * alpha | |
dz = np.zeros_like(dx) | |
x, y, z = np.meshgrid(np.arange(shape[0]), np.arange(shape[1]), np.arange(shape[2])) | |
print x.shape | |
indices = np.reshape(y+dy, (-1, 1)), np.reshape(x+dx, (-1, 1)), np.reshape(z, (-1, 1)) | |
distored_image = map_coordinates(image, indices, order=1, mode='reflect') | |
return distored_image.reshape(image.shape) |
How can I save this distored image ?
I tried with PIL and scipy but the output is entirely black , actually nothing.
I can't even show the image with matplot. (TypeError: Invalid dimensions for image data)
The error is clear but how can I create the appropriate shape for images?
Hi, I have load a RGB img whose shape is (400, 248, 3), but I have got an error
ValueError: operands could not be broadcast together with shapes (248,400,3) (400,248,3)
in the code here
indices = np.reshape(y + dy, (-1, 1)), np.reshape(x + dx, (-1, 1)), np.reshape(z, (-1, 1))
can anyone help ? THX!!!
Hi, I have load a RGB img whose shape is (400, 248, 3), but I have got an error
ValueError: operands could not be broadcast together with shapes (248,400,3) (400,248,3)
in the code here
indices = np.reshape(y + dy, (-1, 1)), np.reshape(x + dx, (-1, 1)), np.reshape(z, (-1, 1))
can anyone help ? THX!!!
you need to invert the shapes in the resolution of x,y,z :
x, y, z = np.meshgrid(np.arange(shape[1]), np.arange(shape[0]), np.arange(shape[2]))
instead of
x, y, z = np.meshgrid(np.arange(shape[0]), np.arange(shape[1]), np.arange(shape[2]))
with the correction I gave in the previous post, the algo works perfectly ! Thanks
Results on the dataset of OCR digits recognition I'm currently building :
For the question of sigma alpha values, I build the dataset with 3 pairs of values as follows :
ELASTIC_ALPHA_SIGMA = ((1201, 10), (1501, 12), (991, 8))
Hello I get an error: tuple index out of range
on line : x, y, z = np.meshgrid(np.arange(shape[0]), np.arange(shape[1]), np.arange(shape[2]))
Anyone have any advice?
try to print the shape and you will find out that the shape is something like [x,y] not [x,y,z].
this can be because you may using grayscale image. try to reshape the image to [x,y,1].
hi!
How do you apply the same transformation to the mask ? I have a set of satellite images with each their corresponding road segmentation (black and white, (400,400))
How to find the corresponding mathematical formula?
hi!
How do you apply the same transformation to the mask ? I have a set of satellite images with each their corresponding road segmentation (black and white, (400,400))
Same question here, the implementation is inherently randomized for each run, making it impossible to apply the exactly same transform to both the original image and the mask image.
hi!
How do you apply the same transformation to the mask ? I have a set of satellite images with each their corresponding road segmentation (black and white, (400,400))Same question here, the implementation is inherently randomized for each run, making it impossible to apply the exactly same transform to both the original image and the mask image.
Just fix the random_state for both calls. :)
Can I apply it for a multi-class dataset for the segmentation task?
Can you elaborate a bit more @jepperaskdk - i would love to make use of this however im unsure what you mean in terms of stack the two images (i..e pull in the original from one folder X_img and the corresponding mask Y-Img and separate after processing? Literally run them all through and then manually / script separate the images that are the output?
Can you elaborate a bit more @jepperaskdk - i would love to make use of this however im unsure what you mean in terms of stack the two images (i..e pull in the original from one folder X_img and the corresponding mask Y-Img and separate after processing? Literally run them all through and then manually / script separate the images that are the output?
On second thought, I'm not sure if it works.
Hi, I have load a RGB img whose shape is (400, 248, 3), but I have got an error
ValueError: operands could not be broadcast together with shapes (248,400,3) (400,248,3)
in the code here
indices = np.reshape(y + dy, (-1, 1)), np.reshape(x + dx, (-1, 1)), np.reshape(z, (-1, 1))
can anyone help ? THX!!!you need to invert the shapes in the resolution of x,y,z :
x, y, z = np.meshgrid(np.arange(shape[1]), np.arange(shape[0]), np.arange(shape[2]))
instead of
x, y, z = np.meshgrid(np.arange(shape[0]), np.arange(shape[1]), np.arange(shape[2]))
Inverting the shapes flipped the image for me. Setting the the indexing of the meshgrid to 'ij' instead fixed this issue:
x, y, z = np.meshgrid(np.arange(shape[0]), np.arange(shape[1]), np.arange(shape[2]), indexing='ij')
Since the interpolation is also done over the channel dimension, I got this interpolation artifacts.
So I decided to do the interpolation channel-wise.
https://gist.github.com/mvoelk/0880f5de7c101c093165e1e46ce3f6e5
which augement did you apply for dataset @bigfred76
Thank you. But should Input Image be square?