Last active
February 10, 2025 19:02
-
-
Save endolith/334196bac1cac45a4893 to your computer and use it in GitHub Desktop.
Detecting rotation and line spacing of image of page of text using Radon transform
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
Automatically detect rotation and line spacing of an image of text using | |
Radon transform | |
If image is rotated by the inverse of the output, the lines will be | |
horizontal (though they may be upside-down depending on the original image) | |
It doesn't work with black borders | |
""" | |
from skimage.transform import radon | |
from PIL import Image | |
from numpy import asarray, mean, array, blackman | |
import numpy as np | |
from numpy.fft import rfft | |
import matplotlib.pyplot as plt | |
try: | |
# More accurate peak finding from | |
# https://gist.github.com/endolith/255291#file-parabolic-py | |
from parabolic import parabolic | |
def argmax(x): | |
return parabolic(x, np.argmax(x))[0] | |
except ImportError: | |
from numpy import argmax | |
def rms_flat(a): | |
""" | |
Return the root mean square of all the elements of *a*, flattened out. | |
""" | |
return np.sqrt(np.mean(np.abs(a) ** 2)) | |
filename = 'skew-linedetection.png' | |
# Load file, converting to grayscale | |
I = asarray(Image.open(filename).convert('L')) | |
I = I - mean(I) # Demean; make the brightness extend above and below zero | |
plt.subplot(2, 2, 1) | |
plt.imshow(I) | |
# Do the radon transform and display the result | |
sinogram = radon(I) | |
plt.subplot(2, 2, 2) | |
plt.imshow(sinogram.T, aspect='auto') | |
plt.gray() | |
# Find the RMS value of each row and find "busiest" rotation, | |
# where the transform is lined up perfectly with the alternating dark | |
# text and white lines | |
r = array([rms_flat(line) for line in sinogram.transpose()]) | |
rotation = argmax(r) | |
print('Rotation: {:.2f} degrees'.format(90 - rotation)) | |
plt.axhline(rotation, color='r') | |
# Plot the busy row | |
row = sinogram[:, rotation] | |
N = len(row) | |
plt.subplot(2, 2, 3) | |
plt.plot(row) | |
# Take spectrum of busy row and find line spacing | |
window = blackman(N) | |
spectrum = rfft(row * window) | |
plt.plot(row * window) | |
frequency = argmax(abs(spectrum)) | |
line_spacing = N / frequency # pixels | |
print('Line spacing: {:.2f} pixels'.format(line_spacing)) | |
plt.subplot(2, 2, 4) | |
plt.plot(abs(spectrum)) | |
plt.axvline(frequency, color='r') | |
plt.yscale('log') | |
plt.show() |
@zoldaten That's what I get for the example image, yes:
Rotation: 5.00 degrees
Line spacing: 13.63 pixels
@endolith
if you dont mind i ll speed up a bit your code.
now i have time 0:00:01.228
sec (with skew-linedetection.png). on raspberry pi.
the bigger pic i use the more inference time. on 2MiB pic i have already 11 sec.
i used cprofile and found that sinogram = radon(I) eats all time.
to speed up it we need smaller image.
so. we need to replace:
I = asarray(Image.open(filename).convert('L'))
with this:
import sys
from PIL.Image import Resampling
I = Image.open(filename).convert('L')
I.thumbnail([sys.maxsize, 480], Resampling.LANCZOS) #resize image keeping aspect ratio. 480 by example. it may be smaller i think.
now i have on skew-linedetection.png:
0:00:00.739
i didnt tested how the last code works as i need only rotation degrees. and it returns the result.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@endolith thanks !
1.
i see output on skew-linedetection.png:
is it correct ?
and sometimes got this:
i saw remark
It doesn't work with black borders
what does it mean ? do you have an example image ?