Skip to content

Instantly share code, notes, and snippets.

@schluppeck
Last active August 18, 2020 20:39
Show Gist options
  • Select an option

  • Save schluppeck/b4bb31eb5efdf818fcb69fd559760e7a to your computer and use it in GitHub Desktop.

Select an option

Save schluppeck/b4bb31eb5efdf818fcb69fd559760e7a to your computer and use it in GitHub Desktop.
extract image features from pre-processed text images

Outline

Use morphological operations to turn a pixel based image of text into geometric information (x,y locations), sizes, etc. to allow direct comparison

Pre-requisites

I used rstats, RStudio and some cran and Bioconductor libraries. Check code.

Steps

  • load image
  • make sure it's grayscale
  • erode / dilate ("closing")
  • label (bwlabel)
  • compute different features
  • plot

starting out with this

raw texts image

and get this

blobbed texts image

and then get this

raw texts image

# label and position information from text images
#
# ds 2020-08-18
library(imager)
# BiocManager::install("EBImage")
library(EBImage)
library(tidyverse)
# load image --------------------------------------------------------------
fname <- '1.png'
textImage <- EBImage::readImage(fname)
# and make sure it's grayscale
textImage <- channel(textImage, "gray")
# display
display(textImage)
# threshold
textImage = thresh(textImage, 10, 10, 0.05)
display(textImage)
# fill / erode / turn to blobs --------------------------------------------
textImage_blobbed <- closing(textImage, makeBrush(7, shape='disc'))
textImage_lablelled <- bwlabel(textImage_blobbed)
display(textImage_blobbed)
display(textImage_lablelled)
# compute + convert to dataframe
fts <- computeFeatures.shape(textImage_lablelled ) %>%
as_tibble()
# some useful metrics
glimpse(fts)
# the moments are central tendency of x, y of each labelled blob
# major axis, angle, etc. (assuming bivariate gaussian, I assume - check doc'n)
ftm <- computeFeatures.moment(textImage_lablelled ) %>%
as_tibble()
glimpse(ftm)
# now plot moment data on image -------------------------------------------
d <- dim(textImage)
ftm %>%
ggplot(aes(x = m.cx,
y = m.cy,
angle = m.theta,
radius=m.majoraxis)) +
geom_spoke() + # maybe not? not quite correct /// but to see what you can do
geom_point(size=2, color="red", alpha=0.5) +
scale_y_reverse(limits=c(d[2],0)) +
scale_x_continuous(limits=c(0, d[1])) +
labs(x="pixels in x", y = "pixels in y", title = "centroids and major axis / theta") +
theme_minimal()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment