Skip to content

Instantly share code, notes, and snippets.

@afeld
Last active October 11, 2015 15:08
Show Gist options
  • Save afeld/3877779 to your computer and use it in GitHub Desktop.
Save afeld/3877779 to your computer and use it in GitHub Desktop.
Jux Video Dimensions
# inspired by http://wiki.stdout.org/rcookbook/Graphs/Scatterplots%20(ggplot2)/#handling-overplotting
# to run:
# $ R < video_dimensions.R --no-save
# install.packages("ggplot2")
dat <- read.csv("/home/jux/videos.csv")
library(ggplot2)
# Make each dot partially transparent, with 1/4 opacity
# For heavy overplotting, try using smaller values
ggplot(dat, aes(x=width, y=height, title="Dimensions of Videos on Jux")) +
geom_point(shape=19, # Use solid circles
alpha=1/40)
# get a sample of the videos
samp <- dat[sample(nrow(dat), size=10000), ]
# Jitter the points
ggplot(samp, aes(x=width, y=height, title="Dimensions of Videos on Jux by Service")) +
geom_point(shape=1, # Use hollow circles
position=position_jitter(width=20,height=10),
aes(color=service))
# sort dimensions by count of occurance
dimCount <- sort(table(paste(dat$width, dat$height, sep="x")), decreasing=T)
# limit to top 20
dimCount <- head(dimCount, n=10)
barplot(dimCount, cex.name=0.6, main="Most Common Video Dimensions on Jux", xlab="Dimensions", ylab="Occurences")
require 'benchmark'
require 'csv'
CSV.open('/home/jux/videos.csv', 'wb') do |csv|
csv << %w(id width height service)
Mongoid.unit_of_work(disable: :current) do
puts Benchmark.measure {
Video.all.only(:property_set).each do |vid|
ps = vid.property_set
dims = ps.try :dimensions
width = dims.try :width
height = dims.try :height
csv << [vid.id, width, height, ps.service] if width && height
end
}
end
end
We can make this file beautiful and searchable if this error is corrected: It looks like row 7 should actually have 4 columns, instead of 1 in line 6.
id,width,height,service
255835,459,344,youtube
345278,480,270,youtube
299833,480,270,youtube
257513,1280,720,vimeo
259259,640,480,vimeo
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment