Last active
October 11, 2015 15:08
-
-
Save afeld/3877779 to your computer and use it in GitHub Desktop.
Jux Video Dimensions
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# inspired by http://wiki.stdout.org/rcookbook/Graphs/Scatterplots%20(ggplot2)/#handling-overplotting | |
# to run: | |
# $ R < video_dimensions.R --no-save | |
# install.packages("ggplot2") | |
dat <- read.csv("/home/jux/videos.csv") | |
library(ggplot2) | |
# Make each dot partially transparent, with 1/4 opacity | |
# For heavy overplotting, try using smaller values | |
ggplot(dat, aes(x=width, y=height, title="Dimensions of Videos on Jux")) + | |
geom_point(shape=19, # Use solid circles | |
alpha=1/40) | |
# get a sample of the videos | |
samp <- dat[sample(nrow(dat), size=10000), ] | |
# Jitter the points | |
ggplot(samp, aes(x=width, y=height, title="Dimensions of Videos on Jux by Service")) + | |
geom_point(shape=1, # Use hollow circles | |
position=position_jitter(width=20,height=10), | |
aes(color=service)) | |
# sort dimensions by count of occurance | |
dimCount <- sort(table(paste(dat$width, dat$height, sep="x")), decreasing=T) | |
# limit to top 20 | |
dimCount <- head(dimCount, n=10) | |
barplot(dimCount, cex.name=0.6, main="Most Common Video Dimensions on Jux", xlab="Dimensions", ylab="Occurences") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require 'benchmark' | |
require 'csv' | |
CSV.open('/home/jux/videos.csv', 'wb') do |csv| | |
csv << %w(id width height service) | |
Mongoid.unit_of_work(disable: :current) do | |
puts Benchmark.measure { | |
Video.all.only(:property_set).each do |vid| | |
ps = vid.property_set | |
dims = ps.try :dimensions | |
width = dims.try :width | |
height = dims.try :height | |
csv << [vid.id, width, height, ps.service] if width && height | |
end | |
} | |
end | |
end |
We can make this file beautiful and searchable if this error is corrected: It looks like row 7 should actually have 4 columns, instead of 1 in line 6.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
id,width,height,service | |
255835,459,344,youtube | |
345278,480,270,youtube | |
299833,480,270,youtube | |
257513,1280,720,vimeo | |
259259,640,480,vimeo | |
... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment