Skip to content

Instantly share code, notes, and snippets.

@tkhduracell
Created January 14, 2019 19:21
Show Gist options
  • Save tkhduracell/2270300124180c9b7a0874853894af77 to your computer and use it in GitHub Desktop.
Save tkhduracell/2270300124180c9b7a0874853894af77 to your computer and use it in GitHub Desktop.
Parse Google Play Store version information, node.js, cheerio, scrape
const extractContent = module.exports.extractContent = $ => {
return {
lines: $("div:contains('What's New') > h2")
.parent()
.parent()
.find('content')
.get()
.map(block => block.children.map(node => {
return $(node).text() // Using dot list in release notes
.replace(/^\W*\*\W*/, '')
.trim()
}))
.map(items => items.filter(s => s !== ''))
.filter(s => s.indexOf('Read more') === -1)
.filter(s => s.indexOf('Collapse') === -1)
.pop(),
name: $("div:contains('Current Version') + span")
.get()
.map(itm => $(itm).text().trim())
.join(', '),
date: $("div:contains('Updated') + span")
.get()
.map(itm => $(itm).text().trim())
.join(', ')
}
}
module.exports.fetchLatestVersion = (log) => {
const url = 'https://play.google.com/store/apps/details?id=<pkg>'
log('Running Google Play parser...')
return new Promise((resolve, reject) => {
scraperjs.StaticScraper
.create(url)
.scrape(extractContent, data => resolve(data))
.catch(err => reject(err))
})
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment