Skip to content

Instantly share code, notes, and snippets.

@ricealexander
Last active August 25, 2025 17:27
Show Gist options
  • Save ricealexander/d5328aa10bb9c7b398cab74b0d910b62 to your computer and use it in GitHub Desktop.
Save ricealexander/d5328aa10bb9c7b398cab74b0d910b62 to your computer and use it in GitHub Desktop.
Collect improperly formatted person page URLs in Grove
const staffLinks = []
// Navigate across the site and expand "Read More" buttons on the site.
// This script will collect any improper author links it finds while you go along.
setInterval(() => {
const bylines = document.querySelectorAll('[class*="Promo"][class*="-authorName"] .Link')
for (let byline of bylines) {
let href = byline.href
if (href.endsWith('-1') // 1. Look for duplicate permalinks
|| href.split('-').length > 3 // 2. Look for partner organizations
) {
staffLinks.push(href)
continue
}
if ( !href // 3. Skip invalid links
|| href.startsWith('https://www.stlpr.org/people/') // a. Skip links that already match our desired format
|| !href.startsWith('https://www.stlpr.org/') // b. Skip partner station links
|| staffLinks.includes(href) // 4. Skip links we've already noted
) continue
staffLinks.push(href) // 4. Catch all links that remain
}
}, 1000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment