Skip to content

Instantly share code, notes, and snippets.

@ianchanning
Last active July 19, 2025 11:48
Show Gist options
  • Save ianchanning/fd0dea25342b73de4add30dea35ae348 to your computer and use it in GitHub Desktop.
Save ianchanning/fd0dea25342b73de4add30dea35ae348 to your computer and use it in GitHub Desktop.
Pinboard.in firefox (think any browser) bookmarklet to save arxiv title, abstract (as description), then authors, year and subjects (as tags)

🧠 Lessons Learned: Building a Reliable Pinboard Bookmarklet

Creating a dependable Pinboard bookmarklet β€” especially for extracting structured data from sites like arXiv.org β€” sounds simple, but there are surprising complexities. This guide documents the key lessons and quirks uncovered while building one.


πŸ’₯ Common Pitfalls & Gotchas

1. later=yes silently disables description

If your Pinboard URL includes later=yes, it will silently ignore the description field β€” no error, just no description saved.

βœ… Fix:

  • Do not use later=yes
  • Use a window.open() popup instead, which allows the description to be preserved
window.open(pinURL, "Pinboard", "toolbar=no,width=700,height=350");

2. Single-line comments break bookmarklets

Since bookmarklets are often pasted into a single-line input field (like the browser's bookmarks bar), using // will break the code.

βœ… Fix:

Use block comments (/* ... */) only.


3. DOM methods can fail silently

APIs like getElementsByClassName may return unexpected results due to dynamic content or page structure.

βœ… Fix:

  • Use document.querySelector() or document.querySelectorAll()
  • Always check for null or undefined before accessing .textContent

4. Use document.title for the bookmark title

Avoid parsing the title manually from page elements. document.title is reliable and already includes the paper ID.

var title = document.title;

5. Always use encodeURIComponent

If any part of the Pinboard URL is unescaped, the whole request may break.

βœ… Fix:

var esc = function (s) {
  return encodeURIComponent(s);
};

Use it for every URL component: url, title, description, tags, etc.


6. Normalize and trim the abstract

Long descriptions are okay, but Pinboard fields should be clean:

βœ… Tips:

  • Strip the "Abstract:" prefix
  • Collapse all whitespace
  • trim() the final result
var desc = abs.textContent
  .replace(/^\s*Abstract:\s*/i, "")
  .replace(/\s+/g, " ")
  .trim();

7. Debug with alert() during development

Pinboard gives zero feedback when fields are dropped or broken. Use alert() to confirm the final URL.

alert(pinURL);

🧩 Data Extracted from arXiv

We were able to reliably extract:

Field Method
Paper ID Regex match from location.href
Authors div.authors β†’ normalized author:john_doe format
Subjects td.tablecell.subjects β†’ extract arXiv tags from parentheses
Year div.dateline β†’ 20XX year match
Title From document.title
Abstract From blockquote.abstract β†’ normalized as description

βœ… Final Notes

Writing Pinboard bookmarklets isn’t hard β€” until you care about quality:

  • βœ‚οΈ Remove later=yes
  • πŸ“ Don’t use // comments
  • πŸ•΅οΈ DOM is fickle β€” always check for nulls
  • 🧼 Normalize everything
  • 🚨 Alert before you open
  • πŸ’‘ Use the popup!

Made through trial, error, and a healthy amount of rebellion βš”οΈ

(function () {
try {
/* encode helper */
var esc = function (s) {
return encodeURIComponent(s);
};
/* arXiv ID */
var url = location.href;
var m = url.match(/arxiv\.org\/abs\/([\d\.]+)(v\d+)?/);
var paper = m ? m[1] : "";
/* Authors β†’ ["author:..."] */
var getAuthorTags = function () {
var el = document.querySelector("div.authors");
return el
? el.textContent
.replace(/^Authors?:\s*/, "")
.split(",")
.map(function (s) {
return "author:" + s.trim().toLowerCase().replace(/\s+/g, "_");
})
: [];
};
/* Subjects β†’ ["subject:..."] */
var getSubjectTags = function () {
var el = document.querySelector("td.tablecell.subjects");
return el
? (el.textContent.match(/\(([^)]+)\)/g) || []).map(function (s) {
return "subject:" + s.slice(1, -1);
})
: [];
};
/* Year β†’ "year:2025" */
var getYearTag = function () {
var el = document.querySelector("div.dateline");
var m = el && el.textContent.match(/\b(20\d{2})\b/);
return m ? "year:" + m[1] : "";
};
/* Abstract β†’ "..." */
var getDescription = function () {
var el = document.querySelector("blockquote.abstract");
return el
? el.textContent
.replace(/^\s*Abstract:\s*/i, "")
.replace(/\s+/g, " ")
.trim()
: "";
};
/* Title (via <title>) */
var title = document.title;
/* Assemble all tags */
var tags = []
.concat(
"arxiv",
paper ? ["paper:" + paper] : [],
getAuthorTags(),
getSubjectTags(),
getYearTag() ? [getYearTag()] : [],
)
.join(" ");
/* Pinboard URL (opens popup for description) */
var pinURL =
"https://pinboard.in/add?" +
"url=" +
esc(url) +
"&title=" +
esc(title) +
"&description=" +
esc(getDescription()) +
"&tags=" +
esc(tags);
void window.open(pinURL, "Pinboard", "toolbar=no,width=700,height=350");
} catch (e) {
alert("pin-arxiv error: " + (e.message || e));
}
})();
// Safer https://make-bookmarklets.com/ minified version.
// javascript:(function(){!function(){try{var t=function(t){return encodeURIComponent(t)},e=location.href,r=e.match(/arxiv\.org\/abs\/([\d\.]+)(v\d+)?/),n=r?r[1]:"",o=function(){var t=document.querySelector("div.dateline"),e=t&&t.textContent.match(/\b(20\d{2})\b/);return e?"year:"+e[1]:""},a=document.title,c=[].concat("arxiv",n?["paper:"+n]:[],(u=document.querySelector("div.authors"),u?u.textContent.replace(/^Authors?:\s*/,"").split(",").map((function(t){return"author:"+t.trim().toLowerCase().replace(/\s+/g,"_")})):[]),function(){var t=document.querySelector("td.tablecell.subjects");return t?(t.textContent.match(/\(([^)]+)\)/g)||[]).map((function(t){return"subject:"+t.slice(1,-1)})):[]}(),o()?[o()]:[]).join(" "),i="https://pinboard.in/add?url="+t(e)+"&title="+t(a)+"&description="+t(function(){var t=document.querySelector("blockquote.abstract");return t?t.textContent.replace(/^\s*Abstract:\s*/i,"").replace(/\s+/g," ").trim():""}())+"&tags="+t(c);window.open(i,"Pinboard","toolbar=no,width=700,height=350")}catch(t){alert("pin-arxiv error: "+(t.message||t))}var u}();}());
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment