Skip to content

Instantly share code, notes, and snippets.

@smallnewer
Created August 29, 2013 12:37
Show Gist options
  • Save smallnewer/6377531 to your computer and use it in GitHub Desktop.
Save smallnewer/6377531 to your computer and use it in GitHub Desktop.
用来匹配一段html字符串中的所有a标签,并提取其中的锚文本和链接。锚文本是将a标签其中的所有标签剔除,保留下来的文字。
function d(html){
var html = document.documentElement.innerHTML;
var aList = [];
// 匹配A标签
var reg=/\<a\s+[\d\D]*?href=['"]([^'"]*?)[\'|\"][^\>]*?\>([\d\D]*?)\<\/a\>/gi;
var res = null;
while((res = reg.exec(html)) != null){
// 匹配标签
var reg2 = /\<[\/]{0,1}[\w]+[^\>]*?\>/gi
aList.push({
source: res[0],
href: res[1],
text: res[2].replace(reg2,"")
});
}
return aList;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment