- Can you verify 5555 amp links?
- sure I can!
The links are in a CSV files with this format:
ID,BRAND,AMP Error Type,AMP URL,Last detected
I need to send one by one through amphtml-validator
:
$ node_modules/.bin/amphtml-validator http://www.bonappetit.com/uncategorized/article/the-linkery-03-24-10/amp --format json
{"http://www.bonappetit.com/uncategorized/article/the-linkery-03-24-10/amp":{"status":"FAIL","errors":[{"severity":"ERROR","line":33,"col":3,"message":"Invalid URL protocol 'foodhttp:' for attribute 'href' in tag 'a'.","specUrl":"https://www.ampproject.org/docs/reference/spec#links","category":"DISALLOWED_HTML","code":"INVALID_URL_PROTOCOL","params":["href","a","foodhttp"]}]}}
We can use head
and tail
to select a specific row:
$ head -n 2 errors.csv | tail -n 1
1,Bon Appetit,Prohibited or invalid use of HTML Tag (Critical issue),http://www.bonappetit.com/recipe/spicy-italian-sausage/amp,4/3/17
$ head -n 3 errors.csv | tail -n 1
2,Bon Appetit,Prohibited or invalid use of HTML Tag (Critical issue),http://www.bonappetit.com/entertaining-style/gift-guides/article/the-7-best-culinary-bookstores-in-america/amp,4/3/17
From each row we need to select the URL:
$ head -n 3 errors.csv | tail -n 1 | csvcut -c 4
http://www.bonappetit.com/entertaining-style/gift-guides/article/the-7-best-culinary-bookstores-in-america/amp
From there we will need to update the first index starting in 2 and ending in 5555. We can use seq
for this:
➜ amp-valid seq 2 10
2
3
4
5
6
7
8
9
10
➜ amp-valid
And we are going to use parallel
to speed up the process:
seq 2 5555 | parallel -j10 "head -n {} errors.csv | tail -n 1 | csvcut -c 4 | xargs node_modules/.bin/amphtml-validator $1 --format json | cat >> output/{}.json"
☕
Now we have a bunch of json files inside ./output
. We can merge them together with:
for f in *.json; do (cat "${f}"; echo) >> output.dat; done
Finally we need to remove ocasionally empty lines from out file:
cat output/output.dat | sed '/^\s*$/d' | cat >> clean.dat
Out new file clean.dat
is ready to go!
this is incredible, I'd suggest merging this doc in
github.com/CondeNast/autopilot-services-validation