Skip to content

Instantly share code, notes, and snippets.

@gonzalo-bulnes
Last active January 9, 2021 02:09

Revisions

  1. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 4 additions and 4 deletions.
    8 changes: 4 additions & 4 deletions updating-links.md
    Original file line number Diff line number Diff line change
    @@ -114,20 +114,20 @@ find ./docs -type f -exec sed -i -e 's/https:\/\/www.yubico.com\/products\/yubik
    # # the name of the file will be placed wherever the COMMAND says {} (double curly bracket)
    # sed -i -e 's/A/B/g' FILE # replaces all occurrences of A by B in FILE (case insensitive)
    # # note that the / characters are meaningful, so we should make sure any / in A or B
    # # is escaped as \/. Because \ is used for escaping, writing a \ takes two \\.
    # # is escaped as \/. Because \ is used for escaping, writing a \ takes two \\. :exploding_head:
    ```

    First, lets escape the `/` by replacing them by `\/`. Remember we'll have to write `\\\/` in **sed** to get `\/` in the file:
    ```bash
    sed -i -e 's/\//\\\//g' replace.sh
    ```

    Then let's replace the start of the line (represented by `^`) by `find ./ -type f -exec sed -i -e 's/`, ting care of escaping any `/` and `'`. Because the second `sed` is inside quotes, it is not a command just text:
    Then let's replace the start of the line (represented by `^`) by `find ./docs -type f -exec sed -i -e 's/`, taking care of escaping any `/` and `'`. Because the second `sed` is inside quotes, it is not a command, just text:
    ```bash
    sed -i -e 's/^/find .\/docs -type f -exec sed -i -e \'s\/g' replace.sh
    ```
    Then let's replace the space between the URLs by `/` (the separation between A and B in our example). Rember to escape the `/`:
    Then let's replace the space between the URLs by `/` (the separation between A and B in our example). Remember to escape the `/` (by `\/`):
    ```bash
    sed -i -e 's/ /\//g` replace.sh
    ```
    @@ -183,7 +183,7 @@ git add -p
    Limitations
    -----------
    You'll notice that some URL get overwriten multiple times, resuting in for example: `https://docs.securedrop.org/en/latest/enlatest/en/latest/`. I didn't think it worth trying to avoid it this time around. Comment below if you know how! : )
    You'll notice that some URL get overwriten multiple times, resuting in for example: `https://docs.securedrop.org/en/latest/enlatest/en/latest/`. I didn't think it worth trying to avoid it this time around. Comment below if you know how! :slightly_smiling_face:
    References
    ----------
  2. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion updating-links.md
    Original file line number Diff line number Diff line change
    @@ -103,7 +103,7 @@ https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key https://w
    Observations:

    - there is a single space, and it is just between the original URL and the updated URL.
    - the URLs contain `/` characters, that we'll have to escape befoer using **sed**
    - the URLs contain `/` characters, that we'll have to escape before using **sed**
    - the original URL comes first, the updated URL second

    In order to update all URLs in all files, we want to achieve the following command (for each line):
  3. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions updating-links.md
    Original file line number Diff line number Diff line change
    @@ -65,8 +65,8 @@ Edit those lines to only keep one original URL and one updated URL on each of th
    ```bash
    cat permanent-redirects | cut -d')' -f2 | cut -d' ' -f4,8 > old-new-links.log

    # cut -d')' -f2 # keeps the entire lines except their start up to the (first) closing bracket
    # cut -d' ' -f4,8 # splits the lines on spaces and only keeps the 1st and 8th segments (both URLs!)
    # cut -d')' -f2 # keeps (each) entire line except their start up to the (only) closing bracket
    # cut -d' ' -f4,8 # splits each line on spaces and only keeps the 1st and 8th segments (both URLs!)
    ```

    The new file looks like this (`tail old-new-links.log`):
  4. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion updating-links.md
    Original file line number Diff line number Diff line change
    @@ -36,7 +36,7 @@ writing output... [100%] yubikey_setup
    (line 15) redirect https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key - permanently to https://www.yubico.com/authentication-standards/fido-u2f/
    ```

    Remove all the lines that are not reportin permanent redirects:
    Remove all the lines that are not reporting permanent redirects:
    ```bash
    cat linkcheck.log | grep redirect | grep permanently > permanent-redirects.log

  5. gonzalo-bulnes revised this gist Jan 7, 2021. No changes.
  6. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions updating-links.md
    Original file line number Diff line number Diff line change
    @@ -108,9 +108,9 @@ Observations:

    In order to update all URLs in all files, we want to achieve the following command (for each line):
    ```bash
    find ./ -type f -exec sed -i -e 's/https:\/\/www.yubico.com\/products\/yubikey-hardware\/fido-u2f-security-key/https:\/\/www.yubico.com\/authentication-standards\/fido-u2f\//g' {} \;
    find ./docs -type f -exec sed -i -e 's/https:\/\/www.yubico.com\/products\/yubikey-hardware\/fido-u2f-security-key/https:\/\/www.yubico.com\/authentication-standards\/fido-u2f\//g' {} \;

    # find ./ -type f -exec COMMAND \; # applies COMMAND to each file in a directory and its sub-directories
    # find ./docs -type f -exec COMMAND \; # applies COMMAND to each file in the 'docs' directory and its sub-directories
    # # the name of the file will be placed wherever the COMMAND says {} (double curly bracket)
    # sed -i -e 's/A/B/g' FILE # replaces all occurrences of A by B in FILE (case insensitive)
    # # note that the / characters are meaningful, so we should make sure any / in A or B
    @@ -124,7 +124,7 @@ sed -i -e 's/\//\\\//g' replace.sh

    Then let's replace the start of the line (represented by `^`) by `find ./ -type f -exec sed -i -e 's/`, ting care of escaping any `/` and `'`. Because the second `sed` is inside quotes, it is not a command just text:
    ```bash
    sed -i -e 's/^/find .\/ -type f -exec sed -i -e \'s\/g' replace.sh
    sed -i -e 's/^/find .\/docs -type f -exec sed -i -e \'s\/g' replace.sh
    ```
    Then let's replace the space between the URLs by `/` (the separation between A and B in our example). Rember to escape the `/`:
  7. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 7 additions and 1 deletion.
    8 changes: 7 additions & 1 deletion updating-links.md
    Original file line number Diff line number Diff line change
    @@ -183,4 +183,10 @@ git add -p
    Limitations
    -----------
    You'll notice that some URL get overwriten multiple times, resuting in for example: `https://docs.securedrop.org/en/latest/enlatest/en/latest/`. I didn't think it worth trying to avoid it this time around. Comment below if you know how! : )
    You'll notice that some URL get overwriten multiple times, resuting in for example: `https://docs.securedrop.org/en/latest/enlatest/en/latest/`. I didn't think it worth trying to avoid it this time around. Comment below if you know how! : )
    References
    ----------
    - https://stackoverflow.com/questions/6758963
    - https://stackoverflow.com/questions/5917576
  8. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion updating-links.md
    Original file line number Diff line number Diff line change
    @@ -151,7 +151,7 @@ Applying the replacements from longer URL to shorter URL avoids some mix-ups (bu
    cat replace-unique.sh | perl -e 'print sort { length($b) <=> length($a) } <>' > replace-unique-sorted.sh
    ```
    At this point the file looks like this:
    At this point the file looks like this (`tail replace-unique-sorted.sh`):
    ```
    [snip]
  9. gonzalo-bulnes revised this gist Jan 7, 2021. 1 changed file with 100 additions and 0 deletions.
    100 changes: 100 additions & 0 deletions updating-links.md
    Original file line number Diff line number Diff line change
    @@ -5,6 +5,9 @@ Updating links from `linkcheck` output
    cd securedrop-docs
    ```

    Getting a list of original and updated URLs
    -------------------------------------------

    First collect the output of `make docs-linkcheck`:
    ```bash
    make docs-linkcheck > linkcheck.log
    @@ -82,5 +85,102 @@ https://tails.boum.org/doc/first_steps/startup_options/administration_password/
    https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key https://www.yubico.com/authentication-standards/fido-u2f/
    ```

    Building a list of commands to update the URLs
    ----------------------------------------------

    Transform the list of URLs into a list of commanda to replace every ocurence of the original URL in a directory by the updated ones.

    Copy the URLs to a new file so that the list of not lost.
    ```bash
    cp old-new-links.log replace.sh
    ```

    Each line looks like this:
    ```
    https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key https://www.yubico.com/authentication-standards/fido-u2f/
    ```

    Observations:

    - there is a single space, and it is just between the original URL and the updated URL.
    - the URLs contain `/` characters, that we'll have to escape befoer using **sed**
    - the original URL comes first, the updated URL second

    In order to update all URLs in all files, we want to achieve the following command (for each line):
    ```bash
    find ./ -type f -exec sed -i -e 's/https:\/\/www.yubico.com\/products\/yubikey-hardware\/fido-u2f-security-key/https:\/\/www.yubico.com\/authentication-standards\/fido-u2f\//g' {} \;

    # find ./ -type f -exec COMMAND \; # applies COMMAND to each file in a directory and its sub-directories
    # # the name of the file will be placed wherever the COMMAND says {} (double curly bracket)
    # sed -i -e 's/A/B/g' FILE # replaces all occurrences of A by B in FILE (case insensitive)
    # # note that the / characters are meaningful, so we should make sure any / in A or B
    # # is escaped as \/. Because \ is used for escaping, writing a \ takes two \\.
    ```

    First, lets escape the `/` by replacing them by `\/`. Remember we'll have to write `\\\/` in **sed** to get `\/` in the file:
    ```bash
    sed -i -e 's/\//\\\//g' replace.sh
    ```

    Then let's replace the start of the line (represented by `^`) by `find ./ -type f -exec sed -i -e 's/`, ting care of escaping any `/` and `'`. Because the second `sed` is inside quotes, it is not a command just text:
    ```bash
    sed -i -e 's/^/find .\/ -type f -exec sed -i -e \'s\/g' replace.sh
    ```
    Then let's replace the space between the URLs by `/` (the separation between A and B in our example). Rember to escape the `/`:
    ```bash
    sed -i -e 's/ /\//g` replace.sh
    ```
    Finally, let's replace the end of line (represented by `$`) by `/g' {} \;`. Take care as usual of escaping `/`, `\` and `'`:
    ```bash
    sed -i -e 's/$/\/g\' {} \\;/g' replace.sh
    ```
    Tips and tricks
    ---------------
    Some URLs may have been present multiple times, resulting in duplicate commands.
    Remove duplicate commands:
    ```bash
    cat replace.sh | uniq > replace-unique.sh
    ```
    Applying the replacements from longer URL to shorter URL avoids some mix-ups (but not all of them!):
    ```bash
    cat replace-unique.sh | perl -e 'print sort { length($b) <=> length($a) } <>' > replace-unique-sorted.sh
    ```
    At this point the file looks like this:
    ```
    [snip]
    find ./ -type f -exec sed -i -e 's/https:\/\/pypi.python.org\/pypi\/authenticator/https:\/\/pypi.org\/project\/authenticator\//g' {} \;
    find ./ -type f -exec sed -i -e 's/https:\/\/pypi.python.org\/pypi\/html-linter\//https:\/\/pypi.org\/project\/html-linter\//g' {} \;
    find ./ -type f -exec sed -i -e 's/http:\/\/www.vagrantup.com\/downloads.html/https:\/\/www.vagrantup.com\/downloads.html/g' {} \;
    find ./ -type f -exec sed -i -e 's/http:\/\/docs.seleniumhq.org\/docs\//https:\/\/www.selenium.dev\/documentation\//g' {} \;
    find ./ -type f -exec sed -i -e 's/https:\/\/docs.securedrop.org\//https:\/\/docs.securedrop.org\/en\/stable\//g' {} \;
    find ./ -type f -exec sed -i -e 's/https:\/\/docs.securedrop.org/https:\/\/docs.securedrop.org\/en\/stable\//g' {} \;
    find ./ -type f -exec sed -i -e 's/http:\/\/weblate.securedrop.org\//https:\/\/weblate.securedrop.org\//g' {} \;
    find ./ -type f -exec sed -i -e 's/https:\/\/hstspreload.appspot.com\//https:\/\/hstspreload.org\//g' {} \;
    find ./ -type f -exec sed -i -e 's/http:\/\/www.ansible.com\//https:\/\/www.ansible.com\//g' {} \;
    find ./ -type f -exec sed -i -e 's/https:\/\/ossec.github.io\//https:\/\/www.ossec.net\//g' {} \;
    ```
    Usage
    -----
    Finally we can apply the replacement:
    ```bash
    sh replace-unique-sorted.sh
    ```
    And control the URLs as we add them to version control:
    ```bash
    git add -p
    ```
    Limitations
    -----------
    You'll notice that some URL get overwriten multiple times, resuting in for example: `https://docs.securedrop.org/en/latest/enlatest/en/latest/`. I didn't think it worth trying to avoid it this time around. Comment below if you know how! : )
  10. gonzalo-bulnes created this gist Jan 7, 2021.
    86 changes: 86 additions & 0 deletions updating-links.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,86 @@
    Updating links from `linkcheck` output
    ======================================

    ```bash
    cd securedrop-docs
    ```

    First collect the output of `make docs-linkcheck`:
    ```bash
    make docs-linkcheck > linkcheck.log
    ```

    It looks like this (`head linkcheck.log` and `tail linkcheck.log`):
    ```
    rm -rf _build/*
    make[1]: Leaving directory '/home/user/src/securedrop-docs/docs'
    Running Sphinx v2.3.1
    making output directory... done
    building [mo]: targets for 0 po files that are out of date
    building [linkcheck]: targets for 81 source files that are out of date
    updating environment: [new config] 81 added, 0 changed, 0 removed
    reading sources... [ 1%] admin
    reading sources... [ 2%] backup_and_restore
    [snip]
    (line 8) ok https://blog.torproject.org/v2-deprecation-timeline
    writing output... [ 98%] what_makes_securedrop_unique
    (line 76) ok https://www.reuters.com/article/us-media-cybercrime/journalists-media-under-attack-from-hackers-google-researchers-idUSBREA2R0EU20140328
    writing output... [100%] yubikey_setup
    (line 59) ok https://www.yubico.com/wp-content/uploads/2015/03/YubiKeyManual_v3.4.pdf
    (line 15) ok https://support.yubico.com/hc/en-us/articles/360016614780-OATH-HOTP-Yubico-Best-Practices-Guide
    (line 15) redirect https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key - permanently to https://www.yubico.com/authentication-standards/fido-u2f/
    ```

    Remove all the lines that are not reportin permanent redirects:
    ```bash
    cat linkcheck.log | grep redirect | grep permanently > permanent-redirects.log

    # grep redirect # only keeps the lines that contain the word 'redirect'
    # grep permanently # only keeps the lines that contain the word 'permanently'
    ```

    The new file looks like this (`tail permanent-redirects.log`):
    ```
    [snip]
    (line 1) redirect https://itunes.apple.com/us/app/freeotp-authenticator/id872559395 - permanently to https://apps.apple.com/us/app/freeotp-authenticator/id872559395
    (line 6) redirect https://itunes.apple.com/us/app/google-authenticator/id388497605 - permanently to https://apps.apple.com/us/app/google-authenticator/id388497605
    (line 7) redirect https://pypi.python.org/pypi/authenticator - permanently to https://pypi.org/project/authenticator/
    (line 14) redirect https://arstechnica.com/security/2013/12/scientist-developed-malware-covertly-jumps-air-gaps-using-inaudible-sound/ - permanently to https://arstechnica.com/information-technology/2013/12/scientist-developed-malware-covertly-jumps-air-gaps-using-inaudible-sound/
    (line 647) redirect https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks - permanently to https://blog.torproject.org/critique-website-traffic-fingerprinting-attacks
    (line 55) redirect https://docs.securedrop.org - permanently to https://docs.securedrop.org/en/stable/
    (line 14) redirect https://tails.boum.org/doc/first_steps/startup_options/administration_password/ - permanently to https://tails.boum.org/doc/first_steps/welcome_screen/administration_password/
    (line 14) redirect https://tails.boum.org/doc/first_steps/startup_options/administration_password/ - permanently to https://tails.boum.org/doc/first_steps/welcome_screen/administration_password/
    (line 15) redirect https://tails.boum.org/doc/first_steps/startup_options/administration_password/ - permanently to https://tails.boum.org/doc/first_steps/welcome_screen/administration_password/
    (line 15) redirect https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key - permanently to https://www.yubico.com/authentication-standards/fido-u2f/
    ```

    Edit those lines to only keep one original URL and one updated URL on each of them:
    ```bash
    cat permanent-redirects | cut -d')' -f2 | cut -d' ' -f4,8 > old-new-links.log

    # cut -d')' -f2 # keeps the entire lines except their start up to the (first) closing bracket
    # cut -d' ' -f4,8 # splits the lines on spaces and only keeps the 1st and 8th segments (both URLs!)
    ```

    The new file looks like this (`tail old-new-links.log`):
    ```
    [snip]
    https://itunes.apple.com/us/app/freeotp-authenticator/id872559395 https://apps.apple.com/us/app/freeotp-authenticator/id872559395
    https://itunes.apple.com/us/app/google-authenticator/id388497605 https://apps.apple.com/us/app/google-authenticator/id388497605
    https://pypi.python.org/pypi/authenticator https://pypi.org/project/authenticator/
    https://arstechnica.com/security/2013/12/scientist-developed-malware-covertly-jumps-air-gaps-using-inaudible-sound/ https://arstechnica.com/information-technology/2013/12/scientist-developed-malware-covertly-jumps-air-gaps-using-inaudible-sound/
    https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks https://blog.torproject.org/critique-website-traffic-fingerprinting-attacks
    https://docs.securedrop.org https://docs.securedrop.org/en/stable/
    https://tails.boum.org/doc/first_steps/startup_options/administration_password/ https://tails.boum.org/doc/first_steps/welcome_screen/administration_password/
    https://tails.boum.org/doc/first_steps/startup_options/administration_password/ https://tails.boum.org/doc/first_steps/welcome_screen/administration_password/
    https://tails.boum.org/doc/first_steps/startup_options/administration_password/ https://tails.boum.org/doc/first_steps/welcome_screen/administration_password/
    https://www.yubico.com/products/yubikey-hardware/fido-u2f-security-key https://www.yubico.com/authentication-standards/fido-u2f/
    ```