Created
September 2, 2016 12:27
-
-
Save gvlx/7fb1d77c6f994f0b029a9423234aef7b to your computer and use it in GitHub Desktop.
Notepad++ Helper: remove duplicate lines
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http://stackoverflow.com/a/16293580/43408 | |
https://stackoverflow.com/questions/3958350/removing-duplicate-rows-in-notepad | |
Since Notepad++ Version 6 you can use this regex in the search and replace dialogue: | |
^(.*?)$\s+?^(?=.*^\1$) | |
and replace with *nothing*. This leaves from all duplicate rows the last occurrence in the file. | |
- ^ - matches the start of the line. | |
- (.*?) - matches any characters 0 or more times, but as few as possible | |
(It matches exactly on row, this is needed because of the ". matches newline" option). | |
The matched row is stored, because of the brackets around and accessible using \1 | |
- $ - matches the end of the line. | |
- \s+?^ - this part matches all whitespace characters (newlines!) till the start of the next row | |
==> This removes the newlines after the matchd row, so that no empty row is there after the replacement. | |
(?=.*^\1$) - this is a positive lookahead assertion. This is the important part in this regex, | |
a row is only matched (and removed), when there is exactly the same row following somewhere else in the file. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment