Skip to content

Instantly share code, notes, and snippets.

@gvlx
Created September 2, 2016 12:27
Show Gist options
  • Save gvlx/7fb1d77c6f994f0b029a9423234aef7b to your computer and use it in GitHub Desktop.
Save gvlx/7fb1d77c6f994f0b029a9423234aef7b to your computer and use it in GitHub Desktop.
Notepad++ Helper: remove duplicate lines
http://stackoverflow.com/a/16293580/43408
https://stackoverflow.com/questions/3958350/removing-duplicate-rows-in-notepad
Since Notepad++ Version 6 you can use this regex in the search and replace dialogue:
^(.*?)$\s+?^(?=.*^\1$)
and replace with *nothing*. This leaves from all duplicate rows the last occurrence in the file.
- ^ - matches the start of the line.
- (.*?) - matches any characters 0 or more times, but as few as possible
(It matches exactly on row, this is needed because of the ". matches newline" option).
The matched row is stored, because of the brackets around and accessible using \1
- $ - matches the end of the line.
- \s+?^ - this part matches all whitespace characters (newlines!) till the start of the next row
==> This removes the newlines after the matchd row, so that no empty row is there after the replacement.
(?=.*^\1$) - this is a positive lookahead assertion. This is the important part in this regex,
a row is only matched (and removed), when there is exactly the same row following somewhere else in the file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment