Skip to content

Instantly share code, notes, and snippets.

@bjhomer
Last active August 29, 2015 14:05
Show Gist options
  • Save bjhomer/29db143810c67a270db9 to your computer and use it in GitHub Desktop.
Save bjhomer/29db143810c67a270db9 to your computer and use it in GitHub Desktop.
Semantically poor lines shouldn't match
context
context
-AAAA
-AAAA
-AAAA
+BBBB
+BBBB
+BBBB
+BBBB
{
- CCCC
- CCCC
- CCCC
+ DDDD
+ DDDD
+ DDDD
+ DDDD
- EEEE
- EEEE
+ FFFF
+ FFFF
}
context
context
context
context
-AAAA
-AAAA
-AAAA
-{
- CCCC
- CCCC
- CCCC
-
- EEEE
- EEEE
+BBBB
+BBBB
+BBBB
+BBBB
+{
+ DDDD
+ DDDD
+ DDDD
+ DDDD
+
+ FFFF
+ FFFF
}
context
context
@bjhomer
Copy link
Author

bjhomer commented Aug 26, 2014

Proposed algorithm

When building a sequence of changed lines, a "semantically poor" line (containing only punctuation and whitespace) shall not terminate the sequence unless it is followed by a "semantically rich" line that is unchanged from the source to the destination.

@bjhomer
Copy link
Author

bjhomer commented Aug 26, 2014

Ideally, that last curly brace would be split as well, but I haven't found a good algorithm to handle it in a language-agnostic way. This is still just a text diff, entirely unaware of the semantics of the language; it's just prioritizing certain classes of text (i.e. non-punctuation) above other kinds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment