Skip to content

Instantly share code, notes, and snippets.

@tef
Created November 30, 2024 23:46
Show Gist options
  • Save tef/90c1c9e69ea5ed7ab840d9cc068e2e90 to your computer and use it in GitHub Desktop.
Save tef/90c1c9e69ea5ed7ab840d9cc068e2e90 to your computer and use it in GitHub Desktop.

to be clear: histogram diff doesn't work like patience diff

By always selecting a LCS position with the lowest occurrence count, this algorithm behaves exactly like Bram Cohen's patience diff whenever there is a unique common element available between the two sequences.

This isn't true. It behaves the same when all unique tokens in the first file happen to be unique tokens in the second file. If the unique tokens in the first file aren't unique in the second file, histogram doesn't work like patience at all:

% cat a
2
cat
1
dog
2
% cat b
1
dog
2
cat
2
dog
1

cat is the only unique value in both files, but dog and 1 are unique in the first file

% git diff --patience a b
diff --git a/a b/b
index eb9810a..eae195b 100644
--- a/a
+++ b/b
@@ -1,6 +1,8 @@
+1
+dog
 2
 cat
-1
-dog
 2
+dog
+1
% git diff --histogram a b
diff --git a/a b/b
index eb9810a..eae195b 100644
--- a/a
+++ b/b
@@ -1,6 +1,8 @@
-2
-cat
 1
 dog
 2
+cat
+2
+dog
+1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment