-
-
Save sebastien-p/1024553 to your computer and use it in GitHub Desktop.
// Please, see http://rosettacode.org/wiki/LZW_compression#JavaScript | |
// and http://en.wikipedia.org/wiki/Lempel–Ziv–Welch for more infos. | |
function ( | |
a // String to compress and placeholder for 'wc'. | |
){ | |
for ( | |
var b = a + "Ā", // Append first "illegal" character (charCode === 256). | |
c = [], // dictionary | |
d = 0, // dictionary size | |
e = d, // iterator | |
f = c, // w | |
g = c, // result | |
h; // c | |
h = b.charAt(e++); | |
) | |
c[h] = h.charCodeAt(), // Fill in the dictionary ... | |
f = 1 + c[a = f + h] ? a : (g[d++] = c[f], c[a] = d + 255, h); // ... and use it to compress data. | |
return g // Array of compressed data. | |
} |
function(a){for(var b=a+"Ā",c=[],d=0,e=d,f=c,g=c,h;h=b.charAt(e++);)c[h]=h.charCodeAt(),f=1+c[a=f+h]?a:(g[d++]=c[f],c[a]=d+255,h);return g} |
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
Version 2, December 2004 | |
Copyright (C) 2011 Sebastien P. https://twitter.com/#!/_sebastienp | |
Special thanks to @subzey (you rock) and @kbjr ! | |
Everyone is permitted to copy and distribute verbatim or modified | |
copies of this license document, and changing it is allowed as long | |
as the name is changed. | |
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION | |
0. You just DO WHAT THE FUCK YOU WANT TO. |
{ | |
"name": "LZWcompress", | |
"description": "JavaScript implementation of the Lempel–Ziv–Welch universal lossless data compression algorithm.", | |
"keywords": [ | |
"LZW", | |
"lossless", | |
"data", | |
"compression" | |
] | |
} |
<!DOCTYPE html> | |
<title>Foo</title> | |
<div>Expected value: <b>84,79,66,69,79,82,78,79,84,256,258,260,265,259,261,263</b></div> | |
<div>Actual value: <b id="ret"></b></div> | |
<script> | |
var LZWcompress = function(a){for(var b=a+"Ā",c=[],d=0,e=d,f=c,g=c,h;h=b.charAt(e++);)c[h]=h.charCodeAt(),f=1+c[a=f+h]?a:(g[d++]=c[f],c[a]=d+255,h);return g} | |
document.getElementById("ret").innerHTML = LZWcompress("TOBEORNOTTOBEORTOBEORNOT") | |
</script> |
And we also have to correct the "\0FOOBAR" bug (https://gist.github.com/1024553#gistcomment-35550) ...
@kbjr : thanks !
My suggestion:
function(a){a+='ħ';for(var b={},c=256,d,e,f=d=e=[],g,h;g=a[d++];)b[g]=g.charCodeAt(),e=1+b[h=e+g]?h:(f.push(b[e]),b[h]=c++,g);return f}
Please note that there is 135 chars but 136 bytes due to ħ.
e = 1 + b[h = e + g] ? … : …
fixes "\0FOOBAR" bug:
b[h=e+g]
is presumably non-negative. After adding 1 we get positive value («true») if operand was a number or NaN («false») if it was undefined.
Then, instead of using f.push(b[e])
twice we can just add extra «illegal» char with charCode > 255 to encoded string that in conjuction with previous char(s) cannot be in dictionary. And then drop the push of last char.
'ħ' is picked randomly as a tribute to Max Planck :)
@subzey : you rock !
The only thing left is to make enough room to write g = a.charAt(d++);
instead of g = a[d++];
!
function(h){for(var a=h+'ħ',b=[],c=0,d=b,e=b,f=b,g;g=a.charAt(d++);)b[g]=g.charCodeAt(),e=1+b[h=e+g]?h:(f[c++]=b[e],b[h]=c+255,g);return f}
139 chars, 140 bytes, works perfectly well in IE6, IE8, IE9 (didn't test in native IE7)
Optimizations applied:
c
is initially 0, not 256, used post-decrement instead of pre-decrementpush
changed to index-based assignment as ifc
starts with 0, index is always equal toc
- Rearranged variables declaration and «illegal» char appending, stripping one byte
@subzey : Wonderful ! IE7 should not be a problem. I did something like that to say goodbye to push
but gave up on it in the end.
Well done guys, this truly is awesome
you can remove the 0 in .charCodeAt(0) because the undefined will coerce to 0