-
-
Save sebastien-p/1024553 to your computer and use it in GitHub Desktop.
// Please, see http://rosettacode.org/wiki/LZW_compression#JavaScript | |
// and http://en.wikipedia.org/wiki/Lempel–Ziv–Welch for more infos. | |
function ( | |
a // String to compress and placeholder for 'wc'. | |
){ | |
for ( | |
var b = a + "Ā", // Append first "illegal" character (charCode === 256). | |
c = [], // dictionary | |
d = 0, // dictionary size | |
e = d, // iterator | |
f = c, // w | |
g = c, // result | |
h; // c | |
h = b.charAt(e++); | |
) | |
c[h] = h.charCodeAt(), // Fill in the dictionary ... | |
f = 1 + c[a = f + h] ? a : (g[d++] = c[f], c[a] = d + 255, h); // ... and use it to compress data. | |
return g // Array of compressed data. | |
} |
function(a){for(var b=a+"Ā",c=[],d=0,e=d,f=c,g=c,h;h=b.charAt(e++);)c[h]=h.charCodeAt(),f=1+c[a=f+h]?a:(g[d++]=c[f],c[a]=d+255,h);return g} |
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
Version 2, December 2004 | |
Copyright (C) 2011 Sebastien P. https://twitter.com/#!/_sebastienp | |
Special thanks to @subzey (you rock) and @kbjr ! | |
Everyone is permitted to copy and distribute verbatim or modified | |
copies of this license document, and changing it is allowed as long | |
as the name is changed. | |
DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE | |
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION | |
0. You just DO WHAT THE FUCK YOU WANT TO. |
{ | |
"name": "LZWcompress", | |
"description": "JavaScript implementation of the Lempel–Ziv–Welch universal lossless data compression algorithm.", | |
"keywords": [ | |
"LZW", | |
"lossless", | |
"data", | |
"compression" | |
] | |
} |
<!DOCTYPE html> | |
<title>Foo</title> | |
<div>Expected value: <b>84,79,66,69,79,82,78,79,84,256,258,260,265,259,261,263</b></div> | |
<div>Actual value: <b id="ret"></b></div> | |
<script> | |
var LZWcompress = function(a){for(var b=a+"Ā",c=[],d=0,e=d,f=c,g=c,h;h=b.charAt(e++);)c[h]=h.charCodeAt(),f=1+c[a=f+h]?a:(g[d++]=c[f],c[a]=d+255,h);return g} | |
document.getElementById("ret").innerHTML = LZWcompress("TOBEORNOTTOBEORTOBEORNOT") | |
</script> |
@kbjr : thanks !
My suggestion:
function(a){a+='ħ';for(var b={},c=256,d,e,f=d=e=[],g,h;g=a[d++];)b[g]=g.charCodeAt(),e=1+b[h=e+g]?h:(f.push(b[e]),b[h]=c++,g);return f}
Please note that there is 135 chars but 136 bytes due to ħ.
e = 1 + b[h = e + g] ? … : …
fixes "\0FOOBAR" bug:
b[h=e+g]
is presumably non-negative. After adding 1 we get positive value («true») if operand was a number or NaN («false») if it was undefined.
Then, instead of using f.push(b[e])
twice we can just add extra «illegal» char with charCode > 255 to encoded string that in conjuction with previous char(s) cannot be in dictionary. And then drop the push of last char.
'ħ' is picked randomly as a tribute to Max Planck :)
@subzey : you rock !
The only thing left is to make enough room to write g = a.charAt(d++);
instead of g = a[d++];
!
function(h){for(var a=h+'ħ',b=[],c=0,d=b,e=b,f=b,g;g=a.charAt(d++);)b[g]=g.charCodeAt(),e=1+b[h=e+g]?h:(f[c++]=b[e],b[h]=c+255,g);return f}
139 chars, 140 bytes, works perfectly well in IE6, IE8, IE9 (didn't test in native IE7)
Optimizations applied:
c
is initially 0, not 256, used post-decrement instead of pre-decrementpush
changed to index-based assignment as ifc
starts with 0, index is always equal toc
- Rearranged variables declaration and «illegal» char appending, stripping one byte
@subzey : Wonderful ! IE7 should not be a problem. I did something like that to say goodbye to push
but gave up on it in the end.
Well done guys, this truly is awesome
And we also have to correct the "\0FOOBAR" bug (https://gist.github.com/1024553#gistcomment-35550) ...