Created
March 2, 2011 19:56
-
-
Save cslarsen/851611 to your computer and use it in GitHub Desktop.
Linear scan for palindromes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
* Find the longest palindrome in the text. | |
* | |
* This is Greplin's first challenge, and I originally solved it in Python. | |
* | |
* This algorithm is linear on the average input, but has a quadratic | |
* worst case running time. There exists an even better algorithm, but | |
* this should do. | |
* | |
* There might also be a small error below, but you get the general idea. | |
* | |
* Christian Stigen Larsen | |
*/ | |
#include <stdio.h> | |
static const char text[] = | |
"Fourscoreandsevenyearsagoourfaathersbroughtforthonthiscontainentanewnati" | |
"onconceivedinzLibertyanddedicatedtothepropositionthatallmenarecreatedequ" | |
"alNowweareengagedinagreahtcivilwartestingwhetherthatnaptionoranynartions" | |
"oconceivedandsodedicatedcanlongendureWeareqmetonagreatbattlefiemldoftzha" | |
"twarWehavecometodedicpateaportionofthatfieldasafinalrestingplaceforthose" | |
"whoheregavetheirlivesthatthatnationmightliveItisaltogetherfangandpropert" | |
"hatweshoulddothisButinalargersensewecannotdedicatewecannotconsecrateweca" | |
"nnothallowthisgroundThebravelmenlivinganddeadwhostruggledherehaveconsecr" | |
"ateditfaraboveourpoorponwertoaddordetractTgheworldadswfilllittlenotlenor" | |
"longrememberwhatwesayherebutitcanneverforgetwhattheydidhereItisforusthel" | |
"ivingrathertobededicatedheretotheulnfinishedworkwhichtheywhofoughthereha" | |
"vethusfarsonoblyadvancedItisratherforustobeherededicatedtothegreattdafsk" | |
"remainingbeforeusthatfromthesehonoreddeadwetakeincreaseddevotiontothatca" | |
"useforwhichtheygavethelastpfullmeasureofdevotionthatweherehighlyresolvet" | |
"hatthesedeadshallnothavediedinvainthatthisnationunsderGodshallhaveanewbi" | |
"rthoffreedomandthatgovernmentofthepeoplebythepeopleforthepeopleshallnotp" | |
"erishfromtheearth"; | |
void print_best(const char* l, const char* r) | |
{ | |
static int best = 2; | |
if ( r-l > best ) | |
printf("%.*s\n", (best=r-l), l); | |
} | |
int main() | |
{ | |
const char *l, *r; | |
for ( const char *p = text + 1; *p; ++p ) { | |
l = p; r = p - 1; | |
while ( *--l == *++r ); | |
print_best(l+1, r); | |
l = r = p; | |
while ( *--l == *++r ); | |
print_best(l+1, r); | |
} | |
return 0; | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The above code is quite fast as well. It runs through the complete works of Shakespeare in about 0.03 seconds of wall clock time, or a processing rate of 128 Mb/second, and grows linearly with the input. It has a worst case running time which is quadratic, though; if you feed it a string of same characters, for instance.
To find the palindromes in Shakespeare's collected works, I first modified the code to load the text from disk. I changed
print_best
to print all palindromes equal to or longer than the current best, so we get a longer list of palindromes. I also had to prepare Shakespeare's collected works into a format suitable for processing: First I removedthe name of which character is speaking (done by sed), then I converted all text to lowercase and deleted all non-alphanumeric characters.
Since Shakespeare wrote his plays more than 300 years before copyright law was invented, you can download and use it freely. I got mine
off a site with all the stuff in different directories. Here's what I did to prepare them, all in one jolly line of unix goodness:
I compiled my code and ran it on the file:
The longest palindromes appear in the lines:
from the play Richard III. The preparsing of the collected works above includes the character's name. Without it, the output is a bit different:
The second last character-by-character palindrome comes from the tragedy Troilus and Cressida: