Skip to content

Instantly share code, notes, and snippets.

@slightfoot
Created August 22, 2013 20:36
Show Gist options
  • Save slightfoot/6312466 to your computer and use it in GitHub Desktop.
Save slightfoot/6312466 to your computer and use it in GitHub Desktop.
Comment Stripper.. not much else to say.
import java.io.IOException;
/**
* Comment Stripper
*
* Usage: cat inputfile.blah | java CommentStripper > outptufile.blah
*
*
*/
public class CommentStripper
{
public static void main(String[] argv)
{
int in, i, state;
int[][] states = {
{'"','\'', '/', '*','\\','\n', '?', 0 }, // state index
{ 2, 3, -4, 0, 0, 0, 0, 1 }, // state 1 - standard input
{ 1, 0, 0, 0, 0, 0, 0, 2 }, // state 2 - inside string
{ 0, 1, 0, 0, 0, 0, 0, 3 }, // state 3 - inside literal
{ 0, 0, -5, -6, -7, 0, 0, '/' }, // state 4 - find comment
{ 0, 0, 0, 0, -9, 1, 0, -5 }, // state 5 - inside line comment
{ 0, 0, 0, -7, 0, 0, 0, -6 }, // state 6 - inside block comment
{ 0, 0, ' ', 0, 0, 0, 0, -6 }, // state 7 - inside block comment ending
{ 0, 0, 0, 0, 0, -4, 0,'\\' }, // state 8 - inside line continue for block comments
{ 0, 0, 0, 0, 0, -5, 0, 5 }, // state 9 - inside line continue for line comments
};
state = 1;
try{
while((in = System.in.read()) != -1){
for(i = 0; i < 7; i++){
if(in == states[0][i] && states[state][i] != 0){
state = states[state][i];
break;
}
}
if(i == 7) state = states[state][7]; // else state
if(state < 0) state = -state; // negative state tells us not to output char
else{
if(state > 10){ // states more than the state index are valid chars
in = state;
state = 1; // reset state
}
System.out.write(in);
}
}
}
catch(IOException e){
System.err.println("Exception: " + e);
}
}
}
@haifenghuang
Copy link

very good code, but it is not strictly correct. when i run it using CommentStripper.java as input:
cat CommentStripper.java | /opt/java6/bin/java CommentStripper > b.log

The 16th line:
{'"','\'', '/', '*','\\','\n', '?', 0 }, // state index
becomes
{'"','\'', '/, '*','\\','\n', '?', 0 },

it should only remove "//state index", but it also change '/' to '/.

And i tried some other source, below is the result:

#include <sys/stat.h>

becomes
#include <sys/tat.h>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment