Created
December 24, 2012 00:25
-
-
Save terrancesnyder/4366869 to your computer and use it in GitHub Desktop.
Bug with pentaho output servlet
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
var out = _step_.getTrans().getServletPrintWriter(); | |
out.println("<H1>"); | |
out.println("Hello World!"); | |
out.println("ときょ 東京 コーヒー"); | |
out.println("</H1>"); |
I would expect the below would be the correct way to create the print writer which is UTF8 aware.
// Get a writer to write the data in UTF-8
res.setContentType("text/html; charset=UTF-8");
out = new PrintWriter(new OutputStreamWriter(res.getOutputStream(), "UTF8"), true);
Found a hack to work around this problem, the print writer returned by the call is a jetty HTTPConnection$1 which has a private field for the underlying HttpConnection object. Using some hacky reflection we can grab the HttpConnection object and then force the content type and ask for a print writer that actually is in UTF-8 format.
// hack to get access to raw print writer from jetty
var out = _step_.getTrans().getServletPrintWriter();
var f = out.getClass().getDeclaredField("this$0");
f.setAccessible(true);
// force output stream to UTF-8
var httpConnection = f.get(out);
httpConnection.getResponse().setContentType("application/octet-stream; charset=UTF-8");
httpConnection.getResponse().addHeader("Content-Disposition","attachment;filename=out.txt");
out = httpConnection.getPrintWriter("UTF-8");
out.println("kanji = " + kanji);
out.println("hiragana = " + hiragana);
out.println("katakana = " + katakana);
out.println("romanji = " + romanji);
Hello
I have the same problem with Kettle Data integration interface.
I try to extract data from an XML files and to insert them into table of a database.
For this, I wrote a transformation which contains 2 steps linked to each other :
- step1 : extract data from XML
- step2 : insert data into a table
I mapped the fields of the flow (coming from xml) to the fields of the table.When I execute the transformation, the records are inserted into the table but all japanese characters are replaced by '?', just like you posted above.
Since this is using the graphical interface, I have no idea on how to use your hack to fix this problem. Would you have an idea?
Thanks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The above will yield and incorrect response with the japanese characters transposed to ????? as if UTF-8 encoding is not set on the print writer in pentaho's output stream... Tried setting in environment variable but this still failed.
This should yield;
Logged defect in pentaho:
http://jira.pentaho.com/browse/PDI-6123