Sunday, February 8, 2009

utf8 and hebrew in tomcat

In tomcat 5.0 and above, if your UTF-8 request parameters are received as gibberish you might need to do the following:

In your server.xml add the URIEncoding="UTF-8" and useBodyEncodingForURI="true" to the Connector tag(s):

<Connector
useBodyEncodingForURI="true"
URIEncoding="UTF-8"

acceptCount="100"
enableLookups="false"
maxSpareThreads="75"
maxThreads="150"
minSpareThreads="25"
port="8080"
redirectPort="8443" />

This should make GET requests work properly.

For some reason the above does not work for POST requests. If you ask the tomcat people they'll mumble something about W3C, RFC, and RTFM. The short way to have this work for POST requests is to write a small filter to set the request encoding properly. We are using something similar to this:

package com.realcommerce.filters;
import javax.servlet.*;
import java.io.*;
public class RequestEncodingFilter implements Filter {
public void init(FilterConfig filterConfig) throws ServletException {
//Do Nothing
}
public void destroy() {
//Do Nothing
}
public void doFilter(ServletRequest request,ServletResponse response,
FilterChain chain) throws IOException, ServletException
{
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
}
}

This made POST requests pass Hebrew (or any UTF-8) parameters properly.

13 comments:

  1. I don't get it. You gave us a piece of code which I can't understand where to put and moreover the doFilter function calls for itself endlessly.

    ReplyDelete
  2. Dear Anonymous.
    1) I appreciate your polite response.
    2) if you don't know what a tomcat filter is, then try http://lmgtfy.com/?q=tomcat+filter.
    3) to use this filter, just open your favorite IDE (vi), paste the code, save as .java, compile, copy the .class file to your tomcat project, edit web.xml and add the filter there.
    4) the doFilter does not call itself, it calls the next filter in the filter chain

    ReplyDelete
    Replies
    1. thanks man for put this lines too, very clear now thank you .

      Delete
  3. Nir:

    Thanks for writing this up, it was very helpful and contrary of Anonymous's comments, I felt like it was very clear and succinct.

    ReplyDelete
  4. Hey how can i install a filter to web.xml? i managed to compile the code but how do i add it to the web.xml?

    I added something like this


    CharacterContentFilter
    CharacterContentFilter



    CharacterContentFilter
    /*


    I am kind of newbie but am not sure if a am correct

    looking forward for yout response

    ReplyDelete
  5. תודה. ממש עזרת לי
    הוספתי
    request.setCharacterEncoding("UTF-8");
    ב-JSP
    ללא פילטרים ולא כלום, וזה פשוט עזר.
    תודה

    ReplyDelete
    Replies
    1. תותחיםםם אתם!! שעות התעסקתי עם זה!!!
      תודה!!!!

      Delete
    2. רק דבר אחד - מה שכתוב למעלה, שזה עובד בגט, לא עובד. יש דרך לסדר את העניין?

      Delete
    3. תעשה מה שאמרתי בפוסט. אם תעשה אחד לאחד כל מה שכתוב הכל יעבוד.
      אם תעשה רק חלק אז לא הכל יעבוד.
      ככה זה בחיים.. :)

      Delete
    4. אחי יום שלם אני בחרא הזה
      וואו עשית לי תיום!!!!

      request.setCharacterEncoding("UTF-8");
      עובד פצצה

      Delete
  6. Thank you Nir!!!!! just added:
    request.setCharacterEncoding("UTF-8");
    and it works perfectly.
    :-)

    ReplyDelete
  7. hi, thanks for the post for some reason it's not working for me. did all of the above and tomcat is still turning my hebrew GET params into ????. is there anything else i might try?

    ReplyDelete
  8. Gal, did you add utf-8 encoding parameters to your @page directive? I don't have the syntax handy but a simple goole search should help

    ReplyDelete

[Due to much spam, comments are now moderated and will be posted after review]