Sunday, October 4, 2009

How to do rewrites in an IIS ISAPI

Note to self: If ever you need to do a rewrite in an IIS ISAPI Filter you just need to re-set the pHeaderInfo's url header. Something like this:

strNewUrl = GET_REWRITE_FOR_URL(strUrl);
if ( strNewUrl.IsEmpty() == FALSE ) {
"url", (LPTSTR)(LPCTSTR)strNewUrl);

This is working in IIS4 and above.

Note that if you really need a good re-write/redirect filter for IIS you should look at IIRF ,but since it's not working for IIS4 i had to do this manually

Thursday, June 18, 2009

Bing, Google, Nutch and Canonical URLs

Google announced a few months back that they started supporting canonical urls. This is a great feature that we already adopted in a couple of sites (mostly to keep tracking codes from messing with our search engine results.

Unfortunelty, altough Microsoft announced they will support this feature in Live, this was not yet implemented in Bing as of now (Jun 2009).

Nutch, on the other hand, had this on thier TODO list for 1.1.

Also, Google Search Appliance (GSA), is currently assumed by all to not support this (although nobody really knows if it does).

More about canonical URLs:

Wednesday, June 3, 2009

apc futex_wait lockdown make your apache freeze over

We had the typical LAMP setup going on, with Drupal as the base CMS and APC for bytecode cache. We needed a good caching engine so I figured why not use APC's user cache. Well, we tried the APC Cache Drupal Module which, with minor fixes proved to work very nicely. That is, until we actually put this all thing on production.

The first thing we had was having our apache hang and not respond to any user requests. We susspected network issues, especially since netstat -na showed that all the apache processes were hanging on SYN_WAIT. However, since apache restart solved the issue i started to suspect this was something else.

To make a long (very long) story short, I got strace on our prod machines to find out that apache was either hanging on futex_lock(....FUTEXT_WAIT...) or doing infinte loops on the same functions.

To make even a longer story short, I got gdb installed on those machines and the backtrace clearly indicated that the locks were from APC user-cache calls.

We decided to abandon APC user-cache and switch to memcached which proved faster and had less lockdowns.

The funny thing is that when we talked about this over dinner the same evening a developer from another team just pointed me to this article by one of the APC leaders: (or something) How to Dismantle an APC Bomb which has been around for over a year. I am supprised and shocked that such a information is hidded so well and not mentioned anywere in the docs. Moreover, I went through the APC code again after reading this post (I went through it once when i started analysing the problem) and it seems that this is not even close to being resolved. there are no patches and no TODOs and nothing of the sort. From reading the code the entire user-cache needs a major re-write. What gives?

(this is a post i wrote a couple of months ago, never had time to finish it. Unfortunately this is still not fixed afaik)
EDIT (07/2009): reports this to be fixed. If anyone can confirm this please send me a note so that I could update this post

Thursday, May 14, 2009

GA_googleAddAttr is not defined error

If you are using Google Ad Manager you can define custom attributes for targeting.

If you are getting a javascript error GA_googleAddAttr is not defined just make sure to put the call(s) to GA_googleAddAttr in a seperate <script> line, after the GS_googleEnableAllServices.

<script type="text/javascript">
<script type="text/javascript">
GA_googleAddAttr("ATTRB1", "whatever");
<script type="text/javascript">
GA_googleAddSlot("ca-pub-0000000000", "slotname_216x311");

Monday, March 23, 2009

V6 DECODE64 not working properly

It turns out that after all these years, V6's built-in DECODE64 and ENCODE64 actually do not work properly with some binary data.

proc COMPARE_BASE64 {} {
set code {2lIU1ZNw5BYvKi79j4L/+GGmjAQK7uiBQG7elDdKZcE=}
set vgn_result [DECODE64 $code]; #VGN built-in function
set tcl_result [::base64::decode $code]; #TclLib tcl-only implementation
return [join [list \\
[BIN2HEX $vgn_result] \\
[BIN2HEX $tcl_result] \\
[string compare $vgn_result $tcl_result ] \\
] "\\r\\n"]
proc BIN2HEX { text } { binary scan $text H* result; return $result }

--- result ----

If you need to encode/decode base64 for use in non-vignette systems make sure you use tcllib's base64.

This was tested on Vignette's V6 but I am sure it's true for Storyserver 4.2 and V5 as well.

Thursday, March 19, 2009

MySQL indexes behaving badly

If you have a MySQL query that for no apparent reason stops working or no longer behaves as it used to, the first thing to do is EXPLAIN it. If the explain is unreasonable (for example, using different indexes then it used to or using an index but still scanning millions or rows) then you should try to CHECK TABLE.

One thing that I accidently found out was that CHECK TABLE also updates the key statistics. This made many problems magically disappear.

A few things that should be noted:
  1. CHECK TABLE will only update key statistics for MyISAM tables
  2. , for InnoDB you must use OPTIMIZE TABLE.
  3. If you use OPTIMIZE TABLE to update the key statistics you should be aware that OPTIMIZE TABLE does other (good) things as well, so it usually takes much longer and it locks your table while doing it. Also, for InnoDB it actually re-builds the table using ALTER TABLE.
  4. The CHECK TABLE have several options. Only the MEDIUM (the default) or EXTENDED update the key statistics. (i think).
So, if your indexes stop working, or just mis-behave, try to check them.

Friday, February 20, 2009

Vista fast user switching disappeared

My fast user switching option in Vista Home machine suddenly disappeared. I remember it was there, then it was not. I don't know why this happened, but here is how to fix it.

A rough translation:

Run regexit.exe
Go to HKEY_LOCAL_MACHINE > SOFTWARE > Microsoft > Windows > CurrentVersion > Policies > System
Look for a key named HideFastUserSwitching. If it exists, change it's value to 0 (zero). If it does not exist create it as a DWORD and set it to zero.
You might need to log off before the change takes effect.

Sunday, February 8, 2009

utf8 and hebrew in tomcat

In tomcat 5.0 and above, if your UTF-8 request parameters are received as gibberish you might need to do the following:

In your server.xml add the URIEncoding="UTF-8" and useBodyEncodingForURI="true" to the Connector tag(s):


redirectPort="8443" />

This should make GET requests work properly.

For some reason the above does not work for POST requests. If you ask the tomcat people they'll mumble something about W3C, RFC, and RTFM. The short way to have this work for POST requests is to write a small filter to set the request encoding properly. We are using something similar to this:

package com.realcommerce.filters;
import javax.servlet.*;
public class RequestEncodingFilter implements Filter {
public void init(FilterConfig filterConfig) throws ServletException {
//Do Nothing
public void destroy() {
//Do Nothing
public void doFilter(ServletRequest request,ServletResponse response,
FilterChain chain) throws IOException, ServletException
chain.doFilter(request, response);

This made POST requests pass Hebrew (or any UTF-8) parameters properly.

Tuesday, January 27, 2009

slow lstat, slow php, slow drupal

If you are short on time go to bottom line.

Ok, the lstat() and stat() system calls system call are not really slow per se. But take a look at an apache strace -r dump from a drupal installation I was testing:

[user@host ~]# strace -o strace.load -r -s 256 -p <PID>
0.000081 getcwd("/mnt/var/www/html/drupal", 4096) = 32
0.000048 lstat("/mnt", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.000805 lstat("/mnt/var", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.001596 lstat("/mnt/var/www", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.000104 lstat("/mnt/var/www/html", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0
0.000362 lstat("/mnt/var/www/html/drupal", {st_mode=S_IFDIR|0755, st_size=8192, ...}) = 0
0.000804 lstat("/mnt/var/www/html/drupal/sites", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.057575 lstat("/mnt/var/www/html/drupal/sites/all", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.000297 lstat("/mnt/var/www/html/drupal/sites/all/modules", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
0.014762 lstat("/mnt/var/www/html/drupal/sites/all/modules/Internet", {st_mode=S_IFDIR|0755, st_size=4096,
0.000214 lstat("/mnt/var/www/html/drupal/sites/all/modules/Internet/date", {st_mode=S_IFDIR|0755, st_size=4
0.000112 lstat("/mnt/var/www/html/drupal/sites/all/modules/Internet/date/date.module", {st_mode=S_IFREG|064
0.010816 open("/mnt/var/www/html/drupal/sites/all/modules/Internet/date/date.module", O_RDONLY) = 20
0.000835 fstat(20, {st_mode=S_IFREG|0644, st_size=44118, ...}) = 0
0.000069 lseek(20, 0, SEEK_CUR) = 0
0.000063 stat("././sites/all/modules/Internet/date/date.module", {st_mode=S_IFREG|0644, st_size=44118, ...}) = 0
0.000159 close(20) = 0

As you can see, for every file included in the system there are around 10-15 lstat calls. this means that if we have ~200 files included we have ~2500 lstat calls per page. A simple sum on the strace results gave me ~8 seconds per page of lstat time, which was over a third of my test page time. see?

[user@host ~]# cat strace.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d |awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
[user@host ~]# cat strace.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d | fgrep lstat | awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
if you really want to understand this shell script let me know. It deserves a post by itself

Now we have 3 options: a) make lstat() faster. b) make php stop doing all those lstats(). c) make drupal include less files.

Well, (a) is not an option as our infrastructure team (aka sysadmins) insist there is nothing wrong with the filesystem and nothing can be done to enhance it's performance. Since I gave up being a sysadmin when it was still call 'sysadmin' it would be hard for me to prove them wrong (especially since they might be are right).

As (c) would require us to refactor most of our code in a very very 'not drupal way' and create very large php files, I was hoping to find a better solution.

By a rare chance of good fortune i came to learn about realpath_cache_size. Once I changed the foolish default to a reasonable 1M in php.ini, I lost most of the lstat() calls and over 10 seconds of processing time.

[user@host ~]# cat strace.2.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d |awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc
[user@host ~]# cat strace.2.single | awk '{printf " " $1 "\n " $2}' |sed 1d | sed \$d | fgrep lstat | awk '{printf $2 "+"}' | sed s/.$/\\n/ | bc

Here is what to do in php.ini

; ...php.ini...
; Determines the size of the realpath cache to be used by PHP. This value should
; be increased on systems where PHP opens many files to reflect the quantity of
; the file operations performed.

; Duration of time, in seconds for which to cache realpath information for a given
; file or directory. For systems with rarely changing files, consider increasing this
; value.

For normal pages under normal load I was able to get a 10%-25% performance improvment. I think that's quite a bit for a configuration change.