Thursday, June 18, 2009

Bing, Google, Nutch and Canonical URLs

Google announced a few months back that they started supporting canonical urls. This is a great feature that we already adopted in a couple of sites (mostly to keep tracking codes from messing with our search engine results.

Unfortunelty, altough Microsoft announced they will support this feature in Live, this was not yet implemented in Bing as of now (Jun 2009).

Nutch, on the other hand, had this on thier TODO list for 1.1.

Also, Google Search Appliance (GSA), is currently assumed by all to not support this (although nobody really knows if it does).

More about canonical URLs:

Wednesday, June 3, 2009

apc futex_wait lockdown make your apache freeze over

We had the typical LAMP setup going on, with Drupal as the base CMS and APC for bytecode cache. We needed a good caching engine so I figured why not use APC's user cache. Well, we tried the APC Cache Drupal Module which, with minor fixes proved to work very nicely. That is, until we actually put this all thing on production.

The first thing we had was having our apache hang and not respond to any user requests. We susspected network issues, especially since netstat -na showed that all the apache processes were hanging on SYN_WAIT. However, since apache restart solved the issue i started to suspect this was something else.

To make a long (very long) story short, I got strace on our prod machines to find out that apache was either hanging on futex_lock(....FUTEXT_WAIT...) or doing infinte loops on the same functions.

To make even a longer story short, I got gdb installed on those machines and the backtrace clearly indicated that the locks were from APC user-cache calls.

We decided to abandon APC user-cache and switch to memcached which proved faster and had less lockdowns.

The funny thing is that when we talked about this over dinner the same evening a developer from another team just pointed me to this article by one of the APC leaders: (or something) How to Dismantle an APC Bomb which has been around for over a year. I am supprised and shocked that such a information is hidded so well and not mentioned anywere in the docs. Moreover, I went through the APC code again after reading this post (I went through it once when i started analysing the problem) and it seems that this is not even close to being resolved. there are no patches and no TODOs and nothing of the sort. From reading the code the entire user-cache needs a major re-write. What gives?

(this is a post i wrote a couple of months ago, never had time to finish it. Unfortunately this is still not fixed afaik)
EDIT (07/2009): reports this to be fixed. If anyone can confirm this please send me a note so that I could update this post