Sunday, October 4, 2009

How to do rewrites in an IIS ISAPI

Note to self: If ever you need to do a rewrite in an IIS ISAPI Filter you just need to re-set the pHeaderInfo's url header. Something like this:

 
pHeaderInfo->GetHeader(pCtxt->m_pFC,
"url",strUrl.GetBuffer(dwUrlSize+1),&dwUrlSize);
strUrl.ReleaseBuffer();
strNewUrl = GET_REWRITE_FOR_URL(strUrl);
if ( strNewUrl.IsEmpty() == FALSE ) {
pHeaderInfo->SetHeader(pCtxt->m_pFC,
"url", (LPTSTR)(LPCTSTR)strNewUrl);
}
return SF_STATUS_REQ_NEXT_NOTIFICATION;

This is working in IIS4 and above.

Note that if you really need a good re-write/redirect filter for IIS you should look at IIRF ,but since it's not working for IIS4 i had to do this manually

Thursday, June 18, 2009

Bing, Google, Nutch and Canonical URLs

Google announced a few months back that they started supporting canonical urls. This is a great feature that we already adopted in a couple of sites (mostly to keep tracking codes from messing with our search engine results.

Unfortunelty, altough Microsoft announced they will support this feature in Live, this was not yet implemented in Bing as of now (Jun 2009).

Nutch, on the other hand, had this on thier TODO list for 1.1.

Also, Google Search Appliance (GSA), is currently assumed by all to not support this (although nobody really knows if it does).

More about canonical URLs:
http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
http://searchengineland.com/canonical-tag-16537
http://janeandrobot.com/library/url-referrer-tracking

Monday, June 15, 2009

many group by strategies to select single rows per group

Xaprb gives a good explanation for selecting a single row from a group: How to select the first/least/max row per group in SQL at Xaprb.

Wednesday, June 3, 2009

apc futex_wait lockdown make your apache freeze over

We had the typical LAMP setup going on, with Drupal as the base CMS and APC for bytecode cache. We needed a good caching engine so I figured why not use APC's user cache. Well, we tried the APC Cache Drupal Module which, with minor fixes proved to work very nicely. That is, until we actually put this all thing on production.

The first thing we had was having our apache hang and not respond to any user requests. We susspected network issues, especially since netstat -na showed that all the apache processes were hanging on SYN_WAIT. However, since apache restart solved the issue i started to suspect this was something else.

To make a long (very long) story short, I got strace on our prod machines to find out that apache was either hanging on futex_lock(....FUTEXT_WAIT...) or doing infinte loops on the same functions.

To make even a longer story short, I got gdb installed on those machines and the backtrace clearly indicated that the locks were from APC user-cache calls.

We decided to abandon APC user-cache and switch to memcached which proved faster and had less lockdowns.

The funny thing is that when we talked about this over dinner the same evening a developer from another team just pointed me to this article by one of the APC leaders: (or something) How to Dismantle an APC Bomb which has been around for over a year. I am supprised and shocked that such a information is hidded so well and not mentioned anywere in the docs. Moreover, I went through the APC code again after reading this post (I went through it once when i started analysing the problem) and it seems that this is not even close to being resolved. there are no patches and no TODOs and nothing of the sort. From reading the code the entire user-cache needs a major re-write. What gives?

(this is a post i wrote a couple of months ago, never had time to finish it. Unfortunately this is still not fixed afaik)
EDIT (07/2009): http://pecl.php.net/bugs/bug.php?id=15179 reports this to be fixed. If anyone can confirm this please send me a note so that I could update this post

Thursday, May 14, 2009

GA_googleAddAttr is not defined error

If you are using Google Ad Manager you can define custom attributes for targeting.

If you are getting a javascript error GA_googleAddAttr is not defined just make sure to put the call(s) to GA_googleAddAttr in a seperate <script> line, after the GS_googleEnableAllServices.


<script type="text/javascript">
GS_googleAddAdSenseService("ca-pub-0000000000");
GS_googleEnableAllServices();
</script>
<script type="text/javascript">
GA_googleAddAttr("ATTRB1", "whatever");
</script>
<script type="text/javascript">
GA_googleAddSlot("ca-pub-0000000000", "slotname_216x311");
</script>

Monday, March 23, 2009

V6 DECODE64 not working properly

It turns out that after all these years, V6's built-in DECODE64 and ENCODE64 actually do not work properly with some binary data.

proc COMPARE_BASE64 {} {
set code {2lIU1ZNw5BYvKi79j4L/+GGmjAQK7uiBQG7elDdKZcE=}
set vgn_result [DECODE64 $code]; #VGN built-in function
set tcl_result [::base64::decode $code]; #TclLib tcl-only implementation
return [join [list \\
[BIN2HEX $vgn_result] \\
[BIN2HEX $tcl_result] \\
[string compare $vgn_result $tcl_result ] \\
] "\\r\\n"]
}
proc BIN2HEX { text } { binary scan $text H* result; return $result }
COMPARE_BASE64

--- result ----
da52145370e4162f2a2efd8f82fff861a68c040aeee881406e94374a65c1
da5214d59370e4162f2a2efd8f82fff861a68c040aeee881406ede94374a65c1
1

If you need to encode/decode base64 for use in non-vignette systems make sure you use tcllib's base64.

This was tested on Vignette's V6 but I am sure it's true for Storyserver 4.2 and V5 as well.

Thursday, March 19, 2009

MySQL indexes behaving badly

If you have a MySQL query that for no apparent reason stops working or no longer behaves as it used to, the first thing to do is EXPLAIN it. If the explain is unreasonable (for example, using different indexes then it used to or using an index but still scanning millions or rows) then you should try to CHECK TABLE.

One thing that I accidently found out was that CHECK TABLE also updates the key statistics. This made many problems magically disappear.

A few things that should be noted:
  1. CHECK TABLE will only update key statistics for MyISAM tables
  2. , for InnoDB you must use OPTIMIZE TABLE.
  3. If you use OPTIMIZE TABLE to update the key statistics you should be aware that OPTIMIZE TABLE does other (good) things as well, so it usually takes much longer and it locks your table while doing it. Also, for InnoDB it actually re-builds the table using ALTER TABLE.
  4. The CHECK TABLE have several options. Only the MEDIUM (the default) or EXTENDED update the key statistics. (i think).
So, if your indexes stop working, or just mis-behave, try to check them.