Cisco Pix/Windows 2008 based NLB webfarm TCP issues

    Hi all,

    I have a 2 node web farm set up on Windows Server 2008, using IIS7 and the Windows NLB service.

    This works fine on the local subnet, traffic is distributed evenly and session state is maintained. The problem lies when accessing the sites from the internet. The Cisco Pix (using a 515E (6.3) but this issue has also been replicated with a 501 (6.3(5)) for testing purposes) drops packets on what, at first glance, appears to be a random basis. Upon further investigation it would appear that even the slightest latency on the inbound route is causing the TCP connection to timeout on the NLB cluster, in turn causing the pix to drop packets assuming that the recurrent presence of SYN bits in the layer 4 header reflect a SYN attack.

    There are a number of different Sites and Services set up on the farm all using unique addresses, all of which are subject to static NAT on the Pix. The NLB service is using Multicast and after searching google I have already ruled out the known ARP issues. I've also ruled out the known (F/B)ECN issue as Cisco report this as having been fixed in Version 6.1 of the Pix software. Turning off FIXUP made no difference either.

    I have built a single instance web server and this is working fine across the firewall but of course leaves us with no automated redundancy. Again, within the local subnet the NLB based farm works fine.

    I thought I'd ask if anyone has had a similar issue to this, and can offer advice, before I attempt making changes to the TCP parameters in the Registry or decide to spend 8000 on the Cisco ACE.


    Last edited by spramwell; 2nd March 2009, 12:59.