Announcement

Collapse
No announcement yet.

XP/SP2: network file operations are slow, slow, slow ...

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • XP/SP2: network file operations are slow, slow, slow ...

    After installing Windows XP service pack 2 file operations over the network are insanely slow, at least when accessing other XP boxen that do not have the plague. File operations (reading, seeking and so on) take at least ten times as long as normal if a XP/SP2 box tries to access the XP/non-SP2 box; in the reverse direction everything is normal.

    I noticed the problem because all three boxen that run WinXP service pack 2 are virtually unusable with apps that access large files on other XP boxen (peer-to-peer network). So I wrote a little test program in Delphi that just scans the IFD information of a 20 MByte TIFF file containing 100 images. If the file resides on the same box then the program takes 10 ms, over the redirector (UNC path) to same box it takes 35 ms; if the file resides on the network then the program takes 350 ms. Except if the program runs on an XP/SP2 box and the file resides on a non-SP2 XP box, in which case the program takes about 3500 ms or ten times as long as normal. This is insane, especially if you consider that even the SP2 boxen work just fine if the file resides, say, on a Win95 box. It seems that only the direction XP/SP2 -> XP/non-SP2 causes problems, although I could not run any tests against Win2K yet. Just in case anyone wonders, I am not concerned about 3.5 seconds but I am very much concerned about our production programs delivering only one tenth of normal throughput which means a job that normally takes a couple of hours suddenly takes a whole bloody week.

    In case it matters, we're normally running MS networking over NetBEUI but switching to TCP/IP did not change anything except making things slightly slower (naturally).

    Considering the curiosly specific nature of the bug it is likely that it can be fixed by something as simple as a registry entry. The other simple fix - format c:\ and reinstalling XP without SP2 - I am somewhat loath to apply, considering that my development machine is among the boxen that have contracted the plague (not a choice of mine - it was delivered that way).

    Does anybody have an idea WTF is going on and how to fix SP2?

  • #2
    Maybe a stupid question, but worth asking: have you tried disabling the built-in firewall on the XP-SP2 boxes ?
    Guy Teverovsky
    http://blogs.technet.com/b/isrpfeplat/
    "Smith & Wesson - the original point and click interface"

    Comment


    • #3
      Originally posted by Guy (Antid0t)
      Maybe a stupid question, but worth asking: have you tried disabling the built-in firewall on the XP-SP2 boxes ?
      Of course. Not that it would affect SMB traffic over NetBEUI anyway. I'll do a bit more experimentation at work today and post the results.

      Comment


      • #4
        Indeed. It's been a while since I last touched NetBEUI...

        I would fire up a sniffer and would start looking for name resolution/checksum errors. Also worth a shot to set the interface to 100-FullDuplex instead of Auto.
        Guy Teverovsky
        http://blogs.technet.com/b/isrpfeplat/
        "Smith & Wesson - the original point and click interface"

        Comment


        • #5
          Yes, the sniffer idea sounds good. I have refined my test program to give detailed timings for all operations and to use two different file access modes - sequential bulk read and sparse scan - and I also did some testing against Win2K boxen.

          To make a long story short, it is only XP SP2 systems that misbehave and it is not the hardware - the same machines work normally when booting into NT4/Win2K or a clean WinXP install without SP2. The problems occur only if an XP SP2 system opens a file on another machine but not if another machine opens a file on an XP SP2 box (which is another indication that the hardware works alright).

          There are two kinds of misbehaviour. One is more rare and it looks exactly as if file caching were non-operational and the maximum throughput divided by ten. I have not been able to study this in detail as it only happens if any of the XP SP2 systems accesses a particular box that runs XP SP1.

          The second kind of misbehaviour is common - i.e. occurs on every XP SP2 box accessing any other computer - and it involves unexplainable gratuitous delays on file open and close operations. These delays are on the order of half a second, spread somewhat randomly between 300 ms and 900 ms. In this case the timings for file seek and read operations are identical to timings on healthy systems.

          Yes, I know this looks exactly like an infection with a virus/trojan or an anti-virus/anti-trojan program (the only difference between these two kinds of malware being basically that the user installs the latter knowingly and voluntarily; both are equally bad for system stability and performance). But that was also my first idea when the problems first surfaced a while ago, and so I took the machines off the LAN and conducted a full forensic. Besides, one of the SP2 boxen is my personal box at home and this is as clean as a whistle - never connected to the Internet, only programs executed/installed that came on a bona-fide manufacturer's disk or were compiled from source. I gave it a full forensic anyway but came up with nothing amiss.

          Incidentally, the half-second delay on file close was the thing that broke our production app - it contains a bought component that does several thousand open/close ops per file, but normally this inefficiency is of no consequence as it amounts to a total of 3..5 seconds penalty per file against the total processing time of about 200 seconds per file. In the case of XP SP2 the inefficiency obviously is of consequence, and the cumulative penalty per file is on the order of 1000 seconds which is several times the raw processing time. The app has been fixed - partially by eliminating network file access via the expedient of local caching, partially by duplicating some of the functionality of the bought component with decently written code. However, I'd still like to know WTF is going on in XP SP2, especially the strange delay on file close that looks like an infection symptom.

          If I learn anything new or get around to sniffing packets I'll keep you posted, but since I committed the error of fixing the symptoms of the problem it is unlikely that I'll be able to wrangle some time for conducting a proper investigation.

          Comment

          Working...
          X