Announcement

Collapse
No announcement yet.

server 2003 blue screens always at same time

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • server 2003 blue screens always at same time

    I have a Poweredge 2950 running Server 2003 x64.

    It blue screeens every 18 to 20 days always just before 4 AM. since January.

    The error is always very similar:

    Error code 0000000000000050, parameter1 fffffadfba868ef5, parameter2 0000000000000000, parameter3 fffffadf292bc290, parameter4 0000000000000000.


    Error code 0000000000000050, parameter1 fffffadff633e28e, parameter2 0000000000000000, parameter3 fffffadf29289290, parameter4 0000000000000000.

    Error code 0000000000000050, parameter1 fffffadf6e2b0518, parameter2 0000000000000000, parameter3 fffffadf292bc290, parameter4 0000000000000000.


    Exchange 2007 is running on the server.

    I ran the Dell server update utility and the BIOS/firmware/drivers are up to date. All MS patches are up to date.

    It has a perc 5 controller with a RAID 5 array. It also has a Dell RD1000 internal removable hard drive.

    When I first experienced this issue I saw that the RAID controller was changing from write back to write thru because the backup battery was low. I replaced the backup battery.


    I ran the Dell hardware diagnostics and no errors were reported. I rearranged/reseated the memory modules.

    windbg gives this when I analyze the minidump:

    Probably caused by : disk.sys ( disk!memcpy+60 )

    There is 40% free space on C: and D: . The Exchange store is on D: Just to clean things up and for troubleshooting I created a new store database and moved all of the mailboxes to it.

    What seems unusual is why is the blue screen always at the same time and at a similar interval?

    Backup is by Storagecraft Shadowprotect. No jobs run from midnight to 6 AM to allow for Exchange auto maintenance. The Exchange maintenance was set to end at 4 AM but I moved the start and end back one hour to 3 AM end.

    Any thoughts are appreciated.

  • #2
    Re: server 2003 blue screens always at same time

    My first impression is hardware fault. We've got several of nearly the exact same model server config as you, less the RD1000. And we've never seen anything like this.

    Assuming you've gone back since before January and looked at whatever you've changed since then and not found anything, I'd say take the RD1000 out of the equation. That's the one unique piece of kit between your setup and mine, so let's start there.
    *RicklesP*
    MSCA (2003/XP), Security+, CCNA

    ** Remember: credit where credit is due, and reputation points as appropriate **

    Comment


    • #3
      Re: server 2003 blue screens always at same time

      Thanks for the reply. My instinct is a hardware problem too but why would a hardware problem be so precise? Every blue screen is always within minutes of 4 AM and about ~ every 18 days. Every time I have seen a hardware issue there is almost randomness.

      The problem with getting the RD1000 out of the picture is it is the backup target for this server and another server. It is internal so it would take at least 18 days after I power the server back up to know if it was the problem.

      Windows could be doing something at 4 AM which might involve the RD1000 but the only thing I know of that writes to it is the backup software. It doesn't run from a little before midnight until 6 AM.

      Comment


      • #4
        Re: server 2003 blue screens always at same time

        Based on your original post, it looks like you may have scsi hardware issues:
        .windbg gives this when I analyze the minidump:

        Probably caused by : disk.sys ( disk!memcpy+60 )
        That points at either your RAID 5 array or the RD1000, or the controller(s) that access them. Or, heaven forbid, something associated with the memory bus access to the controllers. Are they both on the same PERC controller, or are you using different interfaces?

        Can you afford the time to at least temporarily remove the RD1000? Just long enough to reboot the server to re-run the windbg test, see if the same result comes back. If something changes, you're on the right track. If not, you've got nearly 3 weeks to come up with something else. As an interim measure you might consider scheduling a restart task once every 2weeks, on a Sunday maybe, and see if the 18-20 day fail period stays constant. I know that generally it's considered bad form to restart a server without just cause, but if it helps maintain system stability until you can come up with a smoking gun, won't that be worth it?

        Just had a quick search try with our friend . Found this link:
        http://forums.techarena.in/windows-s...elp/927981.htm

        Not terribly info-rich, but see if this rings any bells with you.
        *RicklesP*
        MSCA (2003/XP), Security+, CCNA

        ** Remember: credit where credit is due, and reputation points as appropriate **

        Comment


        • #5
          Re: server 2003 blue screens always at same time

          The RD1000 is just a laptop hard drive in a cartridge. The SATA cable plugs into a separate socket on the motherboard. I don't think the perc controller is used.

          windbg is the Windows debugger. You can use it for crash dump analysis. In this case you load the minidump file and it analyzes it.

          I agree with the idea in the link you provided. I have never seen a hardware issue be so precise. This happens every 18~20 days just before 4 AM. If it was just hardware you would think it would be a lot more random.

          Comment


          • #6
            Re: server 2003 blue screens always at same time

            I was aware of the windbg app's use, the interesting bit is that it calls out the disk.sys issue. As it appears to be the only suggestion of a problem area at the moment, it seemed reasonable to suggest taking it out of the equation just long enough to re-run the windbg and see if that same error flags up again. Is there a window of time to allow you to do that with a reboot and then run the test, before your next backup cycle?

            Without more info, I'm afraid I can't suggest anything else just now.
            *RicklesP*
            MSCA (2003/XP), Security+, CCNA

            ** Remember: credit where credit is due, and reputation points as appropriate **

            Comment


            • #7
              Re: server 2003 blue screens always at same time

              Thanks for the ideas. Were you thinking that disk.sys is used only to talk to the Rd1000 and not the disks that are on the perc controller? I assumed disk.sys was also used to talk to the hard drives on the perc controller. Thats why I wasn't focusing on the rd1000. Do you think my assumption was incorrect?

              Comment


              • #8
                Re: server 2003 blue screens always at same time

                Disk.sys is used for all of the disk activity, not just one or the other devices. The idea is to eliminate one or the other from the system and then re-run the windbg to see if the same results show up. I mentioned the RD1000 first, simply because it sounded as if it was something that could be temporarily removed without causing the users any down-time. That's all. You've got to start somewhere, and that sounded like the easiest thing to try.
                *RicklesP*
                MSCA (2003/XP), Security+, CCNA

                ** Remember: credit where credit is due, and reputation points as appropriate **

                Comment

                Working...
                X