Announcement

Collapse
No announcement yet.

Random Network Outages

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Random Network Outages

    Hi

    I've asked this Q on the MS forums and would like to ask if anyone here has an inkling of what might be happening.

    __________________________________________________ __________

    We have a networking issue that I hope someone may be able to help me with.

    We run a Windows domain - single domain, single subnet. Three domain controllers: Windows 2012 R2 Standard, Windows 2008 Standard (32bit) and Windows Server 2003 Standard. The Win 2008 DC holds all the FSMO roles. DNS is setup on the 2008 and 2012 DC's with the 2008 DC having a secondary DNS server installation. DHCP is setup on the 2008 DC. The 2003 server only functions as a DC - no roles apart from that are installed and it will be demoted very soon.

    Two member servers: Windows Storage Server 2008 (64bit) and Windows 2012 Standard. The Storage server hosts 99% of our data with the other 1% being on the 2008 DC. Two email server software installations (not Exchange) - one on the storage server and one on the 2012 R2 member server. NPS Routing and Remote Access is configured on the 2012 member server to handle VPN connections.

    35 client PC's: 34 run Windows 7 and one runs Vista.

    We have two Draytek routers on the network - one acts as the gateway to the Internet and the other provides wireless coverage. There are two networked printers - a Ricoh MFD 'workgroup' printer and a small mono Brother.

    Network shares are accessed via DFS. The Servers, the Ricoh printer and routers have static IP's, most of the clients and the Brother printer use DHCP.

    All cables terminate at one of two patch panels which then feed to one or more switches. Small desktop switches are used to expand the network where needed. The network is divided into two segments, hence two patch panels, but both run under the same 192.168.0.xxx/255.255.255.0 subnet.

    Before we upgraded all our computers to Windows 7 the network was fine. The present network was built using CAT5 cabling in 2004 (we'd used BNC before that). We rarely had any network issues and when we did it would affect all clients. When I first introduced Windows 7 it was on three PC's and one or more of them would randomly have problems accessing the network and Internet.

    When I upgraded all our machines to Windows 7 we are seeing one or more machines experiencing network problems most days.

    What happens is the (any) computer will suddenly stop - Applications accessing files across the network e.g. email and Access will hang and report as (not responding) and the 'busy' cursor appears. Try and save an office document and the busy cursor appears and the application hangs. The Start menu is not accessible - for example I always have my Taskbar hidden and when this happens on my machine moving the mouse to the bottom of the screen does nothing, the Taskbar stays hidden. Sometimes, the Taskbar may appear, but nothing happens when the Start button is clicked. When trying to access shared folders via Computer the green bar slowly moves through the Address Bar and after a while it reports the share is not accessible. The affected computers are also unable to open any web pages.

    The hang will last for anything from 20 secs to a minute or more after which the computer will continue operating normally. On very rare ocassions the the computer will not recover after 10mins or more and I have to force a shutdown but I assume this is not directly related to the issues I am seeing.

    The short outages are completely random and leave no trace of a problem in the System or Application Logs. When a machine does not recover (which is very rare) the System Log reports that a DNS server could not be reached.

    This will happen on a single PC and others will be fine. But, it may affect several PC's over the course of a day.

    When the outage happens I can Winkey+R, open cmd.exe and successfully ping the DNS servers by IP and name.

    All the systems are up to date. I ensure that Windows Updates are installed on the Servers when they are released and the clients are updated the next day via WSUS.

    I am pretty sure that this is also affecting Active Directory. I am seeing transient errors on the Domain Controllers where, for example, a Global Catalog can not be contacted. Both DC's are GC's and when I run nltest to test the connection to a GC within 30 minutes of the error being reported the test passes. I assume the servers are also experiencing these random outages.

    The problem I have is that because they are random and because, on the clients at least, no errors are logged I cannot reproduce the problem and have no idea what may be causing the issue.

    The problem is not a general network issue as it randomly affects one client at a time - if was a general network problem I would expect all the clients to lose connectivity.

    Has anyone else seen a similar issue and know what the cause was, please?

    Thanks!
    A recent poll suggests that 6 out of 7 dwarfs are not happy

  • #2
    Re: Random Network Outages

    You've got 3 different versions of AD installs going on there. What functional level are you running at?

    If you have 3 DCs and have your GC on more than 1, it has to be on all the DCs. If you're going to demote your 2003 DC, do it sooner rather than later.

    A way to see whether there is a DC causing your problem, try verifying which DC is authenticating the failing PC. Easy way to do this is open a cmd prompt as an admin and type 'echo %logonserver%'. This should give you the name of the DC which the PC is using for domain authentication. If the PCs which fail all refer to the same DC, start digging there.
    *RicklesP*
    MSCA (2003/XP), Security+, CCNA

    ** Remember: credit where credit is due, and reputation points as appropriate **

    Comment


    • #3
      Re: Random Network Outages

      Thanks for that.

      Yes - I had wondered if having 3 different OS's for the DC's might have a bearing on this. The Functional Level is 2003 because that's the lowest that's present. That DC will be demoted within the next couple of weeks.

      All the DC's are GC's.

      I will ask staff to determine their logon server when this happens.

      Thanks again.
      A recent poll suggests that 6 out of 7 dwarfs are not happy

      Comment


      • #4
        Re: Random Network Outages

        The 2003 DC was demoted last weekend and the domain and forest functional level has been raised to 2008.

        Staff have been keeping tabs on the outages since the demotion and the logon server varies so I guess it's unlikely that this can be attributed to a particular server.

        I'm beginning to suspect it is a particular application that we frequently use which is responsible for this but I need to make sure.

        Thanks again for the assistance
        A recent poll suggests that 6 out of 7 dwarfs are not happy

        Comment


        • #5
          Re: Random Network Outages

          are you using redirected desktops/profiles etc?
          have you checked disk queue lengths on the servers?
          (also, are these manageable switches, ie can you check what's going on with them?)
          Please do show your appreciation to those who assist you by leaving Rep Point https://www.petri.com/forums/core/im.../icon_beer.gif

          Comment


          • #6
            Re: Random Network Outages

            Thanks for the input.

            We don't use redirected profiles/desktops.

            The switches are unmanaged so we can't check those However, they have all been replaced during the last two or three years.

            Using perfmon, average disk queue lengths for our data server show regular spikes for the C:\ drive which vary from 10 - 40 and 100. They do not plateau unless I move data from C: to D:. Disk Queue lengths for the drive containing our data rarely go above 40 and average between 10 and 20.

            The data server is a PowerVault NX3000. The 80GB C: drive is mirrored (2 drives) and the 2.7TB data drive is striped (4 drives).

            Perfmon shows negligible disk queue lengths for the domain controllers.

            I have created a data collector set and set the sampling rate to 1 second and scheduled it to run for the next hour - but I have to say that while I've used perfmon in the past my experience with it is minimal so I don't really know if this will be useful or if it is too much.

            Thanks again for the help. I'll post back later when I have more info.
            A recent poll suggests that 6 out of 7 dwarfs are not happy

            Comment


            • #7
              Re: Random Network Outages

              Hi, I've seen Antivirus/Security software cause this before. I think the work around was to add some exclusions or remove the product.
              Please remember to award reputation points if you have received good advice.
              I do tend to think 'outside the box' so others may not always share the same views.

              MCITP -W7,
              MCSA+Messaging, CCENT, ICND2 slowly getting around to.

              Comment


              • #8
                Re: Random Network Outages

                Thanks for that. The problem I have is that this is truly random and there are no log file entries indicating that a problem has occurred. Yesterday, my PC worked fine all day long but the two days before that saw two and three periods of unresponsiveness. While I can't take the chance of disabling our security software I will contact them and ask if they are aware of such an issue. Would you PM me the name of the software you saw this happen with, please?

                For the record, my PC is an i7 Quad 3.2 (8 cores) with 8GB RAM and a dedicated graphics card. The data server has a Xeon processor (8 cores) with 12GB RAM. There are only 35 clients on the network and it is rare that more than 25-30 of those are being used at the same time.
                A recent poll suggests that 6 out of 7 dwarfs are not happy

                Comment


                • #9
                  Re: Random Network Outages

                  Hi, I've seen it happen with comodo internet security and earlier versions of sophos on some xp machines. I've seen it with comodo recently and sophos it was a few years back.
                  Please remember to award reputation points if you have received good advice.
                  I do tend to think 'outside the box' so others may not always share the same views.

                  MCITP -W7,
                  MCSA+Messaging, CCENT, ICND2 slowly getting around to.

                  Comment


                  • #10
                    Re: Random Network Outages

                    Thanks a lot for that. We use Sophos here so I will ring them next week and ask if they are aware of any particular configurations that may result in this behaviour.
                    A recent poll suggests that 6 out of 7 dwarfs are not happy

                    Comment


                    • #11
                      I've spoken to Sophos about this and they have looked at diagnostic logs collected during normal operation and immediately after a freeze. Unfortunately, they did not see anything Sophos related, nor any other network related issues in the logs.

                      This is really frustrating. Some machines on the network remain unaffected.
                      A recent poll suggests that 6 out of 7 dwarfs are not happy

                      Comment


                      • #12
                        Ok, a new development. There is definitely something odd going on here. Our Internet speed appears to have dropped - but it hasn't. I measured our upload and download speeds via a PC connected directly to the gateway router. The download speed was 4Mb and the upload speed measurement timed-out. The router reports a download speed of 18Mb and an upload of 10Mb - this was verified by our ISP at the same time over the phone.

                        There is something on our network that appears to be throttling the traffic and I wonder if it is also responsible for the random outages.

                        As some of you are aware the cabling, patch panels and switches were tested last September. I've looked at the switches and all the lights are blinking - there are no solid lights indicating a massive amount of traffic from any source.

                        I can see plenty of symptoms, but no cause

                        Can anyone recommend a utility that can analyse traffic across the entire network? Our network comprises a single subnet - 4 servers and 35-40 clients.

                        Thanks.
                        A recent poll suggests that 6 out of 7 dwarfs are not happy

                        Comment


                        • #13
                          Originally posted by Blood View Post
                          ... Can anyone recommend a utility that can analyse traffic across the entire network? Our network comprises a single subnet - 4 servers and 35-40 clients.
                          I'm installing a Solarwinds product and hope it can help me identify what may be causing this.
                          A recent poll suggests that 6 out of 7 dwarfs are not happy

                          Comment


                          • #14
                            Do you have anyone connecting to the network at various times and may be causing a Broadcast Storm? That would make it look like the network is down but would magically start working when the disconnected. :clutching-straws:
                            1 1 was a racehorse.
                            2 2 was 1 2.
                            1 1 1 1 race 1 day,
                            2 2 1 1 2

                            Comment


                            • #15
                              Thanks - I'll keep an eye on that. We do have regular VPN users. And, staff use the wireless network for their smartphones/tablets to browse the Internet (not their work files), and we have laptops that rarely connect wirelessly. The SolarWinds product was a dud - it's meant for networks with Cisco routers or high-grade routers and where several networks are connected. No use at all on our humble network

                              I need something that can monitor all our PC's. Spiceworks has a bandwidth monitor plugin but I need something rather more accomplished - logging etc so that I can look at historical usage after the event.
                              A recent poll suggests that 6 out of 7 dwarfs are not happy

                              Comment

                              Working...
                              X