No announcement yet.

network problem - outbound tcp

  • Filter
  • Time
  • Show
Clear All
new posts

  • network problem - outbound tcp

    We are having a very..strange issue with a client:
    Every 3-5 days, AD becomes unaccessable from the server. After a reboot, it will work flawlessly, for 3-5 days and then AD will be come unaccessbale. Shares will be flakey at best. Here is what is interesting: inbound TCP/UDP/ICMP works: but outbound TCP doesnt(ICMP/udp do).

    I believe the AD not responding is a symptom not a cause. When er attempt to run dcdiag we get the error 58 cannot ldap bind. When we run netdiag, ldap bind fails with outofmemory error. This has to be some sort of stack/tcp memory. There are a few errors in the event viewer relating to netlogon: not enough storage to process. The server has plenty of RAM available, pleanty of HD space, etc. It is running R2 SP2 x86-32.

    We have tried increasing the userports in the registry per microsoft, and no change in behavior. It seems that ANY outbound TCP connections immediately fail, but all inbound function, and outbound udp/icmp function as well. I have updated the driver for the NIC, and do not believe it is a ToE issue since inbound TCP connection work, and outbound are good for a couple of days. Any advice would be helpful.

  • #2
    Re: network problem - outbound tcp

    This might be overkill but you could try resetting TCP/IP with netsh int ip reset c:\resetlog.txt ( We had a client that couldn't ping or establish a VPN but everything else worked fine. We tried updating the NIC drivers, uninstalling/reinstalling the NIC, uninstalling third party software but it wasn't until we did the reset that ping and VPN worked again.


    • #3
      Re: network problem - outbound tcp

      also try a different cable, different switchport, and different physical nic if possible.

      And when it stops responding, check your ARP caches on your switch, to see if something else has taken it's arp address maybe ?
      Please do show your appreciation to those who assist you by leaving Rep Point


      • #4
        Re: network problem - outbound tcp

        well, we have narrowed it down. Restarting the print spooler seems to fix the issue, but I am 99% sure that within 2-4 days the problem will return. I do not get why the print spooler would hose up outbound TCP connections....

        MS is looking into it to see if there is a corrupt driver.


        • #5
          Re: network problem - outbound tcp

          That seems pretty strange.

          We had a server that kept crashing, updated all of the printer drivers and it hasn't done it since...

          Might be worth a try. I also seem to remember deleting the spool files in windows/system32.


          • #6
            Re: network problem - outbound tcp

            How exactly did you narrow it done to the print spooler? Are you using TCP/IP ports for any of the printers?


            • #7
              Re: network problem - outbound tcp

              Hello rpcblast !

              Did MS find something ? Do you have a solution ?

              This pb looks like the one described here (no answer, no luck) :
              No buffer space available" Socket Errors on Windows Server 2003 SP2 :

              We have the same pb with a customer, on 6 servers. Most of them run SQL-based applications.
              We intend to disable TCP Chimney offload using netsh in order to delay reboot / then apply MS SNP Hotfix Rollup Package.

              OS : Windows server 2003 R2 Standard Edition SP2
              Task Offload Engine enabled (default registry keys/values).
              NICs : GigaEthernet - 2 NICs/server each connected to separate VLAN (no NLB)
              Broadcom BCM5708C NetXtreme II and Intel PRO/1000 MT
              NIC driver Offload capabilities enabled (checksum and Tx/Tx)

              Pb/symptoms described below occurs about 4 weeks after reboot.
              Pb/symptoms described below all disappear immediately after reboot, then will re-appear in 4 weeks.

              Symptoms when pb occurs (I hope the details provided here may help somebody sometime) :

              Outbound TCP attempts fail immediatly (Sniffer shows no SYN TCP sent)
              Examples :
              Outbound ftp open fails : ftp: bind: no buffer space is supported.
              Outbound http immediately fails : cannit display page
              Outbound telnet immediately fails, on any address/port
              Outbound TSE immediately fails
              Netdiag DC list test fails :
              [WARNING] Cannot call DsBind to FQDN (x.x.x.x). [ERROR_OUTOFMEMORY]
              portqry -n <local-server> -e 135 -i fails with error opening socket: 10055

              Inbound/outbound UDP and ICMP are OK.
              Inbound TCP is OK.

              Many applications that rely on TCP are broken, communication with DC as well.
              The applicative error messages vary depending on the application.
              Example : "The RPC server is unavailable"

              Event viewer logs Application Uservenv ID 1053 - "not enough storage is available to complete this operation" in description
              Event viewer logs System Netlogon ID 5719 - "Not enough storage is available to process this command" in description.

              The Non-Paged Pool used smootly increases until pb occurs and reboot is needed. But there's no obvious memory leak.
              There's enough free memory (at least 25% of total RAM) , the pagefile is 70% free.

              My feeling is that some queue or buffer space used for TCP transmission becomes full .


              • #8
                Re: network problem - outbound tcp

                Hello !

                The previous issue is fixed.

                The culprit seems to be F-Secure Antivirus for Windows Server V7.00 rev 213.
                Under certain circumstances, the F-Secure Alert and Management Extension Handler (FAMEH32.EXE) process opens handles but never closes them.

                Here's the scenario :

                The six servers showing the issue are configured with F-Secure AV SMTP alerts. The specified SMTP server address is incorrect, so that when one tries to open an SMTP connexion, it gets no answer. The default values of F-Secure mail alert are : retry interval = 10 minutes, and the mail is retained for a maximum duration of 30 days.
                The other servers run the same version of F-Secure AV, but they are configured without SMTP alerting.

                When a server reboots with F-Secure SMTP alerting (with incorrect SMTP server IP address), the FAMEH32.EXE begins to consume an increasing number of handles. The number of these handles keeps increasing so after a few days, FAMEH32.EXE becomes the top handles consumer process, with thousands of open handles. This can be seen using taskmgr.
                Process explorer (a Microsoft sysinternals utility) shows that most of those handles are \Device\Afd and \Device\Tcp handles. Sample below.

                29E0: File (---) \Device\Afd
                29E4: File (---) \Device\Tcp
                29E8: File (---) \Device\Afd
                29EC: File (---) \Device\Afd
                29F0: File (---) \Device\Tcp
                29F4: File (---) \Device\Tcp
                29F8: File (---) \Device\Afd
                29FC: File (---) \Device\Tcp
                2A00: File (---) \Device\Afd
                2A04: File (---) \Device\Afd
                2A08: File (---) \Device\Tcp
                2A0C: File (---) \Device\Tcp
                2A10: File (---) \Device\Afd

                A network trafic capture done with wireshark shows that each time the number of handles opened by FAMEH32.EXE increases, the server sends three TCP SYN packets with a destination port 25 and IP address of the "rogue" SMTP server. These TCP SYN packets get no answer.
                netstat -a shows that the number of TCP connexions (whichever state) is pretty low and does'nt increase.

                In the end, no more outgoing tcp connexion can be done.
                The network applications are "broken".
                => reboot the server.

                The final solution = suppress F-Secure SMTP alerting or use a correct IP address for the SMTP server and ensure this SMTP server is properly working.

                More details :

                This issue has been submitted to the F-Secure Support, but the version 7 of F-Secure AntiVirus is not the current version, and F-Secure Support did'nt say if this issue exists with current versions of this product.
                F-Secure Ticket ref : SR ID:1-153539593 "Handle leak in FAMEH32.EXE ?"

                I think that the number of 'tcp ephemeral' ports pool is exhausted because FAMEH32.EXE keeps opening "tcp" handles without closing them.
                When the tcp ephemeral ports pool is exhausted on a Windows Server, no new outgoing tcp connexion can be established.
                The various network applications show specific errors/behaviours.
                For example : ftp : bind : no buffer space is supported

                This can be easily reproduced by pre-allocating most or all outgoing tcp ports. This can be done with the following register key :
                \HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Servi ces\Tcpip\Parameters\ReservedPorts

                For Windows Server 2003, the default ephemeral tcp ports pool is the number of values in the interval [1024-5000]. So, by default, less than 4000 concurrent tcp outgoing connexions can be done.
                With the 'ReservedPorts' value [1024-4990], only 10 concurrent tcp outgoing connexions can be done. This register key can easily be set to reproduce/simulate such outgoing tcp connexion issues.

                Some References :

                Microsoft Article ID: 812873 "How to reserve a range of ephemeral ports on a computer that is running Windows Server 2003 or Windows 2000 Server"

                Article ID: 937535 "Event ID 1053 is logged when you use the "Gpupdate /force" command, or you restart a Windows Server 2003-based domain controller"

                MaxUserPorts and ephemeral tcp ports range :

                Article ID: 196271 "When you try to connect from TCP ports greater than 5000 you receive the error 'WSAENOBUFS (10055)"