No announcement yet.

LSASS.EXE + 30K users + CPU/RAM help

  • Filter
  • Time
  • Show
Clear All
new posts

  • LSASS.EXE + 30K users + CPU/RAM help

    Hello all. Wondering if anyone can provide any direction on LSASS.EXE eating up 2GB Mem Usage and using cpu of anywhere from 10% to 50% CPU.

    This is a rather large enterprise setup with >6 DC/GC's, 30,000 user accounts, and >50 Exchange 2007 servers.

    Obviously, the requirements are a bit high here, but in the past month we have also noticed new MS SCOM 2007 alerts on high CPU usage above the set threshold notices over range of sample tests from SCOM.

    All the SCOM stuff aside, manually checking these DC/GC's I can visually verify that LSASS.EXE is a bit higher than I have seen in most any environment.

    Can anyone relate to this situation and is there a recommendation for reigning LSASS back in to a usable level of usage?

    Any feedback is appreciated.
    Thanks in advance,

  • #2
    Re: LSASS.EXE + 30K users + CPU/RAM help

    2gigs is the maximum RAM LSASS will use. It would be nice if it could use more, what's the problem with LSASS using 2gigs? The more it uses the faster it can respond...with 30k users it's totally normal.

    As for high CPU usage, well you have 5000 users per GC?

    I mean you do have 50 exchange servers for 6 global catalogs. My guess is that they are simply overloaded, I usually run way more domain controllers than that for 30k users. Considering LSASS will only use 2gigs of ram, it's always better to have more "medium-sized" DCs than a few huge and powerful ones.

    Are you even sure the domain can withstand a DC or two going down? The load might become so high for the other ones that latency will become major..
    VCP on vSphere (4), MCITP:EA/DBA, MCTS:Blahblah


    • #3
      Re: LSASS.EXE + 30K users + CPU/RAM help

      A lot of times High CPU LSAS is caused by incorrect LDAP queries
      that5 do not use indexed attributes.
      is your high cpu are on all DCs?
      is it alll the time at 100 % cou or only part of the time?


      • #4
        Re: LSASS.EXE + 30K users + CPU/RAM help

        What is the size of your DIT ?
        Review the following KB about memory usage by LSASS on DCs:

        Depending on the size of your DIT, you might need to use /3GB switch to be able to cache the DIT into RAM without paging (on W2K3 with /3GB switch you can cache around 2.6GB of DIT in RAM)
        Guy Teverovsky
        "Smith & Wesson - the original point and click interface"


        • #5
          Re: LSASS.EXE + 30K users + CPU/RAM help

          Originally posted by gepeto View Post
          Considering LSASS will only use 2gigs of ram, it's always better to have more "medium-sized" DCs than a few huge and powerful ones.
          The limitation exists only on 32bit OS. Instead of adding more 32bit DCs, you can switch to 64bit OS/hardware which scales much better and does not have the limitations imposed by 32bit architecture
          Guy Teverovsky
          "Smith & Wesson - the original point and click interface"


          • #6
            Re: LSASS.EXE + 30K users + CPU/RAM help

            Yes, a 64bit switch is definitely a good idea too. Maybe add a 64bit DC or two and then migrate the rest.
            VCP on vSphere (4), MCITP:EA/DBA, MCTS:Blahblah


            • #7
              Re: LSASS.EXE + 30K users + CPU/RAM help

              * LSASS is the process that makes a DC a DC. Although it runs on all Windows operating systems since Windows NT 4.0, on a DC it is responsible for almost all domain-related functionality. This includes the following:
              * Netlogon
              * Kerberos / NTLM / SSL
              * SAM Access (Servers and DCs)
              * IPSEC
              * Knowledge Consistency Checker (KCC)
              * AD Replication
              * LDAP
              * Windows NT 4.0 Replication
              * Auditing / Access Checks / Creates Session Tokens
              * Domain Policies (Lockout / Password Age)

              LSASS is a process just like any other process. It actually runs in user mode and not kernel mode, and has been a significant misconception. Its process threads are scheduled like any other thread. LSASS consists of only one executable, LSASS.exe, while everything else is done with DLLs. Additionally, LSASS can use up to 2 GB of virtual address space and can cache up to 2.5 GB of directory data with the /3GB switch enabled on Windows 2003 DCs.
              For the best scalability of your domain controller as a domain controller, the LSASS process should consume the majority of the processor usage. %Processor Time should be within 5 to 10 percent of %Processor Time for the LSASS process.
              In environments where the Active Directory database is large enough to warrant it, LSASS.exe should be the biggest consumer of VA space. For more information about enabling /3GB.

              * LSASS will try to cache as much of the AD database in RAM as possible. The more of the database that is cached in RAM, the less frequently LSASS must go to disk to retrieve the data. Accessing data in RAM is always much faster than RAM. Thus, any application or configuration option that forces LSASS to give up RAM can have a negative effect on the performance of the application.
              * Configuration options that will improve the ability of LSASS to cache the database include using the /3GB switch in the boot.ini, upgrading to Windows 2003, and removing unnecessary data from the database (for example, DLT objects, stale accounts, leverage SIS in 2003). Also, on Windows 2003, the In Memory cache can grow significantly larger than on Window 2000.
              * To service incoming requests, LSASS generally requires about 200 to 300 MB of overhead, depending on how busy the DC is. Thus, the Virtual Memory consumed by the LSASS process should be approximately the size of the database, plus LSASS overhead, up to the limits identified in the Knowledge Base article 308356.
              * LSASS VM sizes top out at approximately 2.8 GB on x86 platforms running Windows 2003 with large memory model (/3GB) enabled. On 64-bit systems, the ability of LSASS to cache the AD database is effectively limited by the amount of RAM that can be installed on the hardware.
              * LSASS will release cache if other applications require the memory. The worst culprits are applications that have memory leaks or that store large data structures in memory (that is, other database applications). Servers should be constantly monitored for any indications of increased memory consumption that do not also have any corresponding decrease. Also, significant changes in memory consumption during the operation of the server can indicate that other applications are running, such as backups. These should be investigated to ensure that they are not occurring during operating hours.
              * Free RAM on the system as well as page file usage should be monitored to determine whether there is enough RAM in the box. The actual amount of RAM needed in the box is specific to the environment. However, on the x86 platforms, any investment in excess of 4 GB of RAM will not provide any return.
              * Kernel address spaces should also be monitored for early warning of any risks to system stability. Paged pool and Non-paged pool should be monitored for steady increases in usage, with no corresponding decreases in usage. Again, this would indicate that there are applications (often drivers) running in these memory locations that are leaking memory. Exhausting these memory locations will cause the server to stop responding to user requests and, most likely, will crash.
              * Paged pool can grow to 450 MB on systems up to and including Windows XP, and grow to 650 MB on Windows 2003 systems. Non-paged pool can grow to 256 MB, except when the large memory model (/3GB) is used. In this case, it is limited to 128 MB.
              Last edited by Akila; 19th October 2008, 17:23.


              • #8
                Re: LSASS.EXE + 30K users + CPU/RAM help

                Hello again all, and thanks for the replies. I should done a bit more research over the past 2 months since I started this thread and should have noted a few additional things. This entire enterprise is all 64bit servers, and most all of them are clustered. We have 7 AD DC/GC servers, a few dozen Exch 2007 CAS servers, a few dozen Exch 2007 HUB servers, and about 50 Exch 2007 OAB servers (since they each hold only 1000 OAB's without hacking the servers).

                To answer a few of the above questions:
                LSASS.EXE is running 4GB RAM on all AD GC servers, at all times.
                The DIT is only at 2.5GB in size.
                ADAM servers are on a separate workgroup handling all traffic inbound from external sources.

                The persistent issue:
                We still cannot pin down why and where this LSASS.EXE spike is coming from exactly. We see it for about 4-5 minutes at a time, spiking to 100%, and then resuming back to about 3% for a while. We have not seen an exact increment of time when this seems to re-occur so we cannot assume that AD replication, a forced scheduled-task to sync AD every 15 minutes, or specific LDAP queries are the culprit. Is there anything I can use to see where the incoming load on LSASS is actually coming from? Also, is there any documentation on why/when/how LSASS handles incoming queries/security/policies and when/why/where is unloads this data it is holding? What I mean in this last statement is: Is LSASS accepting and accumulating all the queries and security and policies on a specific time interval and then is it set to unload or write-out this data to the AD to be replicated to the other GC's for replication to occur? This is just one of the theories I am working on at the moment. Any advice or insight is appreciated.