No announcement yet.

How to develop a proactive maintenance schedule?

  • Filter
  • Time
  • Show
Clear All
new posts

  • How to develop a proactive maintenance schedule?

    I am actually kinda new to this and need a little help. I have 3 Domain Controllers in our Branch office and need a little input. What type of things should I be paying attention to on a regular basis and how often? What tools are available for getting this job done?

    Please help


  • #2
    Re: How to develop a proactive maintenance schedule?

    all the tools you will need are already included with Windows server whatever version...

    that being said, your setup may not be supported by micro$oft, therefore you may need manufacturer specific tools/applications to backup data on proprietary storage devices like SANs arrays and some SCSI controllers.

    you should become very familiar with the BPA tool for all the different services offered on a windows server. read the results and follow the links and make sure you understand the errors and information you get back from the tool.

    many people will use 3rd party applications for many of the tasks that can be handled by windows natively. this is fine when the app saves time or helps automate the process, but you should know how to do it yourself without any apps. you may operate on several different networks with different standards (as i have being a contract sys admin for so many years) and some places cannot afford the 3rd party apps or dont trust them for whatever reason and you must accommodate with what you have at hand.

    be vigilant with backups. if you have an exchange server, back it up! if you have an SQL server, back it up! and back up the configs as well as the databases. its great to have a backup of a database, but if its encrypted and your AD crashes, how are you gonna get that data back without the user account?

    there are plenty of caveats and tips, but they will all require you to study and know your products forward and backwards...

    imo, this site is probably the most powerful tool you will have at your disposal. there are a solid group of highly trained professionals that are willing to provide very accurate information and instruction that would otherwise cost an arm and a leg...

    but most importantly, there is about 1,000 years of combined IT experience right at your finger tips. if your just starting out, i can guarantee that there is nothing that could happen to you that we have not experienced firsthand in our own networks. this is where the real power of this site comes in if you ask me.

    best of luck and keep your chin in a manual and your gonna do just fine.

    but if you get stuck, create me a VPN account and i will fix it all remotely for a small paypal gift!
    its easier to beg forgiveness than ask permission.
    Give karma where karma is due...


    • #3
      Re: How to develop a proactive maintenance schedule?

      thanks sounds great my friend and who knows paypal may become our best friend ?


      • #4
        Re: How to develop a proactive maintenance schedule?

        Originally posted by vesterlee2008 View Post
        I am actually kinda new to this and need a little help. I have 3 Domain Controllers in our Branch office and need a little input. What type of things should I be paying attention to on a regular basis and how often?
        These are questions most new/junior admins ask, and there's no simple, one-size-fits-all answer. What you need to monitor and maintain will vary a great deal, depending on the size and complexity of the network. (Un)fortunately for you, I'm in the mood to compose a really long post.

        In general, you have to keep an eye on hardware, software, services, data and load/capacity. You want things to run smoothly, and you don't want any surprises.

        Hardware usually means servers, infrastructure equipment and storage:
        • Most server manufacturers supply monitoring software that can keep an eye on temperature, fans, memory ECC errors, hard drive status (S.M.A.R.T. ), RAID controllers/arrays and unexpected reboots. Make sure the software is installed and configured to send alerts and notifications.
        • All critical equipment should be connected to a UPS. Again, make sure the UPS management software is installed and configured to power down the servers after a reasonable delay. Some software is also capable of performing regular battery tests.
        • Consider what the effects would be if networking equipment should fail. Decide whether a fault-tolerant setup, in-house spare equipment or some sort of service agreement could best handle such a scenario. Consider installing monitoring software to keep an eye on switches and routers; sudden reboots are a sign of impending failure, but could go undetected for a long time if you don't monitor uptime.
        • Consider replacing wear parts before they fail. Enterprise hard drives usually last around 5 years. UPS manufacturers usually recommend replacing the battery packs every 3 years. Check the manufacturers' specs and recommendations for your equipment.
        • Remember that environmental factors are likely to conspire to ruin your day. Hang a temperature probe on the wall in the server room, and put a water detector on the floor. Connect both to a system that will alert you if something goes wrong, via mail, SMS, pager, phone, klaxon or foghorn if necessary. Getting there in time is the difference between a) venting excess heat/calling the cleaners and a plumber, and b) spending the next 3-4 days and nights at work rebuilding everything from scratch.
        Software needs to be kept up to date:
        • Windows operating systems receive regular updates, and WSUS (or SCCM, which uses WSUS) makes is easy to deploy updates and generate reports. You need to know that the updates actually install successfully on both servers and workstations. WSUS also takes care of most other Microsoft software, like Exchange, SQL Server and Office.
        • Keep a close eye on the antivirus software and any other security software you may be running. Non-functional AV software or missed updates could be a sign of malware or malicious activity. Make sure you renew the licenses in time!
        • Application software may need their updates applied manually. Consider deploying updates with GPOs if possible, or use scripting. Many vendors provide detailed instructions on how best to deploy their software and any subsequent updates.
        • Remember that firmware is also software, and that routers, switches, firewalls, access points and even network printers may need updates to patch security vulnerabilities.
        • Make sure you have a backout strategy for failed updates! Occasionally an update will fail, sometimes catastrophically, so don't update all systems at once. Make sure application data is backed up before applying an update to an application. Keep installation files for older versions around until you're sure you don't need them
        Services and data needs to be kept available. Unscheduled downtime should be avoided:
        • Eliminate single-points-of-failure wherever possible. In particular, make sure you don't rely on a cheap component that could easily fail and just as easily be made fault-tolerant.
        • Use RAID. Hard drives die often and suddenly.
        • Use monitoring software to keep an eye on important services. You want to know if a service keeps crashing before the situation deteriorates to the point where the service won't start at all.
        • Keep an eye on memory consumption and CPU utilization on your servers. Keep a record of old statistics, so you'll notice (and be able to document) if something is off.
        • Are you responsible for Internet domain names or SSL certificates? Make sure the bills are paid and the certificates are renewed in time.
        Data needs to be backed up:
        • Backups need to be stored offsite, period. No exceptions.
        • Don't rely on too much manual intervention when it comes to backup procedures. People make mistakes.
        • Make sure the backup software sends regular reports. You want to be able to notice a silent failure.
        • Make sure data is backed up as often as needed, and that old backups are kept for a reasonable period of time. Consider both practical and legal issues.
        • Make sure you know the backups actually work. Perform test restores regularly, and consider doing a mock disaster recovery.
        • Make sure a restore can be performed quickly enough to get data and systems back on line within a reasonable timeframe.
        • The backup and restore operations must be documented properly. Always keep in mind that disaster may (and probably will) strike when you're on holiday or out sick.
        And then you'll need to keep an eye on the network in general, to make sure there are no problems or bottlenecks (or impending disasters):
        • Keep an eye on available disk space everywhere. A full volume can mean anything from a minor annoyance to a stopped (or even crashed) database.
        • Consider collecting statistics from servers, routers and other networking equipment to find out if you're likely to run out of capacity in the near future. Also, look for spikes immediately following application update or upgrade, or when a new system or application is introduced.
        • Monitor the event logs (and any other logs) for warnings and errors. Use an automated tool for this; no-one can be expected to read hundreds of pages of logs every week.
        Originally posted by vesterlee2008 View Post
        What tools are available for getting this job done?
        Again, that's a difficult question to answer without knowing anything about the network in question.

        If you have three Windows domain controllers with some file data, no 3rd party server applications and perhaps an MSSQL database, then WSUS + storage reports + Performance Monitor + some decent backup software + Event subscriptions (and perhaps attached tasks) + management software from the server manufacturer should cover the basics. Add something like Icinga/Nagios and Cacti (all free), and you'll be able to monitor everything from the servers right down to the office copier.