No announcement yet.

Need help with recovery from USN roll back

  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help with recovery from USN roll back

    3 servers in the domain:
    Location A:
    Windows 2000 SP4 DC (old) name –server2000
    Windows 2003 SP2 DC (server01)

    Location B:
    Windows 2003 SP2 DC ( server02) – VPN between location A and B

    SQL 2005 on server01 and server02
    All server replicate.
    All servers have global catalogue
    Production software uses SQL and replicate SQL every 20 minutes between locations
    I can’t take them down in the day – only work when they are closed (from 10 PM to 2 AM).

    Server01 SQL started to report I/O errors and hangs for 10-200 seconds.
    Checking the RAID controller (3ware – RAID1) it reported that drive 1 had a bad sector last month and it was repaired.
    Just to rule out drive issue, I changed the drive with another drive and rebuilt the RAID.
    I have the old drive and we can call it Drive – Wednesday
    I/O errors are still hanging the server for 10-200 seconds every 20-30 minutes
    I ordered 2 X Raptor 10000 RPM.
    On Friday night I broke the raid and took 1 drive out for emergency
    Let’s call it drive – Friday.
    Instead of rebuilding the RAID with the new drives, I decide to image the old drive to the new drives and increase the size of the C partition. (I only had %10 free on C drive)
    Using Acronis True image I “ghosted” the original drive to the new drive increasing C partition.
    I connect the drive to the MB and tried booting from the MB.
    C partition was fine (50GB) but the Data partition was empty.
    I formatted the drive and left the original drive connected to the MB and booted from it- so users can work and to rule out issues that maybe the controller is bad and causing the I/O delays.
    On Saturday (with 1 original drive connected to the MB and users using the system) they still get the I/O errors.
    This rule out the 3Ware RAID controller as causing the I/O errors.
    Saturday night I ghosted twice again from the original drive to the Raptor drive.
    All faileld when I tried to increase the C drive while ghosting
    I only succeeded when ghosting 1 to 1 w/o increasing the C partition size.
    Just for test proposes, I booted with the new ghosted drive and change it to Dynamic, trying to increase the C since.
    I could increase the data partition but not C - it reported that it was the boot partition and can not be increased.
    I formatted the Raptor again and left the original drive on the MB.
    Sunday, the users reported the same problems.
    Sunday night, I crated new raid 1 with the raptors and used True Image to Ghost the original drive (connected to the MB) to the RAID 1.
    Ghosting 1 to 1 went OK and I have the the new drives on the RAID
    Since the new RAID is bigger, I create a new partition and moved “old junk data” from the C drive to the new partition.
    Now I have %40 free on C drive.
    All looked fine, but when I try to replicate the DC I get an error on Server2000 and server02 that server01 refuses to accept connection.
    Server02 and server2000 still replicate fine
    Checking the USN and it looks like server01 has wrong USN.
    I know that I can demote server01, clean the AD on server02 and server2000 and promote it back.
    I am concern about special shares that the Production software uses on the Data drive that I might not have the correct permissions after the Demote / Promote.
    Checking my backup and NTbackup started on Friday night so I don’t have an old system state to restore for server01
    I have shadow copies, but it doesn’t look that I can open it.
    Event viewer shows that USN is not matching.
    Server02 and server2000 have USN matching.
    Server01 has different USN for the 2 servers and itself.
    I"ll post the erros on the next thread
    Since I am concern about the demote , clean AD on 2 server and promote back, can I use the old drives (Drive- Wednesday or Drive – Friday) to boot and create system state backup and restore from it to the new drive?
    Can I copy other information that will help the UNS rollback?
    Or should I boot with Drive Wednesday, check if replication works and if it does, demote server01 and promote it back.
    This would help cleaning the AD w/o using the NTDSUTIL

    Thank you,

    Roy Wagner
    [email protected]

  • #2
    Re: Need help with recovery from USN roll back

    Here are the event Viewer erros:

    Event Type: Error
    Event Source: NTDS General
    Event Category: Service Control
    Event ID: 2103
    Date: 8/21/2009
    Time: 11:26:24 PM
    Computer: SERVER01
    The Active Directory database has been restored using an unsupported restoration procedure.

    Active Directory will be unable to log on users while this condition persists. As a result, the Net Logon service has paused.

    Event Type: Error
    Event Source: NTDS Replication
    Event Category: Replication
    Event ID: 2095
    Date: 8/21/2009
    Time: 11:26:24 PM
    Computer: SERVER01
    During an Active Directory replication request, the local domain controller (DC) identified a remote DC which has received replication data from the local DC using already-acknowledged USN tracking numbers.

    Because the remote DC believes it is has a more up-to-date Active Directory database than the local DC, the remote DC will not apply future changes to its copy of the Active Directory database or replicate them to its direct and transitive replication partners that originate from this local DC.

    If not resolved immediately, this scenario will result in inconsistencies in the Active Directory databases of this source DC and one or more direct and transitive replication partners. Specifically the consistency of users, computers and trust relationships, their passwords, security groups, security group memberships and other Active Directory configuration data may vary, affecting the ability to log on, find objects of interest and perform other critical operations.

    The most probable cause of this situation is the improper restore of Active Directory on the local domain controller.

    User Actions:
    If this situation occurred because of an improper or unintended restore, forcibly demote the DC.

    Remote DC:
    USN reported by Remote DC:
    USN reported by Local DC:

    Event Type: Error
    Event Source: NTDS Replication
    Event Category: Replication
    Event ID: 1863
    Date: 8/23/2009
    Time: 12:39:11 AM
    Computer: SERVER01
    This is the replication status for the following directory partition on the local domain controller.

    Directory partition:

    The local domain controller has not received replication information from a number of domain controllers within the configured latency interval.

    Latency Interval (Hours):
    Number of domain controllers in all sites:
    Number of domain controllers in this site:

    The latency interval can be modified with the following registry key.

    Registry Key:
    HKLM\System\CurrentControlSet\Services\NTDS\Parame ters\Replicator latency error interval (hours)

    To identify the domain controllers by name, install the support tools included on the installation CD and run dcdiag.exe.
    You can also use the support tool repadmin.exe to display the replication latencies of the domain controllers in the forest. The command is "repadmin /showvector /latency <partition-dn>".