AD replication failures silently corrupt the directory — user password changes don't propagate, GPO updates stall, and object creation conflicts arise. This lab builds hands-on familiarity with every tool a Tier 2 sysadmin uses to monitor and recover replication: baseline health checks with dcdiag and repadmin, break/fix scenarios, Event Log analysis, and documented incident closure using an After-Action Report (ITIL format).
Ran a full dcdiag /v on both domain controllers to establish a healthy baseline. Documented all passing tests. Key tests reviewed: Advertising, KccEvent, KnowsOfRoleHolders, MachineAccount, NCSecDesc, NetLogons, ObjectsReplicated, Replications, RidManager, SystemLog.
Directory Server Diagnosis
Performing initial setup:
Trying to find home server...
Home Server = DC01
* Identified AD Forest.
Starting test: Replications
* Replications Check
......................... DC01 passed test Replications
Starting test: Advertising
......................... DC01 passed test Advertising
Starting test: KccEvent
......................... DC01 passed test KccEvent
REM Baseline: all tests PASSED on both DC01 and DC02
Used repadmin /showrepl to view per-partition replication status and repadmin /replsummary for a high-level pass/fail view across all DCs. Forced manual replication with repadmin /syncall and verified changes propagated within seconds.
Replication Summary Start Time: 2026-03-10 14:22:01
Beginning data collection for replication summary, this may take awhile:
.........................
Source DSA largest delta fails/total %% error
DC01 00h:03m:12s 0 / 3 0%
DC02 00h:03m:08s 0 / 3 0%
C:\> repadmin /syncall /AdeP
Syncing all NC's held on DC01.
Syncing partition: DC=corp,DC=local
CALLBACK MESSAGE: The following replication completed successfully:
From: DC02 To: DC01 NC: DC=corp,DC=local
SyncAll Finished Successfully.
Intentionally broke replication between DC01 and DC02 by blocking RPC port 135 traffic via Windows Firewall on DC02. Observed the failure surface in repadmin /showrepl and in Event Viewer (Event IDs 1311, 1864). Confirmed the failure, then restored connectivity by removing the firewall rule.
C:\> netsh advfirewall firewall add rule name="BREAK-RPC" protocol=TCP dir=in localport=135 action=block
Ok.
REM -- On DC01: force sync attempt — should now fail
C:\> repadmin /syncall /AdeP
CALLBACK MESSAGE: The following replication FAILED:
From: DC02 To: DC01
Naming Context: DC=corp,DC=local
LDAP Error 58 (0x3a): The specified server cannot perform the requested operation.
C:\> repadmin /showrepl DC01
Last attempt @ 2026-03-10 15:17:32 was successful.
Last attempt @ 2026-03-10 15:44:01 FAILED, result 1722 (0x6ba):
The RPC server is unavailable.
REM -- Fix: remove blocking rule on DC02
C:\> netsh advfirewall firewall delete rule name="BREAK-RPC"
Deleted 1 rule(s).
C:\> repadmin /syncall /AdeP
SyncAll Finished Successfully.
Reviewed the Directory Service event log in Event Viewer during and after the failure. Documented the key Event IDs that surface during replication failures and understood what each indicates. Confirmed all errors cleared after the firewall rule was removed and replication recovered.
TimeWritten EventID Message
----------- ------- -------
3/10/2026 3:44:01 PM 1311 The Knowledge Consistency Checker (KCC) has detected that successive attempts to replicate...
3/10/2026 3:44:01 PM 1864 This is the replication status for the naming context DC=corp,DC=local from the source DC...
3/10/2026 3:44:01 PM 1925 The attempt to establish a replication link for the following writable directory partition failed.
# After fix — clean health check
PS> Get-EventLog -LogName "Directory Service" -EntryType Error -Newest 5
No events found matching the specified criteria.
Documented the incident end-to-end in a structured After-Action Report following ITIL incident management principles. The AAR captures timeline, root cause analysis, impact, and corrective actions — the same format used in enterprise environments to close incidents and prevent recurrence.
- Established baseline AD health using dcdiag across all test categories on a 2-DC lab
- Monitored replication topology and forced manual sync using repadmin /showrepl, /replsummary, /syncall
- Intentio.ally introduced and confirmed a replication failure (RPC error 1722) via firewall block
- Diagnosed root cause using Event IDs 1311, 1864, 1925 in the Directory Service log
- Restored full replication and verified clean health with no errors in Event Viewer
- Produced a complete ITIL-format After-Action Report covering timeline, root cause, and corrective actions