Announcement

Collapse
No announcement yet.

New to DFS Replication, running into some issues

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New to DFS Replication, running into some issues

    Hello, Let me start out by saying that I am the server specialist with the Kodiak Island Borough School District here in Alaska. We use the Jamf casper suite in our environment to manage our Mac computers. We have 7 remote sites that are only accessible by boat or plane, are connected via satellite links and have RODC's at each location. AD Replication and everything for the most part is working well. In the past I have used a third party program, AllWaySync to schedule syncing specifically the Jamf file repository to each remote site. Syncing this repo out to each remote site saves time and their limited bandwidth when a client machine goes to install an application or run an update. Our domain controllers are primarily 2008 R2 with a single remote site having a 2012 R2 server since that school site recently restarted. I am currently trying to get DFS to work between our Jamf management server (2012R2 Member Server) and a single remote site's RODC(2008R2). If I can get this working well then I would like to roll this out to the remaining 6 remote sites.

    Both servers currently have the DFS and Namespaces roles installed. I created a new namespace from the Jamf Management server, and then added both the local folder and the remote folder as targets in that namespace. I then went to the replication settings and created a new group, with the Jamf Management server as the primary and the RODC in the remote site as Read Only as well as a schedule that respects bandwidth and time constraints. I have setup a 40GB staging cache and confirmed that the topology looks good with connections both ways from the Management server and the remote site. Last night was the first night that my schedule would have been applicable and sure enough there are some files that got replicated out. I only got about 1.3GB of files replicated out so far, but I expected quite a bit more using 1MB of bandwidth all night long....Looking at the event viewer on the remote site DC, there are lots of DFS errors and I'm not entirely sure why it would be constantly be disconnecting and reconnecting. All of my IT resources that I work with have never tried anything with DFS before so I was hoping to reach out to the forums here and see if anyone could chime in with some insight as to what my errors mean or if I may have missed part of the initial configuration....this is all new to me Thanks!

    The error that I am getting is DFS stopped communicating with the partner due to an error, Event ID 5014, Error ID1818

    Photos of configuration and errors attached as well.



  • #2
    Probably a networking issue rather than a server issue

    https://support.microsoft.com/en-us/kb/976442

    Comment


    • #3
      Thanks for the response Wullieb1.....although its not really what I wanted to hear

      Our ISP manages our WAN links to our remote school sites and its very hard to troubleshoot intermittent connectivity issues with them. I've already confirmed that their business managed services team has no experience with DFSR at all. I did run the health check and sure enough it shows all the files in the backlog and 0% as being replicated so far even though there are a handful of files that did replicate out. No further progress last night with additional files, just the same small subset of files that seemed to replicate out the first night.

      Would you have any other tips that I could try as far as troubleshooting this? If I could identify an issue I can work with our ISP to have it resolved...If not, then I appreciate your response!

      Comment


      • #4
        What i would do in this instance is run a continuous ping over the line to see if it drops at all during that time.

        Might also want to check your replication is working as expected as this could potentially cause issues as well.

        Comment


        • #5
          Thanks again for your reply Wullieb1, while I will occasionally run into a replication issue that needs attention due to a site being offline for a period of time, for the most part its working well. I will start an ongoing ping and see how that behaves overnight.

          Comment


          • #6
            So I've had an ongoing ping since last week running 24x7 from the target server to the source server and I counted about 50 pings that failed throughout the entire logged time. Latency seems to be pretty consistent with pings times ranging from 500-700MS which is pretty standard for the satellite links. Would you expect 50 failed pings over 3 days to essentially tank the DFS replication service? I'm seeing no issues with standard AD replication ad of this morning either....

            Comment


            • #7
              Attached is a snippet of the error log on the source server...I have the schedule set to only replicate after school is out, overnight, and just before school starts. It seems to roughly log error 516 every hour which states that its doing a complete metadata refresh for namespace JSSDP...also, I seem to be getting DFSR ID 5014 everynight at 9PM stating that communication is stopping with the partner, specific error:9033 * the request was cancelled by a shutdown. Now the actual server is not shutting down at 9pm but I do have a local server backup task that runs at 9PM which I have disabled to see if it would clear out the specific DSFR error every night. Are there any concerns with deleting the Namespace and replication group and starting over? Since this is all new to me, before I completely write off DFS as not a viable solution in our environment, I would like to try just entirely deleting the configuration and setting it up from scratch...

              Comment


              • #8
                Apparently the 516 error is a bug that can be safely ignored, pain in the butt however.

                Your pings seem fine unless they are in batches then maybe that could be an outage.

                What does a health report in DFS say? Has it started syncing or finished syncing yet? You can do this via the DFS Management MMC.

                If it helps we have multiple replications happening with servers all over the globe with latency ranging from 50ms to 500ms and everything works fine for us.

                Comment


                • #9
                  Well that's a pretty persistent bug Oh well....The missed pings seemed to be completely sporadic and not in a any block or at any specific time, I'm thinking they are just due to the nature of satellite in Alaska....high winds and rain now and then but nothing to actually address. I did a health report from the source server and while there aren't any critical errors it looks like a lot of backlog which is what I would pretty much expect with DFS replication not happening. There is still a subset of files on the target server that did initially replicate out, looks to be a little over a gig of data I just haven't seen that grow after checking each night. The target server has a DFS uptime of over 100 days, looks like I never restarted it after the initial configuration so I will try that tonight before going home. Alternatively, I have 6 other sites all connected via satellite and I can try configuring replication for a different site to the same source server and see if the problem could be site specific. I chose the first site as my initial test because its the only one actually connected to the road system and has a slightly more reliable connection than the farthest remote sites.
                  Thanks again for your help!

                  Comment


                  • #10
                    The server is still looking for its initial replication to occur.

                    Maybe this will help

                    https://blogs.technet.microsoft.com/...ive-directory/

                    Comment

                    Working...
                    X