Friday, February 02, 2018

Detect ArcGIS Feature Changes When Archiving is Not Enabled


Notes:
Use the information below at your own discretion and risk. Feel free to adjust, modify and leverage to meet your own requirements.
The method proposed in this post aims to be simple for easy adoption though it is highly recommended that it is performed by an experienced IT professional.
The approach here described is only viable if Archiving is not enabled for your Enterprise Geodatabase e.g. if you have Geometric Networks enabled which at this point cannot be combined with Archiving functionality. 
This was tested in MS SQL Server 2014.

Corporate geospatial environments typically include an array of Esri Enterprise Geodatabases:
Public-facing
  • PUBLICATION (read-only)
  • MOBILE       (read-write)
Internal
  • BUSINESS
  • WORKFLOW
  • NETWORK
  • RASTER
  • OTHERS
Complex ETL processes run periodically to keep data in sync across all these repositories. In order to move data around, these processes track deltas (data changes), that occurred in the database. These deltas are then synchronized with the relevant databases.

These ETL processes are heavily dependent on the ArcGIS Archiving functionality since it provides details about each delta including a unique identifier and dates when the change occurred. A delta can also have one of three possible states:
  • CREATE
  • UPDATE
  • DELETE
This information is critical for the synchronization processes.

The purpose of this article is to discuss what to do when Archiving is not available e.g. you are dealing with an Electrical Network that is part of a Geometric Network and archiving is not an option as it cannot be combined with datasets that already participate in a geometric network topology.   

There are 4 ways of addressing this issue:
  1. ArcGIS Replication
    • This can work in very specific use cases, but typically does not support complex scenarios where the databases don't share the same logic and schema
    • It is possible to write custom code to compare the database state with a replica generated in a previous run but this can become complex to implement and to maintain over time
  2. Inspecting the ArcGIS SDE adds 'A' and deletes 'D' versioning states
    • Complex to implement and maintain over time. Changes to the Esri gdb schema can break the implementation logic and become difficult to manage over time
  3. SQL Server Native Replication
    • SQL Server Replication including peer to peer (in two-ways) can be enabled in certain scenarios. Live changes can only be implemented in non-versioned databases. Versioned databases can be sync once reconcile, post and compress have been executed so that changes are moved to the base tables. I'll leave this discussion to a future article, but be aware that this sort of implementation also creates tight coupling between the schemas of all participating tables which can become difficult to manage over time
  4.  SQL Server Change Tracking
    • You can leverage the native SQL Server tracking capabilities to provide information about what rows have changed for a particular base table and the type of change done i.e. whether it was an INSERT, an UPDATE or a DELETE 
This article focuses in point 4 above and provides a basic example of how it can be implemented for software developers.

Steps
  1. Enable Change Tracking on the database e.g. 'EGDB2018'


  2. ALTER DATABASE EGDB2018  SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 7 DAYS,  AUTO_CLEANUP = ON)


  3. Enable 'Change Tracking' for an enterprise geodatabase base table e.g. "POLES"

  4. ALTER TABLE MASTER.POLES  ENABLE CHANGE_TRACKING WITH  (TRACK_COLUMNS_UPDATED = ON)


  5. Develop a script that runs the following logic (note that this process can include more logic for logging, backup, and conflict detection)
    • Initial run to detect deltas to copy
    • Keep track of the current database tracking state
    • For any subsequent run keep track of the current tracking state and detect changes since the last track state

  6. Initial Run (SQL Logic)

  7. USE EGDB2018

    -- 1 - Initial Run to detect deltas to copy --
    DECLARE @synchronization_version BigInt;
    SET @synchronization_version = CHANGE_TRACKING_CURRENT_VERSION();             

    -- 2 - Copy all deltas to new repository (optional) --SELECT
               *
    FROM MASTER.POLE

    -- 3 - Keep track of the initial change track state number --
    PRINT @synchronization_version  -- e.g. 4 


    GO


  8. Detect Deltas Going Forward

  9. USE EGDB2018

    -- 1 - Detect changes going forward. First keep track of current state --
    DECLARE @synchronization_version BigInt;
    SET @synchronization_version = CHANGE_TRACKING_CURRENT_VERSION();


    -- 2 - Obtain incremental changes since last run via the track state number e.g. 4 --
    DECLARE @last_synchronization_version BigInt;
    SET @last_synchronization_version = 4;
    SELECT
               P.OBJECTID,
               P.SYS_CHANGE_OPERATION,
               P.SYS_CHANGE_COLUMNS,
               P.SYS_CHANGE_CONTEXT
    FROM
               MASTER.POLE AS P
    RIGHT OUTER JOIN
               CHANGETABLE (CHANGES MASTER.POLE, @last_synchronization_version) AS PJ
                P.OBJECTID = PJ.OBJECTID
    GO


  10. The figure below, shows an example of an output for step 5 above, when only one record is updated since the last run.
    Possible values for SYS_CHANGE_OPERATIONS are:
    U - UPDATE
    I - INSERT
    D - DELETE
I hope this article helps understanding how MS SQL Server CHANGE TRACKING capabilities can be used for detecting deltas in the database in a consistent and efficient manner.