Merge Docs

Syncing Data Between You and Merge

Make sure you are efficiently pulling data from Merge's servers into your own.

How to Efficiently Sync Data Between Your Database and Merge

There are a few different ways to make sure you're efficiently pulling data from Merge

  • Webhooks
  • Polling with timestamp filters

To learn about how to set up Webhooks, check out the Webhooks guide in the API Basics Section. However, it’s generally recommended to not solely rely on webhooks and instead poll continuously in the background, on a less frequent basis, with timestamp filters to ensure a consistent stream of data.

To learn how to efficiently poll today from Merge’s servers, follow the guide below on how to use the modified_after endpoint.


How to Sync Your Data with Timestamp Filters

Merge recommends syncing data in the following way:

  1. Create a set of functions in your backend that are responsible for syncing data

    These functions should utilize the modified_after timestamp filter that is available on all Merge list endpoints. It is critical that you use this to ensure you aren’t pulling redundant data. It’s a simple approach and at scale will make your life incredibly easy.

    This enables you to filter and only pull data that has been changed (or created) since your last sync. For example, you can ask for modified_after=2021-03-30T20:44:18.662942Z, and only pull the items that are actually new/different.

    To do this, you need to store the time you last began a fetch to Merge’s API for the given linked account and endpoint.

    Note that it’s important to use the time the fetch began, as data can change during a fetch, and you want to be sure you pick up those changes at the next sync.

  2. Call your sync functions whenever Merge issues a sync webhook

    It’s preferred to not be notified about every model change using our “select data types” webhook, and rather to use our “first sync” and “any sync” notification webhooks. This will also help you scale more effectively.

    Whenever you receive a sync complete webhook, simply call the functions you created in step 1.

  3. Call your sync functions on a periodic cadence

    Engineering best practices dictate not relying entirely on webhooks. They can fail for a variety of reasons, such as downtime on your end or failed processing.

    Merge does attempt to redeliver multiple times using exponential backoff, however, we recommend still calling your sync functions on a periodic cadence of around once every 24 hours.


Other Important Notes


expandQuery Parameter
Pull multiple models that are related to each other. By default Merge returns just the id. For example, if you are querying for candidates and also want details about associated applications you can expand=applications, and Merge will return the actual application objects instead of just the application_ids. This way, you don’t make multiple pulls for what is ultimately related information.

GET /sync-statusEndpoint
When you set up the linking flow, make sure to configure sync-status to set up a trigger so you don't make any API calls before the sync is finished.