Syncing Data Between You and Merge
How to Efficiently Sync Data Between Your Database and Merge
There are a few different ways to make sure you're efficiently pulling data from Merge
- Polling with timestamp filters
To learn about how to set up Webhooks, check out the Webhooks guide in the API Basics Section. However, it’s generally recommended to not solely rely on webhooks and instead poll continuously in the background, on a less frequent basis, with timestamp filters to ensure a consistent stream of data.
To learn how to efficiently poll today from Merge’s servers, follow the guide below on how to use the
How to Sync Your Data with Timestamp Filters
Merge recommends syncing data in the following way:
- Create a set of functions in your backend that are responsible for syncing data
These functions should utilize the
modified_aftertimestamp filter that is available on all Merge list endpoints. It is critical that you use this to ensure you aren’t pulling redundant data. It’s a simple approach and at scale will make your life incredibly easy.
This enables you to filter and only pull data that has been changed (or created) since your last sync. For example, you can ask for
modified_after=2021-03-30T20:44:18.662942Z, and only pull the items that are actually new/different.
To do this, you need to store the time you last began a fetch to Merge’s API for the given linked account and endpoint.
Note that it’s important to use the time the fetch began, as data can change during a fetch, and you want to be sure you pick up those changes at the next sync.
- Call your sync functions whenever Merge issues a sync webhook
It’s preferred to not be notified about every model change using our “select data types” webhook, and rather to use our “first sync” and “any sync” notification webhooks. This will also help you scale more effectively.
Whenever you receive a sync complete webhook, simply call the functions you created in step 1.
- Call your sync functions on a periodic cadence
Engineering best practices dictate not relying entirely on webhooks. They can fail for a variety of reasons, such as downtime on your end or failed processing.
Merge does attempt to redeliver multiple times using exponential backoff, however, we recommend still calling your sync functions on a periodic cadence of around once every 24 hours.
Other Important Notes
id. For example, if you are querying for candidates and also want details about associated applications you can
expand=applications, and Merge will return the actual application objects instead of just the
application_ids. This way, you don’t make multiple pulls for what is ultimately related information.
sync-statusto set up a trigger so you don't make any API calls before the sync is finished.