Downloading the data and keeping it fresh

We update our data regularly to ensure up-to-date sanctions and PEP lists are available.

You can download bulk data extracts of the database directly from this website, without any login or API key. While the bulk data files are free to use for non-commercial users, commercial use of the data requires a data license.

To perform regular data updates, you can use one of the following methods:

  1. Re-fetching on a Schedule:

    • Re-fetch the latest version of the desired bulk data snapshot at scheduled intervals (e.g., every six hours or daily).
    • Use the URL format: https://data.opensanctions.org/datasets/latest/<dataset>/<format>
      • Replace <dataset> with the name of the dataset or collection.
      • Replace <format> with the file name of the format (e.g., entities.ftm.json or targets.simple.csv).
    • Note: We cannot predict the specific time at which new exports are published. Export cadences are available on the dataset overview page.
    • Recommended frequency: At most every 6 hours to avoid excessive data transfers. If you require timely updates, use the metadata index checking mechanism to detect if a new export was published.
  2. Metadata Index Checking:

    • Fetch the dataset metadata index to check for new dataset versions.
    • Access metadata at: https://data.opensanctions.org/datasets/<dataset>/latest/index.json
    • Use the version ID or last_export timestamp to determine if an update is needed.
    • Utilize SHA1 checksums in the dataset.resources section to detect if the export is different from the file that was previously published.
    • Recommended frequency: Every 30 minutes for frequent updates.
  3. Delta Update Mechanism:

    • Use the delta update mechanism to retrieve incremental update files.
    • These files describe additions, modifications, and removals of entities between data export versions.

Accessing Historical Data

If you require historical data, you can access past versions of each dataset or collection by specifying the desired date (in YYYYMMDD format) in the download URL. For example:

https://data.opensanctions.org/datasets/20231001/default/entities.ftm.json

This URL fetches a version of the dataset published on October 1, 2023. Historical data is available for most core sanctions lists from around July 2021. Check the "Date Added" on the dataset profile page for specific availability details.

A shortcut is available to download the latest published version of a dataset:

https://data.opensanctions.org/datasets/latest/default/entities.ftm.json

Handling Entity Data Deletions and ID Changes

When an entity is no longer present in the source data, it will not appear in subsequent updates, and the total count of entities in entities.ftm.json will decrease accordingly. We do not provide explicit markers for deleted entities in the export files.

Entity IDs can change due to several reasons:

  • Merging Duplicates: Entities from multiple sources may be merged, resulting in a new cluster ID.
  • Source Updates: Changes in source data or processing methods may alter entity IDs.

To manage these changes, it's advisable to track both the entity's primary ID and the referents list to maintain consistency. This approach helps you avoid duplicate alerts and ensures you're referencing the correct entities. For detailed guidance, refer to our Identifiers Documentation.

Additional Information

  • Data Completeness: Even if no changes have occurred at the source, data files are re-exported to ensure you have the most recent confirmed state.
  • Data Deletion Policy: We reflect deletions from source data, typically within one week of the change.
  • Data Formats and Documentation: For more details on data formats and how to work with them, refer to the bulk data documentation.