The datasets published by the project are made available in multiple formats, suitable for different purposes.
If you would like to see another file format or slice of the data included in this project, please get in touch to discuss your idea.
Unfortunately, the structure of persons of interest data does not easily lend itself to a simple tabular form. For example, a person might have multiple nationalities, or have been a member of several political parties in their career.
The "Simplified CSV" format addresses this by presenting a highly limited view of the data, in which only a select set of key columns is provided. These include:
id: the unique identifier of the given entity.
schema: the entity type.
name: the display name of the given entity.
aliases: any alias names (e.g. other scripts, nom de guerre) provided by the data sources.
birth_date: for people, their birth date.
countries: a list of countries linked to this entity. Includes countries of residence, nationalities and corporate jurisdictions.
addresses: a list of known addresses for the entity.
identifiers: identifiers such as corporate registrations, passport numbers or tax identifiers linked to this sanctions target.
sanctions: details regarding the sanctions designation, if any.
phones: a list of phone numbers in E.164 format.
emails: a list of email addresses linked to the entity.
dataset: the dataset this entity is in.
last_seen: the last time this entity was observed in source data.
first_seen: the earliest date this entity has been noticed by OpenSanctions.
Further technical notes:
,(comma) as a delimiter, encoded as
The simplest format we publish is a simple text file with the names of all persons and companies targeted in each dataset, one name per line. The format can be used for:
The plain text files are encoded in
utf-8. If non-latin names don't show up
correctly in your application, make sure you've opened the file with the
We offer two JSON-based export formats that are both based on the FollowTheMoney (FtM). They are a close representation of the internal data structure of OpenSanctions. The nested JSON format should be the preferred import method for software-based data consumers.
Both formats use line-delimited JSON: each line of the exported files is a separate entity. While the FollowTheMoney entities (
entities.ftm.json) export contains one entity per line, the nested JSON (
targets.nested.json) format contains one line per target, with adjacent entities (e.g. addresses, sanctions) nested inside the properties section of the data structure.
The nested format and some of the provided metadata (
last_seen) are not part of FtM, but extensions developed for OpenSanctions.
Some further documentation regarding FtM tooling: