RSSChangelog

See updates across the data model, metadata structure, and API of our service. Breaking changes that require updates to data consumer applications are announced prior to their implementation.

Filter by components: Bulk data delivery (2) · Export formats (3) · Data model (7) · Datasets (4) · yente (3) · Hosted API (2) · Metadata (1)

#22 Case-Insensitive Name/Alias Collapsing

Effective date:took effect on
Components affected:Export formats
Announcement:

We are starting deduplicate name variants that differ only in letter-case. If an entity lists several names whose only variation is capitalization (e.g. VLADIMIR PUTIN, Vladimir Putin, vladimir putin), we will keep just the variant closest to title-case (Vladimir Putin) and omit the others.

Why? Reduces noise and file size, and makes downstream matching more intuitive. Scope affected: all exports (CSV, JSON, Senzing) and API responses; you will see fewer aliases. Compatibility: no schema changes, only redundant values are removed. If you relied on the presence of case-only variants, review your unit tests.

We are also removing some invalid name values from the dataset, including names that consist of single-character, non-letter names.

#18 Notice: do not use the `all` dataset

Effective date:took effect on
Components affected:DatasetsExport formats
Announcement:

We'd like to strongly advise all customers that are using the all dataset to use default instead.

The all dataset is an internal artifact of our data infrastructure. Other than having a pleasing name, it is a strictly inferior data product. As of February 2025, we've reduced the update frequency of all to monthly updates - meaning that users of the all scope will use outdated data in their systems.

all also includes data meant for internal verification/testing purposes which should not be included in production systems (unless you have a regulatory requirement to screen for 1990 "Die Hard" movie villain, John Gruber).

#13 Generating target files from relevant topics

Effective date:took effect on
Components affected:Data modelExport formats
Announcement:

We're phasing out the use of the target flag throughout the system, and switching the export formats that are based on target to use a defined list of topics as their source of truth.

A binary flag (target) is an insufficient method to describe what entities are associated with risk. For the past few months, we've been recommending the use of topics to decide if a match is relevant (e.g. as a PEP, sanctioned entity). However, some export formats - such as targets.nested.json and targets.simple.csv are still using targets to decide which entities to include.

On January 15, we will switch these two export formats (targets.nested.json and targets.simple.csv) to include any entities tagged with one the topics listed below. This is guaranteed to include all current targets, but will bring in additional entities that have topics assigned, but are not marked as targets. In short: the new exports will be more correct, and a bit larger.

This will result in the targets.nested.json export of the default dataset becoming equivalent to the topics.nested.json export of the same collection. This export can be used for testing until the change becomes effective on January 15, 2025. We will eventually remove the topics.nested.json export format on February 15, 2025, and only generated the file named targets.nested.json going forward.

Topics included in new target definition:

  • corp.disqual
  • crime.boss
  • crime.fin
  • crime.fraud
  • crime.terror
  • crime.theft
  • crime.traffick
  • crime.war
  • crime
  • debarment
  • export.control
  • export.risk
  • poi
  • reg.action
  • reg.warn
  • role.oligarch
  • role.pep
  • role.rca
  • sanction.counter
  • sanction.linked
  • sanction
  • wanted