Deploying yente in your infrastructure

yente is an open source data match-making API. It provides functions search, retrieve or match FollowTheMoney entities, including people, companies or vessels that are subject to international sanctions.

yente: Intro · Deployment · Settings · Custom datasets · FAQ


Running yente requires a server that can run host the main screening application (a lightweight Python application) and the ElasticSearch backend used to store and query entity information. In total, we anticipate 500 MB memory per Python service, and 2-4GB of memory plus 8-10GB of disk volume size for the ElasticSearch index. Running ElasticSearch on SSD-backed hard drives will produce a significant performance gain.

While it is possible to operate yente outside of Docker, we strongly encourage the use of containers as a simple means of dependency management and deployment.

Using docker-compose

In order to deploy yente on your own servers, we recommend you use docker-compose (or another Docker orchestration tool) to pull and run the pre-built containers. For example, you can download the docker-compose.yml in the repository and use it to boot an instance of the system:

mkdir -p yente && cd yente
docker-compose up

This will make the service available on Port 8000 of the local machine. You may have to wait for five to ten minutes until the service has finished indexing the data when it is first started.

Next: Configure yente

Using Kubernetes

If you run the container in a cluster management system like Kubernetes, you will need to run both of the services defined in the compose file (the API and ElasticSearch instance). You may also need to assign the API container network policy permissions to fetch data from once every hour so that it can update itself.