aleph: find the people and companies you look for

Aleph

Aleph is a tool for indexing large amounts of both documents (PDF, Word, HTML) and structured (CSV, XLS, SQL) data for easy browsing and search. It is built with investigative reporting as a primary use case. Aleph allows cross-referencing mentions of well-known entities (such as people and companies) against watchlists, e.g. from prior research or public datasets.

aleph

 

Here are some key features:

  • Web-based search across large documents and data sets.
  • Imports many file formats, including popular office formats, spreadsheets, email, and zipped archives. Processing includes optical character recognition, language and encoding detection, and named entity extraction.
  • Load structured entity graph data from databases and CSV files. This allows the navigation of complex datasets like companies registries, sanctions lists, or procurement data. Import tools for OpenSanctions. are included.
  • Receive notifications for new search matches with a personal watchlist.
  • OAuth authorization and access control on a per-source and per-watchlist basis.

Download && Tutorial