Apache Foundation announced that Apache Griffin graduated as a top project

The Apache Software Foundation officially announced on December 12 that  Apache Griffin graduated as an Apache top-level project.

Apache Griffin is a powerful open source big data quality solution for distributed data systems of any size. It provides a unified process for defining and detecting the quality of data sets from different perspectives, as well as building and validating trusted data assets in a streaming or batch context. Griffin originated in eBay China and entered the Apache Incubator in December 2016.

Image: griffin

Vice President of Griffin, William Guo said: “We are very proud of Griffin reaching this important milestone. By actively improving Big Data quality, Griffin helps build trusted data assets, therefore boosting your confidence in your business.”

Apache Griffin enables data scientists/analysts to handle data quality issues by:
  • Defining –specifying data quality requirements such as accuracy, completeness, timeliness, profiling, etc.;
  • Measuring –source data ingested into the Griffin computing cluster will apply data quality measurement based on user-defined requirements; and
  • Applying Metrics –data quality reports as metrics will be exported to designated destination.

In addition, Griffin allows users to easily incorporate new requirements into the platform and write more comprehensive logic to further define their data quality.

Apache Griffin is currently used in high-capacity, high-demand environments such as 163.com/Netease, eBay, Expedia, Huawei, JD, Meituan, PayPal, Ping An Bank, PPDAI, VIP.com, and VMWare.