Name
Proactive Data Quality Management: Implementing Automated Checks with Deequ and ML
Date & Time
Tuesday, May 21, 2024, 3:15 PM - 3:45 PM
Description

This talk will cover automated data quality checks performed by large organizations to execute data reliability checks on big datasets in real time using data profiling and machine learning techniques. The demo will use the open source library Deequ, Spark framework and reporting and notification tools to enforce data issues in a proactive manner. Akshay will be covering an example of a framework developed at Amazon and Visa to validate customer facing data and its integration with notification tools based on the statistical methods.

Akshay Jain