2 Apr 2025, Wed

Data Catalog & Governance

Data Catalogs

  • Apache Atlas: Data governance and metadata framework
  • DataHub: Metadata platform for the modern data stack
  • Amundsen: Data discovery and metadata engine
  • Alation: Data intelligence platform
  • Collibra: Enterprise data governance and catalog platform
  • Azure Purview: Unified data governance service
  • AWS Glue Data Catalog: Metadata repository for AWS Glue
  • Google Data Catalog: Fully managed, scalable metadata management service

Data Quality Tools

  • Great Expectations: Data validation and documentation framework
  • Deequ: Data quality validation for large datasets
  • Soda SQL: Data quality framework for SQL data
  • Monte Carlo: Data observability platform
  • Databand: Data pipeline monitoring platform
  • Apache Griffin: Big data quality solution

Data Lineage

  • OpenLineage: Open standard for data lineage metadata collection
  • Marquez: Open-source metadata service for data lineage
  • Spline: Data lineage tracking for Apache Spark
  • Atlan: Data governance platform with lineage capabilities
  • Informatica Enterprise Data Catalog: Enterprise metadata management