- Apache Hadoop - Foundational for Big Data ecosystems, offers HDFS for distributed storage.
- Apache HBase - Distributed, scalable, NoSQL database atop HDFS.
- Apache Cassandra - Highly scalable NoSQL database for large data workloads.
- Apache Accumulo - Secure, distributed key-value store.
- Apache Kudu - Columnar storage for analytics.
- Apache Parquet - Columnar storage file format optimized for Big Data.
- Apache ORC - Optimized row-columnar file format for Big Data.
- Apache Arrow - In-memory columnar data storage for analytics.
- Apache Spark - Unified analytics engine for large-scale data processing.
- Apache Flink - Stream and batch processing engine.
- Apache Beam - Unified model for stream and batch processing.
- Apache Hive - SQL-like querying on Big Data.
- Apache Pig - High-level platform for processing large datasets.
- Apache Tez - Framework for data processing workflows.
- Apache Storm - Real-time computation system.
- Apache NiFi - Data integration and processing automation.
- Apache Ranger - Security and access control for Big Data.
- Apache Atlas - Data governance and metadata management.
- Apache Ambari - Management platform for Hadoop clusters.
- Apache ZooKeeper - Coordination service for distributed systems.
- Apache Drill - SQL query engine for heterogeneous data.
- Apache Impala - Interactive SQL for Hadoop.
- Apache Druid - Real-time analytics database.
- Apache Kylin - OLAP engine for multi-dimensional analytics.
- Apache Phoenix - SQL on HBase.
- Apache Kafka - Distributed event streaming platform.
- Apache Pulsar - Cloud-native distributed messaging and streaming.
- Apache Samza - Stream processing framework.
- Apache Mahout - Scalable machine learning.
- Apache MADlib - Machine learning library for SQL.
- Apache SystemDS - Distributed machine learning.
- Apache MXNet (in the Attic) - Deep learning framework.
- Apache Airflow - Workflow automation and scheduling.
- Apache DolphinScheduler - Distributed workflow orchestration.
- Apache Oozie - Workflow scheduler for Hadoop.
- Apache Bigtop - Packaging and integration of Big Data components.
- Apache TinkerPop - Graph computing framework.
- Apache Giraph (in the Attic) - Large-scale graph processing.
- Apache Sedona - Spatial data processing engine.
- Apache Hudi - Data lake management.