Skip to content

Instantly share code, notes, and snippets.

View rohithreddykota's full-sized avatar
🥷
building | analysing | fixing

Rohith Reddy Kota rohithreddykota

🥷
building | analysing | fixing
View GitHub Profile

The Role of Data Compression in Modern Data Storage & Processing

Why Compression Matters in Big Data Systems

Modern data storage and processing systems handle massive volumes of data, making efficient storage and access crucial. Data compression addresses this by reducing dataset sizes, which in turn lowers storage costs and speeds up data reads. In columnar data formats (like Parquet or ORC), similar values are stored together, enabling very high compression ratios. For example, Parquet’s columnar storage often yields higher compression than row-based formats (All About Parquet Part 05 - Compression Techniques in Parquet - DEV Community). Compressed data means less I/O (disk and network) for the same information, which can dramatically improve query performance. In fact, using Parquet or ORC can shrin

GPU Acceleration in Analytical Databases: Promise and Reality

Introduction

Graphics Processing Units (GPUs) offer massive parallelism and high memory bandwidth, theoretically enabling order-of-magnitude speedups for data analytics. However, mainstream analytical databases like ClickHouse, DuckDB, and BigQuery remain largely CPU-based. This report examines the technical reasons behind the limited GPU usage in traditional OLAP databases. We compare CPU vs GPU performance for real-world analytical queries, analyzing how factors like data volume, memory access patterns, and architecture affect outcomes. We then contrast this with GPU-native analytic engines – OmniSci (MapD/Heavy.AI), BlazingSQL (RAPIDS), and related technologies – highlighting their architecture, performance benchmarks, and practical challenges. Engineering trade-offs, scalability considerations, and the suitability of GPU acceleration for startups vs. enterprises are discussed, supported by findings from academic resea

Recursion in the decorrelate_predicate_subquery Optimizer Pass

Recursion Mechanics of DecorrelatePredicateSubquery

The DecorrelatePredicateSubquery rule in Apache DataFusion is responsible for rewriting correlated subqueries in WHERE/HAVING clauses (specifically IN and EXISTS predicates, including their negations) into semijoin or antijoin operations. This transforms a nested query into a flat join-based plan for execution. To achieve this, the rule employs a carefully orchestrated recursion strategy that handles subqueries within subqueries (nested subqueries) and coordinates with DataFusion’s optimizer driver to avoid duplicate traversals.

Top-Down Invocation: The rule is registered to run in a top-down manner. In its implementation of OptimizerRule, it overrides apply_order to return Some(ApplyOrder::TopDown) ([decorrelate_predicate_subquery.rs - source](https://docs.rs/datafusion-optimizer/46.0.1/x86_64-unknown-linux-gnu/src/datafusion_optimizer/decorrelate_predicate_

Apache Arrow Physical Data Types and Layouts (v17.0.0)

Apache Arrow defines a standardized in-memory columnar format composed of primitive and nested data types. Each Arrow array is backed by one or more contiguous buffers (blocks of memory) and optional metadata such as length and null count (Internal structure of Arrow objects • Arrow R Package) (Physical memory layout — Apache Arrow v0.12.1.dev425+g828b4377f.d20190316). This report provides a deep dive into all Arrow physical layouts in version 17.0.0, covering their memory structure, Rust implementation, code examples, and performance trade-offs. We will explore primitives, variable-length types, list types, struct, union, dictionary encoding, run-end encoding, and the n

Apache Arrow Physical Memory Layout (Rust Implementation Focus)

Apache Arrow defines a standardized columnar in-memory format to enable high-performance analytics across languages. It emphasizes contiguous, aligned memory and minimal metadata, allowing zero-copy sharing and efficient vectorized processing (Arrow Columnar Format — Apache Arrow v19.0.1). This answer breaks down Arrow’s physical memory layout fields (validity bitmap, offsets, data buffers, type IDs, etc.), and explains how these map to the Rust implementation (using the arrow and arrow-array crates). We’ll cover buffer structures (Buffer 0/1/2), validity bitmaps, alignment/padding rules, nested types (List, Struct, Union), offset buffers for variable-length types, dictionary encoding, and run-end encoding. Finally, we highlight how Arrow’s Rust crates represent these concepts with structs, enums, and safe memory management. Diagrams and code examples are included

Pointer Swizzling, Optimized Versioned Latching, and Adaptive Compilation

Advanced Database Engine Techniques in Rust:

Modern high-performance database systems like Umbra () () use a combination of low-level techniques to maximize performance. We’ll explore three such concepts – pointer swizzling, optimized versioned latching, and adaptive compilation – with clear explanations and Rust code examples. Each section includes a standalone demo and then shows how these techniques integrate into a mini database component (like a B+-tree and query executor). We’ll discuss design considerations (performance, concurrency, correctness) and use Rust’s low-level features (unsafe, atomics, custom memory layouts) where appropri

Serverless ETL: Load CSV from S3 into Aurora Using DuckDB in AWS Lambda

Introduction

Building a serverless ETL pipeline that efficiently loads structured data into Amazon Aurora is a common requirement. AWS Lambda, combined with DuckDB, provides a powerful way to perform in-memory analytics and bulk insert data into Aurora.

This guide demonstrates how to:

  • Read a CSV file from Amazon S3
  • Process the data in-memory using DuckDB

Waitgroup Implementation in Rust

While Rust provides tokio::join! and std::sync::Barrier for synchronizing tasks, WaitGroup (available in crossbeam and tokio) offers a lightweight, efficient way to wait for multiple tasks to complete, similar to Go’s sync.WaitGroup.


Example: Synchronizing Multiple Async Tasks

Using tokio::sync::Notify to implement a simple WaitGroup: