Skip to content

Instantly share code, notes, and snippets.

View abelardojarab's full-sized avatar

Abelardo Jara-Berrocal abelardojarab

View GitHub Profile
@abelardojarab
abelardojarab / generic_app_using_multiprocessing_map_reduce.py
Last active February 10, 2025 01:04
Generic parallel application using multiprocessing in Python
import numpy as np
import multiprocessing.shared_memory as shm
from concurrent.futures import ProcessPoolExecutor, as_completed
import ray
import contextlib
class ParallelArrayProcessor:
def __init__(self, data, num_chunks=8, use_shared_memory=False):
"""
Initialize processor.
@abelardojarab
abelardojarab / parallel_flame_graph_generation_process.py
Last active February 9, 2025 19:49
Parallel Flame Graph Generation using Multiprocessing
import collections
import json
import re
import concurrent.futures
def _create_default_tree():
return [0, collections.defaultdict(dict)]
def process_trace_chunk(trace_chunk):
"""Processes a subset of kernel traces and returns a local stack tree."""
@abelardojarab
abelardojarab / parallel_flame_graph_generation_ray.py
Last active February 9, 2025 17:13
Parallel flame graph generation using Ray
import collections
import json
import re
import ray
import graphviz
ray.init(ignore_reinit_error=True)
@ray.remote
class StackProcessor:
@abelardojarab
abelardojarab / flame_graphs_generation.py
Created February 9, 2025 16:34
Flame graph generation with Python
import collections
import json
import re
import graphviz
class KernelFlameGraph:
def __init__(self):
# Nested structure for stack hierarchy (FlameGraph format)
self.stack_tree = collections.defaultdict(lambda: [0, collections.defaultdict(dict)])
# Flat dictionary to track accumulated occurrences of each function
@abelardojarab
abelardojarab / count_words.py
Created February 9, 2025 16:33
Count words in Python in parallel
from concurrent.futures import ProcessPoolExecutor
from collections import Counter
import os
#import psutil
import re
def map_word_count(text_chunk):
"""Map function: Counts words in a chunk of text."""
words = re.findall(r'\w+', text_chunk.lower()) # Tokenize words
return Counter(words) # Return word frequencies
@abelardojarab
abelardojarab / general_split_and_reduce.md
Last active February 9, 2025 04:50
General method to split and reduce in Python

Data Splitting and Reduction Techniques in Python and NumPy

1. Introduction

Efficiently splitting data structures and performing reduce operations are essential skills for optimizing computations in Python, especially for interviews. This document covers:

  • Splitting lists, dictionaries, and matrices using Python and NumPy
  • Common reduce operations for 1D and 2D data

2. Splitting Data Structures

@abelardojarab
abelardojarab / general_method_parallel.md
Created February 9, 2025 04:31
General method to solve problems

Parallel Computation: Task Execution, ProcessPoolExecutor, and Ray for Matrix Multiplication

Introduction

Parallel execution is essential for optimizing computational workloads such as matrix multiplication, large-scale data processing, and distributed computing. In this document, we explore:

  1. Creating a Generalized Task Execution Framework
  2. Using ProcessPoolExecutor and Ray for 1D and 2D Task Execution
  3. Parallel Matrix Multiplication (Stripe-Based vs. Block-Based Approaches)

@abelardojarab
abelardojarab / merge_function_events.py
Last active February 8, 2025 02:36
Merge function events in Python
import numpy as np
import multiprocessing
from concurrent.futures import ProcessPoolExecutor
def process_chunk(chunk, chunk_start_idx):
"""Processes a chunk of samples and tracks function lifetimes."""
active_functions = {} # {function_name: start_time}
events = []
for offset, functions in enumerate(chunk):
@abelardojarab
abelardojarab / dfs_bfs.cpp
Last active February 8, 2025 02:38
DFS and BFS Graph Traversal in C++
#include <iostream>
#include <unordered_map>
#include <vector>
#include <queue>
#include <stack>
#include <unordered_set>
class Graph {
private:
std::unordered_map<int, std::vector<int>> adjList;