Skip to content

Instantly share code, notes, and snippets.

View azagniotov's full-sized avatar
🎯
Ranking ..

Alexander Zagniotov azagniotov

🎯
Ranking ..
View GitHub Profile
/*
alexPrintPrep($("textarea#phrase").val(), null, 210);
alexPrintPrep(address, privkey, 100);
<div id="printable-view-for-alex" class="alex-qr-container"></div>
@media print
{
@azagniotov
azagniotov / JettySolrRunnerExampleTest.java
Created December 8, 2024 16:44
Solr unit test example shows how to index documents using JettySolrRunner and SolrTestCaseJ4
/*
* Copyright (c) 2023-2024 Alexander Zagniotov
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
@azagniotov
azagniotov / SparkConfig.scala
Last active May 5, 2025 21:56
This Spark / PySpark on YARN configuration is optimized for large-scale workloads with high executor memory (64g), efficient Kryo serialization, and aggressive adaptive query execution to handle skewed data and optimize partitioning, while also tuning shuffle and network settings for stability under load.
/**
* The configuration parameters outlined below are specifically optimized for execution across
* four (4) distributed workers, each provisioned with the n2d-highmem-48 machine type—offering
* 48 virtual CPUs and 384 GB of RAM per node. These settings have been carefully fine-tuned to
* support Spark workloads that process data on a monthly cadence, where each job ingests and
* computes over an entire month's worth of data in a single run on DataProc.
*
* The cluster total resources:
* - 4 workers * 48 vCPUs/worker = 192 total vCPUs
* - 4 workers * 384 GB RAM/worker = 1,536 GB total RAM