Skip to content

Instantly share code, notes, and snippets.

@rajvermacas
rajvermacas / scalaSparkJson.scala
Last active November 13, 2024 07:42
Scala spark json file
import org.apache.spark.sql.{Row, SparkSession}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.types._
object ScalaListEquivalents {
def main(args: Array[String]): Unit = {
// Initialize Spark session
val spark = SparkSession.builder
.appName("CSV Loader")
.config("spark.master", "local")
import pandas as pd
import numpy as np
def calculate_combined_score(df):
"""
Calculate a combined score based on normalized 6M return and sum of ranks.
Returns the DataFrame with additional columns for analysis.
Parameters:
df (pandas.DataFrame): DataFrame with '6M Return' and 'Sum of Ranks' columns
import java.time.{LocalDate, DayOfWeek}
import java.time.temporal.ChronoUnit
object WeekdayCounter {
def countWeekdays(start: LocalDate, end: LocalDate): Long = {
var weekdayCount = 0L
var currentDate = start
while (!currentDate.isAfter(end)) {
@rajvermacas
rajvermacas / python_logger.py
Created October 28, 2024 06:45
python_logger
import logging
# Create a custom logger for your project
my_logger = logging.getLogger('my_project')
my_logger.setLevel(logging.DEBUG)
# Create a handler
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.DEBUG)
@rajvermacas
rajvermacas / debugPinescript.pinescript
Last active October 27, 2024 18:15
debugPinescript
//@version=5
indicator("Var Understanding", overlay=true)
myFunction() =>
var int counter = 0 // Initialized once per bar
counter := counter + 1
counter
// Scenario 1
// On the same bar, multiple calls will reset the var
You are an AI model specialized in software programming, data engineering, and Apache Spark. Your role is to assist users in solving technical questions related to programming in languages such as Java, Scala, Python, JavaScript, and Spark, and to address general coding issues, dependency challenges, and data processing with Spark. You provide precise explanations, troubleshooting guidance, and solutions to errors or debugging requests.
Your responsibilities include:
1. **Answering Programming Questions**: Deliver detailed solutions and explanations for coding problems in Java, Scala, Python, JavaScript, and Spark.
2. **Resolving Data Engineering and Spark Queries**: Address issues regarding data pipelines, ETL processes, Spark configurations, transformations, and optimizations within distributed data systems.
3. **Debugging Errors**: Analyze and diagnose code errors or performance issues, especially in Spark-related code, providing corrective steps, recommendations, and insights to ensure optimal performanc
@rajvermacas
rajvermacas / stock_export.js
Last active October 20, 2024 11:34
stock export
const scrapeData = () => {
// 1. Find the table
const sections = document.querySelectorAll("#screener-table > table");
/*
- document.querySelectorAll finds all elements matching the CSS selector
- "#screener-table > table" means:
- Find element with ID "screener-table"
- ">" means direct child
- "table" means find table elements
- Returns a NodeList of matching tables
@rajvermacas
rajvermacas / get_stock_financial_ratios.py
Created October 7, 2024 19:56
Get stock financial ratios
import yfinance as yf
def get_stock_metrics(ticker_symbol):
# Create a Ticker object
stock = yf.Ticker(ticker_symbol)
# Fetch the data
info = stock.info
# Extract the requested metrics
@rajvermacas
rajvermacas / monitor_threads.py
Last active October 3, 2024 17:14
monitor threads in python
import threading
import time
import gc
import logging as logger
import traceback
import sys
import os
# Set up logger
logger.basicConfig(level=logger.INFO, format='%(asctime)s - %(message)s', datefmt='%H:%M:%S')
1.
I have a list of objects. The objects are of type Document. This Document is a user defined class. The list has 100000 documents. After doing the business logic when the for loop tries to find out the next element from the list. it is almost taking 100 ms which is unexpected.
I need to find out what is making the iteration of elements so time consuming. How can i go deep into debuging the logic of for loop which can show some insight as to what is taking time. my for loop looks like below:
for document in documents:
logger.info("Start processing of document")
# I have business logic here
logger.info("Processing of document completed")
After the log