Matheus Rossi matheus-rossi

🎯

Data Engineer

Data Enginner | AI Engineer #python #sql #spark #cloud #aws #ai #llm #vectordatabases

matheus-rossi / gist:d6141900d524d48d6741f9c0c538adde

Created April 20, 2023 20:28

gist

	apiVersion: v2
	name: datahub-prerequisites
	description: A Helm chart for packages that Datahub depends on
	type: application
	# This is the chart version. This version number should be incremented each time you make changes
	# to the chart and its templates, including the app version.
	version: 0.0.14
	dependencies:
	- name: elasticsearch
	version: 7.17.3

matheus-rossi / spark_tips_01.py

Created February 21, 2024 18:26

spark_tips_01

	from pyspark.sql import SparkSession

	spark = (
	SparkSession
	.builder
	.appName("spark_parameterized_queries")
	.getOrCreate()
	)

	##### Criando dois datasets de teste #####

matheus-rossi / generators.py

Created April 8, 2024 18:31

	import sys

	# List comprehension
	list_comprehension = [i for i in range(10_000_000)]
	print(f"List comprehension memory: {sys.getsizeof(list_comprehension) / (1024 * 1024)} MB")

	# Yield generator
	def generator():
	for i in range(10_000_000):
	yield i

matheus-rossi / yml_replace.py

Created April 17, 2024 13:48

	import string, yaml

	def load_yaml(file_path: str, context: dict = None):
	def string_constructor(loader, node):
	t = string.Template(node.value)
	value = t.substitute(context)
	return value

	l = yaml.SafeLoader
	l.add_constructor('tag:yaml.org,2002:str', string_constructor)