joao-parana / how_to_loop_using_pandas.py

Created February 5, 2023 17:04

How to Iterate over rows in pandas, and why you shouldn't

	# %%
	import httpx
	import pandas as pd

	# %% Read CSV and rename headers
	websites = pd.read_csv("resources/popular_websites.csv", index_col=0)
	print(websites)


	# %% Define function to check connection

joao-parana / TestAppendWithPartitioning.java

Created October 30, 2022 21:26

TestAppendWithPartitioning for test identity transform partitioning in Iceberg version 1.0.0

	import org.apache.hadoop.conf.Configuration;
	import org.apache.iceberg.*;
	import org.apache.iceberg.catalog.Catalog;
	import org.apache.iceberg.catalog.TableIdentifier;
	import org.apache.iceberg.data.GenericRecord;
	import org.apache.iceberg.data.IcebergGenerics;
	import org.apache.iceberg.data.Record;
	import org.apache.iceberg.data.parquet.GenericParquetWriter;
	import org.apache.iceberg.hadoop.HadoopCatalog;
	import org.apache.iceberg.io.CloseableIterable;

joao-parana / predict_tabular_classification_sample.py

Created June 14, 2022 22:41

	# Copyright 2020 Google LLC
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# https://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,

joao-parana / increase_root_fedora.md

Created June 10, 2022 21:54 — forked from 181192/increase_root_fedora.md

How to increase the root partition size on Fedora

Boot up with an Fedora Live USB stick.

$ sudo vgs
  VG     #PV #LV #SN Attr   VSize    VFree
  fedora   1   3   0 wz--n- <237.28g    0

joao-parana / Workflow01UDAF.scala

Created October 3, 2018 13:23

	// UserDefinedAggregateFunction is the contract to define
	// user-defined aggregate functions (UDAFs)
	class MyCountUDAF extends UserDefinedAggregateFunction {
	// Este método abaixo define pode ser invocado apenas assim: inputSchema(0)
	// Isto é feito via inversão de dependência pelo Spark
	// o retorno é um objeto StructField assim:
	// StructField("id", LongType, true, {})
	// o objeto StructField é do pacote org.apache.spark.sql.types
	override def inputSchema: StructType = {
	new StructType().add("id", LongType, nullable = true)

joao-parana / theme.md

Created October 2, 2018 22:07

We couldn’t find that file to show.

joao-parana / diagrama-W1.svg

Created October 2, 2018 21:45

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

joao-parana / W1.scala

Created October 1, 2018 20:27

	import org.apache.spark.sql.SparkSession
	import org.apache.spark.sql.SparkSession.Builder
	import org.apache.spark.SparkContext
	import org.apache.log4j.{Level, Logger}
	// A sparkSession é provida pelo proprio Spark Shell
	// O nivel de log também já é configurado pela Spark Shell
	def boolean_udf_wrapper(a:String, b:String, t:Any): Boolean = { true }
	def string_udf_wrapper(a:String, b:String, t:Any): String = { "••••" }
	import org.apache.spark.sql.functions.expr
	import org.apache.spark.sql.functions.sum

joao-parana / Workflow01UDFs.scala

Created August 4, 2018 16:29

	def happyEmployees(salary: Int) => salary > 2200

	def smartTextCase(name: String) => name.toUpperCase()

joao-parana / kubernetes-dnd.sh

Last active June 30, 2018 23:02

	# Usando o Kubernetes com Docker in Docker (DIND)

	sudo mkdir -p /usr/local
	cd /usr/local
	sudo mkdir dind-cluster
	cd dind-cluster/
	sudo chmod o+w .
	ls -lat .. \| head
	# wget https://cdn.rawgit.com/kubernetes-sigs/kubeadm-dind-cluster/master/fixed/dind-cluster-v1.10.sh
	curl -O https://cdn.rawgit.com/kubernetes-sigs/kubeadm-dind-cluster/master/fixed/dind-cluster-v1.10.sh

João Antonio Ferreira joao-parana

How to increase the root partition size on Fedora