Skip to content

Instantly share code, notes, and snippets.

@rg3915
Last active December 15, 2021 18:51
Show Gist options
  • Save rg3915/5fb3a2e7338115bc92e82b7a9a2b372b to your computer and use it in GitHub Desktop.
Save rg3915/5fb3a2e7338115bc92e82b7a9a2b372b to your computer and use it in GitHub Desktop.
Annotations of Pandas DataFrame
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@rg3915
Copy link
Author

rg3915 commented Dec 5, 2018

Suponha que você tenha Cliente e Obra.

Pegando o ID do Cliente que está em Obra e trocando pelo nome do Cliente que está no outro DataFrame.

# JSON do Cliente com IDCliente e Cliente
dict_cliente = df_cliente[['IDCliente', 'Cliente']].T.apply(dict).tolist()

[
 {'IDCliente': 288, 'Cliente': 'Cliente Um'},
 {'IDCliente': 1, 'Cliente': 'Cliente Dois'},
 {'IDCliente': 959, 'Cliente': 'Cliente Três'},
]

Montando o dicionário que será usado como busca de cada Cliente a partir do seu ID.

_dict_cliente = {}
for item in dict_cliente:
    _dict_cliente[item['IDCliente']] = item['Cliente']

_dict_cliente

{
 288: 'Cliente Um',
 1: 'Cliente Dois',
 959: 'Cliente Três',
}

A partir desse dicionário fazemos a busca no outro DataFrame.

for row in df.itertuples():
    nome_cliente = _dict_cliente.get(row.IDCliente)
    print(row.IDCliente, nome_cliente)

288 Cliente Um
1   Cliente Dois
959 Cliente Três

@rg3915
Copy link
Author

rg3915 commented Dec 9, 2018

Retornando o valor máximo agrupado por ano.

df.groupby(['Ano'])['NumeroOrcamento'].max()
df.reset_index()

@rg3915
Copy link
Author

rg3915 commented Dec 10, 2018

Verificando data vazia:

for row in df.itertuples():
    if row.DataOrcamento is pd.NaT:
        print('Vazio')
    else:
        print(row.DataOrcamento)

@rg3915
Copy link
Author

rg3915 commented Dec 16, 2018

Definindo vários fillna diferentes por coluna:

values = {'last_name': '', 'occupation': '', 'age': 0}
df = df.fillna(value=values)
df.head()

@rg3915
Copy link
Author

rg3915 commented Aug 6, 2019

Se tiver problema com liblzma, faça um downgrade do Pandas para pandas==0.24.2.
https://stackoverflow.com/a/57115325

@rg3915
Copy link
Author

rg3915 commented Aug 6, 2019

Retorna o tamanho do maior objeto de cada coluna.

dict_sizes = {}
for col in df.columns:
    try:
        print(f'{col} max length: {df[col].map(len).max()}\n')
        dict_sizes[col] = df[col].map(len).max()
    except Exception as e:
        raise e
dict_sizes

@rg3915
Copy link
Author

rg3915 commented Sep 12, 2019

@rg3915
Copy link
Author

rg3915 commented Nov 6, 2019

dtype example
df['estoque'] =df['estoque'].fillna(0).astype(int)

@rg3915
Copy link
Author

rg3915 commented Nov 6, 2019

Pandas Dataframe df to Django
https://www.laurivan.com/save-pandas-dataframe-as-django-model/

Produto.objects.bulk_create(
    Produto(**item) for item in df.to_dict('records')
)

@rg3915
Copy link
Author

rg3915 commented Nov 13, 2019

Definindo os tipos das colunas com dtype

dict_types_annot = {
    'produto': str,
    'ncm': str,
    'preco': float,
    'estoque': 'Int64',
}

# Define os tipos das colunas
dff = df.astype(dict_types_annot, errors='ignore')

# Troca 'nan' por None e float por None.
dff = dff.replace({'nan': None, float('nan'): None})

dff.to_dict('records')

Produto.objects.bulk_create(
    Produto(**item) for item in dff.to_dict('records')
)

@rg3915
Copy link
Author

rg3915 commented Jan 22, 2020

Intersecção de dataframes

import pandas as pd
import numpy as np
import datetime
from random import randint

df1 = pd.DataFrame({
    'letters': ['A', 'B', 'C', 'D', 'E', 'J', 'K', 'M'],
    'B': np.random.randint(0, 10, 8),
})
df1

df2 = pd.DataFrame({
    'letters': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'U', 'Z'],
    'B': np.random.randint(0, 10, 26),
})
df2

# Retorna o que tem de comum nos dois dataframes.
pd.merge(df1, df2, how='inner', on='fruits')

# Retorna o que tem de comum, considerando o df1.
pd.merge(df1, df2, how='left', on='fruits')

# Retorna o que tem de comum, considerando o df2.
pd.merge(df1, df2, how='right', on='fruits')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment