Skip to content

Instantly share code, notes, and snippets.

@fenago
Created February 4, 2024 20:27
Show Gist options
  • Select an option

  • Save fenago/4e4f946aa992d006bfc1ced6ddcab935 to your computer and use it in GitHub Desktop.

Select an option

Save fenago/4e4f946aa992d006bfc1ced6ddcab935 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Analyze the Cars data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cars = pd.read_pickle('cars.pkl')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# display the first five rows"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Melt the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# melt the enginesize and curbweight columns"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# create a scatterplot for the melted data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Rank the data by price"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# add a priceRank column that ranks each row by the price value"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# display the ten rows with the lowest price in ascending order from lowest price to highest"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Bin the data with quantiles"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use the qcut() method to create three price bins for the data: low, medium, and high\n",
"# store these bins in a new column named priceGrade"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# display the number of values for each bin in the priceGrade column"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Group and aggregate the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# group the cars data by the priceGrade column and display the min and max price for each group"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# group the data by the carbody and aspiration columns and get the average price for each group"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# unstack the aspiration column of the index"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use the pivot_table() method to accomplish the same task as the previous cell"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# use the Pandas plot() method to create a bar chart for the DataFrame created in the previous cell"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
https://github.com/fenago/datasets
Go here to find: cars.pkl and fires_by_month.pkl
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment