Skip to content

Instantly share code, notes, and snippets.

@hritik5102
Last active October 19, 2020 13:29
Show Gist options
  • Save hritik5102/e46c520c84aa2eeadcfaf202fe60edf6 to your computer and use it in GitHub Desktop.
Save hritik5102/e46c520c84aa2eeadcfaf202fe60edf6 to your computer and use it in GitHub Desktop.

Microsoft Azure Machine Learning Studio Experiment: Dataset: Automobile Price data

Problem statement

A Chinese automobile company Geely Auto aspires to enter the US market by setting up their manufacturing unit there and producing cars locally to give competition to their US and European counterparts. They have contracted an automobile consulting company to understand the factors on which the pricing of cars depends. Specifically, they want to understand the factors affecting the pricing of cars in the American market, since those may be very different from the Chinese market. The company wants to know:

  • Which variables are significant in predicting the price of a car
  • How well those variables describe the price of a car

Based on various market surveys, the consulting firm has gathered a large dataset of different types of cars across the Americal market.

Business Goal

You are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels. Further, the model will be a good way for management to understand the pricing dynamics of a new market.

Aim: To predict price of the car.

Features:

  1. Make
  2. body-style
  3. wheel-base
  4. engine-size
  5. horsepower
  6. peak-rpm
  7. highway-mpg
  8. price

Labels: Price

Algorithm: Linear Regression

Step 1 – Go to https://studio.azureml.net/

We can either use our own dataset or provided by azure Here we gone use custom dataset provided by the azure , but you can use your own

Step 2 – Upload the dataset

1

Step 3- Create a new experiment by clicking +NEW at the bottom of the Machine Learning Studio (classic) window. Select EXPERIMENT > Blank Experiment.

2

Just drag and drop

In this dataset, each row represents an automobile, and the variables associated with each automobile appear as columns. We'll predict the price in far-right column (column 26, titled "price") using the variables for a specific automobile.

3

Select column in the dataset

Prepare the data A dataset usually requires some preprocessing before it can be analyzed. You might have noticed the missing values present in the columns of various rows. These missing values need to be cleaned so the model can analyze the data correctly. We'll remove any rows that have missing values. Also, the normalized-losses column has a large proportion of missing values, so we'll exclude that column from the model altogether.

4

Clean missing data

5

Select column in the dataset

6

Split data

7

This is the regression type of problem so we choose linear regression model

8

So now select the target/label feature

9

Now select scoring model

10

Now Evaluate model using testing dataset

11

Now click on the RUN button

12


13

Check the evaluated result I.e. metric score

14

Reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment