How to Normalise a Pandas DataFrame Column?

This recipe helps you Normalise a Pandas DataFrame Column
Last Updated: 22 Dec 2022

Get access to Data Science projects View all Data Science projects

DATA MUNGING DATA CLEANING PYTHON MACHINE LEARNING RECIPES PANDAS CHEATSHEET ALL TAGS

Recipe Objective

In many datasets we find some of the features have very high range and some does not. So while traning a model it may be possible that the features having high range may effect the model more and make the model bias towards the feature. So for this we need to normalize the dataset i.e to change the range of values keeping the differences same.

Here we are using min-max normalizer which will normalize the data in the range 0 to 1 such that the minimum value of dataset will be 0 and the maximum will be 1.

So this recipe is a short example of How we can Normalise a Pandas DataFrame Column.

Get Closer To Your Dream of Becoming a Data Scientist with 70+ Solved End-to-End ML Projects

Recipe Objective

Step 1 - Import the library

import pandas as pd from sklearn import preprocessing

We have imported pandas and preprocessing from sklearn library.

Step 2 - Setup the Data

Here we have created a dictionary named data and passed that in pd.DataFrame to create a DataFrame with column named values. We have also used a print statement to print the dataframe. data = {'values': [23,243,17,30,-79,40,173,-20,69,170]} df = pd.DataFrame(data) print(df)

Step 3 - Using MinMaxScaler and transforming the Dataframe

As the dataframe is made its time to call MinMaxScaler and learn about its parameters. It has two parameters:

feature_range : By this parameter we can set the minimun and maximum value of normalized data that we want by passing a tuple(min , max). By default it is (0 , 1).
copy : It is a bool parameter which is by default True that means by default it will make a copy of new normalized data and set inplace equals to False.

We are calling MinMaxScaler with default parameters. min_max_scaler = preprocessing.MinMaxScaler()

Now, we are normalizing the dataframe (df) by using fit_transform function of MinMaxScaler and making the dataframe of the normalized array. x_scaled = min_max_scaler.fit_transform(df) df_normalized = pd.DataFrame(x_scaled)

Explore More Data Science and Machine Learning Projects for Practice. Fast-Track Your Career Transition with ProjectPro

Step 5 - Viewing the DataFrame

So we are printing the final dataframe and observe that the values have been normalized in the range 0 to 1. print(df_normalized) So the output comes as

   values
0      23
1     243
2      17
3      30
4     -79
5      40
6     173
7     -20
8      69
9     170

          0
0  0.316770
1  1.000000
2  0.298137
3  0.338509
4  0.000000
5  0.369565
6  0.782609
7  0.183230
8  0.459627
9  0.773292

Download Materials

iPython Notebook

What Users are saying..

Ed Godalle

Director Data Analytics at EY / EY Tech

I am the Director of Data Analytics with over 10+ years of IT experience. I have a background in SQL, Python, and Big Data working with Accenture, IBM, and Infosys. I am looking to enhance my skills... Read More

Relevant Projects

Machine Learning Projects

Data Science Projects

Python Projects for Data Science

Data Science Projects in R

Machine Learning Projects for Beginners

Deep Learning Projects

Neural Network Projects

Tensorflow Projects

NLP Projects

Kaggle Projects

IoT Projects

Big Data Projects

Hadoop Real-Time Projects Examples

Spark Projects

Data Analytics Projects for Students

Relevant Projects

Autogen Project to Build an Intelligent AI Personal Assistant

Build a multi-agent AI personal assistant using Autogen that can handle tasks like managing calendars, emails, reminders, messaging, research, and weather updates, automating everyday workflows with LLMs and tool integrations. This is an upcoming project that is expected to be launched in June.

View Project Details

A/B Testing Approach for Comparing Performance of ML Models

The objective of this project is to compare the performance of BERT and DistilBERT models for building an efficient Question and Answering system. Using A/B testing approach, we explore the effectiveness and efficiency of both models and determine which one is better suited for Q&A tasks.

View Project Details

How to Normalise a Pandas DataFrame Column?

Recipe Objective

Table of Contents

Step 1 - Import the library

Step 2 - Setup the Data

Step 3 - Using MinMaxScaler and transforming the Dataframe

Step 5 - Viewing the DataFrame

What Users are saying..

Ed Godalle

Relevant Projects

You might also like

Relevant Projects