Content

Vijona

4 Feb at 16:36

Using StandardScaler() Function to Standardize Python Data

In this article, we will be focusing on one of the most important pre-processing techniques in Python – Standardization using StandardScaler() function.

So, let us begin!!

Need for Standardization

Before getting into Standardization, let us first understand the concept of Scaling.

Scaling of Features is an essential step in modeling the algorithms with the datasets. The data that is usually used for the purpose of modeling is derived through various means such as:

Questionnaire
Surveys
Research
Scraping, etc.

So, the data obtained contains features of various dimensions and scales altogether. Different scales of the data features affect the modeling of a dataset adversely.

It leads to a biased outcome of predictions in terms of misclassification error and accuracy rates. Thus, it is necessary to Scale the data prior to modeling.

This is when standardization comes into picture.

Standardization is a scaling technique wherein it makes the data scale-free by converting the statistical distribution of the data into the below format:

mean – 0 (zero)
standard deviation – 1

Standardization

By this, the entire data set scales with a zero mean and unit variance, altogether.

Let us now try to implement the concept of Standardization in the upcoming sections.

Python sklearn StandardScaler() function

Python sklearn library offers us with StandardScaler() function to standardize the data values into a standard format.

Syntax:

Copy Code


object = StandardScaler()
object.fit_transform(data)

According to the above syntax, we initially create an object of the StandardScaler() function. Further, we use fit_transform() along with the assigned object to transform the data and standardize it.

Note: Standardization is only applicable on the data values that follows Normal Distribution.

Standardizing data with StandardScaler() function

Have a look at the below example!

Copy Code


from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
 
dataset = load_iris()
object= StandardScaler()
 
# Splitting the independent and dependent variables
i_data = dataset.data
response = dataset.target
 
# standardization 
scale = object.fit_transform(i_data) 
print(scale)

Explanation:

Import the necessary libraries required. We have imported sklearn library to use the StandardScaler function.
Load the dataset. Here we have used the IRIS dataset from sklearn.datasets library. You can find the dataset here.
Set an object to the StandardScaler() function.
Segregate the independent and the target variables as shown above.
Apply the function onto the dataset using the fit_transform() function.

Source: digitalocean.com

Create a Free Account

Try now

Posts you might be interested in:

Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

How to Install and Secure GoCD on CentOS 7 with SSL and Firewall

Linux Basics, Tutorial

6 days ago

Installing GoCD on CentOS 7 with Block Storage Configuration GoCD is a freely available automation and continuous delivery platform. It supports designing sophisticated pipelines through both sequential and concurrent task…

Install Leanote on CentOS 7 with SSL, MongoDB & Nginx

Linux Basics, Tutorial

6 days ago

Installing Leanote on CentOS 7 with MongoDB and Let’s Encrypt SSL Leanote is a free, lightweight, and open source note-taking platform built with Golang. Designed with a strong focus on…

Set Up a Secure Git Server with Nginx on Debian 8

Linux Basics, Tutorial

6 days ago

Setting Up a Secure Git Server with Nginx on Debian 8 Git is a widely used version control solution that allows developers to manage and track changes in their source…

FEATURED PRODUCTS

Kubernetes

ccloud³

Managed Server

Cloud GPU

S3 Object Storage

COMPUTE

MANAGED

STORAGE

NETWORKING

MANAGEMENT TOOLS

BACKUPS & SNAPSHOTS

WEBSITE HOSTING

HOUSING

FEATURED INDUSTRIES

Enterprise

Saas-Hosting

Startup

INDUSTRIES

MORE INDUSTRIES

FEATURED USE CASES

Linux-Hosting

VMware Migration

Docker Hosting

USE CASES

MORE USE CASES

RESSOURCES

Help Center

Trust Center

Glossar

Tutorials

MORE CENTRON

MORE INFOS

FEATURED PRODUCTS

Kubernetes

ccloud³

Managed Server

Cloud GPU

S3 Object Storage

COMPUTE

MANAGED

STORAGE

NETWORKING

MANAGEMENT TOOLS

BACKUPS & SNAPSHOTS

WEBSITE HOSTING

HOUSING

FEATURED INDUSTRIES

Enterprise

Saas-Hosting

Startup

INDUSTRIES

MORE INDUSTRIES

FEATURED USE CASES

Linux-Hosting

VMware Migration

Docker Hosting

USE CASES

MORE USE CASES

RESSOURCES

Help Center

Trust Center

Glossar

Tutorials

MORE CENTRON

MORE INFOS

Using StandardScaler() Function to Standardize Python Data

Need for Standardization

Standardization

Python sklearn StandardScaler() function

Standardizing data with StandardScaler() function

Create a Free Account

Posts you might be interested in:

How to Install and Secure GoCD on CentOS 7 with SSL and Firewall

Install Leanote on CentOS 7 with SSL, MongoDB & Nginx

Set Up a Secure Git Server with Nginx on Debian 8

Do you have any questions, a specific use case, or special requirements?

Start now for free.