Image Pre-processing Part 5: Image Segmentation

Read Time: 4 min

In previous posts, we discussed different pre-processing techniques using opencv. You can access these posts from here:

In this post, I have explained theoretical overview of an important and widely used concept i.e. image segmentation.

Before moving ahead, first we need to understand the difference between object recognition, object detection and image segmentation.

These three terms seems similar but has a lot of difference.

Object Recognition:

As the name suggests, object recognition identify that if any object (of given class) is present in the image or not and if present, what is that object.

input: Image

Output: A class label to which object of the image belongs

Object detection:

It includes localization along with the object recognition. When it needs to figure out that what is the object and where is it located, its called object detection.

Input: Image

Output: class label + bounding box information(x,y,w,h)

Image segmentation:

It includes identification of shape of the object present in the image along with object detection.

Let’s understand it in detail:

Segmentation provides exact outline of the objects present in the image.

How image segmentation works..

Through image segmentation each pixel detail is extracted and mask is created by drawing contours around shape of each object. Image segmentation works on extracting granular details of image.

Where do we use image segmentation..

  • Background removal: In some cases, only foreground is necessary to extract only object’s features.
  • Shape detection: In some use cases, shape of object plays important role in determining type of object.

Foe example, in medical imaging, while detection of cancer cells, shape is main identifier of what kind of cancer it is and how serious situation is.

Types of Image Segmentation

  • Instance segmentation
  • Semantic segmentation

Semantic Segmentation

In any image, every pixel of that image belongs to particular class or class (either foreground or background)

So all pixels which belong to same class displayed using same color category.


Instance Segmentation

Instance segmentation differentiates different objects of same class. In this type of segmentation, if multiple objects of same class are present, then instance segmentation assigns different color to each object.

e.g. In the above image, different colors are used to mark each person.

Methods for image segmentation

There are multiple methods are available for image segmentation, some which are as follows:

  • Threshold based segmentation
  • Edge based segmentation
  • Clustering based segmentation
  • Using Neural Network (Mask R-CNN)
  • Using Watershed algorithm
  • Region based segmentation

Threshold based segmentation

Threshold based segmentation assumes that both foreground and background have difference in pixel intensities. That difference can be a base to segment the objects using different types of thresholding.

Usually three types of threshold used for image segmentation:

Edge based Segmentation

Edges are discontinuous local features of the image. These features are used to differentiate foreground and background.

Using this method, first edges are detected through different filters(e.g. canny edge detector) and they are connected together to build boundaries of the object.

Clustering based Segmentation

Using clustering, data points are divided into a number of groups in such a way so that similar data points will be in same cluster or group.

Clustering based segmentation is of two types:

  • Soft Clustering
  • Hard Clustering

Hard Clustering

If pixels of an image are divided in such a way that one pixel can be assigned to one cluster only then it is hard clustering

e.g. K-Means algorithm

In K-Means algorithm are centers are computed at first, clusters are formed based on centers then each pixel is assigned to it’s nearest cluster.

Soft Clustering

Soft clustering is more natural as compare to hard one as exact division is not possible in real life scenario. In this type of clustering, one pixel can be assigned to more than one cluster.

e.g. Fuzzy K-Means

Segmentation Using Neural Network (Mask R-CNN)

Neural network is more efficient way to solve image segmentation problem as it works dynamically on images. Through neural network, pixel wise mask is created on each object present in the image.

E.g. Mask R-CNN

In mask R-CNN, given output is in form of class id, bounding box and mask of the image.

Segmentation using Watershed algorithm

According to Watershed algorithm image intensity can be taken as water basin. Minimum intensity is like hole in basin where water spills. As water reaches to border of the basin, basin merges. So to maintain separation between basins, dames are required. Similar case in images where dam can be considered border of the region. These dams i.e. border of object can be created using dilation method.

Region based Segmentation

Region based splitting can be divided into two parts:

  • Merging
  • Splitting


As the name suggests, through merging, small region is selected first then combine adjacent similar regions.


In this method, first whole image is considered to process then it is divided into regions that have similar characteristics.