<< Chapter < Page
  Datamining   Page 1 / 1
Chapter >> Page >
this module contains an introduction to one of the most popular datamining technique, clustering. First we will discuss what is clustering and when we need to do it. Next some general topics are discussed as how to calculate the distance or dissimilarity functions, what to do when we came across categorical attributes etc.. Next the discussion bifurcates into two majore ways of clustering the hierarchical method and partitioning method. Different methods under these two categories are discussed in detail. Working codes for these methods will be available sortly on author's personal we site http://emailsaptarshi.googlepages.com . The module ends with discussing real life problems.

clustering key words : distance function, dissimilariy matrix, hierarchical clustering, partitioning clustering method, K-Mean, PAM, single linkage, average linkage.

Clustering as the name suggest is a technique of making groups or clusters from a set of objects. When and where to use clustering, the biggest question to any one learning the subject for the first time. Before taking the subject forward lets look into a problem or a scenario where clustering can be used to get a solution.

An example: a bank xyz want to study its credit card customers. The bank want to study cusomer's payment record also the bank is going to offer some facilities. Its almost impossible for the bank to study every customer indivisually. So xyz managers want to group their customrs and want to study a group or its representatives

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Datamining. OpenStax CNX. Aug 15, 2006 Download for free at http://cnx.org/content/col10356/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Datamining' conversation and receive update notifications?

Ask