Electronic statistics textbook banner

Customer Segmentation

Customer segmentation, also referred to as market segmentation, is the process of finding homogenous sub-groups within a heterogeneous aggregate market. Typically this approach is used in direct marketing to target and focus on increasingly well-defined and profitable market segments.  The process of segmentation begins with observing customer actions and continues with learning about the demographic and psychographic characteristics of these customers.  This approach is also used in economics.

Detecting these sub-groups within the market enables an organization to better understand its customers. Learning about clusters within the customer base allows for customized marketing plans to cater specifically to the needs of a particular group. Market segments can be used to find the most profitable groups of customers, allowing the company to focus on maintaining these valuable customers. Another market segment may show a high risk of losing these customers. A cost versus benefit study would help determine how aggressively these customers should be pursued.

Generally, a customer database for a marketing study is quite large, possibly containing millions of records and hundreds if not thousands of variables. Due to the size of the data and complexities found within, data mining tools can be the most appropriate for uncovering information from the data. Following are descriptions of data mining techniques commonly used for customer and market segmentation.

Bivariate histogram of cluster by ageCluster analysis is a tool commonly used for customer segmentation.  In cluster analysis, the goal is to organize observed data into a meaningful structure. This type of analysis is different from traditional statistical approaches such as linear regression in that cluster analysis does not have a dependent variable. Both continuous and categorical variables are used to find sub-groups/clusters. These clusters should consist of observations that are both similar to other members of the group and different from other cluster members. Once clusters are found, characteristics of those clusters can be explored, providing insight into its members, and new observations can be assigned to clusters. The graph to the left shows frequencies for age groups across the 3 clusters found in a marketing study. 

The tree building approach, CHAID, is also used for determining customer segments in a market. A CHAID decision tree uses multi-class splits to segment the data into nodes. Members of nodes tend to be very similar within the node as well as different from members of other nodes. This tool often effectively yields many multi-way frequency tables when classifying a categorical response variable, making it popular in marketing research.