www.statsoft.com

STATISTICA Sequence, Association, and Link Analysis

 

Request a trial

Discover how we can put
your data to work for you.

STATISTICA Sequence, Association and Link Analysis (SAL) is designed to address the needs of clients in healthcare, retailing, banking and insurance, etc., industries. It can be used for model building and deployment. SAL is an implementation of several state-of-the-art techniques specifically designed for extracting rules from datasets (databases) that can be generally characterized as "market-baskets."

fruit basket, creative common license, http://www.flickr.com/photos/smudie/15233393/"Market-Basket" Metaphor

The market-basket problem assumes that there are a large number of products that can be purchased by the customer, either in a single transaction, or over time in a sequence of transactions. Such products can be goods displayed in a supermarket, spanning a wide range of items from groceries to electrical appliances, or they can be insurance packages which customers might be willing to purchase, etc. Customers fill their basket with only a fraction of what is on display or on offer.

Association Rules

Association Rule graphAssociation rules can be extracted from a database of transactions, to determine which products are frequently purchased together. For example, one might find that purchases of flashlights also typically coincide with purchases of batteries in the same basket. Likewise, when transactions are time-stamped, allowing the analysts to track purchases.

Sequence Analysis

Sequence Association GraphSequence analysis is concerned with a subsequent purchase of a product or products given a previous buy. For instance, buying an extended warranty is more likely to follow (in that specific sequential order) the purchase of a TV or other electric appliances. Sequence rules, however, are not always that obvious and sequence analysis helps you to extract such rules no matter how hidden they may be in your market-basket data. There is a wide range of applications for sequence analysis in many areas of industry and since including customer shopping patterns, phone call patterns, the fluctuation of the stock market, DNA sequence and web-log streams.

Link Analysis

Once extracted, rules about associations or the sequences of items as they occur in a transaction database can be extremely useful for numerous applications. Obviously, in retailing or marketing, knowledge of purchase "patterns" can help with the direct marketing of special offers to the "right" or "ready" customers (i.e., those that, according to the rules, are most likely to purchase some specific items given their observed past consumption patterns).

However, transaction databases occur in many areas of business, such as banking, as well as general customer "intelligence." In fact, the term "link analysis" is often used when these techniques -- for extracting sequential or non-sequential association rules -- are applied to organize complex "evidence."

It is easy to see how the "transactions" or "market-basket" metaphor can be applied to situations where individuals engage in certain actions, open accounts, contact other specific individuals, and so on. Applying the technologies described here to such databases may quickly extract patterns and associations between individuals and actions, and hence, reveal the patterns and structure in datasets.

Functional Overview

The STATISTICA SAL module is designed to address and carry out such tasks with the help of an intuitive user-friendly interface, employing, behind the scenes, state of the art techniques and computationally efficient multi-threaded highly scalable algorithms, capable of providing solutions within a short period of time. This tool, furthermore, has the unique capability of handing continuous variables as well as categorical variables or items, and allowing the user to run both sequence and (non-sequence) association analyses on selected variables in a single analysis. These facilities are fully integrated into the STATISTICA platform, supporting a results interface specifically designed to provide the user with a wealth of tools for in depth analysis. In fact, all the tools available in STATISTICA Data Miner can be quickly and effortlessly leveraged to analyze and "drill into" results generated via STATISTICA SAL.

Last but not least, STATISTICA SAL provides options for deployment, enabling you to quickly apply the rules extracted from historical data to make predictions (or "recommendations") about events (purchases) that are likely to happen next. Such models can conveniently be modified or deployed (e.g., in the STATISTICA Enterprise client-server platform) later with only a few clicks.

Highlights of Advanced and Unique Features

  • The Novel Algorithm: Instead of the apriori algorithm, the program uses a Tree-Building technique to extract Sequence and Association rules from data.
  • Database Technology: Uses efficient and thread-safe local relational Database technology to store Association and Sequence models.
  • Variable Handling: Can handle multiple response, multiple dichotomy, and continuous variables in one analysis.
  • Multi-Tasking: Can perform Sequence analysis while also mining for Association rules in a single analysis.
  • Multidimensional Analysis: Simultaneously extracts Association and Sequence rules for more than one dimension.
  • Quantitative Attributes: Given the ability to perform multidimensional Association and Sequence mining and the capacity to extract only rule for specific items, the program can be used for Predictive Data Mining.
  • Clustering Analysis: The module can perform Hierarchical Single-Linkage Cluster analysis which can detect the more likely cluster of items that can occur. This has extremely useful practical Real-World applications such as in Retailing.

 

Content

Contact Us

Statistica
2300 East 14th Street
Tulsa, Oklahoma, 74104
(918) 749-1119