Search the Electronic Statistics Textbook

StatSoft.comTextbookResponse Optimization

 

Response Optimization for Data Mining Models

Iterations for Response OptimizationResponse Optimization for Data Mining Models is a method of analysis designed for optimizing and exploring deployed predictive data mining models. You can think of a predictive model as a black box representation of the relationship between a set of independent variables (also known as measurement variables) and one or more dependent variables known as response. The independent and response variables can either be continuous or categorical. Continuous response variables imply regression tasks, while categorical responses imply classification. Thus, when presented with a set of independent values, the task of a predictive model is to produce a response. This is known as prediction making, in which a set of values for the independent variables are fed into the model and an estimate of the response is received in return.
 
There are situations, however, where the response variable is known, and the aim is to find a set of values for the independent variables for which the predictive model yields the desired response. This is a reverse engineering technique and is known as Response Optimization. Such situations frequently occur in industrial production and product development, where the response variable is a product quality variable and the measurement (independent) variables are the conditions under which the product is developed. A way to achieve this task is to conduct a discrete (guided or unguided) search in the independent variables space. For each selected set of independent values, the model prediction is evaluated and compared with the desired response. This process is then repeated until a set of independent values are discovered for which the model yield is equal or as close as possible to the desired response value.
 
Several techniques are available for performing the above optimization task (for one or more predictive models). These include Simplex, Grid, and Random search algorithms. The Simplex method is a guided optimization algorithm that finds a set of independent values yielding the desired response in a finite number of steps. The Grid and Random algorithms are unguided search techniques based on brute computing power.
 
The models can either be optimized on a standalone basis (i.e., as separate models) or can be combined to form an ensemble. The latter feature enables the user to treat the predictive models as an ensemble. Ensembles of predictive models are known to have a better generalization ability compared to their standalone members.