### Glossary Index

###### Z

Unequal N HSD. This post hoc test can be used to determine the significant differences between group means in an analysis of variance setting. The Unequal N HSD test is a modification of the Tukey HSD test, and it provides a reasonable test of differences in group means if group n's are not too discrepant (for a detailed discussion of different post hoc tests, see Winer, Michels, & Brown (1991). For more details, see General Linear Models. See also, Post Hoc Comparisons. For a discussion of statistical significance, see Elementary Concepts.

Uniform Distribution. The discrete Uniform distribution (the term first used by Uspensky, 1937) has density function:

f(x) = 1/N          x = 1, 2, ..., N

The continuous Uniform distribution has density function: where

f(x) = 1/(b-a)        a < x < b

a     is the lower limit of the interval from which points will be selected
b     is the upper limit of the interval from which points will be selected

Unimodal Distribution. A distribution that has only one mode. A typical example is the normal distribution which happens to be also symmetrical but many unimodal distributions are not symmetrical (e.g., typically the distribution of income is not symmetrical but "left-skewed"; see skewness). See also bimodal distribution, multimodal distribution.

Unit Penalty. In several search algorithms, a penalty factor which is multiplied by the number of units in the network and added to the error of the network, when comparing the performance of the network with others. This has the effect of selecting smaller networks at the expense of larger ones. See also, Penalty Function.

Unit Types (in Neural Networks). Units in the input layer are extremely simple: they simply hold an output value, which they pass onto units in the second layer. Input units do no processing. Input units have their synaptic function set to Dot Product, and their activation function set to Identity by default; actually these functions are ignored in input units.

Each hidden or output unit has a number of incoming connections from units in the preceding layer (the fan-in): one for each unit in the preceding layer. Each unit also has a threshold value.

The outputs of the units in the preceding layer, the weights on the associated connections, and the threshold value are fed through the unit's synaptic function (post synaptic potential function) to produce a single value (the unit's input value).

The input value is passed through the unit's activation function to produce a single output value, also known as the activation level of the unit.

Unsupervised and Supervised Learning. An important distinction in machine learning, and also applicable to data mining, is that between supervised and unsupervised learning algorithms. The term "supervised" learning is usually applied to cases in which a particular classification is already observed and recorded in a training sample, and you want to build a model to predict those classifications (in a new testing sample). For example, you may have a data set that contains information about who from among a list of customers targeted for a special promotion responded to that offer. The purpose of the classification analysis would be to build a model to predict who (from a different list of new potential customers) is likely to respond to the same (or a similar) offer in the future. You may want to review the methods discussed in General Classification and Regression Trees (GC&RT), General CHAID Models (GCHAID), Discriminant Function Analysis and General Discriminant Analysis (GDA), MARSplines (Multivariate Adaptive Regression Splines), and neural networks to learn about different techniques that can be used to build or fit models to data where the outcome variable of interest (e.g., customer did or did not respond to an offer) was observed. These methods are called supervised learning algorithms because the learning (fitting of models) is "guided" or "supervised" by the observed classifications recorded in the data file.

In unsupervised learning, the situation is different. Here the outcome variable of interest is not (and perhaps cannot be) directly observed. Instead, we want to detect some "structure" or clusters in the data that may not be trivially observable. For example, you may have a database of customers with various demographic indicators and variables potentially relevant to future purchasing behavior. Your goal would be to find market segments, i.e., groups of observations that are relatively similar to each other on certain variables; once identified, you could then determine how best to reach one or more clusters by providing certain goods or services you think may have some special utility or appeal to individuals in that segment (cluster). This type of task calls for an unsupervised learning algorithm, because learning (fitting of models) in this case cannot be guided by previously known classifications. Only after identifying certain clusters can you begin to assign labels, for example, based on subsequent research (e.g., after identifying one group of customers as "young risk takers").

There are several methods available for unsupervised learning, including Principal Components and Classification Analysis, Factor Analysis, Multidimensional Scaling, Correspondence Analysis, Neural Networks, Self-Organizing Feature Maps (SOFM, Kohonen networks); particularly powerful algorithms for pattern recognition and clustering are the EM and k-Means clustering algorithms.

Unsupervised Learning in Neural Networks. Training algorithms that adjust the weights in a neural network by reference to a training data set including input variables only. Unsupervised learning algorithms attempt to locate clusters in the input data.

Unweighted Means. If the cell frequencies in a multi-factor ANOVA design are unequal, then the unweighted means (for levels of a factor) are calculated from the means of sub-groups without weighting, that is, without adjusting for the differences between the sub-group frequencies.

Content