### Glossary Index

###### 2

- 2D Bar/Column Plots
- 2D Box Plots
- 2D Box Plots - Box Whiskers
- 2D Box Plots - Boxes
- 2D Box Plots - Columns
- 2D Box Plots - Error Bars
- 2D Box Plots - Whiskers
- 2D Categorized Detrended Probability Plots
- 2D Categorized Half-Norm. Probability Plots
- 2D Categorized Normal Probability Plots
- 2D Detrended Probability Plots
- 2D Histograms
- 2D Histograms - Hanging Bars
- 2D Histograms - Double-Y
- 2D Line Plots
- 2D Line Plots - Aggregated
- 2D Line Plots - Double-Y
- 2D Line Plots - Multiple
- 2D Line Plots - Regular
- 2D Line Plots - XY Trace
- 2D Range Plots - Error Bars
- 2D Matrix Plots
- 2D Matrix Plots - Columns
- 2D Matrix Plots - Lines
- 2D Matrix Plots - Scatterplot
- 2D Normal Probability Plots
- 2D Probability-Probability Plots
- 2D Probability-Probability Plots-Categorized
- 2D Quantile-Quantile Plots
- 2D Quantile-Quantile Plots - Categorized
- 2D Scatterplot
- 2D Scatterplot - Categorized Ternary Graph
- 2D Scatterplot - Double-Y
- 2D Scatterplot - Frequency
- 2D Scatterplot - Multiple
- 2D Scatterplot - Regular
- 2D Scatterplot - Voronoi
- 2D Sequential/Stacked Plots
- 2D Sequential/Stacked Plots - Area
- 2D Sequential/Stacked Plots - Column
- 2D Sequential/Stacked Plots - Lines
- 2D Sequential/Stacked Plots - Mixed Line
- 2D Sequential/Stacked Plots - Mixed Step
- 2D Sequential/Stacked Plots - Step
- 2D Sequential/Stacked Plots - Step Area
- 2D Ternary Plots - Scatterplot

###### 3

- 3D Bivariate Histogram
- 3D Box Plots
- 3D Box Plots - Border-style Ranges
- 3D Box Plots - Double Ribbon Ranges
- 3D Box Plots - Error Bars
- 3D Box Plots - Flying Blocks
- 3D Box Plots - Flying Boxes
- 3D Box Plots - Points
- 3D Categorized Plots - Contour Plot
- 3D Categorized Plots - Deviation Plot
- 3D Categorized Plots - Scatterplot
- 3D Categorized Plots - Space Plot
- 3D Categorized Plots - Spectral Plot
- 3D Categorized Plots - Surface Plot
- 3D Deviation Plots
- 3D Range Plot - Error Bars
- 3D Raw Data Plots - Contour/Discrete
- 3D Scatterplots
- 3D Scatterplots - Ternary Graph
- 3D Space Plots
- 3D Ternary Plots
- 3D Ternary Plots - Categorized Scatterplot
- 3D Ternary Plots - Categorized Space
- 3D Ternary Plots - Categorized Surface
- 3D Ternary Plots - Categorized Trace
- 3D Ternary Plots - Contour/Areas
- 3D Ternary Plots - Contour/Lines
- 3D Ternary Plots - Deviation
- 3D Ternary Plots - Space
- 3D Trace Plots

###### A

- Aberration, Minimum
- Abrupt Permanent Impact
- Abrupt Temporary Impact
- Accept-Support Testing
- Accept Threshold
- Activation Function (in Neural Networks)
- Additive Models
- Additive Season, Damped Trend
- Additive Season, Exponential Trend
- Additive Season, Linear Trend
- Additive Season, No Trend
- Adjusted means
- Aggregation
- AID
- Akaike Information Criterion (AIC)
- Algorithm
- Alpha
- Anderson-Darling Test
- ANOVA
- Append a Network
- Append Cases and/or Variables
- Application Programming Interface (API)
- Arrow
- Assignable Causes and Actions
- Association Rules
- Asymmetrical Distribution
- AT&T Runs Rules
- Attribute (attribute variable)
- Augmented Product Moment Matrix
- Autoassociative Network
- Automatic Network Designer

###### B

- B Coefficients
- Back Propagation
- Bagging (Voting, Averaging)
- Balanced ANOVA Design
- Banner Tables
- Bar/Column Plots, 2D
- Bar Dev Plot
- Bar Left Y Plot
- Bar Right Y Plot
- Bar Top Plot
- Bar X Plot
- Bartlett Window
- Basis Functions
- Batch algorithms in
*STATISTICA Neural Net* - Bayesian Information Criterion (BIC)
- Bayesian Networks
- Bayesian Statistics
- Bernoulli Distribution
- Best Network Retention
- Best Subset Regression
- Beta Coefficients
- Beta Distribution
- Bimodal Distribution
- Binomial Distribution
- Bivariate Normal Distribution
- Blocking
- Bonferroni Adjustment
- Bonferroni Test
- Boosting
- Boundary Case
- Box Plot/Medians (Block Stats Graphs)
- Box Plot/Means (Block Stats Graphs)
- Box Plots, 2D
- Box Plots, 2D - Box Whiskers
- Box Plots, 2D - Boxes
- Box Plots, 2D - Whiskers
- Box Plots, 3D
- Box Plots, 3D - Border-Style Ranges
- Box Plots, 3D - Double Ribbon Ranges
- Box Plots, 3D - Error Bars
- Box Plots, 3D - Flying Blocks
- Box Plots, 3D - Flying Boxes
- Box Plots, 3D - Points
- Box-Ljung Q Statistic
- Breakdowns
- Breaking Down (Categorizing)
- Brown-Forsythe Homogeneity of Variances
- Brushing
- Burt Table

###### C

- Canonical Correlation
- Cartesian Coordinates
- Casewise Missing Data Deletion
- Categorical Dependent Variable
- Categorical Predictor
- Categorized Graphs
- Categorized Plots, 2D-Detrended Prob. Plots
- Categorized Plots, 2D-Half-Normal Prob. Plots
- Categorized Plots, 2D - Normal Prob. Plots
- Categorized Plots, 2D - Prob.-Prob. Plots
- Categorized Plots, 2D - Quantile Plots
- Categorized Plots, 3D - Contour Plot
- Categorized Plots, 3D - Deviation Plot
- Categorized Plots, 3D - Scatterplot
- Categorized Plots, 3D - Space Plot
- Categorized Plots, 3D - Spectral Plot
- Categorized Plots, 3D - Surface Plot
- Categorized 3D Scatterplot (Ternary graph)
- Categorized Contour/Areas (Ternary graph)
- Categorized Contour/Lines (Ternary graph)
- Categorizing
- Cauchy Distribution
- Cause-and-Effect Diagram
- Censoring (Censored Observations)
- Censoring, Left
- Censoring, Multiple
- Censoring, Right
- Censoring, Single
- Censoring, Type I
- Censoring, Type II
- CHAID
- Characteristic Life
- Chernoff Faces (Icon Plots)
*Chi*-square Distribution- Circumplex
- City-Block (Manhattan) Distance
- Classification
- Classification (in Neural Networks)
- Classification and Regression Trees
- Classification by Labeled Exemplars (in NN)
- Classification Statistics (in Neural Networks)
- Classification Thresholds (in Neural Networks)
- Classification Trees
- Class Labeling (in Neural Networks)
- Cluster Analysis
- Cluster Diagram (in Neural Networks)
- Cluster Networks (in Neural Networks)
- Coarse Coding
- Codes
- Coding Variable
- Coefficient of Determination
- Coefficient of Variation
- Column Sequential/Stacked Plot
- Columns (Box Plot)
- Columns (Icon Plot)
- Common Causes
- Communality
- Complex Numbers
- Conditional Probability
- Conditioning (Categorizing)
- Confidence Interval
- Confidence Interval for the Mean
- Confidence Interval vs. Prediction Interval
- Confidence Limits
- Confidence Value (Association Rules)
- Confusion Matrix (in Neural Networks)
- Conjugate Gradient Descent (in Neural Net)
- Continuous Dependent Variable
- Contour/Discrete Raw Data Plot
- Contour Plot
- Control, Quality
- Cook's Distance
- Correlation
- Correlation, Intraclass
- Correlation (Pearson r)
- Correlation Value (Association Rules)
- Correspondence Analysis
- Cox-Snell Gen. Coefficient Determination
- Cpk, Cp, Cr
- CRISP
- Cross Entropy (in Neural Networks)
- Cross Verification (in Neural Networks)
- Cross-Validation
- Crossed Factors
- Crosstabulations
- C-SVM Classification
- Cubic Spline Smoother
- "Curse" of Dimensionality

###### D

- Daniell (or Equal Weight) Window
- Data Mining
- Data Preparation Phase
- Data Reduction
- Data Rotation (in 3D space)
- Data Warehousing
- Decision Trees
- Degrees of Freedom
- Deleted Residual
- Denominator Synthesis
- Dependent t-test
- Dependent vs. Independent Variables
- Deployment
- Derivative-Free Funct. Min. Algorithms
- Design, Experimental
- Design Matrix
- Desirability Profiles
- Detrended Probability Plots
- Deviance
- Deviance Residuals
- Deviation
- Deviation Assign. Algorithms (in Neural Net)
- Deviation Plot (Ternary Graph)
- Deviation Plots, 3D
- DFFITS
- DIEHARD Suite of Tests & Randm. Num. Gen.
- Differencing (in Time Series)
- Dimensionality Reduction
- Discrepancy Function
- Discriminant Function Analysis
- Distribution Function
- DOE
- Document Frequency
- Double-Y Histograms
- Double-Y Line Plots
- Double-Y Scatterplot
- Drill-Down Analysis
- Drilling-down (Categorizing)
- Duncan's test
- Dunnett's test
- DV

###### E

- Effective Hypothesis Decomposition
- Efficient Score Statistic
- Eigenvalues
- Ellipse, Prediction Area and Range
- EM Clustering
- Endogenous Variable
- Ensembles (in Neural Networks)
- Enterprise Resource Planning (ERP)
- Enterprise SPC
- Enterprise-Wide Software Systems
- Entropy
- Epoch in (Neural Networks)
- Eps
- EPSEM Samples
- ERP
- Error Bars (2D Box Plots)
- Error Bars (2D Range Plots)
- Error Bars (3D Box Plots)
- Error Bars (3D Range Plots)
- Error Function (in Neural Networks)
- Estimable Functions
- Euclidean Distance
- Euler's e
- Exogenous Variable
- Experimental Design
- Explained Variance
- Exploratory Data Analysis
- Exponential Distribution
- Exponential Family of Distributions
- Exponential Function
- Exponentially Weighted Moving Avg. Line
- Extrapolation
- Extreme Values (in Box Plots)
- Extreme Value Distribution

###### F

- F Distribution
- FACT
- Factor Analysis
- Fast Analysis Shared Multidimensional Info. FASMI
- Feature Extraction (vs. Feature Selection)
- Feature Selection
- Feedforward Networks
- Fisher LSD
- Fixed Effects (in ANOVA)
- Free Parameter
- Frequencies, Marginal
- Frequency Scatterplot
- Frequency Tables
- Function Minimization Algorithms

###### G

- g2 Inverse
- Gains Chart
- Gamma Coefficient
- Gamma Distribution
- Gaussian Distribution
- Gauss-Newton Method
- General ANOVA/MANOVA
- General Linear Model
- Generalization (in Neural Networks)
- Generalized Additive Models
- Generalized Inverse
- Generalized Linear Model
- Genetic Algorithm
- Genetic Algorithm Input Selection
- Geometric Distribution
- Geometric Mean
- Gibbs Sampler
- Gini Measure of Node Impurity
- Gompertz Distribution
- Goodness of Fit
- Gradient
- Gradient Descent
- Gradual Permanent Impact
- Group Charts
- Grouping (Categorizing)
- Grouping Variable
- Groupware

###### H

- Half-Normal Probability Plots
- Half-Normal Probability Plots - Categorized
- Hamming Window
- Hanging Bars Histogram
- Harmonic Mean
- Hazard
- Hazard Rate
- Heuristic
- Heywood Case
- Hidden Layers (in Neural Networks)
- High-Low Close
- Histograms, 2D
- Histograms, 2D - Double-Y
- Histograms, 2D - Hanging Bars
- Histograms, 2D - Multiple
- Histograms, 2D - Regular
- Histograms, 3D Bivariate
- Histograms, 3D - Box Plots
- Histograms, 3D - Contour/Discrete
- Histograms, 3D - Contour Plot
- Histograms, 3D - Spikes
- Histograms, 3D - Surface Plot
- Hollander-Proschan Test
- Hooke-Jeeves Pattern Moves
- Hosmer-Lemeshow Test
- HTM
- HTML
- Hyperbolic Tangent (tanh)
- Hyperplane
- Hypersphere

###### I

- Icon Plots
- Icon Plots - Chernoff Faces
- Icon Plots - Columns
- Icon Plots - Lines
- Icon Plots - Pies
- Icon Plots - Polygons
- Icon Plots - Profiles
- Icon Plots - Stars
- Icon Plots - Sun Rays
- Increment vs Non-Increment Learning Algr.
- Independent Events
- Independent t-test
- Independent vs. Dependent Variables
- Industrial Experimental Design
- Inertia
- Inlier
- In-Place Database Processing (IDP)
- Interactions
- Interpolation
- Interval Scale
- Intraclass Correlation Coefficient
- Invariance Const. Scale Factor ICSF
- Invariance Under Change of Scale (ICS)
- Inverse Document Frequency
- Ishikawa Chart
- Isotropic Deviation Assignment
- Item and Reliability Analysis
- IV

###### J

###### K

###### L

- Lack of Fit
- Lambda Prime
- Laplace Distribution
- Latent Semantic Indexing
- Latent Variable
- Layered Compression
- Learned Vector Quantization (in Neural Net)
- Learning Rate (in Neural Networks)
- Least Squares (2D graphs)
- Least Squares (3D graphs)
- Least Squares Estimator
- Least Squares Means
- Left and Right Censoring
- Levenberg-Marquardt Algorithm (in Neural Net)
- Levene's Test for Homogeneity of Variances
- Leverage values
- Life Table
- Life, Characteristic
- Lift Charts
- Likelihood
- Lilliefors test
- Line Plots, 2D
- Line Plots, 2D - Aggregated
- Line Plots, 2D (Case Profiles)
- Line Plots, 2D - Double-Y
- Line Plots, 2D - Multiple
- Line Plots, 2D - Regular
- Line Plots, 2D - XY Trace
- Linear (2D graphs)
- Linear (3D graphs)
- Linear Activation function
- Linear Modeling
- Linear Units
- Lines (Icon Plot)
- Lines (Matrix Plot)
- Lines Sequential/Stacked Plot
- Link Function
- Local Minima
- Locally Weighted (Robust) Regression
- Logarithmic Function
- Logistic Distribution
- Logistic Function
- Logit Regression and Transformation
- Log-Linear Analysis
- Log-Normal Distribution
- Lookahead (in Neural Networks)
- Loss Function
- LOWESS Smoothing

###### M

- Machine Learning
- Mahalanobis Distance
- Mallow's CP
- Manifest Variable
- Mann-Scheuer-Fertig Test
- MANOVA
- Marginal Frequencies
- Markov Chain Monte Carlo (MCMC)
- Mass
- Matching Moments Method
- Matrix Collinearity
- Matrix Ill-Conditioning
- Matrix Inverse
- Matrix Plots
- Matrix Plots - Columns
- Matrix Plots - Lines
- Matrix Plots - Scatterplot
- Matrix Rank
- Matrix Singularity
- Maximum Likelihood Loss Function
- Maximum Likelihood Method
- Maximum Unconfounding
- MD (Missing data)
- Mean
- Mean/S.D. Algorithm (in Neural Networks)
- Mean, Geometric
- Mean, Harmonic
- Mean Substitution of Missing Data
- Means, Adjusted
- Means, Unweighted
- Median
- Meta-Learning
- Method of Matching Moments
- Minimax
- Minimum Aberration
- Mining, Data
- Missing values
- Mixed Line Sequential/Stacked Plot
- Mixed Step Sequential/Stacked Plot
- Mode
- Model Profiles (in Neural Networks)
- Models for Data Mining
- Monte Carlo
- Multi-Pattern Bar
- Multicollinearity
- Multidimensional Scaling
- Multilayer Perceptrons
- Multimodal Distribution
- Multinomial Distribution
- Multinomial Logit and Probit Regression
- Multiple Axes in Graphs
- Multiple Censoring
- Multiple Dichotomies
- Multiple Histogram
- Multiple Line Plots
- Multiple Scatterplot
- Multiple R
- Multiple Regression
- Multiple Response Variables
- Multiple-Response Tables
- Multiple Stream Group Charts
- Multiplicative Season, Damped Trend
- Multiplicative Season, Exponential Trend
- Multiplicative Season, Linear Trend
- Multiplicative Season, No Trend
- Multivar. Adapt. Regres. Splines MARSplines
- Multi-way Tables

###### N

- Nagelkerke Gen. Coefficient Determination
- Naive Bayes
- Neat Scaling of Intervals
- Negative Correlation
- Negative Exponential (2D graphs)
- Negative Exponential (3D graphs)
- Neighborhood (in Neural Networks)
- Nested Factors
- Nested Sequence of Models
- Neural Networks
- Neuron
- Newman-Keuls Test
- N-in-One Encoding
- Noise Addition (in Neural Networks)
- Nominal Scale
- Nominal Variables
- Nonlinear Estimation
- Nonparametrics
- Non-Outlier Range
- Nonseasonal, Damped Trend
- Nonseasonal, Exponential Trend
- Nonseasonal, Linear Trend
- Nonseasonal, No Trend
- Normal Distribution
- Normal Distribution, Bivariate
- Normal Fit
- Normality Tests
- Normalization
- Normal Probability Plots
- Normal Probability Plots (Computation Note)
- n Point Moving Average Line

###### O

- ODBC
- Odds Ratio
- OLE DB
- On-Line Analytic Processing (OLAP)
- One-Off (in Neural Networks)
- One-of-N Encoding (in Neural Networks)
- One-Sample t-Test
- One-Sided Ranges Error Bars Range Plots
- One-Way Tables
- Operating Characteristic Curves
- Ordinal Multinomial Distribution
- Ordinal Scale
- Outer Arrays
- Outliers
- Outliers (in Box Plots)
- Overdispersion
- Overfitting
- Overlearning (in Neural Networks)
- Overparameterized Model

###### P

- Pairwise Del. Missing Data vs Mean Subst.
- Pairwise MD Deletion
- Parametric Curve
- Pareto Chart Analysis
- Pareto Distribution
- Part Correlation
- Partial Correlation
- Partial Least Squares Regression
- Partial Residuals
- Parzen Window
- Pearson Correlation
- Pearson Curves
- Pearson Residuals
- Penalty Functions
- Percentiles
- Perceptrons (in Neural Networks)
- Pie Chart
- Pie Chart - Counts
- Pie Chart - Multi-Pattern Bar
- Pie Chart - Values
- Pies (Icon Plots)
- PMML (Predictive Model Markup Language)
- PNG Files
- Poisson Distribution
- Polar Coordinates
- Polygons (Icon Plots)
- Polynomial
- Population Stability Report
- Portable Network Graphics Files
- Positive Correlation
- Post hoc Comparisons
- Post Synaptic Potential (PSP) Function
- Posterior Probability
- Power (Statistical)
- Power Goal
- Ppk, Pp, Pr
- Prediction Interval Ellipse
- Prediction Profiles
- Predictive Data Mining
- Predictive Mapping
- Predictive Model Markup Language (PMML)
- Predictors
- PRESS Statistic
- Principal Components Analysis
- Prior Probabilities
- Probability
- Probability Plots - Detrended
- Probability Plots - Normal
- Probability Plots - Half-Normal
- Probability-Probability Plots
- Probability-Probability Plots - Categorized
- Probability Sampling
- Probit Regression and Transformation
- PROCEED
- Process Analysis
- Process Capability Indices
- Process Performance Indices
- Profiles, Desirability
- Profiles, Prediction
- Profiles (Icon Plots)
- Pruning (in Classification Trees)
- Pseudo-Components
- Pseudo-Inverse Algorithm
- Pseudo-Inverse-Singular Val. Decomp. NN
- PSP (Post Synaptic Potential) Function
- Pure Error
- p-Value (Statistical Significance)

###### Q

###### R

- R Programming Language
- Radial Basis Functions
- Radial Sampling (in Neural Networks)
- Random Effects (in Mixed Model ANOVA)
- Random Forests
- Random Num. from Arbitrary Distributions
- Random Numbers (Uniform)
- Random Sub-Sampling in Data Mining
- Range Ellipse
- Range Plots - Boxes
- Range Plots - Columns
- Range Plots - Whiskers
- Rank
- Rank Correlation
- Ratio Scale
- Raw Data, 3D Scatterplot
- Raw Data Plots, 3D - Contour/Discrete
- Raw Data Plots, 3D - Spikes
- Raw Data Plots, 3D - Surface Plot
- Rayleigh Distribution
- Receiver Oper. Characteristic Curve
- Receiver Oper. Characteristic (in Neural Net)
- Rectangular Distribution
- Regression
- Regression (in Neural Networks)
- Regression, Multiple
- Regression Summary Statistics (in Neural Net)
- Regular Histogram
- Regular Line Plots
- Regular Scatterplot
- Regularization (in Neural Networks)
- Reject Inference
- Reject Threshold
- Relative Function Change Criterion
- Reliability
- Reliability and Item Analysis
- Representative Sample
- Resampling (in Neural Networks)
- Residual
- Resolution
- Response Surface
- Right Censoring
- RMS (Root Mean Squared) Error
- Robust Locally Weighted Regression
- ROC Curve
- ROC Curve (in Neural Networks)
- Root Cause Analysis
- Root Mean Square Stand. Effect RMSSE
- Rosenbrock Pattern Search
- Rotating Coordinates, Method of
- r (Pearson Correlation Coefficient)
- Runs Tests (in Quality Control)

###### S

- Sampling Fraction
- Scalable Software Systems
- Scaling
- Scatterplot, 2D
- Scatterplot, 2D-Categorized Ternary Graph
- Scatterplot, 2D - Double-Y
- Scatterplot, 2D - Frequency
- Scatterplot, 2D - Multiple
- Scatterplot, 2D - Regular
- Scatterplot, 2D - Voronoi
- Scatterplot, 3D
- Scatterplot, 3D - Raw Data
- Scatterplot, 3D - Ternary Graph
- Scatterplot Smoothers
- Scheffe's Test
- Score Statistic
- Scree Plot, Scree Test
- S.D. Ratio
- Semi-Partial Correlation
- SEMMA
- Sensitivity Analysis (in Neural Networks)
- Sequential Contour Plot, 3D
- Sequential/Stacked Plots, 2D
- Sequential/Stacked Plots, 2D - Area
- Sequential/Stacked Plots, 2D - Column
- Sequential/Stacked Plots, 2D - Lines
- Sequential/Stacked Plots, 2D - Mixed Line
- Sequential/Stacked Plots, 2D - Mixed Step
- Sequential/Stacked Plots, 2D - Step
- Sequential/Stacked Plots, 2D - Step Area
- Sequential Surface Plot, 3D
- Sets of Samples in Quality Control Charts
- Shapiro-Wilks' W test
- Shewhart Control Charts
- Short Run Control Charts
- Shuffle, Back Propagation (in Neural Net)
- Shuffle Data (in Neural Networks)
- Sigma Restricted Model
- Sigmoid Function
- Signal Detection Theory
- Simple Random Sampling (SRS)
- Simplex Algorithm
- Single and Multiple Censoring
- Singular Value Decomposition
- Six Sigma (DMAIC)
- Six Sigma Process
- Skewness
- Slicing (Categorizing)
- Smoothing
- SOFMs Self-Organizing Maps Kohonen Net
- Softmax
- Space Plots 3D
- SPC
- Spearman R
- Special Causes
- Spectral Plot
- Spikes (3D graphs)
- Spinning Data (in 3D space)
- Spline (2D graphs)
- Spline (3D graphs)
- Split Selection (for Classification Trees)
- Splitting (Categorizing)
- Spurious Correlations
- SQL
- Square Root of the Signal to Noise Ratio (f)
- Stacked Generalization
- Stacking (Stacked Generalization)
- Standard Deviation
- Standard Error
- Standard Error of the Mean
- Standard Error of the Proportion
- Standardization
- Standardized DFFITS
- Standardized Effect (Es)
- Standard Residual Value
- Stars (Icon Plots)
- Stationary Series (in Time Series)
- STATISTICA Advanced Linear/Nonlinear
- STATISTICA Automated Neural Networks
- STATISTICA Base
- STATISTICA Data Miner
- STATISTICA Data Warehouse
- STATISTICA Document Management System
- STATISTICA Enterprise
- STATISTICA Enterprise/QC
- STATISTICA Enterprise Server
- STATISTICA Enterprise SPC
- STATISTICA Monitoring and Alerting Server
- STATISTICA MultiStream
- STATISTICA Multivariate Stat. Process Ctrl
- STATISTICA PI Connector
- STATISTICA PowerSolutions
- STATISTICA Process Optimization
- STATISTICA Quality Control Charts
- STATISTICA Sequence Assoc. Link Analysis
- STATISTICA Text Miner
- STATISTICA Variance Estimation Precision
- Statistical Power
- Statistical Process Control (SPC)
- Statistical Significance (p-value)
- Steepest Descent Iterations
- Stemming
- Steps
- Stepwise Regression
- Stiffness Parameter (in Fitting Options)
- Stopping Conditions
- Stopping Conditions (in Neural Networks)
- Stopping Rule (in Classification Trees)
- Stratified Random Sampling
- Stub and Banner Tables
- Studentized Deleted Residuals
- Studentized Residuals
- Student's t Distribution
- Sum-Squared Error Function
- Sums of Squares (Type I, II, III (IV, V, VI))
- Sun Rays (Icon Plots)
- Supervised Learning (in Neural Networks)
- Support Value (Association Rules)
- Support Vector
- Support Vector Machine (SVM)
- Suppressor Variable
- Surface Plot (from Raw Data)
- Survival Analysis
- Survivorship Function
- Sweeping
- Symmetrical Distribution
- Symmetric Matrix
- Synaptic Functions (in Neural Networks)

###### T

- Tables
- Tapering
- t Distribution (Student's)
- Tau, Kendall
- Ternary Plots, 2D - Scatterplot
- Ternary Plots, 3D
- Ternary Plots, 3D - Categorized Scatterplot
- Ternary Plots, 3D - Categorized Space
- Ternary Plots, 3D - Categorized Surface
- Ternary Plots, 3D - Categorized Trace
- Ternary Plots, 3D - Contour/Areas
- Ternary Plots, 3D - Contour/Lines
- Ternary Plots, 3D - Deviation
- Ternary Plots, 3D - Space
- Text Mining
- THAID
- Threshold
- Time Series
- Time Series (in Neural Networks)
- Time-Dependent Covariates
- Tolerance (in Multiple Regression)
- Topological Map
- Trace Plots, 3D
- Trace Plot, Categorized (Ternary Graph)
- Training/Test Error/Classification Accuracy
- Transformation (Probit Regression)
- Trellis Graphs
- Trimmed Means
- t-Test (independent & dependent samples)
- Tukey HSD
- Tukey Window
- Two-State (in Neural Networks)
- Type I, II, III (IV, V, VI) Sums of Squares
- Type I Censoring
- Type II Censoring
- Type I Error Rate

###### U

###### V

###### W

###### X

###### Y

###### Z

Tapering. The so-called process of split-cosine-bell tapering in Time Series is a recommended transformation of the series prior to the spectrum analysis. It usually leads to a reduction of leakage in the periodogram. The rational for this transformation is explained in detail in Bloomfield (1976, p. 80-94). In essence, a proportion (*p*) of the data at the beginning and at the end of the series is transformed via multiplication by the weights:

w_{t} = 0.5*{1-cos[*(t - 0.5)/m]} (for t=0 to m-1)

w_{t} = 0.5*{1-cos[*(N - t + 0.5)/m]} (for t=N-m to N-1)

where *m* is chosen so that 2**m/N* is equal to the proportion of data to be tapered (*p*).

Terabyte. 1 terabyte = 1,000 gigabytes. Current distributed file system technology such as Hadoop allows for the storage and management of multiple terabytes of data in a single repository

Ternary Plots, 2D - Scatterplot. In this type of ternary graph, the triangular coordinate systems are used to plot three (or more) variables [the components *X, Y*, and *Z*] in two dimensions. Here, the points representing the proportions of the component variables (*X, Y*, and *Z*) are plotted.

See also, Data Reduction.

Ternary Plots, 3D. A ternary plot can be used to examine relations between four or more dimensions where three of those dimensions represent components of a mixture (i.e., the relations between them is constrained such that the values of the three variables add up to the same constant). One typical application of this graph is when the measured response(s) from an experiment depends on the relative proportions of three components (e.g., three different chemicals) which are varied in order to determine an optimal combination of those components (e.g., in mixture designs).

Ternary Plots, 3D - Categorized Scatterplot. The responses associated with the proportions of the component variables (*X, Y*, and *Z*) in a ternary graph are plotted in a 3-dimensional display for each level of the grouping variable (or user-defined subset of data). One component graph is produced for each level of the grouping variable (or user-defined subset of data) and all the component graphs are arranged in one display to allow for comparisons between the subsets of data (categories).

See also, Data Reduction.

Ternary Plots, 3D - Categorized Space. In this type of ternary graph, 3D scatterplot data are represented through the use of an *X-Y-Z* plane (defined via a triangular coordinate system) positioned at a user-selectable level of the vertical *V-axis* (which "sticks up" through the middle of the plane) and categorized by each level of the grouping variable (or user-defined subset of data). One component graph is produced for each level of the grouping variable (or user-defined subset of data) and all the component graphs are arranged in one display to allow for comparisons between the subsets of data (categories).

The level of the *X-Y-Z* plane can be adjusted in order to divide the *X-Y-Z-V* space into meaningful parts (e.g., featuring different patterns of the relation between the three variables).

Ternary Plots, 3D - Categorized Surface. A surface is fit to a four-coordinate data set in this 3-dimensional ternary graph categorized by each level of the grouping variable (or user-defined subset of data). One component graph is produced for each level of the grouping variable (or user-defined subset of data) and all the component graphs are arranged in one display to allow for comparisons between the subsets of data (categories).

Ternary Plots, 3D - Categorized Trace. In this type of ternary graph, we can examine the relations between four or more dimensions (*X, Y, Z*, and *V1, V2*, etc.) as a 3D trace plot categorized by each level of the grouping variable (or user-defined subset of data). One component graph is produced for each level of the grouping variable (or user-defined subset of data) and all the component graphs are arranged in one display to allow for comparisons between the subsets of data (categories).

Ternary Plots, 3D - Contour/Areas. In this type of ternary graph, the 3-dimensional surface (fitted to a four-coordinate data set) is projected onto a 2-dimensional plane as an area contour.

Ternary Plots, 3D - Contour/Lines. In this type of ternary graph, the 3-dimensional surface (fitted to a four-coordinate data set) is projected onto a 2-dimensional plane as a line contour (see graph below).

Ternary Plots, 3D - Deviation. Use this type of ternary graph to examine the relations between four or more dimensions (*X, Y, Z*, and *V1, V2*, etc.) as "deviations" from a specified base-level of the *V-axis* where three of those dimensions (*X, Y*, and *Z*) represent components of a mixture (i.e., the relations between them is constrained such that the values of the three variables add up to the same constant for each case).

Ternary Plots, 3D - Space. This type of ternary graph offers a distinctive method of representing 3D scatterplot data through the use of an *X-Y-Z* plane (defined via a triangular coordinate system) positioned at a user-selectable level of the vertical *V-axis* (which "sticks up" through the middle of the plane). The level of the *X-Y-Z* plane can be adjusted in order to divide the *X-Y-Z*-space into meaningful parts (e.g., featuring different patterns of the relation between the three variables).

Text Mining. While data mining is typically concerned with the detection of patterns in numeric data, very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data, text is often amorphous, and difficult to deal with (e.g., email messages, open-ended comments on a questionnaire or suggestion form, patients' descriptions of their symptoms, searches of written historical records, etc.). Text mining generally consists of the analysis of (multiple) text documents by extracting key phrases, concepts, etc. and the preparation of the text processed in that manner for further analyses with numeric data mining techniques (e.g., to determine co-occurrences of concepts, key phrases, names, addresses, product names, etc.).

A typical (first) goal in data mining is feature extraction, i.e., the identification of the terms and concepts most frequently used in the input documents; a second goal typically is to discover any associations between features (e.g., associations between symptoms as described by patients). Hence, a first step to text mining usually consists of "coding" the information in the input text; as a second step various methods such as Association Rules algorithms may be applied to determine relations between features.

THAID. *THAID* is a classification trees program developed by Morgan & Messenger (1973) that performs multi-level splits when computing classification trees. For discussion of the differences of *THAID* from other classification tree programs, see A Brief Comparison of Classification Tree Programs.

Threshold. A criterion value (sometimes arbitrarily established) that is used to determine if particular conditions are met or a point separating conditions. (In neural networks, a value subtracted from the weighted sum in a linear PSP unit to produce the activation level. In radial units, the threshold is actually treated as a deviation.)

Time Series. A *Time series* is a sequence of measurements, typically taken at successive points in time. *Time series* analysis includes a broad spectrum of exploratory and hypothesis testing methods that have two main goals: (a) identifying the nature of the phenomenon represented by the sequence of observations, and (b) forecasting (predicting future values of the time series variable). Both of these goals require that the pattern of observed time series data is identified and more or less formally described. Once the pattern is established, we can interpret and integrate it with other data (i.e., use it in our theory of the investigated phenomenon, e.g., seasonal commodity prices). Regardless of the depth of our understanding and the validity of our interpretation (theory) of the phenomenon, we can extrapolate the identified pattern to predict future events.

For more information, see Time Series.

Time Series (in Neural Networks). Many important problems can be classified as time series problems; the objective is to predict the value of some (typically continuous) variable, giving previous values of that and/or other variables (Bishop, 1995).

Time-Dependent Covariates. Time-dependent covariates occur when the effect of the covariate on survival is dependent on time (i.e., the conditional hazard at each point in time is a function of the covariate and time).

Tolerance (in Multiple Regression). The *tolerance* of a variable is defined as 1 minus the squared multiple correlation of this variable with all other independent variables in the regression equation. Therefore, the smaller the *tolerance* of a variable, the more redundant is its contribution to the regression (i.e., it is redundant with the contribution of other independent variables). If the *tolerance* of any of the variables in the regression equation is equal to zero (or very close to zero), then the regression equation cannot be evaluated (the matrix is said to be ill-conditioned, and it cannot be inverted).

Topological Map. The radial layer of a Kohonen network, with units laid out in two-dimensions, and trained so that inter-related clusters tend to be situated close together in the layer. Used for cluster analysis (Kohonen, 1982; Fausett, 1994; Haykin, 1994; Patterson, 1996). See, Neural Networks.

Trace Plots 3D. As in *3D Scatterplots*, each data point in *Trace Plots* is represented by its location in 3D space as determined by the values of the variables selected as *X*, *Y*, and *Z* (and interpreted as the *X*, *Y*, and *Z* axis coordinates). The data points are then connected sequentially (in the order encountered in the data file) with a line to form a "trace" of a sequential process (e.g., movement, change of a phenomenon over time, etc.).

A good metaphor of the information that is best represented in a trace plot is that of the trajectory of an object in three-dimensional space.

Trace Plot, Categorized (Ternary Graph). Use this type of ternary graph to examine the relations between four or more dimensions (*X, Y, Z*, and *V1, V2*, etc.) as a 3D trace plot where three of those dimensions (*X, Y*, and *Z*) represent components of a mixture (i.e., the relations between them is constrained such that the values of the three variables add up to the same constant for each case). Data points in this graph are positioned as in regular 3D scatterplots, however, individual data points are connected with a line (in the order in which they were read from the data file), visualizing a "trace" of sequential values.

Training/Test Error/Classification Accuracy. A measure of how well a model is trained to predict the training/testing data.

Trimmed Means. For certain graphs (e.g., *2D Box Plots*, *3D Box Plots*, *Categorized Box Plots*), an option is available to trim the extreme values from the distribution of values of a variable. For example, we can trim (i.e., remove) the lowest 5% and the highest 5% from the distribution of values. The mean of the trimmed distribution of values is referred to as a "trimmed mean" (this term was first used by Tukey, 1962).

t-Test (for Independent and Dependent Samples). The *t-test* is the most commonly used method to evaluate the differences in means between two groups. The groups can be independent (e.g., blood pressure of patients who were given a drug vs. a control group who received a placebo) or dependent (e.g., blood pressure of patients "before" vs. "after" they received a drug, see below). Theoretically, the *t-test* can be used even if the sample sizes are very small (e.g., as small as 10; some researchers claim that even smaller n's are possible), as long as the variables are approximately normally distributed and the variation of scores in the two groups is not reliably different (see also Elementary Concepts).

**Dependent samples test. **The *t-test for dependent samples* can be used to analyze designs in which the within-group variation (normally contributing to the error of the measurement) can be easily identified and excluded from the analysis. Specifically, if the two groups of measurements (that are to be compared) are based on the same sample of observation units (e.g., subjects) that were tested twice (e.g., before and after a treatment), then a considerable part of the within-group variation in both groups of scores can be attributed to the initial individual differences between the observations and thus accounted for (i.e., subtracted from the error). This, in turn, increases the sensitivity of the design.

**One-sample test. **In so-called *one-sample t-test*, the observed mean (from a single sample) is compared to an expected (or reference) mean of the population (e.g., some theoretical mean), and the variation in the population is estimated based on the variation in the observed sample.

See Hays, 1988. See also the Basic Statistics Introductory Overviews: t-test for Independent Samples and t-test for Dependent Samples.

Tukey HSD. This post hoc test (or multiple comparison test) can be used to determine the significant differences between group means in an analysis of variance setting. The *Tukey HSD* is generally more conservative than the Fisher LSD test but less conservative than Scheffe's test (for a detailed discussion of different post hoc tests, see Winer, Michels, & Brown (1991). For more details, see General Linear Models. See also, Post Hoc Comparisons. For a discussion of statistical significance, see Elementary Concepts.

Tukey Window. In Time Series, the Tukey window is a weighted moving average transformation used to smooth the periodogram values. In the Tukey (Blackman and Tukey, 1958) or Tukey-Hanning window (named after Julius Von Hann), for each frequency, the weights for the weighted moving average of the periodogram values are computed as:

w_{j} = 0.5 + 0.5*cos(*j/p) (for j=0 to p)

w_{-j} = w_{j} (for j 0)

where *p* = *(m-1)/2x*.

This weight function will assign the greatest weight to the observation being smoothed in the center of the window, and increasingly smaller weights to values that are further away from the center.

See also, Spectrum Analysis - Basic Notations and Principles.

Two-State (in Neural Networks). An encoding technique for nominal variables with only two values, where the nominal variable is represented by a single input or output unit, either set or cleared. See, Neural Networks.

Type I, II, III (IV, V) Sums of Squares. When in a factorial ANOVA design there are missing cells, then there is ambiguity regarding the specific comparisons between the (population, or least-squares) cell means that constitute the main effects and interactions of interest. General Linear Models discusses the methods commonly labeled *Type **I*, *II*, *III*, and *IV* sums of squares as well as methods for testing effects in incomplete designs, that are widely used in other areas (and traditions) of research.

**Type V**** sums of squares.** We propose the term *Type V* *sums of squares *to denote the approach that is widely used in industrial experimentation, to analyze fractional factorial designs; these types of designs are discussed in detail in the *2**(k-p) Fractional Factorial Designs* section of Experimental Design.* *In effect, for those effects for which tests are performed all population marginal means (least squares means) are estimable.

**Type VI** **sums of squares****.** We propose the term *Type VI sums of squares *to denote the approach that is often used in programs that only implement the sigma restricted model (as opposed to programs like *STATISTICA*'s *VGLM* which offers the user a choice between the sigma restricted and overparameterized). This approach is identical to what is described as the *effective hypothesis *method in Hocking (1996).

For additional details, see the *Six types of sums of squares* topic in General Linear Models.

Type I and II Censoring. So-called *Type I censoring* describes the situation when a test is terminated at a particular point in time, so that the remaining items are only known not to have failed up to that time (e.g., we start with 100 light bulbs, and terminate the experiment after a certain amount of time). In this case, the censoring time is often fixed, and the number of items failing is a random variable. In *Type II censoring* the experiment would be continued until a fixed proportion of items have failed (e.g., we stop the experiment after exactly 50 light bulbs have failed). In this case, the number of items failing is fixed, and time is the random variable.

Data sets with censored observations can be analyzed via Survival Analysis or via Weibull and Reliability/Failure Time Analysis. See also, Single and Multiple Censoring and Left and Right Censoring.

Type I Error Rate (Alpha). The probability of incorrectly rejecting a true statistical null hypothesis.