| Moving from STATISTICA Version 7 Neural Networks (SNN) to STATISTICA Version 8 Automated Neural Networks |
The popularity of neural network methodology is rapidly growing in a wide variety of areas from basic research to data mining applications, business forecasting and risk management, engineering, and others (see Example Applications). STATISTICA Automated Neural Networks (SANN) is one of the most advanced and best performing neural networks applications on the market. It offers numerous unique advantages and will appeal not only to neural network experts (by offering them a wide selection of network types and training algorithms), but also to new users in the field of neural computing (via the unique Automated Network Search, a tool that guides the user through the necessary procedures for creating neural networks).
STATISTICA Automated Neural Networks is a comprehensive, state-of-the-art, powerful, and extremely fast neural network data analysis package, featuring:
| Back to Technical Description |
STATISTICA Automated Neural Networks - Tackling the Real Issues in Neural Computing
Using neural networks involves more than simply feeding data to a neural network.
STATISTICA Automated Neural Networks (SANN) has the functionality to assist you through the critical design stages, including not only state-of-the-art Neural Network Architectures and Training Algorithms, but also innovative new approaches to network architecture design by using specific and meaningful error functions that allow the iterpretation of the output results. Moreover, software developers and those users who experiment with customized applications will appreciate the fact that once your prototyping experiments are completed using STATISTICA Automated Neural Networks' simple and intuitive user interface, neural networks analyses can be incorporated in custom applications by using either the STATISTICA library of COM functions that fully expose all functionality of the program or by using the C/C++ code generated by the program to aid in the deployment of fully trained networks.
| Back to Technical Description |
STATISTICA Automated Neural Networks is fully integrated with the STATISTICA system, so a large selection of tools for editing (preparing) data for analyses is available (transformations, case selection conditions, data verification tools, etc.). Like all STATISTICA analyses, the program can be "connected" to remote databases via the tools for in-place-database processing, or it can be linked to active data so that models are retrained or applied (e.g., to compute predicted values or classifications) Automatedally every time the data change.
| Back to Technical Description |
Data Scaling and Nominal Value Preparation
In general, data must be specifically prepared for input into neural networks, and also it is important that the network output can be interpreted correctly. STATISTICA Automated Neural Networks (SANN) includes Automated data scaling for both inputs and outputs; there is also Automated recoding of Nominal valued variables (e.g., Sex = {Male, Female}), including one-of-N encoding. SANN also has facilities to handle missing data. There are special data prepartaion and interpretation facilities for use with Time Series. A large number of relevant tools are also included in STATISTICA.
For classification problems, SANN assigns cases to class memberships and interprets network outputs as true probabilities. In combination with SANN's specialized Softmax activation function and cross-entropy error functions, this supports a principled, probabilistic approach to classification.
| Back to Technical Description |
The range of neural network models and the number of parameters that must be decided upon (including network size, and training algorithm control parameters) can seem bewildering [the Automated Network Search (ANS) is available to Automatedally search through numerous network architectures of varying complexities; see below]. STATISTICA Automated Neural Networks (SANN) supports the most important classes of neural networks for real world problem solving, including:
The above architecture can be used for regression, classification, regression time series, classification time series, and cluster analysis.
In addition, ANS supports Ensembles networks formed from arbitrary (when meaningful) combinations of the network types listed above. Combining networks to form Ensemble predictions are particularly easy to use in SANN, especially for noisey or small datasets.
SANN contains numerous facilities to aid in selecting an appropriate network architecture. SANN's statistical and graphical feedback includes histograms, matrices and graphs of individual and overall case errors, summaries of classification/misclassification performance, and vital statistics such as regression correlation - all Automatedally calculated.
For data visualization, SANN can also display scatterplots and 3D response surfaces to help the user understand the network's "behavior."
Naturally, you can use information from any of these sources for further analyses with other STATISTICA tools or for inclusion in your reports, further analysis, or customization.
SANN Automatedally retains copies of the best networks found as you experiment on a problem, which can be retrieved at any time. The usefulness and predictive validity of the network can Automatedally be assessed by including test and validation samples and by evaluating the size and efficiency of the network as well as the cost of misclassification.
For enhanced performance, STATISTICA Automated Neural Networks supports a number of network customization options. You can specify a linear output layer for networks used in (but not restricted to) regression problems or softmax activation functions for probability-estimation in classification problems. Cross-entropy error functions, based on information-theory models, are also included, and there is a range of specialized activiation functions, including Exponential, Tangent Hyperbolic, Logistic Sigmoid, and Sine functions for both hidden and output neurons.
| Back to Technical Description |
The Automated Network Search (Automated evaluation and selection of multiple network architectures)
Included with STATISTICA Automated Neural Networks (SANN) is a tool that can Automatedally evaluate a large number of different neural network architectures of varying complexities, and select the best set of specific architectures for the problem at hand. It is known as the Automated Network Search (ANS)
A significant amount of time during the design of a neural network is spent on the selection of appropriate variables, and then optimizing the network architecture by heuristic search. SANN takes the pain out of the process by Automatedally conducting a heuristic search for you. This search includes network types, network sizes and architectures, activation functions, and even error functions when appropriate.
The Automated Network Search is an extremely effective tool that uses sophisticated techniques to search Automatedally for optimal network architectures. Why labor over a terminal for hours, when you can let STATISTICA Automated Neural Networks do the work for you?
| Back to Technical Description |
As you experiment with architectures and network types, you rely critically on the quality and speed of the network training algorithms. STATISTICA Automated Neural Networks (SANN) supports the best known state-of-the-art training algorithms.
SANN naturally includes fast, second-order training algorithms: Conjugate Gradient Descent and BSFGS. There is also a memory-less version of BFGS to which SANN Automatedally switches whenever the amount of memory on your computer is at critical levels. These algorithms typically converge far more quickly than first order algorithms such as Gradient Descent.
STATISTICA Automated Neural Networks' iterative training procedures are complemented by Automated tracking of both the training error and an independent testing error as training progresses. Training can be aborted at any point by the click of a button, and you can also specify Stopping Conditions when training should be prematurely aborted, for example, when a target error level is reached, or when the selection error deteriorates over a given number of epochs. If over-learning occurs, you needn't worry: SANN Automatedally retains a copy of the best network discovered, which is Automatedally retreieved and used as the best solution. When training has finished, you can finally check performance against train, test, and validation samples.STATISTICA Automated Neural Networks also includes a range of training algorithms for Cluster analysis, which is based on the well known Kohonen algorithm for self organizing feature maps.
| Back to Technical Description |
Once you have trained a network, you'll want to test its performance and explore its characteristics. STATISTICA Automated Neural Networks (SANN) uses a range of statistics and graphical facilities.
You may select multiple models (and ensembles), in which case, wherever possible, SANN will display any results generated in a comparative fashion (e.g. by plotting the response curves for several models on a single graph, or presenting the predictions of several models in a single spreadsheet). This feature is particularly useful for comparing various models trained on the same data set.
All statistics are generated independently for the training, test, and validation samples or combinations of your choice.
Overall statistics calculated include mean network error, the so-called confusion matrix for classification problems (which summarizes correct and incorrect classification across all classes), and the correlation for regression problems - all Automatedally calculated. Kohonen networks include a Topological Map window, which enables you to visually inspect unit activations during data analysis.
| Back to Technical Description |
Embedded Solutions (custom applications that use the STATISTICA Automated Neural Networks engines)
STATISTICA Automated Neural Networks' simple and efficient user-interface enables you to rapidly prototype neural network solutions to your problems.
In some applications, you may want to embed these solutions in your own systems and, for example, build them into larger computing environments (such as pre-designed procedures built into enterprise-wide computing systems).
Trained neural networks can be applied to new data (for prediction) in several ways: you can save the trained neural networks, and later retrieve them to be applied to new data (for prediction, predicted classification, or forecasting). You can also use the optional code generator to save fully trained neural network models in C programming language that is ready to compile and can be called from external applications and environments, such as visual basic, for deployment and predicting new data. Finally, all functionality of STATISTICA, including STATISTICA Automated Neural Networks, can be accessed as COM (Component Object Model) functions from other applications (e.g., from Java, Microsoft Excel, C#, VB.NET, etc.). For example, you could embed Automated analyses via STATISTICA Automated Neural Networks into your Microsoft Excel spreadsheets.
| Back to Technical Description |
| Back to Technical Description |
STATISTICA System Requirements
Minimum:
| Back to Technical Description |
The networks can be of practically unlimited size (that is, they can be much larger than what would ever be practical or reasonable). For all practical purposes, the program is effectively limited only by the hardware of the computer.
| Back to Technical Description |
STATISTICA Neural Networks includes a well-illustrated manual, with a comprehensive, conceptual introduction to Neural Networks (and tutorials), and extensive context sensitive Help accessible from every dialog.
| Back to Technical Description |
Examples of Real-life Applications
Neural networks can be used in virtually any situation where the objective is to determine an unknown variable or attribute from known observations or registered measurements (i.e., various forms of regression, classification, and time series), where there is a sufficient amount of historical data, and where there actually exists a tractable underlying relationship or a set of relationships (networks are relatively noise tolerant). In addition, neural networks can be used for exploratory analysis by looking for data clustering (Kohonen networks).
A comprehensive discussion of theoretical considerations related to the issue of when neural network applications are most likely to be successful can be found in the chapter on neural networks in the StatSoft Electronic Statistics Textbook (available on the StatSoft web site). The following list includes a selection of representative examples that by no means exhaust all areas where neural networks can be used.
| Back to Technical Description |
Optional Source Code Generator Add-on
The C source code generator is an optional add-on that allows users the flexibility to build custom applications based on solutions found with STATISTICA Automated Neural Networks (SANN). This add-on generates a source code version of a neural network in C (also available in PMML), which can then be compiled and integrated into your own programs. The add-on product is designed for corporate system developers and other users who need to convert the highly optimized solutions generated by SANN procedures into fixed, predefined applications that will solve routine analytic problems. Note: The SANN C-code generator is an add-on feature that requires a separate license.
| Back to Top |
| Request Quote |
| StatSoft Home Page |
![[StatSoft]](../images/sssmall.gif)
2300 East 14th Street, Tulsa, OK 74104
Phone: (918) 749-1119; Fax: (918) 749-2217
e-mail: info@statsoft.com
©Copyright StatSoft, Inc., 1984-2008.
StatSoft, StatSoft logo, and STATISTICA, are trademarks of StatSoft, Inc.