Welcome, Register  | Login
Search Options
Electronic Statistics Textbook
StatSoft Blog
  • Home
  • Products
    • STATISTICA Product Catalog
    • STATISTICA Product Overview
    • Connectivity and Data Integration Solutions
    • Data Mining Solutions
    • Decisioning Platform
    • Desktop Solutions
    • Enterprise Solutions
    • Power Solutions
    • Statistics Methods and Applications Book
    • Text Mining Solutions
    • Web-Based Solutions
    • Video Tutorials
    • Brochures
    • Request Quote
    • STATISTICA Upgrade Offer
  • Services
    • Services Overview
    • Custom Development
    • Consulting
    • Training
      • United States Course Schedule
    • Validation Services
  • Solutions
    • Solutions Overview
    • Automotive Manufacturing
    • Banking
    • Chemical and Petrochemical
    • Credit Cards
    • Consumer Product Goods
    • Credit Scoring
    • Food and Beverage
    • Government Agencies
    • Hedge Fund Applications
    • Heavy Equipment Manufacturing
    • Healthcare
    • Insurance
      • Health Insurance
      • Life Insurance
      • Property and Casualty Insurance
    • Manufacturing
    • Medicare Fraud Detection
    • Marketing
    • Pharmaceuticals
    • Power Industry
    • R Language Platform
    • Sarbanes-Oxley Compliance
    • SAS Alternative
    • Semiconductors
    • Sentiment Analysis
    • Six Sigma
    • Telecommunications
  • Support
    • Support Overview
    • Product Registration
    • Knowledge Base
      • Installation, Registration, & Licensing
      • User Interface
      • Analyses
      • Graphics
      • Graph Customization
      • Graphic Interactive Analysis
      • Reports
      • Spreadsheets
      • Data Import & Export
      • Data Manipulation
      • Workbooks
      • Output Management & Printing
    • Download
      • Video Tutorials
      • Webcasts
      • Brochures
      • White Papers
      • Help
      • Installation Instructions
      • STATISTICA Software Updates
      • Visual Basic Examples
      • Free STATISTICA 10 Trial
    • Books on STATISTICA
    • Electronic Statistics Textbook
    • Free STATISTICA 10 Trial
    • Blog
    • Forum
    • Section 508 Compliance
    • Privacy Statement
  • Customers
    • Customer Listing
    • Success Stories
    • Feedback
  • Academic
    • Academic Overview
    • Academic Customers
    • Academic Request Quote
  • Company
    • About StatSoft
    • History
    • Office Locations
    • News
    • Events
    • Webcasts
    • Newsletter
    • Reviews
    • Careers
    • Partners
  • Contact Us
Chat Live with StatSoft
Solutions
  • Insurance, Fraud Detection
  • Data Mining: How To Get Started
  • Financial, Credit Scoring
  • Hands-on Data Mining (video series)
  • Performance Benchmarks on Large Datasets
Product Information
  • STATISTICA Scorecard
  • Text Miner
  • STATISTICA Data Miner Details
  • STATISTICA Data Mining Overview
  • STATISTICA Live Score
  • Market-Basket Analysis
  • Process Optimization
What's New
  • StatSoft’s VP Hill Accepts Keynote Role at Big Data Analytics Conclave
  • Mon, 20 May 2013 19:00:00 GMT

  • What does it mean to not have enough codes?
  • Fri, 17 May 2013 20:28:00 GMT

  • Magic Bullet
  • Mon, 13 May 2013 08:48:00 GMT

Skip Navigation Links.
Collapse SubscriptionsSubscriptions
STATISTICA Newsletter
STATISTICA Webcasts
AnalyticBridge
YouTube
Twitter
Facebook
LinkedIn

STATISTICA Automated Neural Networks

  • Overview
  • Details
  • System Requirements

“Your work has made a huge difference in my ability to create medical devices with expert-level accuracy.”

Scott B. Wilson
President
Persyst Development Corporation

The popularity of neural network methodology is rapidly growing in a wide variety of areas from basic research to data mining applications, business forecasting and risk management, engineering, and others.

STATISTICA Automated Neural Networks (SANN) is one of the most advanced and best performing neural networks applications on the market. It offers numerous unique advantages and will appeal to neural network experts and new users. 

Experts have a wide selection of network types and training algorithms. New users can be guided, via the Automated Network Search tool, through the necessary procedures for creating neural networks.

STATISTICA Automated Neural Networks is a comprehensive, state-of-the-art, powerful, and extremely fast neural network data analysis package, featuring:

  • Integrated pre- and post-processing, including data selection, nominal-value encoding, scaling, normalization, and missing value substitution, with interpretation for classification, regression, and time series problems.
  • Exceptional ease of use coupled with unsurpassed analytic power; for example, a unique wizard-style Automated Network Search (ANS) guides the user step-by-step through the procedure of creating a variety of different networks and choosing the network with the best performance (a task that would otherwise require a lengthy "trial-and-error" process and a solid background in the underlying theory).
  • State-of-the-art, highly optimized training algorithms (including Conjugate Gradient Descent and BFGS).
  • Support for combinations of networks and network architectures of practically unlimited sizes organized in network sets for forming ensembles.
  • Comprehensive graphical and statistical feedback that facilitates interactive exploratory analyses.
  • Full integration with the STATISTICA system; all results, graphs, reports, etc. can be further modified with STATISTICA's powerful graphics and analytic tools (e.g., to perform further analyses of prediction residuals, generate annotated summary reports, and so on).
  • Full integration with STATISTICA's powerful tools for automation; record complete macros for all analyses; program custom neural network analyses and applications in the STATISTICA Visual Basic environment, or call STATISTICA Automated Neural Networks from any application that supports the Component Object Model (COM; e.g., automatically perform neural network analyses in Microsoft Excel spreadsheets, or incorporate neural network procedures in your own custom applications developed in C, C++, C#, Java, etc.).

Additional SANN Information

  • Tackling the Real Issues in Neural Computing
  • Input Data
  • Data Scaling and Nominal Value Preparation
  • Selecting a Neural Network Model, Neural Network Ensembles
  • The Automated Network Search (Automated evaluation and selection of multiple network architectures)
  • Training a Neural Network
  • Probing and Testing a Neural Network
  • Embedded Solutions
  • Training Algorithm Summary
  • Size Limitations
  • Electronic Manual
  • Optional Source Code Generator Add-on
  • Examples of Real-life Applications

Tackling the Real Issues in Neural Computing

Using neural networks involves more than simply feeding data to a neural network.

STATISTICA Automated Neural Networks (SANN) has the functionality to assist you through the critical design stages, including not only state-of-the-art Neural Network Architectures and Training Algorithms, but also innovative new approaches to network architecture design by using specific and meaningful error functions that allow the interpretation of the output results. Moreover, software developers and those users who experiment with customized applications will appreciate the fact that once your prototyping experiments are completed using STATISTICA Automated Neural Networks' simple and intuitive user interface, neural networks analyses can be incorporated in custom applications by using either the STATISTICA library of COM functions that fully expose all functionality of the program or by using the C/C++ code generated by the program to aid in the deployment of fully trained networks.


Input Data

STATISTICA Automated Neural Networks is fully integrated with the STATISTICA system, so a large selection of tools for editing (preparing) data for analyses is available (transformations, case selection conditions, data verification tools, etc.). Like all STATISTICA analyses, the program can be "connected" to remote databases via the tools for in-place-database processing, or it can be linked to active data so that models are retrained or applied (e.g., to compute predicted values or classifications) automatically every time the data change.


Data Scaling and Nominal Value Preparation

In general, data must be specifically prepared for input into neural networks, and also it is important that the network output can be interpreted correctly. STATISTICA Automated Neural Networks (SANN) includes Automated data scaling for both inputs and outputs; there is also Automated recoding of Nominal valued variables (e.g., Sex = {Male, Female}), including one-of-N encoding. SANN also has facilities to handle missing data. There are special data preparation and interpretation facilities for use with Time Series. A large number of relevant tools are also included in STATISTICA.

For classification problems, SANN assigns cases to class memberships and interprets network outputs as true probabilities. In combination with SANN's specialized Softmax activation function and cross-entropy error functions, this supports a principled, probabilistic approach to classification.


Selecting a Neural Network Model, Neural Network Ensembles

The range of neural network models and the number of parameters that must be decided upon (including network size, and training algorithm control parameters) can seem bewildering [the Automated Network Search (ANS) is available to automatically search through numerous network architectures of varying complexities; see below]. STATISTICA Automated Neural Networks (SANN) supports the most important classes of neural networks for real world problem solving, including:

  • Multilayer Perceptrons
  • Radial Basis Function networks
  • Self-Organizing Feature Maps

The above architecture can be used for regression, classification, regression time series, classification time series, and cluster analysis.

In addition, ANS supports ensemble networks formed from arbitrary (when meaningful) combinations of the network types listed above. Combining networks to form ensemble predictions are particularly easy to use in SANN, especially for noisy or small datasets. SANN contains numerous facilities to aid in selecting an appropriate network architecture. For data visualization, SANN can also display scatterplots and 3D response surfaces to help the user understand the network's "behavior". Naturally, you can use information from any of these sources for further analyses with other STATISTICA tools or for inclusion in your reports, further analyis, or customization.

SANN automatically retains copies of the best networks found as you experiment on a problem, which can be retrieved at any time. The usefulness and predictive validity of the network can automatically be assessed by including test and validation samples and by evaluating the size and efficiency of the network as well as the cost of misclassification.

For enhanced performance, STATISTICA Automated Neural Networks supports a number of network customization options. You can specify a linear output layer for networks used in (but not restricted to) regression problems or softmax activation functions for probability-estimation in classification problems. Cross-entropy error functions, based on information-theory models, are also included, and there is a range of specialized activation functions, including Exponential, Tangent Hyperbolic, Logistic Sigmoid, and Sine functions for both hidden and output neurons.


The Automated Network Search (Automated evaluation and selection of multiple network architectures)

Included with STATISTICA Automated Neural Networks (SANN) is a tool that can automatically evaluate a large number of different neural network architectures of varying complexities, and select the best set of specific architectures for the problem at hand. It is known as the Automated Network Search (ANS)

A significant amount of time during the design of a neural network is spent on the selection of appropriate variables, and then optimizing the network architecture by heuristic search. SANN takes the pain out of the process by automatically conducting a heuristic search for you. This search includes network types, network sizes and architectures, activation functions, and even error functions when appropriate.

The Automated Network Search is an extremely effective tool that uses sophisticated techniques to search automatically for optimal network architectures. Why labor over a terminal for hours, when you can let STATISTICA Automated Neural Networks do the work for you?


Training a Neural Network

As you experiment with architectures and network types, you rely critically on the quality and speed of the network training algorithms. STATISTICA Automated Neural Networks (SANN) supports the best known state-of-the-art training algorithms.

SANN naturally includes fast, second-order training algorithms: Conjugate Gradient Descent and BSFGS. There is also a memory-less version of BFGS to which SANN automatically switches whenever the amount of memory on your computer is at critical levels. These algorithms typically converge far more quickly than first order algorithms such as Gradient Descent.

STATISTICA Automated Neural Networks' iterative training procedures are complemented by Automated tracking of both the training error and an independent testing error as training progresses. Training can be aborted at any point by the click of a button, and you can also specify stopping conditions when training should be prematurely aborted, for example, when a target error level is reached, or when the selection error deteriorates over a given number of epochs. If over-learning occurs, you needn't worry: SANN automatically retains a copy of the best network discovered, which is automatically retrieved and used as the best solution. When training has finished, you can finally check performance against train, test, and validation samples.


Probing and Testing a Neural Network

Once you have trained a network, you'll want to test its performance and explore its characteristics. STATISTICA Automated Neural Networks (SANN) offers a wide selection of statistical and graphical output.

You may select multiple models (and ensembles), in which case, wherever possible, SANN will display any results generated in a comparative fashion (e.g. by plotting the response curves for several models on a single graph, or presenting the predictions of several models in a single spreadsheet). This feature is particularly useful for comparing various models trained on the same data set.

All statistics are generated independently for the training, test, and validation samples or combinations of your choice.

Overall statistics calculated include mean network error, the so-called confusion matrix for classification problems (which summarizes correct and incorrect classification across all classes), and the correlation for regression problems - all automatically calculated. Kohonen networks include a Topological Map window, which enables you to visually inspect unit activations during data analysis.


Embedded Solutions (custom applications that use the STATISTICA Automated Neural Networks engines)

STATISTICA Automated Neural Networks simple and efficient user-interface enables you to rapidly prototype neural network solutions to your problems.

In some applications, you may want to embed these solutions in your own systems and, for example, build them into larger computing environments (such as pre-designed procedures built into enterprise-wide computing systems).

Trained neural networks can be applied to new data (for prediction) in several ways: you can save the trained neural networks, and later retrieve them to be applied to new data (for prediction, predicted classification, or forecasting). You can also use the optional code generator to save fully trained neural network models in C programming language that is ready to compile and can be called from external applications and environments, such as visual basic, for deployment and predicting new data. Finally, all functionality of STATISTICA, including STATISTICA Automated Neural Networks, can be accessed as COM (Component Object Model) functions from other applications (e.g., from Java, Microsoft Excel, C#, VB.NET, etc.). For example, you could embed automated analyses via STATISTICA Automated Neural Networks into your Microsoft Excel spreadsheets.


Training Algorithm Summary

  • Gradient Descent
  • Conjugate Gradient Descent
  • BFGS
  • Kohonen training
  • k-Means Center Assignment for Radial Basis networks

Size Limitations

The networks can be of practically unlimited size (that is, they can be much larger than what would ever be practical or reasonable). For all practical purposes, the program is effectively limited only by the hardware of the computer.


Electronic Manual

STATISTICA Automated Neural Networks includes a well-illustrated manual, with a comprehensive, conceptual introduction to Neural Networks (and tutorials), and extensive context sensitive Help accessible from every dialog.


Optional Source Code Generator Add-on

The STATISTICA Automated Neural Networks (SANN) Code Generator add-on allows you to create a source code version of a neural network in C or JAVA (also available in PMML), which can then be compiled and integrated into your own programs. This add-on feature requires a separate license.


Examples of Real-life Applications

Neural networks can be used in virtually any situation where the objective is to determine an unknown variable or attribute from known observations or registered measurements (i.e., various forms of regression, classification, and time series), where there is a sufficient amount of historical data, and where there actually exists a tractable underlying relationship or a set of relationships (networks are relatively noise tolerant). In addition, neural networks can be used for exploratory analysis by looking for data clustering (Kohonen networks).

A comprehensive discussion of theoretical considerations related to the issue of when neural network applications are most likely to be successful can be found in the chapter on neural networks in the StatSoft Electronic Statistics Textbook (available on the StatSoft web site). The following list includes a selection of representative examples that by no means exhaust all areas where neural networks can be used.

  • Optical Character Recognition, including Signature Recognition (e.g., a company has developed a device which identifies signatures, using not just appearance but also pen-velocity while signing, which makes it more difficult to perpetrate fraud).
  • Image Processing (e.g., a system was developed which scanned images of London subway stations, and could tell if the station was Full, Empty, Half-Full etc. and was invariant across light conditions and presence/absence of trains).
  • Financial Time Series Prediction (e.g., one trading company has claimed to have significantly improved trading performance using Multilayer Perceptrons to predict stock prices).
  • Credit Worthiness (credit scoring; a classic problem - decide whether someone is a good credit risk, based on questionnaire information).
  • Bulk Mail Targeting (i.e., identify customers who are more likely to respond positively to a mail-out, based on database information).
  • Detection and Evaluation of Medical Phenomena (e.g., detection of epileptic attacks, estimation of prostate tumor size).
  • Condition Monitoring of Machinery (e.g., detecting when something has gone wrong with a machine based on vibration or acoustic signatures, so that preventative maintenance can be scheduled).
  • Speech synthesis from text (e.g., the famous early experiment was Nettalk, which learned to produce phonemes from written text).
  • Chaotic Time Series Prediction (a number of researchers have demonstrated good prediction capability on chaotic time series data).
  • Process Control (e.g., monitoring industrial process machinery and continuously adjusting control parameters).
  • Engine Management Systems (estimating fuel consumption from sensor measurements and adjusting - a form of process control).
  • Language Analysis (e.g., using unsupervised techniques to identify key phrases, words, etc. in native South American languages).
  • Real-time Triggering Systems of High Energy Physics Detectors. Neural networks are noise tolerant and allow for robust pattern recognition of particle physics data with large statistical noise.

STATISTICA Automated Neural Networks is compatible with Windows XP, Windows Vista, and Windows 7.

Minimum System Requirements

  • Operating System: Windows XP or above
  • RAM: 1 GB
  • Processor Speed: 2.0 GHz

Recommended System Requirements

  • Operating System: Windows 7
  • RAM: 4 GB or more 
  • Processor Speed: 2.0 GHz, 64 bit, dual core 

Native 64-bit versions and highly optimized multiprocessor versions are available.

 

Home   |   Products   |   Services   |   Solutions   |   Support   |   Customers   |   Academic   |   Company   |   Contact Us
Copyright © 2013 by StatSoft Inc. Privacy Statement   |  Terms Of Use