Welcome, Register  | Login
Search Options
Electronic Statistics Textbook
StatSoft Blog
  • Home
  • Products
    • STATISTICA Product Catalog
    • <div class="hr"></div>
    • STATISTICA Product Overview
    • Enterprise Solutions
    • Decisioning Platform
    • Web-Based Solutions
    • Data Mining Solutions
    • Text Mining Solutions
    • Desktop Solutions
    • Connectivity and Data Integration Solutions
    • Power Solutions
    • Statistics Methods and Applications Book
    • <div class="hr" id="hr2"></div>
    • Video Tutorials
    • STATISTICA Brochures
    • Request Quote
    • STATISTICA Upgrade Offer
  • Services
    • Services Overview
    • Custom Development
    • Consulting
    • Training
      • United States Course Schedule
    • Validation Services
  • Solutions
    • Solutions Overview
    • <div class="hr"></div>
    • Automotive Manufacturing
    • Banking
    • Chemical and Petrochemical
    • Credit Cards
    • Consumer Product Goods
    • Credit Scoring
    • Food and Beverage
    • Government Agencies
    • Hedge Fund Applications
    • Heavy Equipment Manufacturing
    • Healthcare
    • Insurance
      • Health Insurance
      • Life Insurance
      • Property and Casualty Insurance
    • Manufacturing
    • Medicare Fraud Detection
    • Marketing
    • Pharmaceuticals
    • Power Industry
    • R Language Platform
    • Sarbanes-Oxley Compliance
    • SAS Alternative
    • Semiconductors
    • Sentiment Analysis
    • Six Sigma
  • Support
    • Support Overview
    • Product Registration
    • Knowledge Base
      • Installation, Registration, & Licensing
      • User Interface
      • Analyses
      • Graphics
      • Graph Customization
      • Graphic Interactive Analysis
      • Reports
      • Spreadsheets
      • Data Import & Export
      • Data Manipulation
      • Workbooks
      • Output Management & Printing
    • <div class="hr" id="5"></div>
    • Download
      • Video Tutorials
      • Webcasts
      • <div class="hr"></div>
      • Brochures
      • White Papers
      • <div class="hr" id="hr2"></div>
      • Example Applications
      • Help
      • Installation Instructions
      • STATISTICA Software Updates
      • Version Manager
      • Visual Basic Examples
      • <div class="hr" id="3"></div>
      • Free STATISTICA 10 Trial
    • Books on STATISTICA
    • Electronic Statistics Textbook
    • <div class="hr" id="4"></div>
    • Free STATISTICA 10 Trial
    • <div class="hr" id="7"></div>
    • Blog
    • Forum
    • <div class="hr" id="6"></div>
    • Section 508 Compliance
    • Privacy Statement
  • Customers
    • Customer Listing
    • Success Stories
    • Feedback
  • Academic
    • Academic Overview
    • Academic Customers
    • Academic Request Quote
  • Company
    • About StatSoft
    • History
    • Office Locations
    • <div class="hr"></div>
    • News
    • Events
    • Webcasts
    • Newsletter
    • Reviews
    • <div class="hr" id="hr2"></div>
    • Careers
    • Partners
  • Contact Us
Chat Live with StatSoft
Solutions
  • Insurance, Fraud Detection
  • Data Mining: How To Get Started
  • Financial, Credit Scoring
  • Hands-on Data Mining (video series)
  • Performance Benchmarks on Large Datasets
Product Information
  • STATISTICA Scorecard
  • Text Miner
  • STATISTICA Data Miner Details
  • STATISTICA Data Mining Overview
  • STATISTICA Live Score
  • Market-Basket Analysis
  • Neural Networks
  • Process Optimization
What's New

How to View Multiple Panes in a Data Spreadsheet

Tue, 22 May 2012 16:09:00 -0500

STATISTICA Embedded in Semiconductors Industry

Tue, 22 May 2012 15:38:00 -0500

Six Years at StatSoft!

Tue, 22 May 2012 14:38:00 -0500

Skip Navigation Links.
Collapse SubscriptionsSubscriptions
STATISTICA Newsletter
STATISTICA Webcasts
AnalyticBridge
YouTube
Twitter
Facebook
LinkedIn

Purpose and Advantages of In-Place Database Processing (IDP)

Frequently asked questions on IDP

The In-Place Database Processing (IDP) is an advanced database access technology developed at StatSoft to support high-performance, direct interface between external data sets residing on remote servers and the analytic functionality of STATISTICA products. The IDP technology has been developed to facilitate accessing data in large databases using a one-step process which does not necessitate creating local copies of the data set. IDP significantly increases the performance of STATISTICA; it is particularly well suited for large data mining and exploratory data analysis tasks. IDP technology also provides a security advantage in that data never leave the secure database (remain in the database at all times).

The speed gains of the IDP technology - over accessing data in a traditional way - result not only from the fact that IDP allows STATISTICA to access data directly in databases and skip the otherwise necessary step of first importing the data and creating a local data file, but also from its "multitasking" (technically, asynchronous and distributed processing) architecture. Specifically, IDP uses the processing resources (multiple CPUs) of the database server computers to execute the query operations, extract the requested records of data and send them to the STATISTICA computer, while STATISTICA is simultaneously processing these records as they arrive.

Compatibility with STATISTICA products

The IDP technology can be used with both desktop and enterprise versions of STATISTICA products and it is fully compatible with the Client-Server architecture of WebSTATISTICA (the requests can be made over the Web and data processed asynchronously by WebSTATISTICA Server computers connected to the (next-tier) database server computers which will execute the queries). IDP is also optimized to seamlessly integrate with STATISTICA Data Miner which supports multiple IDP data input channels.

Architecture and Programmability

The IDP technology is implemented around a COM object which wraps an instance of a Microsoft Active Data Object (ADO) Recordset object and implements a subset of the Spreadsheet COM interface in the STATISTICA Object Library. This works because all STATISTICA Analyses access the source Spreadsheet data via the Spreadsheet interface. (Actually the InputSpreadsheet interface, which has a subset of the Spreadsheet interface methods. This InputSpreadsheet interface is normally hidden in the Object Browser but can be seen by right-clicking in Object Browser and selecting "Show Hidden Members".) Therefore, to a STATISTICA Analysis, the IDP looks just like a Spreadsheet. Indeed, advanced users of STATISTICA could wrap an InputSpreadsheet interface around any data source at all, and perform STATISTICA Analyses on it programmatically via the STATISTICA Object Model.

Behind the scenes, certain steps must be taken by the spreadsheet wrapper object to make Analyses work seamlessly. For instance, if an Analysis requires the number of cases in a Recordset before that information is known, then either a separate "count" query will be executed synchronously (i.e., the analysis must wait until the count query returns before continuing) and the result returned to the analysis, or some arbitrary upper bound on the case count will be returned immediately. This behavior is configurable on the IDP page of the STATISTICA options dialog. Also, if using a forward only cursor (see below) and the Analysis must make multiple passes through the data or access the data in random order, then any request for a previous case (row) forces the IDP to requery the database and advance the cursor forward to the requested case, since the cursor may not be scrolled backwards. The Analysis would simply wait until this process is completed and the requested data were provided to it.

IDP Type Library - Two Main Interfaces

DBTable provides programmatic access to the IDP Document, much as the Macro, Graph, and Spreadsheet interfaces provide access to STATISTICA Macros, Graphs, and Spreadsheets. In addition to the standard document methods and properties (Visible, Activate, Close, etc) it provides access to all IDP specific options (cursor type, location, query string, etc.) Its read-only property "Spreadsheet" returns the Spreadsheet wrapper around the ADO Recordset.

The second interface is DBSpreadsheet. This interface is used internally by the IDP to create the Spreadsheet wrapper object, and could also by used by users writing their own macros or programs, although in most cases the DBTable interface is sufficient and will itself use a DBSpreadsheet object. This interface has two methods, Open and CreateNew. Open executes the supplied query and opens an ADO Recordset. It creates a Spreadsheet wrapper object and attaches the ADO Recordset to it, and returns this Spreadsheet object. CreateNew creates a Spreadsheet wrapper object which is not attached to any Recordset and therefore is not useable until you call its "SetRecordset" method to attach an ADO Recordset object of your own creation.

Frequently asked questions on IDP

Home   |   Products   |   Services   |   Solutions   |   Support   |   Customers   |   Academic   |   Company   |   Contact Us
Copyright (c) 2012 www.statsoft.com Privacy Statement   |  Terms Of Use