STATISTICA






StatSoft Home Page

STATISTICA QUERY

How Do I Retrieve External Data via STATISTICA Query?

STATISTICA Query is used to easily access data from a wide variety of databases (including many large system databases such as Oracle, Sybase, etc.) using OLE DB conventions. The steps necessary to retrieve external data via STATISTICA Query are outlined below:

What Is SQL?

SQL (Structured Query Language) is a specialized programming language designed to specify criteria for selecting records for databases. It enables you to query an outside data source about the data it contains and select desired subsets. Use an SQL statement to specify the desired tables, fields, rows, etc. to return as data to STATISTICA via STATISTICA Query. For detailed information on SQL syntax, please consult an SQL manual.

What Is the Difference between ODBC and OLE DB?

The differences between these two data access protocols are technical, but in the most general terms, OLE DB is newer, more advanced, and compatible with a wider selection of database technologies. In addition, OLE DB is more general, in that it includes the ODBC functionality.

Technically speaking, ODBC (Open Database Connectivity) is designed to provide access primarily to SQL data in a multi-platform environment. OLE DB (Object Linking and Embedding Database) is designed to provide access to all types of data in an OLE Component Object Model (COM) environment. OLE DB includes the SQL functionality defined in ODBC but also defines interfaces suitable for gaining access to data other than SQL data.

A query generated via ODBC is functionally similar to a query generated by OLE DB. However, be aware that if you generate a query (via STATISTICA Query) using an ODBC connection, STATISTICA uses OLE DB to connect to the ODBC connection. Because there is an intermediate connection, ODBC queries may be slower than OLE DB queries.

How Do I Set Up a Database Connection Using the OLE DB Provider for ODBC?

The necessary steps for establishing an OLE DB database connection vary depending on the type of data and database you are using. To create a database connection using the Microsoft OLE DB Provider for ODBC Drivers, select Create Query from the Data - Get External Data menu (or the File - Get External Data menu) to display the Database Connection dialog.

Next, click the New button on the Database Connection dialog to display the Data Link Properties dialog. Select Microsoft OLE DB Provider for ODBC Drivers on the Provider tab of the Data Link Properties dialog and click Next to display the Connection tab.

On the Connection tab of the Data Link Properties dialog, select the appropriate ODBC data source name (DSN) in the Use data source name drop-down combo box. Note that in order to use this option, the ODBC drivers (appropriate for the data format to be accessed) need to be installed on your system and a DSN must already have been created. A DSN tells the computer what ODBC driver to use, where the database is located, and any logon information (if applicable). You create your DSN in Windows (i.e., outside of STATISTICA). Rather than using a previously existing DSN, you can select the Use connection string option button and enter or build an ODBC connection string. At this point, you can also enter information to log on to the server, enter the initial catalog to use, specify advanced network settings via the Advanced tab, or edit the initialization properties via the All tab if your specific data source requires this additional information.



Finally, click the Test Connection button on the Connection tab to see if you successfully connected to the ODBC data. If the test connection has succeeded, click the OK button on the Data Link Properties dialog to display the Add a Database Connection dialog.



Here, type the Name of the database connection and click the OK button. Your new database connection will now be added to the Connections list on the Database Connection dialog.

Note that for detailed documentation on the additional OLE DB Providers and the other advanced options that are available on the Data Link Properties dialog, press the F1 key on your keyboard to display the Microsoft Data Link Help®.

How Do I Query a Database Using ODBC?

To use this example, you must have a database connection already set up to the Northwind.mdb example database. (You can choose to include this database when you install Microsoft Access on your computer.)

Select Create Query from the Data - Get External Data menu (or the File - Get External Data menu) to display the Database Connection dialog. Select the database connection that you have set up to Northwind.mdb and click the OK button. At this point, STATISTICA Query will be available for you to actually create your query.

Use the Tree pane on the left side of the STATISTICA Query window to expand the Customers table and view its fields (to do this, either double-click on the word Customers or use the plus sign next to it). Double-click the fields CompanyName and Address to select those fields for your query (the fields are then selected and displayed in the Graphic pane on the right side of the dialog).

You can then use the various menu options and/or the Text Mode to further customize your query. When you are finished specifying your query, select Return Data to STATISTICA from the File menu or click the Play button to display the Returning External Data to Spreadsheet dialog.

On the Returning External Data to Spreadsheet dialog, you can name your query. You can also send your query to a new spreadsheet or to the active spreadsheet by selecting the appropriate option button. When you are finished, click the OK button and the data your query returns will be available on the specified spreadsheet.

How Do I Query a Database Using OLE DB?

Within STATISTICA, select Create Query from the Data - Get External Data menu (or the File - Get External Data menu) and click the New button on the Database Connection dialog to display the Data Link Properties dialog, which is used to create a new database connection. Note that for detailed documentation on the Data Link Properties dialog, press the F1 key on your keyboard while you are on the Data Link Properties dialog to display the Microsoft Data Link Help®.

On the Provider tab, select the driver for your OLE DB connection. To create an OLE DB connection, select the appropriate Microsoft Jet OLE DB Provider for your database. Click the Next button to continue to the Connection tab, or simply click on the Connection tab itself.

Select the database you want to connect to in the Select or enter a database name field. Then, click the Test Connection button to test your connection to the database. After the test connection has succeeded, click the OK button on the Data Link Properties dialog to display the Add a Database Connection dialog.

On the Add a Database Connection dialog, enter a name for your OLE DB connection in the Name field. When you are finished, click the OK button. Your new database connection is now listed in the Database Connection dialog. Select the connection that you just created and click the OK button.

At this point, STATISTICA Query will be available for you to actually create your query. Use the Tree pane on the left side of the STATISTICA Query window to expand the tables and view their fields (to do this, either double-click on the table names or use the plus sign next to them). Double-click the desired fields to select those fields for your query (the fields are then selected and displayed in the Graphic pane on the right side of the dialog).

You can then use the various menu options and/or the Text Mode to further customize your query. When you are finished specifying your query, select Return Data to STATISTICA from the File menu or click the Play button to display the Returning External Data to Spreadsheet dialog.

On the Returning External Data to Spreadsheet dialog, you can name your query. You can also send your query to a new spreadsheet or to the active spreadsheet by selecting the appropriate option button. When you are finished, click the OK button and the data your query returns will be available on the specified spreadsheet.

How Do I Add Joins or Criteria to My Query?

After you have specified tables and fields to be included in your query (by selecting Add from the Table menu and then selecting fields), you can add joins and/or criteria to your query. The simplest way to do this is to select Add from either the Join or Criteria menu.

A join shows how data is related between two tables and determines which cases STATISTICA Query will return as data. On the Add Join dialog,



select the tables and fields that you want to join, and the operator that determines the type of comparison that STATISTICA Query will perform on the values in the joined fields.

Criteria establishes the condition(s) that records from an external data source must meet in order to be included in the cases that are returned by the query. For example, you can specify criteria to select only records that have a certain value (such as customers in Louisville). On the Add Criteria dialog,



specify the Table, Field, Operator, and Value for your criteria.

Can I Save a Query and Use It Later on Another Dataset?

Yes, you can. When you are creating a query via STATISTICA Query, you can select Save As from the File menu to save the query as a .sqy file. When you open a different spreadsheet you can use the previously saved query to return data to the new spreadsheet by selecting Open Query from File from the Data - Get External Data menu (or from the File - Get External Data menu). In this manner, you will have access to all queries that you have written in STATISTICA Query even if you are working in a different application.

Can I Use More Than One Query on a Datafile?

Yes you can. Multiple queries based on one or many different databases can be created to return data to an individual spreadsheet. After you create a query via STATISTICA Query and select Return Data to STATISTICA from the File menu, the Returning External Data to Spreadsheet dialog is displayed.



On this dialog, you can specify exactly where you want the returned data to be put on the spreadsheet. Therefore, you can specify that the first query returns data to Variable 1 and Case 1 of the current spreadsheet. Then you are able to create another query and specify that that data is returned to Variable 5, Case 1 of the same spreadsheet. In this manner, you can maintain connections to multiple external databases simultaneously.

What Is the Difference between Graphic Mode and Text Mode in STATISTICA Query?

STATISTICA Query consists of two modes of operation, Graphic Mode and Text Mode. The default mode of operation is Graphic Mode, and for most users is the only mode of STATISTICA Query ever needed to do advanced querying.

The Graphic Mode of STATISTICA Query provides you with a very intuitive, graphical means to create your query. Using this simple user interface, you can create very intricate SQL statements. STATISTICA Query automatically builds the SQL statement for you as you select the components of your query with only a few mouse clicks and/or intuitive menu options and dialogs. Users with limited knowledge of SQL will therefore be able to create advanced and powerful queries of their data in a quick and straightforward manner.

For users who currently have a detailed knowledge of the more complex commands of SQL language, the Text Mode is available for further editing and/or customization of the SQL statement that has been created in the Graphic Mode. To manually enter or remove text in the SQL statement with the keyboard, switch to the Text Mode (by toggling the Graphic Mode command on the View menu). Once you are in Text Mode, you can build SQL statements including options that are not available in Graphic Mode (such as outer joins, using mathematical operators to create calculations in criteria, sorting rows by one or more columns, etc.). Therefore, by using both the Graphic Mode and the Text Mode of STATISTICA Query, you can create advanced queries of your data allowing you to optimize the valuable information that is contained in multiple external data sources.

Back to Top
Request Quote
StatSoft Home Page



If your question is not answered above, please e-mail us, or call (918) 749-1119 Monday through Friday, 9:00AM to 5:00PM Central Time. Please include your serial number (select About STATISTICA from the Help menu to view your serial number) and information about your hardware (the type of processor (CPU), the amount of memory (RAM), and disk space) and the version of the operating system that you are using. If you live outside North America, please contact your local StatSoft office. (Please note: Technical Support is only for users of the latest version of the program.)


[StatSoft]
2300 East 14th Street, Tulsa, OK 74104
Phone: (918) 749-1119; Fax: (918) 749-2217

[StatSoft]e-mail: info@statsoft.com

©Copyright StatSoft, Inc., 1984-2006.
StatSoft, StatSoft logo, STATISTICA, SEWSS, SEDAS, Data Miner, SEPATH and GTrees are trademarks of StatSoft, Inc.