StatSoft Home Page
Can I Open More Than One Input Datafile Simultaneously?
Can I Use the Results of One Analysis to Perform Another Analysis?
Can I Create a Custom List to Use for Extrapolation?
What Is a Matrix Spreadsheet?
Does STATISTICA Support Drag-and-Drop Opening of Spreadsheets and Graphs?
How Much Disk Space Is Necessary to Perform Database Management Operations?
Can I Undo Editing Operations?
What Is the Difference between the File Header and a Customized Header?
What Graphs Are Available from Spreadsheets?
How Do I Insert an Object into the Spreadsheet?
What Are Cases and Variables?
Why Do You Need Different Variable Types?
How Do I Add/Delete Variables (Columns of Data)?
How Do I Add/Delete Cases (Rows of Data)?
Can I Select (Highlight) the Entire Spreadsheet?
Can I Edit the Contents of a Cell (and Not Overwrite It)?
How Are Dates Represented in STATISTICA Datafiles?
How Is Time Represented in STATISTICA Datafiles?
How Do I Convert Date-Values into Text Labels and Vice-Versa?
How Do I Review and Edit Variable Specifications (Names, Formats, Notes, Formulas, etc.)?
How Are Missing Data Handled in STATISTICA?
How Do I Change the Missing Data Code for Individual Variables?
How Do I Edit Case Names?
How Do I Create a New Cell/Block Format?
How Do I Apply a Format?
How Do I Create a New Spreadsheet Layout?
How Do I Apply a Spreadsheet Layout?
Can I Label Numeric Values?
How Do I Enter/Edit the Assignments between Numeric Values and Text Labels?
Can I Switch between Displaying Text Labels and Numeric Values in the Data Spreadsheet
Can I Copy a Set of Text Labels, Numeric Values, and Descriptions to Other Variables and Files?
Can Clipboard and Drag-and-Drop Operations in the Spreadsheet Affect Values Outside the Range that is BEing Pasted or Dropped?
What Are the Drag-and-Drop Facilities?
Can I Rearrange Blocks of Data or Ranges of Cases and Variables in a Datafile?
What Is the Difference Between Copy and Copy with Headers?
Do STATISTICA Spreadsheets Support Slipt Display?
How Can I Expand a Block in the Spreadsheet Outside the Current Screen?
Can I Expand the Datafile Using Drag-and-Drop?
How Do I Split a STATISTICA Datafile into Smaller Files?
How Do i Merge Two STATISTICA Datafiles?
Can I Merge the Text Labels/Numeric Values from Two Files?
How Do I Calculate Values of a Variable?
Can STATISTICA Automatically Recalculate All Spreadsheet Formulas When the Data Change?
What Is the Simplest Way to Recode Values of a Variable (e.g., Split a Continuous Variable into Categories)?
How Do I Create Values of a New Variable Based on Conditions Met by Other Variables?
What Syntax Can Be Used to Create Case Selection/Verification/Recode Conditions (or to Calculate Variable Values Using Spreadsheet Formulas)?
How Can I Verify and "Clean" Data?
How Do I Perform a Multiple Sort?
How Do I Rank-Order Values of a Variable (Replace Values with Their Ranks)?
How Do I Transpose Data (Convert Cases Into Variables)?
How Do I Automatically Fill Ranges of Data in the Spreadsheet?
How Does the Extrapolation of Blocks (AutoFill) Work?
How Do I Standardize Values in a Block?
How Do I Select Variables for an Analysis?
Can Variables Be Selected for Analyses by Highlighting Them in the Spreadsheet?
How Do I Select a Subset of Cases (Observations) to Be Included in an Analysis?
How Are Case Selection Conditions Stored and Saved?
Can I Select Random Subsets of Data?
What Is the Quickest Way to Review Basic Descriptive Statistics for a Variable?
What Is the Quickest Way to Review a Sorted List of All Unique Values of a Variable?
What Are Statistics of Block Data?
Can Matrix Data Be Used For Input Instead of Raw Data?
How Can I Access Data from Excel and Other Foreign Datafiles?
How Do I Export Data from STATISTICA to Excel and Other Foreign Datafiles?
Can STATISTICA Access Data from Enterprise Databases?
Can STATISTICA Process Files That Are Larger Than the Local Hard Drive?
Does STATISTICA Support ODBC?
How Do I Set Up OLE Links between STATISTICA and Other Windows Applications?
How Do I Set Up DDE Links between STATISTICA and Other Windows Applications?
Can I Open More Than One Input Datafile Simultaneously?
Yes. Moreover, you can simultaneously run different analyses on the same or different datafiles. To open (and use) more than one input datafile simultaneously, open multiple datafiles and make sure each is specified as an input datafile (by selecting Input Spreadsheet on the Data menu) as you start the respective analyses.
For example, open the file Adstudy.sta and select Basic Statistics/Tables from the Statistics menu. When the Basic Statistics and Tables (Startup Panel) is displayed, double-click Descriptive Statistics on the Quick tab and calculate summary statistics for the variable Measure01.
Next, open the file Aggressn.sta and once again select Basic Statistics/Tables from the Statistics menu. This time, double-click t-test, independent, by groups and perform an independent t-test with Gender as the grouping variable and Aggressn as the dependent variable.
You will notice that the analyses stay associated with the different datafiles when you switch between them (using the buttons next to the STATISTICA Start button, above the status bar). Note that you can also perform different analyses on the same or different datafiles. For example, you could also perform analysis of variance on Adstudy.sta or on some other datafile.
While you can specify any number of spreadsheets as input datafiles, only one file (per workbook) can be designated as the active input datafile.
This means that if all your spreadsheets are opened within one workbook, only one of them can be used for input data at a time. The active input data spreadsheet is highlighted in the workbook tree with a red square, as shown above.
Can I Use the Results of One Analysis to Perform Another Analysis?
Yes. To designate a spreadsheet (i.e., the results from one
analysis) as an input spreadsheet, select the spreadsheet (e.g., ensure the
spreadsheet has the focus) and select Input spreadsheet from the Data menu. Now you can specify an analysis, and STATISTICA uses the selected spreadsheet as the input
spreadsheet. Note that if you switch back to another spreadsheet that has
previously been designated as an input spreadsheet, it can still be used for
analyses as well.
While you can
specify any number of spreadsheets as input datafiles, only one file (per
workbook) can be designated as the active input datafile. This means that if all
your spreadsheets are opened within one workbook, only one of them can be used
for input data at a time. The active input data spreadsheet is highlighted in
the workbook tree with a red square as shown in Can I
Open More Than One Datafile Simultaneously? (above).
Can I Create a Custom List to Use for Extrapolation?
Yes. Select the Custom Lists
tab of the Options dialog
(accessible via Tools -
Options) to edit or create custom lists for STATISTICA to use.
The Custom
lists box reports all the custom lists currently available in STATISTICA. To see the entries included in a custom
list, select it here and the entries are shown in the List entries box. To
create a new list, select NEW
LIST and enter the list items for it in the List entries box. Once you have added all the items
for the new list, click the Add button and the list is added to those in the Custom lists box (see Mon., Wed., Fri. in the above
illustration). When entering data in spreadsheets, you can enter the first few
items in the list, and then use extrapolation to fill in the remaining cases in
the spreadsheet (as shown below).
What Is a Matrix Spreadsheet?
A STATISTICA Matrix Spreadsheet (*.smx) is a spreadsheet file with a strictly defined
format that can be used as input in several modules of STATISTICA (e.g., Cluster Analysis, Multidimensional Scaling, SEPATH).
Matrix spreadsheets have to meet the following conditions:
1. The number of cases (rows) = the number of variables
(columns) + 4.
2. The matrix
must be a square matrix and the case names should be the same as the variable
names.
3. The last four cases
contain the following case names and information:
Means:
The mean of each variable is given in this row; this case can be left empty
(i.e., do not enter anything in this row) for Similarities and Dissimilarities
matrices.
Std.Dev.: The
standard deviation of each variable is given in this row; this case can be left
empty (i.e., do not enter anything in this row) for Similarities and
Dissimilarities matrices.
No.Cases: This required entry is the number of cases
from which the matrix was produced, not the number of cases (rows of data) in
this matrix file.
Matrix: This
required entry represents the type of matrix file; 1 = Correlation, 2 =
Similarities, 3 = Dissimilarities, and 4 = Covariance.
When entering these last four cases
into the matrix file manually, be sure to spell the case names exactly as they
appear above (i.e., Means, Std.Dev., No.Cases, and Matrix).
While a matrix spreadsheet looks identical to a regular
spreadsheet, analyses that require matrix input (such as Multidimensional
Scaling) only consider input spreadsheets that are also matrix spreadsheets.
You can create a matrix spreadsheet
from any place that allows you to create a matrix of results [e.g., Basic Statistics - Correlations,
Multiple Regression, Cluster Analysis, Multidimensional Scaling, and Structural
Equation Modeling (SEPATH)]. To create the matrix, click the Matrix button. Note that
clicking this button does not cause STATISTICA to
save the matrix file to disk, rather it causes STATISTICA to create the special matrix spreadsheet.
This matrix spreadsheet is always displayed in an individual window (regardless
of the settings on the Options -
Output Manager tab or the Analysis/Graph Output Manager dialog). You can,
however, add the spreadsheet to a workbook or report using the Add to Workbook or Add to Report button,
respectively.
When you save a
matrix spreadsheet (using Save or Save As from the File menu), STATISTICA
detects that it is in this special matrix form and sets the default file
extension to *.smx.
Does STATISTICA Support Drag-and-Drop Opening of Spreadsheets and Graphs?
Yes. In Windows Explorer, you can simply drag any STATISTICA Spreadsheet (file name extension *.sta) or graphics file (file name extension *.stg) onto the STATISTICA
icon to automatically open the file in STATISTICA.
How Much Disk Space Is Necessary to Perform Database Management Operations?
In order to allow you to revert to the original datafile
after file editing, STATISTICA creates temporary and
backup files. Thus, in order to edit a datafile, STATISTICA needs at least twice as much free disk space
as the size of the file to be edited. More disk space is necessary if you
perform large-scale editing operations (e.g., changing large blocks) on very
large datafiles. To allow for the Undo function, STATISTICA
needs to store multiple copies (up to 32) of the modified sections of data. Some
operations (e.g., import/export) use intermediate scratch files, so they may
need additional space.
Can I Undo Editing Operations?
You can undo most operations such as editing, moving, or copying blocks; random fill; recoding or
ranking variables; etc. by selecting Undo from the spreadsheet Edit menu or pressing CTRL+Z. Multi-level undo is
supported (with 32 buffers), so you can undo up to 32 of the most recent
spreadsheet operations.
What Is the Difference between the File Header and a Customized Header?
File Header. The file header is located immediately above the variable headers at the top of the spreadsheet
window. You can use this header to include a summary title or other
identification of the datafile (or results). Double-click in the header area (or
press F2) to edit/enter information.
To select the header only (for formatting), click once in
the upper-left corner (the mouse pointer will be an outlined plus sign). Press
CTRL+ENTER or ALT+ENTER to enter a new line (note that you need to extend the
height of the field to see new lines that you are adding).
In addition to adding new lines to
the file header, STATISTICA Spreadsheets (and all
other STATISTICA documents) can easily be annotated
by entering notes into the Comments area of the Document Properties (on the Summary tab) dialog, which is
accessed by selecting Properties from the File menu. The Comments area can hold any type of comments or
supplementary information.
Customized header. A
customized header is used when printing spreadsheets (or other STATISTICA
documents, e.g., reports, graphs, etc.). This header can include a title, page
numbers, time of printing, date, the name of the file, and any custom text you
might want to include. To create a header, select Header/Footer from the View menu. For more information, see the Modify Header/Footer dialog
in the Electronic Manual. Note that while the file
header is associated only with spreadsheets, any printed STATISTICA document can have a customized header.
What Graphs Are Available from Spreadsheets?
A variety of graphs is available from the spreadsheet shortcut menu (accessed by
right-clicking in any cell or selected block of the spreadsheet). These graphs
belong to two broad categories, Graphs of Block Data and Graphs of Input Data, and they produce either
summary graphs from rows or columns of the currently highlighted block, or
graphs as the original input data identified by the currently selected row
and/or column of the spreadsheet (respectively).
How Do I Insert an Object into the Spreadsheet?
You can insert either a new or an existing object into the spreadsheet. To do this, select Object from the Insert menu to display the Insert Object dialog. To
insert a new object, select the Create New option button, select the type of object
to be created from the Object
Type box, and click the OK button. In the following example, a Bitmap Image will be inserted
into the spreadsheet.
Note that STATISTICA will now
emulate Microsoft Paint, and you can edit the new bitmap image using Paint
utilities (as shown below).
To return to the standard STATISTICA user interface, click a cell in the
spreadsheet. (Note that if the Display As Icon check box is selected on the Insert Object dialog, then
the object is displayed as an icon in the spreadsheet, and when you double-click
the icon, the application in which it was created will be opened.)
To insert an existing object into
the spreadsheet, select the Create
from File option button on the Insert Object dialog. Next, click the Browse button to display the
Browse dialog and select
the file to be displayed in the spreadsheet. You can also select the Display As Icon check box to
display the selected file as the icon that is pictured below the Display As Icon check box.
Click the Change Icon
button to specify a new icon. To link the object to the original file (so that
changes in that file will automatically be reflected in the object that is
displayed in STATISTICA), select the Link check box.
Finally, click the OK button.
Double-click on the object to edit
it. It will be opened in the application in which it was created (unless both
the Link and the Display As Icon check boxes
are cleared, in that case, STATISTICA will emulate
the application in which the object was created, as shown in the insertion of
the Bitmap Image
above).
What Are Cases and Variables?
STATISTICA datafiles are organized into cases and variables. If you are not familiar with this notation, you can think of cases as the equivalent of records in a database management program (or rows of a spreadsheet) and variables as the equivalent of fields (columns of a spreadsheet). Each case consists of a set of values of
variables.
Why Do You Need Different Variable Types?
The difference between the text and the numeric types is straightforward; however, the main reason for
having three types of numeric values is the storage efficiency. For most
datafiles, that is not important and, thus, using the default (Double) data type is
recommended. However, for very large datafiles, being able to switch to a 2 (or
even 8) times more efficient storage (by using different data types) could make
a difference between being able to perform the necessary analysis on a specific
computer system or not.
How Do I Add/Delete Variables
(Columns of Data)?
Perhaps
the easiest way to add variables to an existing datafile is to expand the
spreadsheet beyond the existing columns and double-click on the gray area of the
spreadsheet. This will display the Add Cases and/or Variables dialog.
Specify the number of variables to
add and click the OK
button. The new variables are added to the far right of the datafile.
Variables can also be added to
and/or deleted from a datafile either by (1) using the drag-and-drop operation
(insert to add columns), or (2) selecting the appropriate option from the
spreadsheet menu of global operations on variables, accessed by clicking the Vars button on the Spreadsheet toolbar.
How to add variables before variable 1. The quickest
way to add variables before variable 1 is to select Add Variables from the spreadsheet Insert menu. In the After box of the Add Variables dialog, enter
0, which references the
variable number. If the variable number is 0, it effectively means that you are inserting
before the first variable.
Once you click the OK button, a new variable is inserted before the
first variable of your spreadsheet.
Note that the name of the new variable is the name specified
in the Name box of the Add Variables dialog (in this example, NewVar).
How Do I Add/Delete Cases (Rows of
Data)?
Perhaps the easiest
way to add cases to an existing datafile is to double-click on the gray area of
the spreadsheet. This displays the Add Cases and/or Variables dialog.
Specify the number of cases you want
to add and click the OK
button. The new cases are added to the bottom of the datafile.
Cases can also be added to and/or
deleted from a datafile either by (1) using the drag-and-drop operation (insert
to add cases), or (2) selecting the appropriate option from the spreadsheet menu
of global operations on cases, accessed by clicking the Cases button on the Spreadsheet toolbar.
How to add
cases before case 1. The quickest way to add cases before case 1 is to
select Add Cases from the
spreadsheet Insert menu.
In the Insert after case
box of the Add Cases
dialog, enter 0, which
references the case number. If the case number is 0, it effectively means that you are inserting
before the first case.
After you click the OK button, a new case is inserted before the first
case of your spreadsheet.
Note that the name of the new case is 1. To change this,
double-click on the 1 and
enter a new name.
Can I Select (Highlight) the
Entire Spreadsheet?
Following
standard spreadsheet conventions, click in the upper-left corner of the
spreadsheet (in the Info
Box). This shortcut is useful, for example, when you intend to copy the
entire file to the Clipboard or reformat the entire spreadsheet. You can also
press CTRL+A or select Select
All from the spreadsheet Edit menu or shortcut menu (accessed by
right-clicking in the Info
Box).
Note that by clicking twice in the Info Box, you deselect (i.e., select and then
deselect) the entire spreadsheet.
Can I Edit the Contents of a Cell
(and Not Overwrite It)?
Follow the standard spreadsheet
(e.g., Microsoft Excel) conventions, specifically:Double-click on the cell. To avoid clearing the contents
of the cell at the point when you start entering a correction, double-click on
the cell before you start typing. This action enters the edit mode and positions
the cursor within the cell.
Press F2. Alternatively, you can follow the "old"
spreadsheet convention and press the F2 key, which also enters the edit mode for
the currently highlighted cell.
How Are Dates Represented in STATISTICA Datafiles?
Date values of variables are stored internally in Julian
format, that is, as a single integer value that represents the number of days
that have passed since January 1, 1900. For example, a date entered and
displayed as 1/21/1968 is
stored as the Julian date 24858; the (optional) decimals are interpreted as
time. Date values stored in this manner can be used in subsequent analyses
(e.g., in Survival
Analysis in order to calculate survival times, see below) and
transformed using arithmetic operations; at the same time, they can be displayed
as dates in reports or graphs (e.g., used to label scale values).
Julian date values can be displayed
in the spreadsheet in numeric (Julian) format or in one of several predefined
date display formats (e.g., 1/6/64, 6-Jan-64, Jan-1964, 01/06/64, 01/06/1964,
6-Jan).
To change the date display format, select Date in the Display format group of the
Variable specifications
dialog (accessible by double-clicking on the variable name in the spreadsheet or
selecting Variable Specs
from the Tools menu) and
choose one of the predefined display formats. When you enter data into a new
variable using a format that is recognized as valid in the Display format group of the
Variable specifications
dialog (e.g., Time, Date,
Currency, etc.), STATISTICA displays the Auto Format Cells dialog. In
this dialog, you can either apply the format to the individual cell or to the
whole column.
You can create a single date variable from two (month, year
or day, month) or three (day, month, year) variables as well as split a single
date variable into two or three variables via the Date Operations dialog (accessed from the Data - Date Operations
menu).
Refer to the on-line Electronic
Manual (press the F1 key or click the Help button) for examples of creating date variables
from numeric variables and splitting date variables into numeric variables.
How Is Time Represented in
STATISTICA Datafiles?
Time
values of variables are stored as (optional) decimal values representing the
fraction of the day since midnight; for example, 6:00 A.M. is stored as 0.25.
Time values stored in this manner can be used in subsequent analyses and
transformed using arithmetic operations; at the same time, they can be displayed
as times in reports or graphs (e.g., used to label scale values).
STATISTICA formats the display of time values according
to the current settings of the Windows Control
Panel.
See the previous topic (How Are Dates
Represented in STATISTICA Datafiles) for a
discussion of the Auto Format
Cells and the Date
Operations dialogs. Refer to the on-line Electronic Manual (press the F1 key or click the Help button in the Date Operations dialog) for
examples of creating time variables from numeric variables and splitting time
variables into numeric variables.
How Do I Convert Date-Values into
Text Labels and Vice-Versa?
In some circumstances, it can be useful to create text
labels with date information [e.g., when using a date variable as a coding
variable with codes greater than 32,000 (e.g., As Values of Grouping Variables)?
above]. In this case, you can transform the date variable into a variable
containing date text labels with numeric equivalents in a range that allows them
to be used as codes by all procedures of STATISTICA
(i.e., numeric values less than 32,000). Use the Dates Operations - Text Dates tab to convert dates
into codes.
This dialog is accessed by selecting Date Operations from the Data menu.
How Do I Review and Edit Variable
Specifications (Names, Formats, Notes, Formulas, etc.)?
Specifications of a single variable. Double-click on a
variable name in a spreadsheet to display the Variable specifications dialog where you can change the
variable name, font format, display format, missing data value, long label, or
formula for the current variable. (As in most other facilities commonly used in
STATISTICA, this dialog can also be accessed from the
toolbar, shortcut menus, and menus.)
This dialog can also be used to access descriptive
statistics and a listing of sorted numeric values and text labels for the
current variable (click the Values/Stats button). From this dialog, you can also
access the Text Labels
Editor (click the Text
Labels button) to review and change the assignments between the text and
numeric values for the current variable. Note that you can click the Scroll left or Scroll right button to scroll
through each variable in your datafile.
Specifications of all variables.
You can also click the All
Specs button in the Variable specifications dialog to display an
editable, combined table of specifications of all variables in the current
datafile. Note that selecting All
Variable Specs from the Data menu or All Specs from the Vars button on the Spreadsheet toolbar can also access this dialog.
The table format is convenient when
you need to compare or edit specifications of several variables, especially when
you need to paste and copy between variables (e.g., comments, formulas, or
links), or extend a format definition or missing data code from one variable to
subsequent variables.
The shortcut menu contains commands to Add Vars, Delete Vars, Cut, Copy,
Paste, and Fill/Copy
Down.
How Are Missing Data Handled in
STATISTICA?
Specifying missing data
codes. A missing data code, that is, a value that signifies that there
is no data for a particular case and variable (displayed as a blank cell in the
spreadsheet), can be specified separately for each variable. To change the code,
double-click on a variable name in the spreadsheet to access the Variable specifications
dialog for that variable. Alternatively, click the Vars toolbar button and select All Specs from the resulting
menu to display the Variable
Specifications Editor dialog for all variables. The default Missing Data code in STATISTICA (used when new files are created, new
variables are added, or data are imported) depends on the variable type being
used.
Processing cases with missing data. The way in which
missing data are handled when processing data can be adjusted individually for
each analysis (see the MD
deletion group in most analysis definition dialogs). Whenever
applicable, you have the choice to eliminate them from calculations in a
casewise or pairwise manner, substitute them with means, or reconstruct or
interpolate them (e.g., in the Time Series module).
Press the F1 key or click the help button in the respective
analysis-definition dialog to learn about the specific missing data handling
options available for the procedures of interest.
How Do I Change the Missing Data
Code for Individual Variables?
The value used to designate missing data values for
individual variables can be changed in all Variable specifications dialogs.
How Do I Edit Case Names?
In order to enter or edit the
current case names, double-click on any case name in the spreadsheet. To manage
(e.g., change the width, copy from a variable, etc.) case names, right-click on
any case name and select Case
Names Manager from the resulting shortcut menu to display the Case Names Manager dialog.
The Case Names Manager
command is also available from the Cases toolbar button and the Tools menu.
If the current datafile contains no
case names, you can create them by double-clicking on the case number column or
resizing the case number column with the mouse pointer.
How Do I Ceate a New Cell/Block
Format?
A format is a group
of formatting options that you can save with an assigned name for future use.
Formats can be assigned to numbers as well as text. To create a new format in a
spreadsheet, follow these instructions:
Select Format
Manager from the Format -
Block menu to display the Format Manager dialog.
In this dialog, click the New button to display the Format Name dialog.
Enter a name in the Name box, and then click the OK button to display the Edit Format dialog.
This dialog contains four tabs
(Number, Alignment, Font, and Border) that are used to specify the formatting
components for the new format. Select the desired formatting options for your
new format. In the illustration above, the Bold check box has been selected in
the Style/Effect group and the Text Color box has been changed to navy.
Click the OK button to close the Edit Format dialog. The format that you just created
is added to the list of formats available on the Format Manager dialog and the Format - Block menu.
You can now apply this format to new
selections (see the next section).
How Do I Apply a Format?
Formats can be applied quickly and
consistently to segments of data. Once a format has been created, it can be
applied to any part (or all) of the data in a spreadsheet.
To apply an existing format, first
select the cell(s) that you want to format in the spreadsheet. Then select Block from the Format menu to display the
list of available formats. (Optionally, right-click on the selected cell(s) and
choose Format from the
shortcut menu, as displayed below.) Finally, select the name of the format to
apply to the selected cells.
How Do I Create a New Spreadsheet
Layout?
A spreadsheet layout
is a convenient way to combine sets of formatting options into one collection.
This feature can be used to specify and apply different formats for different
components of a spreadsheet. Once you have created a spreadsheet layout, it can
be applied to any spreadsheet for a consistently formatted appearance. Follow
these instructions to create a new spreadsheet layout:
Select Layout Manager from the Format - Spreadsheet menu to display the Spreadsheet Layouts
dialog.
In this dialog, click the New button to display the Spreadsheet Layout Name dialog.
Enter a name in the Name box, and then click the
OK button to display the
Edit Spreadsheet Layout
dialog.
In the Formats group, a default Format 1 will be displayed. Select the area of the
spreadsheet to be affected by Format 1 in the Areas of Spreadsheet affected by format group. In
the above illustration, only the Variable Header check box is selected. Note that you
can control the range of variable headers to be affected by Format 1 by using the Range of variables group.
Next, click the Edit button in the Format group to display the
Edit Format: Format 1
dialog.
Select any formatting options to be associated with Format 1. Above, Comic Sans MS has been chosen
as the Font, 10 for the Size, white for the Text Color, and blue for the Background color. Note that
the Preview group is used
to view your specified format.
Click the OK button to accept the settings and return to the
Edit Spreadsheet Layout
dialog.
To define additional formats for other areas of the
spreadsheet, click the New
button to display the New Format:
Format 2 dialog and select any formatting options to be associated with
Format 2. In this manner, you can add additional formats to the spreadsheet
layout as necessary. In the illustration above, Format 2 (Font - Comic Sans MS, Size - 10, Text Color -
white, and Background color - blue) has been applied to the Case
Headers.
When you are satisfied
with the appearance of your spreadsheet layout, which is displayed in the Preview group, click the OK button. The spreadsheet
layout that you just created is added to the list of formats available on the Spreadsheet Layouts dialog
and the Format -
Spreadsheet command.
You can now apply this spreadsheet layout to new selections
(see the next section).
How Do I Apply a Spreadsheet
Layout?
You can apply any
existing spreadsheet layout to the active spreadsheet. To do this, select Spreadsheet from the Format menu to display the
list of available spreadsheet layouts. Then, select the name of the spreadsheet
layout to apply to the spreadsheet.
Can I Label Numeric Values?
Yes, in STATISTICA, each numeric value can have two identities:
the actual numeric value (e.g., 1) and a text label (e.g., Male) assigned to it.
Note that in addition to text labels for numeric values, STATISTICA also supports text variables.
How Do I Enter/Edit the
Assignments between Numeric Values and Text Labels?
Normally, the assignments between
text labels and numeric values are handled automatically as you enter or edit
data in the spreadsheet. However, in some circumstances, you may want to review,
edit, or restructure all assignments of values for a particular variable or copy
text/numeric assignments from one variable to another. These operations can be
performed in the Text Labels
Editor dialog, accessible by selecting Text Labels from the Vars toolbar button menu or by clicking the Text Labels button in the Variable specifications
dialog for a particular variable.
Use the Text
Labels Editor dialog to sort the assignments by text labels or numeric
values, perform the Fill
Down operation (to automatically reassign numeric values to text
labels), Copy numeric/text
assignments from other variables or from case names, and perform other
operations.
Can I Switch between Displaying
Text Labels and Numeric Values in the Data Spreadsheet?
Each value of a particular variable
can simultaneously have a numeric and text identity. You can toggle the display
of text labels or numeric values by selecting Display Text Labels from the spreadsheet View menu.
Can I Copy a Set of Text Labels,
Numeric Values, and Descriptions to Other Variables and Files?
Yes, you can. Click the Copy From button in the Text Labels Editor (see
above) to copy to the current variable the text labels, numerical values, and
descriptions from other variables or from the case names.
When you need to copy the text
labels, numerical values, and descriptions from one file to another, select Text Labels from the Data - Merge menu. This
allows you to concatenate or replace text labels with labels from another file.
Can Clipboard and Drag-and-Drop
Operations in the Spreadsheet Affect Values Outside the Range that is Being
Pasted or Dropped?
Text labels. When you copy or move a block in the
spreadsheet, the values copied to the Clipboard depend on the display mode of
the spreadsheet. If the spreadsheet displays numeric values when the block is
copied, then only those numeric values are copied to the Clipboard. If the
spreadsheet displays text labels when the block is copied, then not only are the
text labels copied to the Clipboard, but also the corresponding numeric
values.
This can result in the assignment of text labels to numeric
values that did not previously have text label equivalents. Note that in the
illustration above, even though only the (highlighted) block was moved, other
numeric values in the target variable (in this case 1's and 2's), acquired new text identities.
Display
formats. When you copy or move a block in a spreadsheet, the display format
for that block is also copied. This means that the display format of the copied
or moved block overwrites the display format for the block into which it is
copied. For example, if you copy a block of values that are formatted as
currency and paste them into a variable that is formatted as percentage, the
block remains formatted as currency even though the rest of the values in that
variable are formatted as percentage.
In the example above, the four currency values copied from
Cost are still formatted
as currency even though they were pasted into the variable Percent.
What Are the Drag-and-Drop Facilities?
STATISTICA Spreadsheets support all standard spreadsheet
(Excel-style) block drag-and-drop operations (copy, move, insert, etc.) and
block extension operations, as well as mouse pointer-controlled column width,
variable-speed scrolling, split scrolling, etc.
Moving a block. You can move a
block by pointing to the border of the selection (the mouse pointer changes to
an arrow) and dragging it to a new location.
Copying a block. To copy a block,
point to the border of the selection (the mouse pointer changes to an arrow),
then drag the selection to a new location while pressing the CTRL key. Note that
when you are dragging the selection, a plus sign is displayed next to the cursor
to indicate you are copying the text rather than moving it (see the cursor in
the illustration below).
Inserting a block. To insert a
block between columns or rows, point to the border of the selection (the mouse
pointer changes to an arrow), then drag the selection while pressing the SHIFT
key. If you point between rows, an insertion bar appears between the rows, and
when you release the mouse button, the block is inserted between those two rows
[creating new case(s)]. If you point the cursor between columns, an insertion
bar appears between the columns, and when you release the mouse button, the
block is inserted between those two columns [creating new variable(s)]. Note
that if you also press the CTRL key while you are dragging the selection, the
block will be copied and inserted instead of moved and inserted; a plus will
appear next to the arrow-cursor, as shown in the illustration below.

Note that all SHIFT + drag-and-drop operations change the
size of the datafile; the same is true of simply dragging blocks outside the
boundaries of the current datafile. However, dragging (without SHIFT) within the
current datafile boundaries produces results identical to using the Clipboard
(except that the Clipboard is not used).
Clearing a block. To clear the
contents of a block of cells, drag the block Fill Handle (a small solid square
located on the lower-right corner of the block) within the selected block. After
selecting the block, point at the lower-right corner of the block displaying the
Fill Handle. When the mouse pointer changes to a plus sign, drag within the
selected block. As you are dragging, the values within the block are dimmed.
When you release the left mouse button, the dimmed values are deleted.

Extrapolating a block (AutoFill).
Values within a block can be extrapolated by using intelligent, Excel-style,
series extension facilities. After selecting the block, point at the lower-right
corner of the block displaying the Fill Handle. When the mouse pointer changes
to a plus sign, drag to the last cell(s) to which you want to extrapolate the
data.

Numeric values are extrapolated using linear regression;
text values are extrapolated following meaningful sequences (e.g., Q1, Q2, Q3,
..., or JAN, FEB, MAR, ...).
Can I Rearrange Blocks of Data or
Ranges of Cases and Variables in a Datafile?
There are three types of operations that you can use to
rearrange blocks of data or ranges of cases and variables in a datafile:
Clipboard-based, global, and drag-and-drop-based; they operate differently and
can produce different effects.
Clipboard-based operations. The standard Clipboard-based
operations of cut, copy, and paste (implemented via the standard Clipboard
keyboard, toolbar, or menu commands) affect only the contents of blocks of data,
rows, or columns. They do not influence the overall size of the datafile (e.g.,
they can empty a column but will not remove the column from the spreadsheet).
Global
operations. Global operations (accessible from the Vars and Cases Spreadsheet toolbar buttons and the spreadsheet Insert menu) are performed on
entire rows or columns as "units"; for example, they move or delete entire
columns, not only their contents. Hence, the size of the datafile increases or
decreases when you use these options.
Drag-and-drop operations. STATISTICA supports standard (Microsoft Excel-style)
drag-and-drop facilities, allowing you to easily perform both Clipboard-style
(but without using the Clipboard) and global (if the insert mode is implemented
by pressing the SHIFT key) operations.
What Is the Difference Between
Copy and Copy with Headers?
The example below shows how the selected portion of the
spreadsheet will be copied to the Clipboard when using Copy with Headers or Copy (both available from the Edit menu).
Copy with Headers:
Copy:
Do STATISTICA Spreadsheets Support
Split Display?
Yes, it does.
Spreadsheets can be split into up to four sections (panes) by dragging the split
box (the black rectangle at the top of the vertical scrollbar or to the left of
the horizontal scrollbar). This is useful if you have a large amount of
information and you want to review results from different parts of the
spreadsheet. When you guide the mouse pointer to the split box, the pointer
changes to Vertical Resize
(or Horizontal). Now, to
position the split, drag to the desired position.
You can change the position of the split by dragging the
split box (now located between panes) to a new position. Note that vertically
split panes scroll together when you scroll vertically; horizontally split panes
scroll together when you scroll horizontally.
How Can I Expand a Block in the
Spreadsheet Outside the Current Screen? Blocks can be selected by (1)
selecting a block with the mouse pointer, (2) clicking in one corner of the
block to be selected, scrolling to the desired opposite corner (the original
cell remains selected), and SHIFT+click in that cell, or (3) pressing the SHIFT
key while using the cursor keys on the keyboard. To expand a previously selected
block, you can use the SHIFT+cursor key, or scroll the display with the mouse
pointer and SHIFT+click in the desired corner of the block. To select a large
block in "split-pane" mode (as shown below), click in a cell in one pane, and
then scroll (in the same pane) to display the diagonally opposite corner of
another pane and SHIFT+click to select the block.
Variable speed of block
highlighting. Note that you can control the speed at which you scroll when
you extend a block outside the current display window. By moving the cursor a
short distance away from the spreadsheet, you can scroll one line at a time when
a block is selected; you can scroll one page at a time by moving the cursor
farther away from the spreadsheet.
Can I Expand the Datafile Using
Drag-and-Drop?
You can use
drag-and-drop operations to increase the size of the spreadsheet by moving or
copying a block [either all or some of the variable(s) or case(s)] beyond the
current spreadsheet boundaries.
You can also expand the spreadsheet using the Insert operation. This
operation inserts the block between cases or variables, thereby increasing the
size of the spreadsheet. Note that whether you move, copy, or insert new cases
or variables, when the block contains only part of a variable or case, then STATISTICA fills the remaining values in the
variable/case with missing data.
How Do I Split a STATISTICA Datafile into Smaller Files?
Select Subset/Random Sampling from the Data menu to display the Create a Subset/Random
Sampling dialog, where you can create a subset of your datafile based on
the specified variables and either case selection conditions or random
sampling.
Once your selections have been made, a new spreadsheet with
the requested subset is created, and you can continue the analyses using the
subset of data. See Subset/Random
Sampling for further details.
Saving
subsets. When you intend to save the subset, select either Save As or Save from the File menu to save the new
spreadsheet. The original file is not in danger of being overwritten unless you
use the same name.
Permanent vs. temporary
subsets. Subset creation is designed mostly to create permanent subset
files, because in order to temporarily select a subset of data to be analyzed,
the on-line Case Selection
Conditions are more convenient to use.
How Do I Merge Two STATISTICA
Datafiles?
Select Merge from the Data menu to display a
submenu of merge options.
Either cases (rows of data) or variables (columns of data)
can be merged. In other words, either the second file is appended to the
"bottom" of the first one, or it is appended to the "right side" of the first
one.
If you select Merge - Variables & then
additional options are available.
For example, you can select Relational or Relational hierarchical option buttons, for which a
key variable (in each file) is used to "match" cases based on the corresponding
values of the key.
Can I Merge the Text
Labels/Numeric Values from Two Files?
Yes. Select Text Labels from the Data - Merge menu to display the Merge Files - Text Labels
dialog.
In this dialog, you can select the way in which the text
values from the two files are to be merged.
How Do I Calculate (Transform) Values of a Variable?
Spreadsheet formulas. To perform data transformation and
recoding operations on single variables using one (as opposed to a set of
transformation formulas), you can use the data spreadsheet formulas.
Double-click on the variable name in a spreadsheet variable that you want to
transform to display the Variable specifications dialog (see below) where a
data transformation or recoding formula can be typed directly into the Long name (label or formula with
Functions) box.
Following
the Windows spreadsheet formula conventions (e.g., Microsoft Excel), start the
formula with an "=" sign (otherwise STATISTICA will not recognize that the text
is to be interpreted as a formula). For example, enter =(v1+v2)/2. Variables can be referenced by their
names (e.g., Income, profit,
TEST1) or numbers (e.g., v1, v2, v3, ...); v0 is the case number. Logical operators can be used
to define conditional transformation expressions (e.g., = ScoreA + ScoreB). Note that
<, <= , >, > =, and <> can be used to specify logical
conditions.
Variable names can
include special characters (e.g., spaces, plus or minus signs). Variable names
that include special characters must be enclosed in quotation marks (e.g., = 'Score A' + 'Score B'). If a
quotation mark itself occurs in the variable name, use double quotation marks
instead (e.g., = "A's Score" +
"B's Score"). Note that if double quotation marks are used in the name,
then the variable name must be enclosed in single quotation marks.
Following the logic of the concept of "missing data," the formula will evaluate to missing data if any variable
used in the formula is missing. However, several functions are supported that
calculate specific statistics for ranges and/or lists of data "adjusting" for
missing values. For example, the formula = mean (v1:v5, v7, sqrt(v9), TIME) will calculate the
mean of variables v1, v2, v3, v4,
v5, v7, square root of v9, and variable TIME, even if some of them may have missing
values.
Click the Functions button on the Variable specifications dialog to access the Functions Browser facility
that can be used to select functions and other elements of the syntax (for the
formulas) and insert them into the formula editor.
The Function
Browser also is used to quickly access information on any of the
functions and the complete on-line syntax documentation. After entering a
formula and clicking the OK button on the Variable specifications dialog, you are given the
option to recalculate the variable now. If you choose not to recalculate at this
time, you can do so later by clicking the Spreadsheet toolbar Recalculate button, selecting Recalculate from the Data menu, or pressing F9.
The STATISTICA Visual Basic programming language. If you
need to write more complex data transformation programs than those that can be
entered via spreadsheet formulas, the integrated programming language STATISTICA Visual Basic can be used. The industry
standard STATISTICA Visual Basic language offers
incomparably more than just a "supplementary application programming language"
that can be used to write custom extensions. STATISTICA Visual Basic takes full advantage of the
object model architecture of STATISTICA and allows
you to access programmatically every aspect and virtually every detail of the
functionality of the program. STATISTICA Visual Basic
adds an arsenal of more than 10,000 new functions to the standard comprehensive
syntax of Microsoft Visual Basic, thus comprising one of the largest and richest
development environments available.
STATISTICA Visual Basic programs
can be executed from within STATISTICA, but because
of the industry standard compatibility of STATISTICA
Visual Basic, you can also execute its programs from any other compatible
environment (e.g., MS Excel, MS Word, or a stand-alone Visual Basic
language).
The STATISTICA Visual Basic environment includes a flexible
program editor and powerful debugging tools. To display the STATISTICA Visual Basic editor, select New from the File menu to display the Create New Document - Macro (SVB)
Program tab. When editing macro programs by typing in general Visual
Basic commands or program commands specific to STATISTICA Visual Basic, the editor displays type-ahead
help to illustrate the appropriate syntax.
Help on the members and functions for each class (object) is
also provided in-line.
You can access examples and concise syntax summaries by
pressing the F1 key or clicking the help button to display the Electronic Manual. Note that to learn more about a
particular Visual Basic function, select the function in the STATISTICA Visual Basic editor and then press F1 on your
keyboard. This will call the Interactive Syntax Help Engine and display the help
for that keyword.
When executing
a program, you can set breakpoints in the program, step through line by line,
and observe and change the values of variables in the macro program as it is
running.
To summarize, STATISTICA Visual
Basic is not only a powerful programming language, but it represents a very
powerful professional programming environment for developing simple macros as
well as complex custom applications.
Can STATISTICA Automatically Recalculate All Spreadsheet
Formulas When the Data Change?
Yes. Select the Auto-recalculate when the data change check box in
the Recalculate dialog
(accessed by clicking the Recalculate button on the toolbar or selecting Recalculate from the Data menu).
STATISTICA will automatically recalculate all
spreadsheet formulas when data are changed in the spreadsheet. Alternatively,
you can press F9 to manually recalculate the formulas in the spreadsheet.
What Is the Simplest Way to Recode
Values of a Variable (e.g., Split a Continuous Variable into Categories)?
Recoding functions of practically
unlimited complexity can be custom-defined in STATISTICA Visual Basic and used repeatedly in your data
transformation programs.
However,
a quick on-line recoding facility can also be accessed directly from the
spreadsheet at any point. Select Recode from the Spreadsheet Vars toolbar button to display the Recode Values dialog. You can
scroll the Recode Values
dialog to define new values (text or numeric) of the current variable (see the
groups New Value 1, New Value 2,
New Value 3, etc., below) depending on the specific conditions that you
define (see the Category 1,
Category 2, Category 3, etc., groups below).
For example, the recoding conditions specified above would
"translate" the values less than or equal to 3 of the current variable (Quality) into Low, the values greater than
3 and less than or equal to 6 into Medium, and the values greater than 6 and less than
or equal to 10 into High.
Note that any other values that occur in the Quality variable would be "translated" into the
missing data code because the MD
Code option button is selected in the Other group. Note also that STATISTICA first finds all cases that meet Category 1 and recodes them;
then STATISTICA only searches through the remaining
cases that have not be recoded to see if they meet Category 2, etc. Hence, you do not have to specify
Quality > 3 AND Quality < =
6 as Category
2.
When specifying
conditions, follow the standard syntax conventions common in STATISTICA to all those procedures that involve any
operation of "selecting cases" based on their values. You can also save the
current recoding conditions to a text file or open a file with previously saved
conditions.
Note that recoding
conditions can be much more complex (see the Electronic Manual by pressing the
F1 key or clicking the Help button in this dialog), and they can be defined
such that the new values of the current variable do not depend on the old values
of that variable, but only on values of some other variables in the datafile.
Thus, this facility can be used not only to recode existing data, but also to
create values of a new variable based on conditions met by other variables.
How Do I Create Values of a New
Variable Based on Conditions Met by Other Variables?
You can use either of the data
transformation facilities: spreadsheet formulas or STATISTICA Visual Basic. However, often the quickest way
to create values of a new variable based on conditions met by other variables is
to use the on-line data recoding facility, which is accessible at any point from
the spreadsheet by clicking the Vars button and selecting Recode or selecting Recode from the Data menu. Note that the currently selected variable
does not even have to be included in the text of the recoding conditions. Thus,
you can use this facility to create values of a variable based on conditions met
by other variables.
For example, you can add a new (empty) variable to the
datafile, and then use this facility to create the new values. For instance, the
recoding conditions could be used to assign 1's to the new variable for all
"male subjects 18 to 25 years old with cholesterol levels below 200," 2's to
"male subjects 18 to 25 years old with cholesterol levels above 200," and assign
the missing data value to all other subjects.
What Syntax Can Be Used to Create
Case Selection/Verification/Recode Conditions (or to Calculate Variable Values
Using Spreadsheet Formulas)?
Recoding variable values, performing case selection for
analyses, or verifying quality of data involve a comparison of conditions or
values encountered in the spreadsheet with conditions specified by an expression
to determine a further action for each case (i.e., respectively assigning a new
variable value, including the case in an analysis, or verifying that the data
associated with a case are correct). Assigning variable values via a spreadsheet
formula involves case-by-case calculation of values for a variable based
directly on the values of other spreadsheet variables and/or parameters not
included in the spreadsheet. For data transformations more extensive or more
complex than can be accomplished using the variable recode and spreadsheet
formula facilities provided, use the STATISTICA Visual Basic language.
Distribution functions and their
integrals. STATISTICA provides a predefined broad
selection of distribution functions, their integrals and inverse distribution
functions that can be used in spreadsheet formulas and in recoding, case
selection, and verification expressions like all other functions.
Below is a listing of all available
distributions (parameters are given in parenthesis):
|
Distribution |
Density/Probability Function |
Distribution Function |
Inverse Distribution Function |
|
Beta Binomial Cauchy Chi-square Exponential Extreme F Gamma Geometric Laplace Logistic Logonormal Normal Pareto Poisso Rayleigh Student's t Weibull |
beta(x,n,w) binom(x,p,n) cauchy(x,h,q) chi2(x,n) expon(x,l) extreme(x,a,b) F(x,n,w) gamma(x,c) geom(x,p) laplace(x,a,b) logis(x,a,b) lognorm(x,m,s) normal(x,m,s) pareto(x,c) poisson(x,l) rayleigh(x,b) student(x,df) weibull(x,b,c,q) |
ibeta(x,n,w) ibinom(x,p,n) icauchy(x,h,q) ichi2(x,n) iexpon(x,l) iextreme(x,a,b) iF(x,n,w) igamma(x,c) igeom(x,p) ilaplace(x,a,b) ilobis(x,a,b) ilognorm(x,m,s) inormal(x,m,s) ipareto(x,c) ipoisson(x,l) irayleigh(x,b) istudent(x,df) iweibull(x,b,c,q) |
vbeta(x,n,w) vcauchy(x,h,q) vchi2(x,n) vexpon(x,l) vextreme(x,a,b) vF(x,n,w) vgamma(x,c) vlaplace(x,a,b) vlogis(x,a,b) vlognorm(x,m,s) vnormal(x,m,s) vpareto(x,c) vrayleigh(x,b) vstudent(x,df) vweibull(x,b,c,q) |
How Can I Verify and "Clean" Data?
Select Verify from the Data menu to access an interactive data-verification and cleaning facility. Use the Verify Data dialog to enter the conditions to be met by the data.

Follow the standard syntax conventions common in STATISTICA to all those procedures that involve any operation of "selecting cases" based on their values. You can also save the current verification condition to a text file or open a file with previously saved conditions.
The verification can be as simple as checking whether values in a variable are "legal" (e.g., only 1 and 2 might be allowed for Gender) or whether they fall within allowed ranges of values (e.g., Age must be more than 0 and less than 200). It can also be as complex as checking multiple logical conditions that some values must meet in relation to other values.
Consider the following example of conditional verification:
If a person is a male or less than 10 years old, then the number of pregnancies for that person cannot be more than zero.
In order to apply these conditions, you would specify (for example):
Invalid if: (v1='MALE' or AGE<10) and PREGN>0
Once you have entered your verification condition(s), click either the Find First button to select the first invalid case in your datafile (after this first case has been selected, you can find the next case by selecting Find Next Invalid Case from the Data - Verify Data menu) or click the Mark All button to mark all of the invalid ycases in the datafile according to the Marked Cells spreadsheet layout.
How Do I Perform a Multiple Sort?
Select Sort from the Data menu, Sort Cases from the Cases button, or click the toolbar button. All of these actions display the Sort Options dialog where you can specify the key variables and the type of sort.

If you need to sort based on more than three keys, click the More Keys button to switch to a larger dialog.
How Do I Rank-Order Values of a Variable (Replace Values with Their Ranks)?
Select Rank from the Vars button on the Spreadsheet toolbar or from the Data menu to display the Rank Order Values dialog.

Then click the Variables button to select the variables to be ranked. Note that this operation replaces the original values with their respective ranks based on the sorted series. Optionally, you can also specify a subset of cases to be affected by the operation (click the Cases button), use case weights (click the Weight button), and set a number of options to perform particular (non-default) types of ranking.
How Do I Transpose Data (Convert Cases Into Variables)?
Select Transpose from the Data menu, then select either Block or File.

Transposing a block of data. The block transposing command affects only the contents of cells in the block currently selected in the spreadsheet (the block must be square); the variable names and case names will not be affected. For example, the Transpose - Block command executed on the following square block of data:

produces the following result:

Transposing a datafile. Datafile transposing restructures the entire file. For example, transposing the datafile shown above (before the block was transposed), changes the number of cases and variables in the datafile:

Note that when you transpose a file, case names become variable names and variable names become case names.
How Do I Automatically Fill Ranges of Data in the Spreadsheet?
In addition to the custom-defined operations of filling specific ranges of data with values available via the data transformation options, STATISTICA can fill selected blocks with random values. STATISTICA also supports standard (Microsoft Excel-style) block extrapolation operations, which can be implemented by dragging a border of the block onto the adjacent area of the spreadsheet that is to be filled following the extrapolation rules.
Random value fill. Select a block, then right-click and select Fill/Standardize Block - Fill Random Values from the spreadsheet shortcut menu. The currently selected block will be filled with random values (following a uniform distribution) in the range of 0 to 1. An error message is displayed if a block is not selected.
Extrapolation of values (AutoFill). A selected series of values (consisting of at least two values) will be extended if a meaningful pattern is detected. For example, the following numbers: 1.00, 2.00, and 3.00 would be extended to 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 7.00, and 8.00 as illustrated in the following spreadsheets:
If no meaningful pattern can be detected, predicted values from a linear trend regression will be used for the extrapolation. If a single value (or a repeated value) is selected, the value will be copied into the extrapolated block; also, sequences consisting of names of months, days, or quarters can be automatically extended.
Fill down or right. Finally, simple spreadsheet-style "fill block" (Down or Right) options are available that fill the currently selected block by duplicating the first row or column (respectively) of the block. To use this facility, first select a block to be filled in the spreadsheet. Then select Fill/Standardize Block from the spreadsheet Edit menu (or the spreadsheet shortcut menu, shown below). Use the hierarchical menu to select either Fill/Copy Down or Fill/Copy Right.
The former copies (duplicates) the first row of the block to all remaining rows of the block; the latter copies the first column of the block to the remaining columns of the block. Both options work in a manner similar to Microsoft Excel's Fill Right and Fill Down facility (also available in Excel from the Edit menu).
How Does the Extrapolation of Blocks (AutoFill) Work?
A horizontal or vertical series in a block can be extrapolated by dragging the block Fill Handle (a small solid square located on the lower-right corner of the block). After selecting the block, point at the Fill Handle. When the mouse pointer changes to a plus sign , drag to the last cell(s) to which you want to extrapolate the data. STATISTICA can create series of values such as sequential numbers, linear extrapolations, and dates (e.g., you can extend a series such as 1, 2, 3 to include 4, 5, 6).
You can extrapolate (AutoFill) a block in the following ways:
How Do I Standardize Values in a Block?
Select a block of values in the spreadsheet, right-click, and select either Fill/Standardize Block - Standardize Columns or Fill/Standardize Block - Standardize Rows from the shortcut menu. The standardized values are computed as follows:
Standard Value = (raw value - mean of highlighted row/column)/standard deviation
In the following illustration, the columns of the selected block have been standardized by columns:
You can also standardize selected variables by selecting Standardize from the Data menu (this procedure works independent of the currently selected block, but takes into account the current case selection conditions and weights).
How Do I Select Variables for an Analysis?
Every analysis definition dialog in STATISTICA contains at least one Variables button that is used to specify variables to be analyzed. You can click the button (or press V). If you forget to specify variables and click the OK button to start the analysis, STATISTICA prompts you to specify the variables to be analyzed. The variable selection dialog that will be displayed supports various ways of selecting variables, and it offers various shortcuts and options to review the contents of the datafile.
Can Variables Be Selected for Analyses by Highlighting Them in the Spreadsheet?
Yes. If you select a block in the spreadsheet, the variables included in the block are automatically preselected for the next analysis. This shortcut is designed to limit the chance of producing unintended results in the following ways:
How Do I Select a Subset of Cases (Observations) to Be Included in an Analysis?
Before an analysis begins (i.e., before the data are processed), you can instruct STATISTICA to select only cases (i.e., rows in the spreadsheet) that meet some specific selection criteria. A facility to define and manage case selection conditions can be accessed from the Tools - Selection Conditions - Edit menu (see page 282) or by clicking on the status bar field Sel, which shows the current status of the Case Selection Conditions. Note that when case selection conditions are disabled, this field reads Sel: OFF, when they are enabled, it reads Sel: ON).
Note that there are two types of case selection conditions in STATISTICA - permanent (saved with the datafile) and temporary (associated with the current analysis only).
How Are Case Selection Conditions Stored and Saved?
Global case selection conditions (definitions of subsets of data) are attached to a spreadsheet and are called Spreadsheet Case Selection Conditions. They are accessed from the Tools - Selection Conditions menu or by clicking the status bar field Sel. Analysis/Graph Case Selection Conditions can also be locally connected to an analysis or graph. They are accessed by clicking the Select Cases button that is available on all analysis/graph definition dialogs. You can save and open either type of selection conditions (and maintain libraries of case selection conditions) using the Open or Save As buttons in the appropriate Case Selection Conditions dialog. Note that case weights are specified in a similar manner.
Can I Select Random Subsets of Data?
Yes, you can. Subsets of data can be created via both simple random sampling and systematic random sampling. Select Subset/Random Sampling from the Data menu to display the Create a Subset/Random Sampling dialog.

Select the Simple random sampling option button in the Subset Selection Rules group to obtain your subset via random sampling. You have two choices in regards to how the subset is created: either via the Percentage of cases within the original spreadsheet or an Approximate number of cases. If you select the With replacement check box, then once a case is selected to be included into the subset, that case will be placed back into the pool of available choices for the remaining cases in the subset (hence an individual case can appear more than once in the resulting subset). Additionally, if a weighting variable has been selected for the original spreadsheet, select the Use spreadsheet case weights check box to apply that case weight when the subset is being obtained.
Select the Systematic random sampling option button to obtain your subset via systematic random sampling. For instance, if you enter a 5 in the K= box, STATISTICA will randomly select a case within the first five cases and then finish obtaining the subset by selecting each fifth case in the spreadsheet after the originally selected case.
What Is the Quickest Way to Review Basic Descriptive Statistics for a Variable?
You can either select Descriptive Statistics from the Basic Statistics and Tables (Startup Panel) (accessible from the Statistics menu) or click the Values/Stats button in the Variable specifications dialog, as illustrated below. To display the Variable specifications dialog, double-click on the variable name in the spreadsheet. Alternatively, right-click on the variable name and select Variable Specs from the shortcut menu.

After you click the Values/Stats button, the resulting dialog displays information about the selected variable, a sorted list of its values, and descriptive statistics. The descriptive statistics can be copied to the Clipboard by clicking the Copy button (see the lower part of the window in the illustration below).
What Is the Quickest Way to Review a Sorted List of All Unique Values of a Variable?
Click the Values/Stats button in the Variable specifications dialog to display a sorted list of all unique numeric values and corresponding text labels (if there are any) for that variable. The Variable specifications dialog is accessed by double-clicking the variable name in the spreadsheet.

Note that the assignments between the numeric values and text labels for the variable can be edited in the Text Labels Editor.
What Are Statistics of Block Data?
Statistics for each row or column in a selected block can be computed and added to the spreadsheet (i.e., appended at the end) by selecting the desired Statistics of Block Data from the shortcut menu.

The selected statistics are added to the end of the spreadsheet as additional rows or columns. For example, the selection shown above computes the sums for each row in the selected block and appends them to the datafile as a new variable Sum. Note that in the illustration below the new variable Sum has been moved next to Measure02.

If you select Block Columns, the results of the selected statistic are appended as a new case, and the case name contains the name of the selected statistic and the range description, as shown below.

In addition, a (customizable) list of predefined graphs is also available from this shortcut menu.

These Graphs of Block Data allow you to visualize the data in the selected block either by row or by column.
Can Matrix Data Be Used for Input Instead of Raw Data?
Yes. STATISTICA supports a variety of matrix file types as input data (e.g., for Multiple Regression, Factor Analysis, Reliability/Item Analysis, and other modules). Matrix files can be edited in the spreadsheet (as if they were raw datafiles); however, in order to be properly interpreted as matrices, they need to meet specific content and format conditions (depending on the matrix type). STATISTICA also supports the multiple matrix datafile format for input in procedures (such as Structural Equation Modeling) that supports matrix data for groups. Note that matrix data that meet the specific content and format conditions are automatically saved in STATISTICA Matrix Spreadsheet (*.smx) format.
How Can I Access Data from Excel and Other Foreign Datafiles?
Clipboard. The quickest, and in many cases easiest, way to access datafiles from other Windows applications (e.g., spreadsheets) is to use the Clipboard, which in STATISTICA supports special Clipboard data formats generated by applications such as Microsoft Excel. For example, STATISTICA properly interprets formatted cells (such as 4/17/1999 or $10) and text values (including extensive in-cell formatting, e.g., RVStower120.3MHz).
File import facilities. Datafiles from a wide variety of Windows and non-Windows applications can be accessed and translated into the STATISTICA format using the file import facilities (available by opening the file using the File - Open command). You can even access formatted and free format text (ASCII) files.
The main advantages of using the file import facilities (over the Clipboard) are that:
How Do I Export Data from STATISTICA to Excel and Other Foreign Datafiles?
STATISTICA can export datafiles to a wide variety of Windows and non-Windows applications via the Save As command accessible from the File menu. You can even save spreadsheets as formatted and free format text (ASCII) files. These facilities allow you to specify the exact way in which the translation is to be performed (e.g., decide whether or not to export variable names, text labels, and case names). Also, STATISTICA Spreadsheets are OLE servers and clients, thus other applications can link to them.
Can STATISTICA Access Data from Enterprise Databases?
Yes. STATISTICA Query (including options to combine fields from multiple tables) is provided via both the File - Get External Data and the BI menus. As long as the database format supports a 32-bit ODBC driver or an OLE DB provider, accessing such data should present no problem.
Note that you can either take advantage of STATISTICA Query's built-in graphical user interface for querying:

or write SQL statements in a text editor.

For further details about setting up connections to external databases, consult the database vendor and/or your database administrator.
Can STATISTICA Process Files That Are Larger Than the Local Hard Drive?
Yes, the enterprise versions of STATISTICA offer options to query and access large remote datafiles in-place (i.e., without having to import the data and create a local copy).
Does STATISTICA Support ODBC?
Yes, via STATISTICA Query, an external data querying facility available from the Data - Get External Data menu and the File - Get External Data menu. The STATISTICA Query interface includes options to combine fields from multiple tables and provides flexible access to a wide variety of database management files, including the major enterprise database formats such as Microsoft SQL Servers and Oracle. Specific records (rows of tables) can be selected by entering SQL statements. STATISTICA Query automatically builds the SQL statement for you as you select the components of the query via a simple graphical interface and/or intuitive menu options and dialogs. Hence, an extensive knowledge of SQL is not necessary in order for you to create advanced and powerful queries of data in a quick and straightforward manner. Multiple queries based on one or many different databases can also be created to return data to an individual spreadsheet. Therefore, you can maintain connections to multiple external databases simultaneously.
How Do I Set Up OLE Links between STATISTICA and Other Windows Applications?
You can establish OLE (Object Linking and Embedding) links between a "source" (or server) file (e.g., a Microsoft Word document) and a STATISTICA document (the "client" file) so that when changes are made to the data in the source file, the data are automatically updated in the respective part of the STATISTICA document (client file). Additionally, STATISTICA can serve as a "source" (or server) file for other OLE compatible documents. In this way, you could link the values in a STATISTICA Spreadsheet to a table in a Microsoft Word document so that the Word document updates when data in the spreadsheet are changed.
OLE links are established using the quick, "paste-like" Paste Link option in the Paste Special dialog. To create an OLE Link within a STATISTICA document, first copy to the Clipboard the desired text (or values) from another Windows application. Then (in STATISTICA), select Paste Special from the Edit menu to display the Paste Special dialog. Select the Paste Link option button, specify the correct format in the As group, and click the OK button to establish the link using the source given in the Source field.
When a link is established, it can be managed using the Links dialog (select Links from the Edit menu). Use this dialog to update or break a link, open the source file or change to a different source file, and determine whether updates will be automatic or manual.
How Do I Set Up DDE Links between STATISTICA and Other Windows Applications?
You can establish DDE (Dynamic Data Exchange) links between a "source" (or server) file (e.g., a Microsoft Excel spreadsheet) and a STATISTICA datafile (the "client" file) so that when changes are made to the data in the source file, the data are automatically updated in the respective part of the STATISTICA Spreadsheet (client file).
A common application for dynamically linking two files would be in industrial settings, where the STATISTICA datafile would be dynamically linked with a measurement device connected to the serial port (e.g., in order to automatically update specific measurements hourly).
DDE links are established in the Manage DDE Links dialog (select DDE Links from the Edit menu). To create a new link, click the New Link button in this dialog. The dialog also is used to edit, update, delete, or disable any or all links.
| Back to Top |
| Request Quote |
| StatSoft Home Page |
![[StatSoft]](../../images/sssmall.gif)
2300 East 14th Street, Tulsa, OK 74104
Phone: (918) 749-1119; Fax: (918) 749-2217
e-mail: info@statsoft.com
©Copyright StatSoft, Inc., 1984-2005.
StatSoft, StatSoft logo, STATISTICA, SEWSS, SEDAS, Data Miner, SEPATH and GTrees are trademarks of StatSoft, Inc.