CHAPTER 8: StatView

StatView is a fairly powerful statistical analysis package that will allow easy entry data and basic and complex statistical analysis and graphing of the data. To begin using the program simply click on the icon within the StatView folder. When the StatView logo appears, go to the "File" menu and select "New" to create a new data table or "Open" to work with a previously saved one.
A. Creating a Data Table
1. Creating Columns:
If "New" is selected the "New Data Column Information" window will appear as seen in figure 11.
This allows the column names to be entered along with the type of data that will be in the column. If real numbers are to be used, the user can select how many decimals are to be used while the "string" type will allow words to be entered. The "category" type of column allows you to group your data by categories such as "male" and "female" or "treated" and "untreated".
Once you have named and selected the type click on "More" to add columns or "Done" if that is your last column.
If you select the "category" type, which is used for ANOVA's and other statistical procedures, you will be asked to choose the new column's category as shown in figure 12. If one was created previously, click on the desired one and on "Select".
If a new one is to be created, click on "New" and the "Create a New Category" window will appear (Figure 13).
The "Category Name" should be selected with the column name in mind. Clicking on the "Element Name" box will allow the name for the individual categories to be entered. For example, if the "Category Name" was "Gender" than Element 1 would be "Female" and, after clicking on "Add", "Male" would be added as Element 2. When all your elements have been added, click on "Done".
2. Editing Columns:
Deleting columns: Any column can be deleted by highlighting it and hitting the "delete" key.
Adding columns: New columns can be added by holding the "command" key and moving the cursor between the column names where you wish to add a column. The cursor will look like that shown in figure 14. Clicking the mouse will take you to the "New Data Column Information" window.
Resizing columns: The width of a column can be changed by moving the cursor to the right side of the column near the column's name. By clicking and dragging on the line the column width can be increased or decreased. This may be needed if you have numbers that are larger than the current width of the column.
3. Entering Data:
The data table will initially be a small table with the column names across the top and only one row of data boxes as shown in figure 14 above. By clicking on any box, data can be entered into that box. The arrow keys can be used to move up, down, left or right through the table, entering data. The "tab" key will move the cursor from left to right and, at the end of the columns, it will move it down one row.
Any errors can be corrected by simply clicking on that box and retyping in the data.
Data from one column can be moved to another by highlighting it and using the "Cut", "Copy" or "Paste" options under the "Edit" menu item.
4. Saving Data: Periodically, and at the end of the session, the data table and a back-up copy should be saved using "Save" and "Save as" under the "File" menu item. If you have performed any statistical procedures or created graphs, these will not be saved as only the data is saved. Therefore, be sure to copy any other tables or graphs to your Word document before closing the data table or program.
B. Basic Statistics
All statistical functions require that variables be set first. To do so, simply highlight the column and, under "Variables" in the menu, select either "X" or "Y". Multiple Y's and Y's can be selected and they will be distinguished by subscripts. To remove a variable, simply highlight it and select "Clear" under the "Variable" menu option.
If only basic statistics on any column are desired, such as mean, standard deviation, variance, etc., the "Mean, Std. Dev., etc." option under the "Describe" menu shown below can be selected after a column has been selected as an "X". The data will be displayed in a table as illustrated in Figure 15.
If multiple X's exist, basic stats will be calculated for all of them and each set can be viewed by using the scroll bar to the right of the window.
C. Basic Graphs
Once the data has been entered in the table it can also be used to create simple graphs. This can be done for any column alone by making it an "X", by plotting any group of columns by making them all X's or by plotting columns against each other by making one an "X" and one a "Y".
Once the variables have been selected, the "View" menu item, shown to the left, will allow a variety of graphs to be created from that data. While the "Table", "Scattergram", "Bar Chart" and "Line Chart" can be created for most variables, the "Pie Chart" and "Box Plot" can only be created from percentage data. The three graphs below in figure 16 illustrate the basic types of graphs that are available from StatView. They also show different ways of presenting the same data. When the graph appears, there will be a toolbox located along the left side that will allow the graph to be labeled or modified. This is also possible using the "Text" and "Graph" options in the menu.
a. scattergram graph b. bar graph
c. line graph
The major tool in the toolbox that you will find useful is the "I-Beam" located just below the arrow. Selecting this tool allows text to be added to the graph at virtually any point. Once selected, the cursor is placed where the text is desired and clicked. A text box will appear that can be resized by clicking ad dragging on the corner box. The font type and size can be created by use of the "Text" menu option.
Any graph or table created in StatView can be printed by selecting "Print" under the "File" option. They can also be exported to your Word document by following the instructions in section VII. B.
D. Conducting an ANOVA
The ANOVA or Analysis of Variation procedure is one of the best ways to determine where any variation seen in your experimental data came from. Whenever you examine you data means differences will exist. The question is whether that variation is due to experimental error, to the way the experiment was conducted or some inherent characteristic, or is due to the treatments applied to the experiment. Obviously, scientists usually want to see the latter be responsible for the variation as it shows that what they did at some point affected the experimental organisms.
In StatView, the "Y" variable needs to be a "category" type column while the "X" variable does not. Once the variables have been set, "Anova..." is selected from the "Compare" menu shown to the left. The first ANOVA window allows the confidence level to be set. This should usually be at either 95% or 99% in science experiments. The next window will contain several "pages" which can be reached by use of the scroll bar along the right side of the window.
The first "page" in the window will be the ANOVA table shown in figure 17 and gives the calculations for the possible experimental error variation (Between groups) and for the possible experimental treatment variation (Within groups). The important number here is the first one listed under the "F-test" column. This is the "calculated F value" and it needs to be greater than the "tabular F value" found in a statistics book. Using such a book's "F values table", the DF numbers provided in the "DF" column of the table and the confidence level originally set for the ANOVA, this number can be found.
If the "calculated F" is greater than the "tabular F" than the variation seen is due to the experimental treatments applied. However, if the "calculated F" is less than the "tabular F" than the variation seen is due to the experimental error inherent in the experiment. Although you will need to provide the "tabular F value" in your write-up, StatView simplifies this process somewhat by providing the p value under the "calculated F value". This p value is the confidence level at which the "calculated F value" would be significant, expressed as a decimal. Therefore, if the p value is above .9500 than the variation is significant and due to the treatments. If it is below that, than we must assume that the variation is due to experimental error.
The second "page" provides a useful way to gather basic statistical data on the data used in the ANOVA.
The last "page" or "pages", shown in figure 19, will contain the results of various statistical procedures designed to determine which treatments are different from other treatments. While this information is invalid if the ANOVA indicates the variation is due to experimental error, it is very useful if the variation is shown to be due to the treatments. At that point the question becomes "Which treatment is significantly better, or worse, than another?" This is shown by the presence of any asterisks in the column's under the "Fisher PLSD", "Scheffe F-test" or "Dunnett t" columns. This indicates that that particular test shows that this comparison is significant. By comparing the treatment means with the information provided in this view, a hierarchy of treatments can be created.


E. Conducting a Regression or Correlation
Other frequently used comparisons, again selected under the "Compare" menu item, are "Correlation...", "Regression..." and "Stepwise Regression...". These allow you to determine if two variables are related or correlated in any way and the degree to which they are. While they vary significantly, the calculations are fairly similar so the "Regression..." procedure will be used here. The others are more powerful and provide additional choices to increase the sophistication of the analysis.
Once the two variables have been selected, one "X" and one "Y", select "Regression..." under the "Compare" menu and the "Regression..." window will appear. At this point a number of options are available. Once changed, click on "Done" and the regression table shown below in figure 20 will appear. This presents a number of results, the most important one being the "R-squared" number which indicates the degree of correlation. The closer this is to 1.0 the greater the correlation between the two variables. In this case, .942 indicates a high degree of correlation.
This relationship can be visualized by going under "View" and selecting "Scattergram" which will create a graph as shown in figure 21 showing the relationship and plotting a "best fit" line that is based upon the equation shown at the top of the graph. This equation is also useful for determining the predicted values of "Y" by plugging in any "X" desired. This allows a "Y" not in the original experiment to be determined.