DR. K's WRITING
MANUAL
CHAPTER 8:
StatView
StatView is a fairly
powerful statistical analysis package that will allow easy entry data
and basic and complex statistical analysis and graphing of the data.
To begin using the program simply click on the icon within the
StatView folder. When the StatView logo appears, go to the "File"
menu and select "New" to create a new data table or "Open" to work
with a previously saved one.
A. Creating a Data
Table
1. Creating
Columns:
If "New" is selected the
"New Data Column Information" window will appear as seen in figure
11.
This allows the column
names to be entered along with the type of data that will be in the
column. If real numbers are to be used, the user can select how many
decimals are to be used while the "string" type will allow words to
be entered. The "category" type of column allows you to group your
data by categories such as "male" and "female" or "treated" and
"untreated".
Once you have named and
selected the type click on "More" to add columns or "Done" if that is
your last column.
If you select the
"category" type, which is used for ANOVA's and other statistical
procedures, you will be asked to choose the new column's category as
shown in figure 12. If one was created previously, click on the
desired one and on "Select".
If a new one is to be
created, click on "New" and the "Create a New Category" window will
appear (Figure 13).
The "Category Name"
should be selected with the column name in mind. Clicking on the
"Element Name" box will allow the name for the individual categories
to be entered. For example, if the "Category Name" was "Gender" than
Element 1 would be "Female" and, after clicking on "Add", "Male"
would be added as Element 2. When all your elements have been added,
click on "Done".
2. Editing
Columns:
Deleting columns: Any
column can be deleted by highlighting it and hitting the "delete"
key.
Adding columns: New
columns can be added by holding the "command" key and moving the
cursor between the column names where you wish to add a column. The
cursor will look like that shown in figure 14. Clicking the mouse
will take you to the "New Data Column Information"
window.
Resizing columns: The
width of a column can be changed by moving the cursor to the right
side of the column near the column's name. By clicking and dragging
on the line the column width can be increased or decreased. This may
be needed if you have numbers that are larger than the current width
of the column.
3. Entering
Data:
The data table will
initially be a small table with the column names across the top and
only one row of data boxes as shown in figure 14 above. By clicking
on any box, data can be entered into that box. The arrow keys can be
used to move up, down, left or right through the table, entering
data. The "tab" key will move the cursor from left to right and, at
the end of the columns, it will move it down one row.
Any errors can be
corrected by simply clicking on that box and retyping in the data.
Data from one column can
be moved to another by highlighting it and using the "Cut", "Copy" or
"Paste" options under the "Edit" menu item.
4. Saving Data:
Periodically, and at the end of the session, the data table and a
back-up copy should be saved using "Save" and "Save as" under the
"File" menu item. If you have performed any statistical procedures or
created graphs, these will not be saved as only the data is saved.
Therefore, be sure to copy any other tables or graphs to your Word
document before closing the data table or program.
B. Basic
Statistics
All statistical
functions require that variables be set first. To do so, simply
highlight the column and, under "Variables" in the menu, select
either "X" or "Y". Multiple Y's and Y's can be selected and they will
be distinguished by subscripts. To remove a variable, simply
highlight it and select "Clear" under the "Variable" menu option.
If only basic statistics
on any column are desired, such as mean, standard deviation,
variance, etc., the "Mean, Std. Dev., etc." option under the
"Describe" menu shown below can be selected after a column has been
selected as an "X". The data will be displayed in a table as
illustrated in Figure 15.
If multiple X's exist,
basic stats will be calculated for all of them and each set can be
viewed by using the scroll bar to the right of the window.
C. Basic
Graphs
Once the data has been
entered in the table it can also be used to create simple graphs.
This can be done for any column alone by making it an "X", by
plotting any group of columns by making them all X's or by plotting
columns against each other by making one an "X" and one a "Y".
Once the variables have
been selected, the "View" menu item, shown to the left, will allow a
variety of graphs to be created from that data. While the "Table",
"Scattergram", "Bar Chart" and "Line Chart" can be created for most
variables, the "Pie Chart" and "Box Plot" can only be created from
percentage data. The three graphs below in figure 16 illustrate the
basic types of graphs that are available from StatView. They also
show different ways of presenting the same data. When the graph
appears, there will be a toolbox located along the left side that
will allow the graph to be labeled or modified. This is also possible
using the "Text" and "Graph" options in the menu.
a. scattergram graph b.
bar graph
c. line
graph
The major tool in the
toolbox that you will find useful is the "I-Beam" located just below
the arrow. Selecting this tool allows text to be added to the graph
at virtually any point. Once selected, the cursor is placed where the
text is desired and clicked. A text box will appear that can be
resized by clicking ad dragging on the corner box. The font type and
size can be created by use of the "Text" menu option.
Any graph or table
created in StatView can be printed by selecting "Print" under the
"File" option. They can also be exported to your Word document by
following the instructions in section VII. B.
D. Conducting an
ANOVA
The ANOVA or Analysis of
Variation procedure is one of the best ways to determine where any
variation seen in your experimental data came from. Whenever you
examine you data means differences will exist. The question is
whether that variation is due to experimental error, to the way the
experiment was conducted or some inherent characteristic, or is due
to the treatments applied to the experiment. Obviously, scientists
usually want to see the latter be responsible for the variation as it
shows that what they did at some point affected the experimental
organisms.
In StatView, the "Y"
variable needs to be a "category" type column while the "X" variable
does not. Once the variables have been set, "Anova..." is selected
from the "Compare" menu shown to the left. The first ANOVA window
allows the confidence level to be set. This should usually be at
either 95% or 99% in science experiments. The next window will
contain several "pages" which can be reached by use of the scroll bar
along the right side of the window.
The first "page" in the
window will be the ANOVA table shown in figure 17 and gives the
calculations for the possible experimental error variation (Between
groups) and for the possible experimental treatment variation (Within
groups). The important number here is the first one listed under the
"F-test" column. This is the "calculated F value" and it needs to be
greater than the "tabular F value" found in a statistics book. Using
such a book's "F values table", the DF numbers provided in the "DF"
column of the table and the confidence level originally set for the
ANOVA, this number can be found.
If the "calculated F" is
greater than the "tabular F" than the variation seen is due to the
experimental treatments applied. However, if the "calculated F" is
less than the "tabular F" than the variation seen is due to the
experimental error inherent in the experiment. Although you will need
to provide the "tabular F value" in your write-up, StatView
simplifies this process somewhat by providing the p value under the
"calculated F value". This p value is the confidence level at which
the "calculated F value" would be significant, expressed as a
decimal. Therefore, if the p value is above .9500 than the variation
is significant and due to the treatments. If it is below that, than
we must assume that the variation is due to experimental
error.
The second "page"
provides a useful way to gather basic statistical data on the data
used in the ANOVA.
The last "page" or
"pages", shown in figure 19, will contain the results of various
statistical procedures designed to determine which treatments are
different from other treatments. While this information is invalid if
the ANOVA indicates the variation is due to experimental error, it is
very useful if the variation is shown to be due to the treatments. At
that point the question becomes "Which treatment is significantly
better, or worse, than another?" This is shown by the presence of any
asterisks in the column's under the "Fisher PLSD", "Scheffe F-test"
or "Dunnett t" columns. This indicates that that particular test
shows that this comparison is significant. By comparing the treatment
means with the information provided in this view, a hierarchy of
treatments can be created.
E. Conducting a
Regression or Correlation
Other frequently used
comparisons, again selected under the "Compare" menu item, are
"Correlation...", "Regression..." and "Stepwise Regression...". These
allow you to determine if two variables are related or correlated in
any way and the degree to which they are. While they vary
significantly, the calculations are fairly similar so the
"Regression..." procedure will be used here. The others are more
powerful and provide additional choices to increase the
sophistication of the analysis.
Once the two variables
have been selected, one "X" and one "Y", select "Regression..." under
the "Compare" menu and the "Regression..." window will appear. At
this point a number of options are available. Once changed, click on
"Done" and the regression table shown below in figure 20 will appear.
This presents a number of results, the most important one being the
"R-squared" number which indicates the degree of correlation. The
closer this is to 1.0 the greater the correlation between the two
variables. In this case, .942 indicates a high degree of correlation.
This relationship can be
visualized by going under "View" and selecting "Scattergram" which
will create a graph as shown in figure 21 showing the relationship
and plotting a "best fit" line that is based upon the equation shown
at the top of the graph. This equation is also useful for determining
the predicted values of "Y" by plugging in any "X" desired. This
allows a "Y" not in the original experiment to be determined.