Statistical Hypothesis Test with STATISTICA
Elementary Concepts in Statistics. 2
Probability Distribution Calculator – Overview.. 9
Alternative Arrangement of Data. 10
More Complex Group Comparisons. 10
T-Test for Independent Samples by Groups Dialog. 10
T-Test for Independent Samples by Variables Dialog. 14
More Complex Group Comparisons. 18
T-Test for Dependent Samples Dialog. 18
T-Test for Single Means Dialog. 21
Example: t-Tests, Deskriptive Statistics. 24
Elementary Concepts in Statistics
Overview of Elementary Concepts in Statistics. In this introduction, we will briefly discuss those elementary statistical concepts that provide the necessary foundations for more specialized expertise in any area of statistical data analysis. The selected topics illustrate the basic assumptions of most statistical methods and/or have been demonstrated in research to be necessary components of one’s general understanding of the “quantitative nature” of reality (Nisbett, et al., 1987).
Because of space limitations, we will focus mostly on the functional aspects of the concepts discussed and the presentation will be very short. Further information on each of those concepts can be found in the Introductory Overviews and Examples of this Electronic Manual and in statistical textbooks. Recommended introductory textbooks are: Kachigan (1986), and Runyon and Haber (1976); for a more advanced discussion of elementary theory and assumptions of statistics, see the classic books by Hays (1988), and Kendall and Stuart (1979).
What are variables. Variables are things that we measure, control, or manipulate in research. They differ in many respects, most notably in the role they are given in our research and in the type of measures that can be applied to them.
Correlational vs. experimental research. Most empirical research belongs clearly to one of these two general categories. In correlational research we do not (or at least try not to) influence any variables but only measure them and look for relations (correlations) between some set of variables, such as blood pressure and cholesterol level. In experimental research, we manipulate some variables and then measure the effects of this manipulation on other variables; for example, a researcher might artificially increase blood pressure and then record cholesterol level. Data analysis in experimental research also comes down to calculating “correlations” between variables, specifically, those manipulated and those affected by the manipulation. However, experimental data may potentially provide qualitatively better information. Only experimental data can conclusively demonstrate causal relations between variables. For example, if we found that whenever we change variable A then variable B changes, we can conclude that “A influences B.” Data from correlational research can only be “interpreted” in causal terms based on some theories that we have, but correlational data cannot conclusively prove causality.
Dependent vs. independent variables. Independent variables are those that are manipulated whereas dependent variables are only measured or registered. This distinction appears terminologically confusing to many because, as some students say, “all variables depend on something.” However, once you get used to this distinction, it becomes indispensable. The terms dependent and independent variable apply mostly to experimental research where some variables are manipulated, and in this sense they are “independent” from the initial reaction patterns, features, intentions, etc. of the subjects. Some other variables are expected to be “dependent” on the manipulation or experimental conditions. That is to say, they depend on “what the subject will do” in response. Somewhat contrary to the nature of this distinction, these terms are also used in studies where we do not literally manipulate independent variables, but only assign subjects to “experimental groups” based on some pre-existing properties of the subjects. For example, if in an experiment, males are compared with females regarding their white cell count (WCC), Gender could be called the independent variable and WCC the dependent variable.
Measurement scales. Variables differ in “how well” they can be measured, i.e., in how much measurable information their measurement scale can provide. There is obviously some measurement error involved in every measurement, which determines the “amount of information” that we can obtain. Another factor that determines the amount of information that can be provided by a variable is its “type of measurement scale.” Specifically variables are classified as a) nominal, b) ordinal, c) interval, or d) ratio.
a. Nominal variables allow for only qualitative classification. That is, they can be measured only in terms of whether the individual items belong to some distinctively different categories, but we cannot quantify or even rank order those categories. For example, all we can say is that 2 individuals are different in terms of variable A (e.g., they are of different race), but we cannot say which one “has more” of the quality represented by the variable. Typical examples of nominal variables are gender, race, color, and city.
b. With ordinal variables, we can rank order the items we measure in terms of which has less and which has more of the quality represented by the variable, but still we cannot say “how much more.” A typical example of an ordinal variable is the socioeconomic status of families. For example, we know that upper-middle is higher than middle but we cannot say that it is, for example, 18% higher. Also this very distinction betweeominal, ordinal, and interval scales itself represents a good example of an ordinal variable. For example, we can say that nominal measurement provides less information than ordinal measurement, but we cannot say “how much less” or how this difference compares to the difference between ordinal and interval scales.
c. With interval variables, we caot only rank order of the items that are measured, but also to quantify and compare the sizes of differences between them. For example, temperature, as measured in degrees Fahrenheit or Celsius, constitutes an interval scale. We can say that a temperature of 40 degrees is higher than a temperature of 30 degrees, and that an increase from 20 to 40 degrees is twice as much as an increase from 30 to 40 degrees.
d. Ratio variables are very similar to interval variables; in addition to all the properties of interval variables, they feature an identifiable absolute zero point, thus they allow for statements such as x is two times more than y. Typical examples of ratio scales are measures of time or space. For example, as the Kelvin temperature scale is a ratio scale, not only can we say that a temperature of 200 degrees is higher than one of 100 degrees, we can correctly state that it is twice as high. Interval scales do not have the ratio property. Most statistical data analysis procedures do not distinguish between the interval and ratio properties of the measurement scales.
Relations between variables. Regardless of their type, two or more variables are related if in a sample of observations, the values of those variables are distributed in a consistent manner. In other words, variables are related if their values systematically correspond to each other for these observations. For example, Gender and WCC would be considered to be related if most males had high WCC and most females low WCC, or vice versa; Height is related to Weight because typically tall individuals are heavier than short ones; IQ is related to the number of errors in a test, if people with higher IQs make fewer errors.
Why relations between variables are important. Generally speaking, the ultimate goal of every research or scientific analysis is finding relations between variables. The philosophy of science teaches us that there is no other way of representing “meaning” except in terms of relations between some quantities or qualities; either way involves relations between variables. Thus, the advancement of science must always involve finding new relations between variables. Correlational research involves measuring such relations in the most straightforward manner. However, experimental research is not any different in this respect. For example, the above mentioned experiment comparing WCC in males and females can be described as looking for a correlation between two variables – Gender and WCC. Statistics does nothing else but help us evaluate relations between variables. Actually, all of the hundreds of procedures that are described in this manual can be interpreted in terms of evaluating various kinds of inter-variable relations.
Two basic features of every relation between variables. The two most elementary formal properties of every relation between variables are the relation’s a) magnitude (or “size”) and b) its reliability (or “truthfulness”).
a. Magnitude (or “size”). The magnitude is much easier to understand and measure than reliability. For example, if every male in our sample was found to have a higher WCC than any female in the sample, we could say that the magnitude of the relation between the two variables (Gender and WCC) is very high in our sample. In other words, we could predict one based on the other (at least among the members of our sample).
b. Reliability (or “truthfulness”). The reliability of a relation is a much less intuitive concept, but still extremely important. It pertains to the “representativeness” of the result found in our specific sample for the entire population. In other words, it says how probable it is that a similar relation would be found if the experiment was replicated with other samples drawn from the same population. Remember that we are almost never “ultimately” interested only in what is going on in our sample; we are interested in the sample only to the extent it can provide information about the population. If our study meets some specific criteria (to be mentioned later), then the reliability of a relation between variables observed in our sample can be quantitatively estimated and represented using a standard measure (technically called p-level or statistical significance level, see the next paragraph).
What is “statistical significance” (p-level). The statistical significance of a result is an estimated measure of the degree to which it is “true” (in the sense of “representative of the population”). More technically, the value of the p-level (the term first used by Brownlee, 1960) represents a decreasing index of the reliability of a result. The higher the p-level, the less we can believe that the observed relation between variables in the sample is a reliable indicator of the relation between the respective variables in the population. Specifically, the p-level represents the probability of error that is involved in accepting our observed result as valid, that is, as “representative of the population.” For example, a p-level of .05 (i.e.,1/20) indicates that there is a 5% probability that the relation between the variables found in our sample is a “fluke.” In other words, assuming that in the population there was no relation between those variables whatsoever, and we were repeating experiments like ours one after another, we could expect that approximately in every 20 replications of the experiment there would be one in which the relation between the variables in question would be equal or stronger than in ours. In many areas of research, the p-level of .05 is customarily treated as a “border-line acceptable” error level.
How to determine that a result is “really” significant. There is no way to avoid arbitrariness in the final decision as to what level of significance will be treated as really “significant.” That is, the selection of some level of significance, up to which the results will be rejected as invalid, is arbitrary. In practice, the final decision usually depends on whether the outcome was predicted a priori or only found post hoc in the course of many analyses and comparisons performed on the data set, on the total amount of consistent supportive evidence in the entire data set, and on “traditions” existing in the particular area of research. Typically, in many sciences, results that yield p ≤ .05 are considered borderline statistically significant but remember that this level of significance still involves a pretty high probability of error (5%). Results that are significant at the p ≤ .01 level are commonly considered statistically significant, and p ≤ .005 or p ≤ .001 levels are often called “highly” significant. But remember that those classifications represent nothing else but arbitrary conventions that are only informally based on general research experience.
Statistical significance and the number of analyses performed. Needless to say, the more analyses you perform on a data set, the more results will meet “by chance” the conventional significance level. For example, if you calculate correlations between ten variables (i.e., 45 different correlation coefficients), then you should expect to find by chance that about two (i.e., one in every 20) correlation coefficients are significant at the p ≤ .05 level, even if the values of the variables were totally random and those variables do not correlate in the population. Some statistical methods that involve many comparisons, and thus a good chance for such errors, include some “correction” or adjustment for the total number of comparisons. However, many statistical methods (especially simple exploratory data analyses) do not offer any straightforward remedies to this problem. Therefore, it is up to the researcher to carefully evaluate the reliability of unexpected findings. Many examples in this Electronic Manual offer specific advice on how to do this; relevant information can also be found in most research methods textbooks.
Strength vs. reliability of a relation between variables. We said before that strength and reliability are two different features of relationships between variables. However, they are not totally independent. In general, in a sample of a particular size, the larger the magnitude of the relation between variables, the more reliable the relation (see the next paragraph).
Why stronger relations between variables are more significant. Assuming that there is no relation between the respective variables in the population, the most likely outcome would be also finding no relation between those variables in the research sample. Thus, the stronger the relation found in the sample, the less likely it is that there is no corresponding relation in the population. As you see, the magnitude and significance of a relation appear to be closely related, and we could calculate the significance from the magnitude and vice-versa; however, this is true only if the sample size is kept constant, because the relation of a given strength could be either highly significant or not significant at all, depending on the sample size (see the next paragraph).
Why significance of a relation between variables depends on the size of the sample. If there are very few observations, then there are also respectively few possible combinations of the values of the variables, and thus the probability of obtaining by chance a combination of those values indicative of a strong relation is relatively high. Consider the following illustration. If we are interested in two variables (Gender: male/female and WCC: high/low) and there are only four subjects in our sample (two males and two females), then the probability that we will find, purely by chance, a 100% relation between the two variables can be as high as one-eighth. Specifically, there is a one-in-eight chance that both males will have a high WCC and both females a low WCC, or vice versa. Now consider the probability of obtaining such a perfect match by chance if our sample consisted of 100 subjects; the probability of obtaining such an outcome by chance would be practically zero. Let’s look at a more general example. Imagine a theoretical population in which the average value of WCC in males and females is exactly the same. Needless to say, if we start replicating a simple experiment by drawing pairs of samples (of males and females) of a particular size from this population and calculating the difference between the average WCC in each pair of samples, most of the experiments will yield results close to 0. However, from time to time, a pair of samples will be drawn where the difference between males and females will be quite different from 0. How often will it happen? The smaller the sample size in each experiment, the more likely it is that we will obtain such erroneous results, which in this case would be results indicative of the existence of a relation between gender and WCC obtained from a population in which such a relation does not exist.
Example: “Baby boys to baby girls ratio.” Consider the following example from research on statistical reasoning (Nisbett, et al., 1987). There are two hospitals; in the first one, 120 babies are born every day, in the other, only 12. On average, the ratio of baby boys to baby girls born every day in each hospital is 50/50. However, one day, in one of those hospitals twice as many baby girls were born as baby boys. In which hospital was it more likely to happen? The answer is obvious for a statistician, but as research shows, not so obvious for a lay person. It is much more likely to happen in the small hospital. The reason for this is that technically speaking, the probability of a random deviation of a particular size (from the population mean), decreases with the increase in the sample size.
Why small relations can be proven significant only in large samples. The examples in the previous paragraphs indicate that if a relationship between variables in question is “objectively” (i.e., in the population) small, then there is no way to identify such a relation in a study unless the research sample is correspondingly large. Even if our sample is in fact “perfectly representative” the effect will not be statistically significant if the sample is small. Analogously, if a relation in question is “objectively” very large (i.e., in the population), then it can be found to be highly significant even in a study based on a very small sample. Consider the following additional illustration. If a coin is slightly asymmetrical, and when tossed is somewhat more likely to produce heads than tails (e.g., 60% vs. 40%), then 10 tosses would not be sufficient to convince anyone that the coin is asymmetrical, even if the outcome obtained (six heads and four tails) was perfectly representative of the bias of the coin. However, is it so that 10 tosses is not enough to prove anything? No, if the effect in question were large enough, then 10 tosses could be quite enough. For instance, imagine now that the coin is so asymmetrical that no matter how you toss it, the outcome will be heads. If you tossed such a coin ten times and each toss produced heads, most people would consider it sufficient evidence that something is “wrong” with the coin. In other words, it would be considered convincing evidence that in the theoretical population of an infinite number of tosses of this coin there would be more heads than tails. Thus, if a relation is large, then it can be found to be significant even in a small sample.
Can “no relation” be a significant result? The smaller the relation between variables, the larger the sample size that is necessary to prove it significant. For example, imagine how many tosses would be necessary to prove that a coin is asymmetrical if its bias were only .000001%. Thus, the necessary minimum sample size increases as the magnitude of the effect to be demonstrated decreases. When the magnitude of the effect approaches 0, the necessary sample size to conclusively prove it approaches infinity. That is to say, if there is almost no relation between two variables, then the sample size must be almost equal to the population size, which is assumed to be infinitely large. Statistical significance represents the probability that a similar outcome would be obtained if we tested the entire population. Thus, everything that would be found after testing the entire population would be, by definition, significant at the highest possible level, and this also includes all “no relation” results.
How to measure the magnitude (strength) of relations between variables. Statisticians have developed very many measures of the magnitude of relationships between variables; the choice of a specific measure in given circumstances depends on the number of variables involved, measurement scales used, nature of the relations, etc. Almost all of them, however, follow one general principle: they attempt to somehow evaluate the observed relation by comparing it to the “maximum imaginable relation” between those specific variables. Technically speaking, a common way to perform such evaluations is to look at how differentiated the values of the variables are, and then calculate what part of this “overall available differentiation” is accounted for by instances when that differentiation is “common” in the two (or more) variables in question. Speaking less technically, we compare “what is common in those variables” to “what potentially could have been common if the variables were perfectly related.” Let us consider a simple illustration. Let us say that in our sample, the average index of WCC is 100 in males and 102 in females. Thus, we could say that on average, the deviation of each individual score from the grand mean (101) contains a component due to the gender of the subject; the size of this component is 1. That value, in a sense, represents some measure of relation between Gender and WCC. However, this value is a very poor measure, because it does not tell us how relatively large this component is, given the “overall differentiation” of WCC scores. Consider two extreme possibilities:
a. If all WCC scores of males were equal exactly to 100, and those of females equal to 102, then all deviations from the grand mean in our sample would be entirely accounted for by Gender. We would say that in our sample, Gender is perfectly correlated with WCC, that is, 100% of the observed differences between subjects regarding their WCC is accounted for by their Gender.
b. If WCC scores were in the range of 0-1000, the same difference (of 2) between the average WCC of males and females found in the study would account for such a small part of the overall differentiation of scores that most likely it would be considered negligible. For example, one more subject taken into account could change, or even reverse the direction of the difference. Therefore, every good measure of relations between variables must take into account the overall differentiation of individual scores in the sample and evaluate the relation in terms of (relatively) how much of this differentiation is accounted for by the relation in question.
Common “general format” of most statistical tests. Because the ultimate goal of most statistical tests is to evaluate relations between variables, most statistical tests follow the general format that was explained in the previous paragraph. Technically speaking, they represent a ratio of some measure of the differentiation common in the variables in question to the overall differentiation of those variables. For example, they represent a ratio of the part of the overall differentiation of the WCC scores that can be accounted for by gender to the overall differentiation of the WCC scores. This ratio is usually called a ratio of explained variation to total variation. In statistics, the term explained variation does not necessarily imply that we “conceptually understand” it. It is used only to denote the common variation in the variables in question, that is, the part of variation in one variable that is “explained” by the specific values of the other variable, and vice versa.
How the “level of statistical significance” is calculated. Let us assume that we have already calculated a measure of a relation between two variables (as explained above). The next question is “how significant is this relation?” For example, is 40% of the explained variance between the two variables enough to consider the relation significant? The answer is “it depends.” Specifically, the significance depends mostly on the sample size. As explained before, in very large samples, even very small relations between variables will be significant, whereas in very small samples even very large relations cannot be considered reliable (significant). Thus, in order to determine the level of statistical significance, we need a function that represents the relationship between “magnitude” and “significance” of relations between two variables, depending on the sample size. The function we need would tell us exactly “how likely it is to obtain a relation of a given magnitude (or larger) from a sample of a given size, assuming that there is no such relation between those variables in the population.” In other words, that function would give us the significance (p) level, and it would tell us the probability of error involved in rejecting the idea that the relation in question does not exist in the population. This “alternative” hypothesis (that there is no relation in the population) is usually called the null hypothesis. It would be ideal if the probability function was linear, and for example, only had different slopes for different sample sizes. Unfortunately, the function is more complex, and is not always exactly the same; however, in most cases we know its shape and can use it to determine the significance levels for our findings in samples of a particular size. Most of those functions are related to a general type of function which is called normal.
Why the “normal distribution” is important. The “normal distribution” is important because in most cases, it well approximates the function that was introduced in the previous paragraph. The distribution of many test statistics is normal or follows some form that can be derived from the normal distribution. In this sense, philosophically speaking, the normal distribution represents one of the empirically verified elementary “truths about the general nature of reality,” and its status can be compared to the one of fundamental laws of natural sciences. The exact shape of the normal distribution (the characteristic “bell curve”) is defined by a function which has only two parameters: mean and standard deviation.
A characteristic property of the normal distribution is that 68% of all of its observations fall within a range of ±1 standard deviation from the mean, and a range of ±2 standard deviations includes 95% of the scores. In other words, in a normal distribution, observations that have a standardized value of less than -2 or more than +2 have a relative frequency of 5% or less. (Standardized value means that a value is expressed in terms of its difference from the mean, divided by the standard deviation). You can explore the exact values of probability associated with different values in the normal distribution using the Probability calculator in Basic Statistics; for example if you enter the Z value (i.e., standardized value) of 4, the associated probability computed by STATISTICA will be less than .0001, because in the normal distribution almost all observations (i.e., more than 99.99%) fall within the range of ±4 standard deviations. The animation below shows the tail area associated with other Z values.

Illustration of how the normal distribution is used in statistical reasoning (induction). Recall the example discussed above, where pairs of samples of males and females were drawn from a population in which the average value of WCC in males and females was exactly the same. Although the most likely outcome of such experiments (one pair of samples per experiment) was that the difference between the average WCC in males and females in each pair is close to zero, from time to time, a pair of samples will be drawn where the difference between males and females is quite different from 0. How often does it happen? If the sample size is large enough, the results of such replications are “normally distributed,” and thus knowing the shape of the normal curve, we can precisely calculate the probability of obtaining “by chance” outcomes representing various levels of deviation from the hypothetical population mean of 0. If such a calculated probability is so low that it meets the previously accepted criterion of statistical significance, then we have only one choice: conclude that our result gives a better approximation of what is going on in the population than the “null hypothesis.” Remember that the null hypothesis was considered only for “technical reasons” as a benchmark against which our empirical result was evaluated.
Are all test statistics normally distributed? Not all, but most of them are either based on the normal distribution directly or on distributions that are related to, and can be derived from normal, such as t, F, or Chi-square. Typically, those tests require that the variables analyzed are themselves normally distributed in the population, that is, they meet the so-called “normality assumption.” Many observed variables actually are normally distributed, which is another reason why the normal distribution represents a “general feature” of empirical reality. The problem may occur when one tries to use a normal distribution-based test to analyze data from variables that are themselves not normally distributed (see tests of normality in Nonparametrics or Basic Statistics). In such cases we have two general choices. First, we can use some alternative “nonparametric” test (or so-called “distribution-free test”); but this is often inconvenient because such tests are typically less powerful and less flexible in terms of types of conclusions that they can provide. Alternatively, in many cases we can still use the normal distribution-based test if we only make sure that the size of our samples is large enough. The latter option is based on an extremely important principle which is largely responsible for the popularity of tests that are based on the normal function. Namely, as the sample size increases, the shape of the sampling distribution (i.e., distribution of a statistic from the sample; this term was first used by Fisher, 1928a) approaches normal shape, even if the distribution of the variable in question is not normal. This principle is illustrated in the following animation showing a series of sampling distributions (created with gradually increasing sample sizes of: 2, 5, 10, 15, and 30) using a variable that is clearly non-normal in the population, that is, the distribution of its values is clearly skewed.

However, as the sample size (of samples used to create the sampling distribution of the mean) increases, the shape of the sampling distribution becomes normal. Note that for n=30, the shape of that distribution is “almost” perfectly normal (see the close match of the fit). This principle is called the central limit theorem (this term was first used by Pólya, 1920; German, “Zentraler Grenzwertsatz”) .
How do we know the consequences of violating the normality assumption? Although many of the statements made in the preceding paragraphs can be proven mathematically, some of them do not have theoretical proofs and can be demonstrated only empirically, via so-called Monte Carlo experiments. In these experiments, large numbers of samples are generated by a computer following pre-designed specifications and the results from such samples are analyzed using a variety of tests. This way we can empirically evaluate the type and magnitude of errors or biases to which we are exposed when certain theoretical assumptions of the tests we are using are not met by our data. Specifically, Monte Carlo studies were used extensively with normal distribution-based tests to determine how sensitive they are to violations of the assumption of normal distribution of the analyzed variables in the population. The general conclusion from these studies is that the consequences of such violations are less severe than previously thought. Although these conclusions should not entirely discourage anyone from being concerned about the normality assumption, they have increased the overall popularity of the distribution-dependent statistical tests in all areas of research.
Probability Distribution Calculator – Overview
The Probability Distribution Calculator is a facility which allows you to compute critical values for a specified distribution based on user-specified parameters or degrees of freedom, or significance levels for the distribution. One of the unique features of this calculator is the interactive graph icons which display the density function and cumulative distribution function for the specified distribution based on the respective parameters. You can display those graphs in the standard (customizable) graph window and print them via the Create Graph option. This calculator is accessible via the Probability Calculator option on the Statistics menu and via the Basic Statistics and Tables Startup Panel.
Note that you can change the values of the parameters by either manually editing them in the respective edit field (you will need to click on the Compute button to complete the calculations) or by using the microscrolls to incrementally change the values in the edit fields. When you use the microscrolls to change a value, STATISTICA will automatically recompute the other values. See also the Probability Distribution Calculator dialog.
Independent t-tests
Introductory Overview
The t-test is the most commonly used method to evaluate the differences in means between two groups. For example, the t-test can be used to test for a difference in test scores between a group of patients who were given a drug and a control group who received a placebo. Theoretically, the t-test can be used even if the sample sizes are very small (e.g., as small as 10; some researchers claim that even smaller n’s are possible), as long as the variables are normally distributed within each group and the variation of scores in the two groups is not reliably different (see also Elementary concepts). As mentioned before, the normality assumption can be evaluated by looking at the distribution of the data (via histograms) or by performing a normality test (via the descriptive statistics option). The equality of variances assumption can be verified with the F-test (which is included in the t-test output), or you can use the more robust Levene test option (as well as the Brown-Forsythe modification of this test). If these conditions are not met, then you can evaluate the differences in means between two groups using one of the nonparametric alternatives to the t-test (see Nonparametric Statistics).
The p-level reported with a t-test represents the probability of error involved in accepting our research hypothesis about the existence of a difference. Technically speaking, this is the probability of error associated with rejecting the hypothesis of no difference between the two categories of observations (corresponding to the groups) in the population when, in fact, the hypothesis is true. Some researchers suggest that if the difference is in the predicted direction, you can consider only one half (one “tail”) of the probability distribution and thus divide the standard p-level reported with a t-test (a “two-tailed” probability) by two. Others, however, suggest that you should always report the standard, two-tailed t-test probability.
Arrangement of Data
In order to perform the t-test for independent samples, one independent (grouping) variable (e.g., Gender) and at least one dependent variable (e.g., a test score) are required. The means of the dependent variable will be compared between selected groups based on the specified values (grouping codes, e.g., male and female) of the independent variable. The following data set can be analyzed with a t-test comparing the average WCC score in males and females:
|
|
GENDER |
WCC |
|
case 1 |
male |
111 |
|
case 2 |
male |
110 |
|
case 3 |
male |
109 |
|
case 4 |
female |
102 |
|
case 5 |
female |
104 |
|
|
mean WCC in males = 110 mean WCC in females = 103 |
|
If you specified a list of dependent variables, then a series of t-tests will be performed (one for each dependent variable).
Alternative Arrangement of Data
Sometimes, the data are already arranged (e.g., as in a spreadsheet) such that each column or variable in the file represents one group:
|
|
Male |
Female |
|
case 1 |
111 |
102 |
|
case 2 |
110 |
104 |
|
case 3 |
109 |
|
Note that the Independent t-test option of the Basic Statistics and Tables module can also compute t-tests for data arranged in this manner. However, we should stress that this arrangement is atypical and generally not recommended when creating large data files. Practically all data analysis software, including STATISTICA, usually assumes data to be arranged as shown in Arrangement of data. The earlier arrangement allows us to identify each individual respondent or subject in the data file; thus, when there are multiple dependent variables of interest, various multivariate methods can be applied which rely on the (within-group) correlation matrices of variables.
More Complex Group Comparisons
It often happens in research practice that you need to compare more than two groups (e.g., drug 1, drug 2, and placebo), or compare groups created by more than one independent variable while controlling for the separate influence of each of them (e.g., Gender, type of Drug, and size of Dose). In these cases, you need to analyze the data using Analysis of Variance, which can be considered to be a generalization of the t-test. In fact, for two group comparisons, ANOVA will give results identical to a t-test (t2 [df] = F[1,df]). However, when the design is more complex, ANOVA offers numerous advantages that t-tests cannot provide (even if you run a series of t-tests comparing various cells of the design).
T-Test for Independent Samples by Groups Dialog
Select t-test, independent, by groups on the Basic Statistics and Tables (Startup Panel) – Quick tab to display the T-Test for Independent Samples by Groups dialog. This dialog contains three tabs: Quick, Advanced, and Options. Use the options on these tabs to compute t-tests for independent samples when the data are entered case by case; that is, to let each row in the data file represent one case (e.g., individual, respondent) and to let each column in the data file represent one variable or measurement (e.g., response to a questionnaire item). In this set-up, a grouping variable (e.g., Gender) should be included to denote to which group each case belongs. Use this T-test for Independent Samples – by Groups dialog if the data are arranged precisely in this manner.
If you have entered the data so that each variable represents the responses of one group, you need to use the t-test, independent, by variables option on the Basic Statistics and Tables (Startup Panel) – Quick tab, instead. A general overview covering the t-test for independent samples is provided in the overview section.
Variables. Click the Variables button to display the standard two variable selection dialog. Specify one grouping variable (e.g., Gender) and a list of dependent variables for the comparison. STATISTICA will compute the t-test for all variables in the dependent variable list, comparing the two groups that are identified by the two selected group codes (see below) in the grouping variable.
Code for Group 1; Code for Group 2. Specify the two codes that identify the two groups in the grouping variable that are to be compared in the Code for Group 1 and Code for Group 2 fields. If you are not sure about the codes that were used in the grouping variables to identify the groups, double-click on the edit field (or press the F2 button on your keyboard), and a Variable Code Window will appear containing all integer codes and their alphanumeric equivalents found in the grouping variable in the current data file. You can select a grouping code in this dialog (i.e., transfer it to the edit field) by double-clicking on it.
Summary. Click the Summary button to compute the t-tests for independent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections in the Options tab.
Cancel. Click the Cancel button to close the dialog without performing any analysis and return to the Basic Statistics and Tables Startup Panel.
Options. Click the Options button to display the Options menu.
Select Cases. Click the Select Cases button to display the Analysis/Graph Case Selection Conditions dialog, which is used to create conditions for which cases will be included (or excluded) in the current analysis. More information is available in the case selection conditions’ overview, syntax summary, and dialog description.
W. Click the W (Weight) button to display the Analysis/Graph Case Weights dialog, which is used to adjust the contribution of individual cases to the outcome of the current analysis by “weighting” those cases in proportion to the values of a selected variable.
Weighted moments. Click the Weighted moments button to specify that each observation contributes the weighting variable’s value for that observation. The weight values need not be integers. This module can use fractional case weights in most computations. Some other modules use case weights as integer case multipliers or frequency values. This option is available only after you have defined a weight variable via the W option described above.
DF = W-1 and N-1 options. When the Weighted moments check box is selected, some statistics related to the moments (e.g., standard deviations and variances, skewness, kurtosis) can be based on the sum of the weight values for the weighting variable (W-1), or on the number of (unweighted) observations (N-1). The sums (and means), and sums of squares and cross products will always be based on the weighted values of the respective observations. However, in computations requiring the degrees of freedom (e.g., standard deviation, t-test, etc.), the value for the degrees of freedom can either be computed as the sum of the weight values minus one, or as the number of observations minus one. Moment statistics (except for the mean) are based on the sum of the weight values for the weighting variable if the W-1 option button is selected, and are based on the number of (unweighted) observations if the N-1 option button is selected. When the Weighted moments check box is selected, several graphics options will not be available. For more information on options for using integer case weights, see also Selecting a weighting variable.
MD deletion. If Casewise deletion of missing data is selected, then STATISTICA will ignore all cases that have missing data for any of the variables selected in the list. If Pairwise deletion of missing data is selected, then all valid data points will be included in the analyses for the respective variables (resulting possibly in unequal valid N per variable).
Note: STATISTICA Power Analysis. Note that the STATISTICA Power Analysis program is designed to allow you to compute statistical power and estimate required sample size while planning experiments, and to evaluate experimental effects in your existing data. You will find many features in this module designed to allow you to perform these calculations quickly and effectively in a wide variety of data analysis situations (including tests for zero correlation and comparing two independent correlations).
Quick Tab
Select the Quick tab of the T-Test for Independent Samples by Groups dialog to access options to quickly review the results of an independent t-test. For more advanced results, use the Advanced tab.
Summary: T-tests. Click the Summary:T-tests button to compute the t-tests for independent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections in the Options tab.
Box & whisker plots. Click the Box & whisker plots button to produce a cascade of box and whisker plots for the dependent variables; one box plot for each variable will be produced. Box and whisker plots summarize the distribution of the dependent variable for each group. Specifically, each group will be represented by one box and whisker “component,” which is made up of three “parts”:
1. A central circle to indicate the mean;
2. A box to indicate the mean plus/minus the standard deviation.
3. Whiskers around the box to indicate the mean plus/minus 1.96*standard deviation (hence if your data follows the normal distribution, 95% of your data should fall within the whiskers).

Advanced Tab
Select the Advanced tab of the T-Test for Independent Samples by Groups dialog to access options to review advanced results of an independent t-test, including a wide variety of categorized plots.
Summary: T-tests. Click the Summary:T-tests button to compute the t-tests for independent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections in the Options tab.
Box & whisker plots. Click the Box & whisker plots button to produce a cascade of box and whisker plots for the dependent variables; one box plot for each variable will be produced. Box and whisker plots summarize the distribution of the dependent variable for each group. Specifically each group will be represented by one box and whisker “component” which is made up of three “parts”:
1. A central line to indicate central tendency or location;
2. A box to indicate variability around this central tendency;
3. Whiskers around the box to indicate the range of the variable.
Clicking this button will display the Box-Whisker Type dialog, which is used to pick a box and whisker type plot to use.
Categorized histograms. Click the Categorized histograms button to produce a cascade of categorized histograms (one for each dependent variable), summarizing the distribution of the respective variable in the two groups.
Categorized normal probability plots. Click the Categorized normal probability plots button to produce a cascade of categorized normal probability plots for the dependent variables, categorized by the two groups.
Categorized half-normal probability plots. Click the Categorized half-normal probability plots button to produce a cascade of categorized half-normal probability plots for the dependent variables, categorized by the two groups.
Categorized detrended probability plots. Click the Categorized detrended probability plots button to produce a cascade of categorized detrended normal probability plots for the dependent variables, categorized by the two groups.
Categorized scatterplots. Click the Categorized scatterplots button to produce a cascade of categorized scatterplots for selected pairs of variables, one plot per pair. You will be prompted to select two lists of variables (from the list of dependent variables) via the standard variable selection dialog. Scatterplots will be produced for each variable in the first list with each variable in the second list, categorized by the two groups for the t-test.
Options Tab
Select the Options tab of the T-Test for Independent Samples by Groups dialog to access options to determine the detail and formatting of the t-test for independent samples results spreadsheet.
Display long variable names. Select the Display long variables names check box to display the long variable names (if any, see Variable Specs Editor) along with the short names in the first column of the result spreadsheets. If no long variable names have been specified for any of the selected variables, then the setting of this check box will have no effect.
Test /w separate variance estimates. Select the Test /w separate variance estimates check box to add the t-test with separate variance estimates to the result spreadsheet that is displayed when you click the Summary button. In order to compute the t-test for independent samples, STATISTICA has to estimate the variance of the difference for the respective dependent variable. By default, this variance is estimated from the pooled (averaged) within-group variances. If the variances in the two groups are widely different, and the number of observations in each group also differs, then the t-test computed in this manner may not accurately reflect the statistical significance of the difference. In that case one should use this option to compute the t-test with separate variance estimates and approximate degrees of freedom (see Blalock, 1972; this test is also called the Welch t; see Welch, 1938).
Multivariate test (Hotelling T2). Select the Multivariate test (Hotelling T2) check box to add the Hotelling T2 test to the Header of the result spreadsheet that is displayed when you click the Summary button. The Hotelling T2 test is a multivariate test for differences in means between two groups. This test will only be computed if more than one dependent variable was selected. Because this test is based on the within-group variance/covariance matrices for the dependent variables, it will automatically exclude missing data casewise from the computations. That is, this test will be computed only for cases that have complete data for all dependent variables in the selected list.
p-level for highlighting. The default p-level for highlighting is .05. You may adjust this p-level by entering a new value in the edit box or using the microscroll buttons. For more details on p-level, see Elementary Concepts.
Homogeneity of variances. Two tests for the homogeneity of variance assumption are available in this group box. For more information on the importance of the homogeneity of variance assumption, see Homogeneity of variances in the ANOVA/MANOVA module.
Levene’s test. Click the Levene’s test button to add the Levene test to the result spreadsheet that is displayed when you click the Summary button. The standard t-test for independent samples is based on the assumption that the variances in the two groups are the same (homogeneous). A powerful statistical test of this assumption is Levene’s test (however, see also the description of the Brown-Forsythe modification of this test below). For each dependent variable, an analysis of variance is performed on the absolute deviations of values from the respective group means. If the Levene test is statistically significant, then the hypothesis of homogeneous variances should be rejected. However, note that the t-test for independent samples is a robust test as long as the N per group is greater than 30 (and, in particular, in the case of equal N); thus, a significant Levene test does not necessarily call into question the validity of the t-test (see also the general overview to the t-test for independent samples). Also, in the case of unbalanced designs (i.e., unequal N per group), the Levene test is itself not very robust, as has recently been pointed out in, for example, Glass and Hopkins (1996; see also the next paragraph).
Brown & Forsythe test. Click the Brown & Forsythe test button to add the Brown & Forsythe test to the result spreadsheet that is displayed when you click the Summary button. Recently, some authors (e.g., Glass and Hopkins, 1996) have called into question the power of the Levene test for unequal variances. Specifically, the absolute deviation (from the group means) scores can be expected to be highly skewed; thus, the normality assumption for the ANOVA of those absolute deviation scores is usually violated. This poses a particular problem when there is unequal N in the two (or more) groups that are to be compared. A more robust test that is very similar to the Levene test has been proposed by Brown and Forsythe (1974). Instead of performing the ANOVA on the deviations from the mean, one can perform the analysis on the deviations from the group medians. Olejnik and Algina (1987) have shown that this test will give quite accurate error rates even when the underlying distributions for the raw scores deviate significantly from the normal distribution. However, recently, Glass and Hopkins (1996, p. 436) have pointed out that both the Levene test as well as the Brown-Forsythe modification suffer from what those authors call a “fatal flaw,” namely, that both tests themselves rely on the homogeneity of variances assumption (of the absolute deviations from the means or medians); and hence, it is not clear how robust these tests are themselves in the presence of significant variance heterogeneity and unequal N. In most cases, when one suspects a violation of the homogeneity of variances assumption, it is probably advisable to interpret the Test /w separate variance estimates described above.
T-Test for Independent Samples by Variables Dialog
Select t-test, independent, by variables on the Basic Statistics and Tables (Startup Panel) – Quick tab to display the T-Test for Independent Samples by Variables dialog. This dialog contains two tabs: Quick and Options. For simple data analyses, it is sometimes easier to enter the data so that each column in the data file (variable) represents the responses of one group (e.g., the data for all male respondents was entered into the first column of the file and the data for all female respondents was entered into the second column). If your data are arranged in this manner, use this dialog. Otherwise use the T-Test for Independent Samples – by Groups dialog. See also, thet-Test for Independent Samples Overview.
Variables (groups). Click the Variables (group) button to display the standard two variable selection dialog. Specify two lists of variables. Since this setting of the input file combo box assumes that each variable represents the data for one group, each variable (group) in the first list will be compared with each variable (group) in the second list.
Summary. Click the Summary button to compute the t-tests for independent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections in the Options tab.
Cancel. Click the Cancel button to close the dialog without performing any analysis and return to the Basic Statistics and Tables Startup Panel.
Options. Click the Options button to display the Options menu.
Select Cases. Click the Select Cases button to display the Analysis/Graph Case Selection Conditions dialog, which is used to create conditions for which cases will be included (or excluded) in the current analysis. More information is available in the case selection conditions’ overview, syntax summary, and dialog description.
W. Click the W (Weight) button to display the Analysis/Graph Case Weights dialog, which is used to adjust the contribution of individual cases to the outcome of the current analysis by “weighting” those cases in proportion to the values of a selected variable.
Weighted moments. Click the Weighted moments button to specify that each observation contributes the weighting variable’s value for that observation. The weight values need not be integers. This module can use fractional case weights in most computations. Some other modules use case weights as integer case multipliers or frequency values. This option is available only after you have defined a weight variable via the W option described above.
DF = W-1 and N-1 options. When the Weighted moments check box is selected, some statistics related to the moments (e.g., standard deviations and variances, skewness, kurtosis) can be based on the sum of the weight values for the weighting variable (W-1), or on the number of (unweighted) observations (N-1). The sums (and means), and sums of squares and cross products will always be based on the weighted values of the respective observations. However, in computations requiring the degrees of freedom (e.g., standard deviation, t-test, etc.), the value for the degrees of freedom can either be computed as the sum of the weight values minus one, or as the number of observations minus one. Moment statistics (except for the mean) are based on the sum of the weight values for the weighting variable if the W-1 option button is selected, and are based on the number of (unweighted) observations if the N-1 option button is selected. When the Weighted moments check box is selected, several graphics options will not be available. For more information on options for using integer case weights, see also Selecting a Weighting Variable.
Quick Tab
Select the Quick tab of the T-Test for Independent Samples by Variables dialog to access options to quickly review the results of a t-test for independent samples when the data have been organized by variables. For more advanced results, use the Advanced tab.
Summary: T-tests. Click the Summary:T-tests button to compute the t-tests for independent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections in the Options tab.
Box & whisker plots. Click the Box & whisker plots button to produce a cascade of box and whisker plots; one box plot for each variable (group) in the first list with each variable (group) in the second list will be produced. Box and whisker plots summarize the distribution of the dependent variable for each group. Specifically each group will be represented by one box and whisker “component,” which is made up of three “parts”:
1. A central circle to indicate the mean;
2. A box to indicate the mean plus/minus the standard deviation.
3. Whiskers around the box to indicate the mean plus/minus 1.96*standard deviation (hence if your data follows the normal distribution, 95% of your data should fall within the whiskers).

Options Tab
Select the Options tab of the T-Test for Independent Samples by Variables dialog to access options to determine the detail and formatting of the t-test for independent samples results spreadsheet.
Display long variable names. Select the Display long variables names check box to display the long variable names (if any, see Variable Specs Editor) along with the short names in the first column of the result spreadsheets. If no long variable names have been specified for any of the selected variables, then the setting of this check box will have no effect.
t-test with separate variance estimates. Select the t-test with separate variance estimates check box to add the t-test with separate variance estimates to the result spreadsheet that is displayed when you click the Summary button. In order to compute the t-test for independent samples, STATISTICA has to estimate the variance of the difference for the respective dependent variable. By default, this variance is estimated from the pooled (averaged) within-group variances. If the variances in the two groups are widely different, and the number of observations in each group also differs, then the t-test computed in this manner may not accurately reflect the statistical significance of the difference. In that case one should use this option to compute the t-test with separate variance estimates and approximate degrees of freedom (see Blalock, 1972; this test is also called the Welch t; see Welch, 1938).
Homogeneity of variances. Two tests for the homogeneity of variance assumption are available in this group box. For more information on the importance of the homogeneity of variance assumption, see Homogeneity of Variances in the ANOVA/MANOVA module.
Levene’s test. Click the Levene’s test button to add the Levene test to the result spreadsheet that is displayed when you click the Summary button. The standard t-test for independent samples is based on the assumption that the variances in the two groups are the same (homogeneous). A powerful statistical test of this assumption is Levene’s test (however, see also the description of the Brown-Forsythe modification of this test below). For each dependent variable, an analysis of variance is performed on the absolute deviations of values from the respective group means. If the Levene test is statistically significant, then the hypothesis of homogeneous variances should be rejected. However, note that the t-test for independent samples is a robust test as long as the N per group is greater than 30 (and, in particular, in the case of equal N); thus, a significant Levene test does not necessarily call into question the validity of the t-test (see also the general overview to the t-test for independent samples). Also, in the case of unbalanced designs (i.e., unequal N per group), the Levene test is itself not very robust, as has recently been pointed out in, for example, Glass and Hopkins (1996; see also the next paragraph).
Brown & Forsythe test. Click the Brown & Forsythe test button to add the Brown & Forsythe test to the result spreadsheet that is displayed when you click the Summary button. Recently, some authors (e.g., Glass and Hopkins, 1996) have called into question the power of the Levene test for unequal variances. Specifically, the absolute deviation (from the group means) scores can be expected to be highly skewed; thus, the normality assumption for the ANOVA of those absolute deviation scores is usually violated. This poses a particular problem when there is unequal N in the two (or more) groups that are to be compared. A more robust test that is very similar to the Levene test has been proposed by Brown and Forsythe (1974). Instead of performing the ANOVA on the deviations from the mean, one can perform the analysis on the deviations from the group medians. Olejnik and Algina (1987) have shown that this test will give quite accurate error rates even when the underlying distributions for the raw scores deviate significantly from the normal distribution. However, recently, Glass and Hopkins (1996, p. 436) have pointed out that both the Levene test as well as the Brown-Forsythe modification suffer from what those authors call a “fatal flaw,” namely, that both tests themselves rely on the homogeneity of variances assumption (of the absolute deviations from the means or medians); and hence, it is not clear how robust these tests are themselves in the presence of significant variance heterogeneity and unequal N. In most cases, when one suspects a violation of the homogeneity of variances assumption, it is probably advisable to interpret the t-test with separate variance estimates described above.
p-level for highlighting. The default p-level for highlighting is .05. You can adjust this p-level by entering a new value in the edit box or using the microscroll buttons. For more details on p-level, see Elementary Concepts.
Dependent t-tests
Within-Group Variation
As explained in Elementary concepts, the size of a relation between two variables, such as the one measured by a difference in means between two groups, depends to a large extent on the differentiation of values within the group. Depending on how differentiated the values are in each group, a given “raw difference” in group means will indicate either a stronger or weaker relationship between the independent (grouping) and dependent variable. For example, if the mean WCC (White Cell Count) was 102 in males and 104 in females, then this difference of “only” 2 points would be extremely important if all values for males fell within a range of 101 to 103, and all scores for females fell within a range of 103 to 105; for example, we would be able to predict WCC pretty well based on gender. However, if the same difference of 2 was obtained from very differentiated scores (e.g., if their range was 0-200), then we would consider the difference entirely negligible. Reduction of the within-group variation increases the sensitivity of our test.
Purpose
The t-test for dependent samples helps us to take advantage of one specific type of design in which an important source of within-group variation (or so-called, error) can be easily identified and excluded from the analysis. Specifically, if two groups of observations (that are to be compared) are based on the same sample of subjects who were tested twice (e.g., before and after a treatment), then a considerable part of the within-group variation in both groups of scores can be attributed to the initial individual differences between subjects. Note that, in a sense, this fact is not much different than in cases when the two groups are entirely independent (see t-test for independent samples), where individual differences also contribute to the error variance; but in the case of independent samples, we cannot do anything about it because we cannot identify (or “subtract”) the variation due to individual differences in subjects. However, if the same sample was tested twice, then we can easily identify (or “subtract”) this variation. Specifically, instead of treating each group separately, and analyzing raw scores, we can look only at the differences between the two measures (e.g., “pre-test” and “post test”) in each subject. By subtracting the first score from the second for each subject and then analyzing only those “pure (paired) differences,” we will exclude the entire part of the variation in our data set that results from unequal base levels of individual subjects. This is precisely what is being done in the t-test for dependent samples, and, as compared to the t-test for independent samples, it always produces “better” results (i.e., it is always more sensitive).
STATISTICA Power Analysis. Note that STATISTICA Power Analysis is designed to allow you to compute statistical power and estimate required sample size while planning experiments, and to evaluate experimental effects in your existing data. You will find many features in this module designed to allow you to perform these calculations quickly and effectively in a wide variety of data analysis situations (including tests for zero correlation and comparing two independent correlations).
Assumptions
The theoretical assumptions of the t-test for independent samples also apply to the dependent samples test; that is, the paired differences should be normally distributed. If these assumptions are clearly not met, then one of the nonparametric alternative tests should be used (see the Nonparametric Statistics module).
Arrangement of Data
Technically, we can apply the t-test for dependent samples to any two variables in our data set and the selection of variables is identical to that used for Correlations. However, applying this test will make very little sense if the values of the two variables in the data set are not logically and methodologically comparable. For example, if you compare the average WCC in a sample of patients before and after a treatment but use a different counting method or different units in the second measurement, then a highly significant t-test value could be obtained due to an artifact; that is, to the change of units of measurement. Following, is an example of a data set (spreadsheet) that can be analyzed using the t-test for dependent samples.
|
|
WCC before |
WCC after |
|
case 1 |
111.9 |
113 |
|
case 2 |
109 |
110 |
|
case 3 |
143 |
144 |
|
case 4 |
101 |
102 |
|
case 5 |
80 |
80.9 |
|
… |
… |
… |
|
|
average change between WCC “before” and “after” = 1 |
|
The average difference between the two conditions is relatively small (d=1) as compared to the differentiation (range) of the raw scores (from 80 to 143, in the first sample). However, the t-test for dependent samples analysis is performed only on the paired differences, “ignoring” the raw scores and their potential differentiation. Thus, the size of this particular difference of 1 will be compared not to the differentiation of raw scores but to the differentiation of the individual difference scores, which is relatively small: 0.2 (from 0.9 to 1.1). Compared to that variability, the difference of 1 is extremely large and can yield a highly significant t value.
Matrices of t-tests
t-tests for dependent samples can be calculated for long lists of variables, and reviewed in the form of matrices produced with casewise or pairwise deletion of missing data, much like the correlation matrices option. Thus, the precautions discussed in the context of correlations also apply to t-test matrices; specifically:
a. The issue of artifacts caused by the pairwise deletion of missing data in t-tests and
b. The issue of “randomly” significant test values.
More Complex Group Comparisons
If there are more than two “correlated samples” (e.g., before treatment, after treatment 1, and after treatment 2), then analysis of variance with repeated measures should be used. The repeated measures ANOVA can be considered a generalization of the t-test for dependent samples, and it offers various features that increase the overall sensitivity of the analysis. For example, it can simultaneously control not only for the base level of the dependent variable, but it can control for other factors and/or include in the design more than one interrelated dependent variable (MANOVA; for additional details refer General Linear Models (GLM) or ANOVA/MANOVA).
T-Test for Dependent Samples Dialog
Select t-test, dependent samples on the Basic Statistics and Tables Startup Panel – Quick tab to display the T-Test for Dependent Samples dialog. This dialog contains two tabs: Quick and Advanced. Use the options on this dialog to perform t-tests for dependent samples. A general discussion of the t-test for dependent samples is provided in the overview section.
Variables. Click the Variables button to display the standard two variable selection dialog in which you can specify two lists of variables to be analyzed. Each variable in the first list will be compared with each variable in the second list.
Summary. Click the Summary button to compute the t-tests for dependent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections on the Advanced tab.
Cancel. Click the Cancel button to close the dialog without performing an analysis and return to the Basic Statistics and Tables Startup Panel.
Options. Click the Options button to display the Options menu.
Select Cases. Click the Select Cases button to display the Analysis/Graph Case Selection Conditions dialog, which is used to create conditions for which cases will be included (or excluded) in the current analysis. More information is available in the case selection conditions overview, syntax summary, and dialog description.
W. Click the W (Weight) button to display the Analysis/Graph Case Weights dialog, which is used to adjust the contribution of individual cases to the outcome of the current analysis by “weighting” those cases in proportion to the values of a selected variable.
Wghtd momnts. Click the Wghtd momnts (Weighted moments) button to specify that each observation contributes the weighting variable’s value for that observation. The weight values need not be integers. This module can use fractional case weights in most computations. Some other modules use case weights as integer case multipliers or frequency values. This option is available only after you have defined a weight variable via the W option above.
DF = W-1 and N-1 options. When the Weighted moments check box is selected, some statistics related to the moments (e.g., standard deviations and variances, skewness, kurtosis) can be based on the sum of the weight values for the weighting variable (W-1), or on the number of (unweighted) observations (N-1). The sums (and means), and sums of squares and cross products will always be based on the weighted values of the respective observations. However, in computations requiring the degrees of freedom (e.g., standard deviation, t-test, etc.), the value for the degrees of freedom can either be computed as the sum of the weight values minus one, or as the number of observations minus one. Moment statistics (except for the mean) are based on the sum of the weight values for the weighting variable if the W-1 option button is selected, and are based on the number of (unweighted) observations if the N-1 option button is selected. When the Wghtd momnts check box is selected, several graphics options will not be available. For more information on options for using integer case weights, see Selecting a Weighting Variable.
MD deletion. If Casewise deletion of missing data is selected, then STATISTICA will ignore all cases that have missing data for any of the variables selected in the list. If Pairwise deletion of missing data is selected, then all valid data points will be included in the analyses for the respective variables (resulting possibly in unequal valid N per variable).
Note: STATISTICA Power Analysis. Note that the STATISTICA Power Analysis program is designed to allow you to compute statistical power and estimate required sample size while planning experiments, and to evaluate experimental effects in your existing data. You will find many features in this module designed to allow you to perform these calculations quickly and effectively in a wide variety of data analysis situations (including tests for zero correlation and comparing two independent correlations). For more information on purchasing this program, contact your local StatSoft office or distributor or visit the Web site at http://www.statsoft.com
Quick Tab
Select the Quick tab of the T-Test for Dependent Samples dialog to access options to analyze the results of the t-test for dependent samples.
Summary: T-tests. Click the Summary button to compute the t-tests for dependent samples and display the results in a spreadsheet. The detail and formatting of the results depends on your selections in the Advanced tab.
Box & whisker plots. Click the Box & whisker plots button to produce a cascade of box and whisker plots for the Variables; one box plot for each variable in the first list vs. each variable in the second list will be produced. Box and whisker plots summarize the distribution of the dependent variable for each group. Specifically each group will be represented by one box and whisker “component” which is made up of three “parts”:
1. A central circle to indicate the mean;
2. A box to indicate the mean plus/minus the standard deviation.
3. Whiskers around the box to indicate the mean plus/minus 1.96*standard deviation (hence if your data follows the normal distribution, 95% of your data should fall within the whiskers).
Advanced Tab

Select the Advanced tab of the T-Test for Dependent Samples dialog to access options to determine the detail and formatting of the t-test for dependent samples results spreadsheet as well as the p-level for highlighting.
Summary: T-tests. Click the Summary button to compute the t-tests for dependent samples and display the results in a spreadsheet. The detail and formatting of the results depends on the selections on this tab.
Box & whisker plots. Click the Box & whisker plots button to produce a cascade of box and whisker plots for the Variables; one box plot for each variable in the first list vs. each variable in the second list will be produced. Box and whisker plots summarize the distribution of the dependent variable for each group. Specifically, each group will be represented by one box and whisker “component” which is made up of three “parts”:
1. A central line to indicate central tendency or location;
2. A box to indicate variability around this central tendency;
3. Whiskers around the box to indicate the range of the variable.
Click this button to display the Box-Whisker Type dialog, which is used to specify a box and whisker type plot to use.
Display. The options in the Display group box determine the detail and formatting of the t-test for dependent samples results spreadsheet that is displayed when you click the Summary button.
Matrix of t-tests. If the Matrix of t-tests option button is selected, then clicking the Summary button will produce three summary results spreadsheets reporting for each pair of variables (from the two lists) the significance level (p-level) for the respective t-value, the mean difference, and the t-values for the respective difference.
Detailed results. If the Detailed results option button is selected, then clicking the Summary button will produce one summary results spreadsheet reporting for each t-test the means, standard deviations, valid N, etc.
Display long variable names. Select the Display long variables names check box to display the long variable names (if any, see Variable Specs Editor) along with the short names in the first column of the result spreadsheets. If no long variable names have been specified for any of the selected variables, then the setting of this check box will have no effect.
p-level for highlighting. The default p-level for highlighting is .05. You can adjust this p-level by entering a new value in the edit box or using the microscrolls.
Single Means t-tests
Introductory Overview
Purpose & arrangement of data. The t-test for a single mean allows us to test hypothesis about the population mean when our sample size is small and/or when we do not know the variance of the sampled population. In so-called one-sample t-tests, the observed mean (from a single sample) is compared to an expected (or reference) mean of the population (e.g., some theoretical mean), and the variation in the population is estimated based on the variation in the observed sample. To compute a one-sample t-test when raw data are not available (i.e., when only the sample mean, sample standard deviation, sample size and the hypothesized mean are known), use the Other significance tests option on the Basic Statistics and Tables Startup Panel.
Assumptions. The theoretical assumption for the one-sample t-test is that the sampled population is normally distributed. You can evaluate the normality of the variable using a variety of graphs (e.g., histograms, probability plots) which are available on the Advanced tab of the T-Test for Single Means dialog. Tests of normality (Lilliefors test, Kolmogorov-Smirnov test, Shapiro-Wilk’s W test) are also available using the Descriptive statistics option from the Basic Statistics and Tables Startup Panel.
Graphical techniques. In addition to the histograms and probability plots which can aid you in determining whether or not the normality assumption is met, a box-and-whisker plot can be used in a one-sample t-test analysis to visualize the mean and variability of a variable.
Other t-tests. In addition to testing hypothesis about a single mean, the t-distribution can be used to test differences between two independent or two dependent samples. For more information see the overview topics on Independent t-tests and Dependent t-tests.
T-Test for Single Means Dialog
Select t-test, single sample on the Basic Statistics and Tables Startup Panel – Quick tab to display the T-Test for Single Means dialog. This dialog contains three tabs: Quick, Advanced, and Options. Use the options on these tabs to perform t-test for single means.
Variables. Click the Variables button to display a standard variable selection dialog in which you can specify the variables to be analyzed.
Summary. Click the Summary button to compute the t-tests for single means and display the results in a spreadsheet. The detail and formatting of the results depends on the selections made on the Options tab.
Cancel. Click the Cancel button to close the dialog without performing an analysis and return to the Basic Statistics and Tables Startup Panel.
Options. Click the Options button to display the Options menu.
Select Cases. Click the Select Cases button to display the Analysis/Graph Case Selection Conditions dialog, which is used to create conditions for which cases will be included (or excluded) in the current analysis. More information is available in the case selection conditions overview, syntax summary, and dialog description.
W. Click the W (Weight) button to display the Analysis/Graph Case Weights dialog, which is used to adjust the contribution of individual cases to the outcome of the current analysis by “weighting” those cases in proportion to the values of a selected variable.
Weighted moments. Click the Weighted moments button to specify that each observation contributes the weighting variable’s value for that observation. The weight values need not be integers. This module can use fractional case weights in most computations. Some other modules use case weights as integer case multipliers or frequency values. This option is available only after you have defined a weight variable via the W option above.
DF = W-1 and N-1 options. When the Weighted moments check box is selected, some statistics related to the moments (e.g., standard deviations and variances, skewness, kurtosis) can be based on the sum of the weight values for the weighting variable (W-1), or on the number of (unweighted) observations (N-1). The sums (and means), and sums of squares and cross products will always be based on the weighted values of the respective observations. However, in computations requiring the degrees of freedom (e.g., standard deviation, t-test, etc.), the value for the degrees of freedom can either be computed as the sum of the weight values minus one, or as the number of observations minus one. Moment statistics (except for the mean) are based on the sum of the weight values for the weighting variable if the W-1 option button is selected, and are based on the number of (unweighted) observations if the N-1 option button is selected. When the Weighted moments check box is selected, several graphics options will not be available. For more information on options for using integer case weights, see Selecting a Weighting Variable.
MD deletion. If Casewise deletion of missing data is selected, then STATISTICA will ignore all cases that have missing data for any of the variables selected in the list. If Pairwise deletion of missing data is selected, then all valid data points will be included in the analyses for the respective variables (resulting possibly in unequal valid N per variable).
Quick Tab
Select the Quick tab of the T-Test for Single Means dialog to access options to quickly specify the reference value to use when performing a t-test for single means. For more options, including a variety of plots that enable you to visualize the distribution of the variable, use the Advanced tab.
Summary: T-tests. Click the Summary:T-Tests button to compute the t-tests for single means and display the results in a spreadsheet. The detail and formatting of the results depends on the selections made on the Options tab.
Reference values. When more than one variable has been selected via the Variables button, use the options in the Reference values group box to specify whether to test all variables against the same hypothesized mean, or to test each variable against a different (user-specified) mean.
Test all means against. Select the Test all means against option and specify the reference value in the corresponding edit field to test each variable against the same reference value (or hypothesized mean). Note that this option performs a separate test for each variable. To test the hypothesis that the means for all variables are equal, use ANOVA/MANOVA.
Test means against different user-defined constants. Select the Test means against different user-defined constants option to specify different reference values for each t-test (or variable). Note that you will need to click the Specify button to do this (see below).
Specify. Click the Specify button to display the Select Reference Values dialog, in which you specify a separate reference value for each selected Variable. Note that the Test means against different user-defined constants option button must be selected for this button to be available.
Box & whisker plot. Click the Box & whisker plots button to produce a box and whisker plot for the selected Variable(s). Box and whisker plots summarize the distribution of each variable. Specifically, each variable will be represented by one box and whisker “component,” which is made up of three “parts”:
1. A central circle to indicate the mean;
2. A box to indicate the mean plus/minus the standard deviation.
3. Whiskers around the box to indicate the mean plus/minus 1.96*standard deviation (hence if your data follows the normal distribution, 95% of your data should fall within the whiskers).
Advanced Tab
Select the Advanced tab of the T-Test for Single Means dialog to access options to specify the reference value to use when performing a t-test for single means as well as to visualize the distribution of the variable using a variety of plots.
Summary: T-tests. Click the Summary:T-Tests button to compute the t-tests for single means and display the results in a spreadsheet. The detail and formatting of the results depends on the selections made on the Options tab.
Reference values. When more than one variable has been selected via the Variables button, use the options in the Reference values group box to specify whether to test all variables against the same hypothesized mean, or to test each variable against a different (user-specified) mean.
Test all means against. Select the Test all means against option and specify the reference value in the corresponding edit field to test each variable against the same reference value (or hypothesized mean). Note that this option performs a separate test for each variable. To test the hypothesis that the means for all variables are equal, use ANOVA/MANOVA.

Test means against different user-defined constants. Select the Test means against different user-defined constants option to specify different reference values for each t-test (or variable). Note that you will need to click the Specify button to do this (see below).
Specify. Click the Specify button to display the Select Reference Values dialog in which you specify a separate reference value for each selected Variable. Note that the Test means against different user-defined constants option button must be selected for this button to be available.
Histograms. Click the Histograms button to produce a cascade of histograms (one for each selected Variable), summarizing the distribution of the respective variable.
Box & whisker plot. Click the Box & whisker plot button to produce a box and whisker plot for the selected Variables; one box plot for each variable will be produced. Box and whisker plots summarize the distribution of the dependent variable. Specifically each variable will be represented by one box and whisker “component” which is made up of three “parts”:
1. A central line to indicate central tendency or location;
2. A box to indicate variability around this central tendency;
3. Whiskers around the box to indicate the range of the variable.
Clicking this button will display the Box-Whisker Type dialog, which is used to specify a box and whisker type plot to use.
Probability plots. Probability plots are used to determine whether or not a variable can be fit with the normal distribution. Three types of probability plots can be selected in this group box:
Normal. Click the Normal button to produce a cascade of normal probability plots for the selected Variables.
Half-normal. Click the Half -normal button to produce a cascade of half-normal probability plots for the selected Variables.
Detrended. Click the Detrended button to produce a cascade of detrended normal probability plots for the selected Variables.
p-level for highlighting. The default p-level for highlighting is .05. You can adjust this p-level by entering a new value in the edit box or using the microscrolls.
Options Tab
Select the Options tab of the T-Test for Single Means dialog to access options to determine the detail and formatting of the t-test for single means results spreadsheet as well as the p-level for highlighting significant values in that spreadsheet.
Reference values. When more than one variable has been selected via the Variables button, use the options in the Reference values group box to specify whether to test all variables against the same hypothesized mean, or to test each variable against a different (user-specified) mean.
Test all means against. Select the Test all means against option and specify the reference value in the corresponding edit field to test each variable against the same reference value (or hypothesized mean). Note that this option performs a separate test for each variable. To test the hypothesis that the means for all variables are equal, use ANOVA/MANOVA.
Test means against different user-defined constants. Select the Test means against different user-defined constants option button to specify different reference values for each t-test (or variable). Note that you will need to click the Specify button to do this (see below).
Specify. Click the Specify button to display the Select Reference Values dialog in which you specify a separate reference value for each selected Variable. Note that the Test means against different user-defined constants option button must be selected for this button to be available.
Display long variable names. Select the Display long variables names check box to display the long variable names (if any, see Variable Specs Editor) along with the short names in the first column of the result spreadsheet (displayed when you click the Summary button). If no long variable names have been specified for any of the selected variables, then the setting of this check box will have no effect.
Compute conf. limits; Interval. Select the Compute conf. limits check box to report the confidence limits for the mean of each selected Variable in the results spreadsheet (displayed when you click the Summary button). Use the Interval field to specify the confidence level to be reported (by defaults 95%).
Multivariate test (Hotelling T2). Select the Multivariate test (Hotelling T2) check box to add the Hotelling T2 test to the Header of the results spreadsheet that is displayed when you click the Summary button. The Hotelling T2 test is a multivariate test for differences in means. This test will only be computed if more than one variable was selected. Because this test is based on the within-group variance/covariance matrices for the variables, it will automatically exclude missing data casewise from the computations. That is, this test will be computed only for cases that have complete data for all selected variables.
p-level for highlighting. The default p-level for highlighting is .05. You can adjust this p-level by entering a new value in the edit box or using the microscrolls.
Select Reference Values
Click the Specify button on any tab of the T-Test for Single Means dialog to display the Select Reference Values dialog. It allows you to specify the reference values to use for the null hypothesis when you do not wish to use the same reference value for each variable.
OK. Click the OK button to accept the specified values and return to the previous dialog.
Cancel. Click the Cancel button to return to the previous dialog without changing any of the default values.
Values box. Specify the reference value for each variable listed in this box. You can type the number in the edit field or use the microscrolls to specify the value.
Common value. Specify here a common value to use for all variables.
Apply. Click the Apply button to apply the value specified in the Common value field to all factors in the Values box.
Example: t-Tests, Deskriptive Statistics
This example is based on the Adstudy.sta example data file that is included with STATISTICA. This data file contains 25 variables and 50 cases. These (fictitious) data were collected in an advertising study where male and female respondents evaluated two advertisements. Respondents’ gender was coded in variable 1 (Gender: 1=male, 2=female). Each respondent was randomly assigned to view one of the two ads (Advert: 1=Coke®, 2=Pepsi®). They were then asked to rate the appeal of the respective ad on 23 different scales (Measure01 to Measure23). On each of these scales, the respondents could give answers between 0 and 9.
Starting the Basic Statistics and Tables module. Start STATISTICA and open the data file Adstudy.sta via the File – Open menu; it is installed in the /Examples/Datasets directory of STATISTICA. You can also open data files from most Startup Panels of each statistical module. For example, select Basic Statistics/Tables from the Statistics menu to display the Basic Statistics and Tables Startup Panel.

Click the Open Data button to display the Select Spreadsheet dialog; in that dialog click the Files button to select the datafile.

Differences between Means (t-Test). In the next step of the analysis, the possibility of differences in response patterns between males and females will be examined. Specifically, males may use some rating scales in a different way, resulting in higher or lower ratings on some scales. The t-test for independent samples will be used to identify such potential differences. The sample of males and females will be compared regarding their average ratings on each scale. Return to the Basic Statistics and Tables Startup Panel and select t-test, independent, by groups in order to display the T-Test for Independent Samples by Groups dialog.
Next, click the Variables button to display the standard variable selection dialog. Here, you can select both the independent (grouping) and dependent variables for the analysis. For this example, select (highlight) variables 3 through 25 (the variables containing the responses) as the dependent variables; select variable Gender as the independent variable.

Once you have made the grouping variable selection, STATISTICA will automatically propose the codes used in that variable to identify the groups to be compared (in this cases the codes are Male and Female). You can double-click on either the Code for Group 1 or Code for Group 2 boxes to display the Variable Codes dialog in which you can review and select the codes for each group.
Many other procedures are available on the Advanced tab on the T-Test for Independent Samples by Groups dialog. Before performing the analysis, you can graphically view the distribution of the variables via the graphics options on this dialog. For example, click the Box & whisker plot button (and select the Mean/SE/1.96*SE option button in the Box-Whisker Type dialog) to produce box and whisker plots categorized by the grouping variable, one plot for each of the dependent variables. Similarly, click the Categorized Histograms button to produce categorized (by the grouping variable) histograms. If your current output (see Output Manager) is directed to workbooks (default), all graphs can quickly be reviewed.

Categorized normal probability plots, detrended normal probability plots, and scatterplots are also available to review the distribution of the variable within each group.
Now, click the Summary button to display the spreadsheet of t-test results.

Reviewing the t-test output. The quickest way to explore the table is to examine the fifth column (p-levels) and look for p-values that are less than the conventional significance level of .05 (see Elementary Concepts). For the vast majority of dependent variables, the means in the two groups (Males and Females) are very similar. The only variable for which the t-test meets the conventional significance level of .05 is Measure07 for which the p-level is equal to .0087. A look at the columns containing the means (see the first two columns) reveals that males used much higher ratings on that scale (5.46) than females (3.63). The possibility that this difference was obtained by chance cannot be entirely excluded, although assuming that the test is valid (see below), it appears unlikely, because a difference at that significance level is expected to occur by chance (approximately) 9 times per 1,000 (thus, less than only 1 time per 100). This result will be examined further, but first, look at the box and whisker plot for this variable.
Go back to the box-whisker plot that you previously produced (shown above in the workbook; or produce these graphs once more by clicking the Box & whisker plot button on the dialog. Then select the graph for variable Measure07; double-click on the graph to display the All Options dialog, select the Plot: Box/Whisker tab, and set the Whisker value drop-down box to Std Dev (standard deviations).

Now click the OK button to produce the updated graph:

The graph shows something unexpected: The variation in the group of females appears much larger than in males. If the variation of scores within the two groups is in fact reliably different, then one of the theoretical assumptions of the t-test is not met (see the Introductory Overview), and you should treat the difference between the means with particular caution. Also, differences in variation are typically correlated with the means, that is, variation is usually higher in groups with higher means. However, something opposite appears to be happening in the current case. In situations like this one, experienced researchers would suspect that the distribution of Measure07 might not be normal (in males, females, or both). However, first look at the test of variances to see whether the difference visible on the graph is reliable.
Test of difference between variances. Now, return to the results spreadsheet and scroll to the right until the F-test results are visible. The F-test does in fact meet the conventional significance level of .05, which suggests that the variances of Measure07 in Males and Females are reliably different. However, the difference between the variances is relatively close to the borderline significance level (the obtained p-level is .029). Most researchers would not consider this fact alone to be sufficient to entirely discard the validity of the t-test for the difference between the means, given the relatively high significance level of that difference (p = .0087). Now, look at the distribution of Measure07 as categorized by the independent variable Gender.
Categorized histogram. Right-click on the results spreadsheet and select Graphs of Input Data – 2D Histogram by from the shortcut menu.

An intermediate dialog is displayed in which you can select the categorization variable for the histogram (choose variable Gender), and codes used in that variable to denote the different groups or categories (choose All codes). Then click the OK button to produce the graph.

Examining Distributions (Descriptive Statistics). Now, return to the Basic Statistics and Tables Startup Panel and select Descriptive statistics to display the Descriptive Statistics dialog. In this dialog, click the Variables button and select all variables in the datafile.
By default, the Descriptive Statistics spreadsheets will contain the mean, valid N, standard deviation, and minimum and maximum values of the selected variables. Click on the Advanced tab to select the types of statistics to be calculated.

For this example, accept the default selection of statistics and click the Summary button to produce the spreadsheet of results.

Graphics options. The Descriptive Statistics dialog offers many graphics options to visualize the distributions of, or correlations between variables. While almost all types of graphs available on this dialog can also be produced via the Graphs menu commands, the graphs produced from this dialog will be based on the current case selection conditions and current selections for handling missing data. So, for example, any histograms produced via the options on the Normality tab will only include cases that are selected into the current analysis.