*****WEB SITE CURRENTLY UNDER CONTSTRUCTION*****
|
|
Welcome to our Glossary website of statistical vocabulary. Here is where you will find definitions of many statistical terms that you
might have come across when involved in studies, testing, or simply your everyday life. Many of the terms defined on this site can be
further explored by visiting suggested web sites. If you choose to learn more about a particular statistical term, use the applicable links provided for you. All links are placed in the "definitions" column, as part of descriptions or definitions, as well as at the
bottom of this page.
Directions for the users of this website:
|
|
|
| Alternative Hypothesis | a statement which must be true if the null hypothesis is false; a hypothesis which questions the truth of status quo; a hypothesis which, in certain cases, could have many different forms, depending on what we want to test; ex. Null Hypothesis (status quo): The average highway speed of cars in the United States is 55 mph...Alternative Hypothesis: (1) The average highway speed of cars in the United States is not 55 mph, or (2) The average highway speed of cars in the United States is greater than 55 mph, or (3) The average highway speed of cars in the United States is less than 55 mph; go to Hypothesis Testing |
| Analysis of Variance | known as ANOVA; a method for comparing several population means at the same time to test the equality of the means; If the variability between the sample means is large in relation to the variability within the samples, then that suggests that not all the population means are equal; ex. given that there are k populations of interest: null hypothesis-all k means of these populations are equal / alternative hypothesis-not all the means are equal; go to Tests |
| Bar graph | a graph of the five-number summary where we have a box go to Statistical Graphs |
| Biased sample | The subjects were not selected randomly; ex.: Voluntary Response Sample - subjects volunteer to be part of the sample; go to Sampling Strategies |
| Box plot | go to Statistical Graphs |
| Categorical variable | a variable which places each individual into a category; ex. male and female; go to Data Types |
| Continuous variable | go to Data Types |
| Control group | a group of subjects that received a fake treatment or no treatment at all; Control groups enable us to control the effects of outside variables on the outcome because we can continually compare a treatment group with a control group. Go to Studies and Experiments |
| Critical region | an area under a density curve which consists of those test statistics which provide strong evidence against the null hypothesis; values in the critical region yield rejection of the null hypothesis; go to Tests |
| Critical value | a value which separates the critical region from the rest of the area under a density curve; a boundary point between the test statistics and the other points; a value to which we compare the test statistic to make a decision in a test of hypotheses; go to Tests or Hypothesis Testing |
| Degrees of Freedom | the number of observations in a sample that can vary freely |
| Density curve | describes a shape of a population; always on or above the horizontal axis; area of 1 underneath the curve; |
| Discrete variable | go to Data Types |
| Distribution | an arrangement of statistical data that shows how many items, or what parts of the data, go into the various categories or intervals into which the data are grouped; for Normal Distribution, go to Hypothesis Testing; for other distributions, go to Tests |
| Double-Blind Study | a study where neither the subjects nor those evaluating the results know which group is a control group and which is a treatment group |
| Empirical Rule | used with distributions that have the general shape of the cross section of a bell (bell-shaped curves); 1. about 68% of all values in the data set will be located within one standard deviation of the mean, 2. about 95% of the values will be located within two standard deviations of the mean, 3. about 99.7% of the values will be located within three standard deviations of the mean; for Normal Distribution, go to Hypothesis Testing; for T-Distribution, go to Tests |
| Experiment | a statistical method of inquiry where we impose a treatment on the individuals in order to study their responses to the treatment; ex. testing the effects of a new drug on patients, where a drug is a treatment; go to Studies and Experiments |
| Explanatory variables | variables that explain or cause changes in the response variables; ex. the effect of Sun exposure (explanatory variable) on skin tan (response variable); |
| F-distribution | a family of distributions with two parameters called degrees of freedom; formed by the F-values; degrees of freedom (df) - ordered pair (k-1, n-k); df for the numerator, k-1, where k is the number of samples or populations; df for the denominator, n-k, where n is the number of data values in each sample; a right-skewed graph that starts at 0 and extends indefinitely to the right; the peak of the F density curve near 1; different shapes, depending on the parameters; go to Tests |
| Five-number summary | used with box plots; minimum, 1st quartile, median, 2nd quartile, and maximum; go to Statistical Graphs |
| F-test | a test for the equality of population means, used with the analysis of variance; a right-tailed test based on comparing a calculated F-value with the critical F-value; reject the null hypothesis if the calculated F-value is greater than or equal to the critical F-value; extremely sensitive to non-normal population distributions; go to Tests |
| F-values | variance within-samples divided by variance between-samples; used with F-tests; for degrees of freedom, go to F-distribution; go to Tests |
| F-values (Critical) | a boundary point in the F-test, separating the values in the critical region (leading to rejection of null hypothesis) from those outside of the critical region; defined by the degrees of freedom and the significance level; go to Tests |
| Frequency | a number of individuals in a category or interval; ex. data set {1, 2, 2, 3, 3, }...frequency of 1 - 1...frequency of 2 - 2...frequency of 3 - 2; go to Statistical Graphs |
| Graphs | For information about bar graphs, pie charts, histograms, stem-and-leaf plots, and box plots, go to Statistical Graphs. |
| Heteroscedasticity | inequality of the population variances |
| Histograms | go to Statistical Graphs |
| Homoscedasticity | equality of the population variances; a necessary assumption for the analysis of variance; go to Tests |
| Hypothesis Testing | testing a reliability of a claim made about a population; .choosing between two opposite statements, called hypotheses; go to Hypothesis Testing |
| Individuals | objects (people, animals or things) described by the data |
| Interquartile range | known as IQR; distance between the first quartile (Q1) and the third quartile (Q3); used with box plots; go to Statistical Graphs |
| Leaves | used with stem-and-leaf plots; the rightmost digit of the numerical values in a data set displayed to the right of the vertical line; go to Statistical Graphs |
| Left-tailed test | critical region under the curve to the left of the critical value and the mean; the alternative hypothesis contains the symbol < ; ex. Null Hypothesis: µ=10 and Alternative Hypothesis: µ < 10...reject Null Hypothesis if our test statistic is in the critical region, i.e. on the left side of the curve; go to Tests |
| Lurking variable | a variable which has an important effect on the relationship between the subjects, but is not one of the variables studied |
| Mean | an average value; the point at which the density curve would balance if made of solid material; a sum of all values in a set divided by the number of values in a set; ex. data set {1, 2, 3, 4} ... mean = (1+2+3+4)/4=10/4=5/2=2.5 |
| Median | a middle value; ex. data set {2, 3, 4}...median= 3; ex. data set {2, 3, 4, 5}...median=(3+4)/2=3.5; |
| Mode | a value which occurs with the highest frequency; ex. data set {2, 3, 3, 4}...mode=3 because of highest frequency |
| Normal Distribution | the probability distribution of a normal random variable (continuous variable); a bell-shaped curve with its peak at the mean, also called µ, which extends indefinitely in both directions (positive and negative) ; symmetric about the mean where the right side of µ is the mirror image of the left side of µ ; same mean and median; naturally occurring normal, random variables: heights of people, Scholastic Aptitude Test scores, and IQs; go to Hypothesis Testing |
| Null Hypothesis | status quo; a statement (claim) under test; hypothesis with the "equal" sign; go to Hypothesis Testing |
| Observational Study | a statistical method of inquiry about an issue where we observe individuals and measure variables that are of interest, without imposing any changes on the individuals (ex. survey); go to Studies and Experiments |
| One-Sample t-test | go to t-test (one-sample); |
| One-Way ANOVA | a method of comparing several means; see ANOVA; go to Tests |
| Pie chart | go to Statistical Graphs |
| Placebo | a dummy treatment |
| Population | a complete group of individuals to be studied |
| P-value | the probability that repeated sampling will produce statistics at least as extreme as the observed sample statistic, computed under an assumption that the null hypothesis is true; the risk involved in rejecting the null hypothesis; go to P-values |
| Quantitative variable | data takes on some numerical values; ex. measurements of monthly precipitation; go to Data Types |
| Response variables | outcomes of a study; variables which reflect changes caused by explanatory variables; ex. human growth as an outcome of proper nutrition; |
| Right-tailed test | the critical region under the curve is to the right of the critical value and the mean; the alternative hypothesis contains the symbol > ; ex. Null Hypothesis: µ=10 (µ=mean), Alternative Hypothesis: µ>10...We reject the Null Hypothesis if our test statistic is in the critical region, i.e. on the right side of the curve; go to Tests |
| Sample | a part of a population of interest selected to represent all the individuals in a population; |
| Significance Level | also called alpha; a fixed probability value that we regard as decisive; a maximum risk we are willing to take when we choose to reject the null hypothesis, which might in fact be true; if a P-value is smaller than a pre-determined significance level, alpha, the results enable us to reject the null hypothesis, otherwise not enough evidence to reject; go to Hypothesis Testing |
| Simple Random Sample | a sample chosen in such a way which gives every individual in a population an equal chance of being selected; go to Sampling Strategies |
| Standard deviation | the square root of variance that tells us how far away numbers on a list are from their average; Most numbers will be somewhere around one standard deviation from the average. |
| Standard error | used in place of true population standard deviation; a statistic estimated from data (sample); |
| Stem-and-leaf plots | go to Statistical Graphs |
| Stems | used with stem-and-leaf plots; the leftmost digits of the numerical values in a data set displayed to the left of the vertical line; go to Statistical Graphs |
| Subjects | human beings (individuals) described by the data; |
| T-distribution | a family of distributions with one parameter (degrees of freedom); formed by t-values; different t distribution for each sample size; degrees of freedom = n - 1, where n is the sample size; used with t-tests; density curve similar in shape to the standard normal curve; symmetry of the curve; more spread than the normal distribution and so more probability in the tails; the more degrees of freedom (bigger sample) - the more the shape of t distribution resembles the normal curve; go to Tests |
| Test statistics | an observed value; used with hypothesis testing to compare with the critical value; test statistics in the critical region - sufficient evidence to reject null hypothesis; go to Tests |
| Treatment | a property under study; ex. effect of Tylenol (treatment) on headaches; |
| Treatment group | a group receiving a treatment; ex. patients receiving Tylenol (treatment) for headaches; |
| T-test (One-Sample) | test of a population mean when the population standard deviation is unknown, and thus must be estimated from a sample; comparing t test statistics with t critical values; go to Tests |
| T-value (critical value) | a boundary point in the t-test, separating the values in the critical region (leading to rejection of null hypothesis) from those outside of the critical region; defined by the degrees of freedom and the significance level; go to Tests |
| T-value (test statistics) | observed value; generated in different ways, depending on the kind of test (one-sample or two-sample); if in the critical region, then sufficient evidence to reject null hypothesis; go to Tests |
| Two-Sample T-test | comparing two population means; independent samples; both populations normally distributed; true means and standard deviations of the populations unknown; comparing two-sample t statistics with the critical value, obtained using the smaller of the two degrees of freedom and the significance level; go to Tests |
| Two-tailed test | the critical region under the curve is on two tails of the density curve; there are two critical values; the alternative hypothesis contains the "not equal to" symbol; significance level, alpha, divided into two equal parts, one for each tail of the curve; ex. Null Hypothesis: µ=10 (µ=mean), Alternative Hypothesis: µ?10...We reject the Null Hypothesis if our two-sample t test statistic is in either one of the two critical regions; go to Tests |
| Two-way ANOVA | a method of comparing several means with two variables involved; go to Tests |
| Variable | any characteristic of an individual; ex. height or shoe size; |
| Variance | the measurement of the spread of observations; the square of standard deviation; the average of the squares of the differences between the observations and their mean (the average of the squares of the deviations of the observations from their mean); |
| Variance between-samples | used with F-tests; mean (average) of the sample variances; denominator of the F-value; degrees of freedom - (n-k) where k is the number of samples or populations and n is the number of data values in each sample; go to Tests |
| Variance within-samples | used with F-tests; variance of the sample means; numerator of the F-value; degrees of freedom - (k-1) where k is the number of samples or populations; go to Tests |
| Z-distribution | normal distribution; |