Applied statistics: Difference between revisions
imported>Nick Gardner |
imported>Nick Gardner |
||
Line 11: | Line 11: | ||
==The collection of statistics== | ==The collection of statistics== | ||
The methodology adopted for the collection of observations has a profound influence upon the problem of extracting useful information from the resulting statistics. That problem is at its easiest when the collecting authority can minimise disturbing influences by conducting a "controlled experiment"<ref> In a controlled experiment, a "control group", that is in all relevant respects similar to the experimental group, receive a "placebo", while the experimental group receive the treatment that is on trial</ref>. A range of more complex methodologies (and associated software packages) referred to as "the design of experiments" <ref>[http://www.stats.gla.ac.uk/steps/glossary/anova.html Valerie Easton and John McCall: ''The Design of Experiments and ANOVA''. STEPS 1997]</ref> is available for use when the collecting authority has various lesser degrees of control. The object of the design in each case is to facilitate the testing of an hypothesis by helping to remove the influence of factors that the hypothesis does not take into account. At the furthest extreme from the controlled experiment, no such help can be provided through the physical elimination of extraneous influences - and, if they are to be eliminated, it must be done after they have been identified by a purely analytical technique termed the "analysis of variance"<ref>[http://www.econometrics.org/. ''Anova Manova" </ref> For example, the rôle of the authorities that collect economic statistics is necessarily passive, and the testing of economic hypotheses involves the use of a version of the analysis of variance termed "econometrics"<ref>[http://www.econometrics.org/ ''Econometrics'' 2005]</ref> (sometimes confused with economic modelling, which is a purely deterministic technique). | The methodology adopted for the collection of observations has a profound influence upon the problem of extracting useful information from the resulting statistics. That problem is at its easiest when the collecting authority can minimise disturbing influences by conducting a "controlled experiment"<ref> In a controlled experiment, a "control group", that is in all relevant respects similar to the experimental group, receive a "placebo", while the experimental group receive the treatment that is on trial</ref>. A range of more complex methodologies (and associated software packages) referred to as "the design of experiments" <ref>[http://www.stats.gla.ac.uk/steps/glossary/anova.html Valerie Easton and John McCall: ''The Design of Experiments and ANOVA''. STEPS 1997]</ref> is available for use when the collecting authority has various lesser degrees of control. The object of the design in each case is to facilitate the testing of an hypothesis by helping to remove the influence of factors that the hypothesis does not take into account. At the furthest extreme from the controlled experiment, no such help can be provided through the physical elimination of extraneous influences - and, if they are to be eliminated, it must be done after they have been identified by a purely analytical technique termed the "analysis of variance"<ref>[http://www.econometrics.org/. ''Anova Manova" ]</ref> For example, the rôle of the authorities that collect economic statistics is necessarily passive, and the testing of economic hypotheses involves the use of a version of the analysis of variance termed "econometrics"<ref>[http://www.econometrics.org/ ''Econometrics'' 2005]</ref> (sometimes confused with economic modelling, which is a purely deterministic technique). | ||
Revision as of 15:09, 27 June 2009
Applied statistics provide both a familiar source of information and a notorious source of error and misinformation. Errors commonly arise from misplaced confidence in an intuitive interpretations, but some of the most serious have arisen from misuse by mathematicians and other professionals. Deliberate misinterpretation of statistics by politicians and marketing professionals is so much a popular commonplace that its genuine use is often treated with suspicion. To those unfamiliar with it, statistics can seem impenetrably arcane, but its pitfalls can be avoided given a grasp of a few readily understood concepts.
(terms shown in italics are defined in the glossary on the related articles subpage).
Overview: the basics
Statistics are observations that are recorded in numerical form. It is essential to their successful handling to accept that statistics are not facts and therefore incontrovertible, but observations about facts and therefore fallible. The reliability of the information that they provide depends not only upon their successful interpretation, but also upon the accuracy with which the facts are observed and the extent to which they truly represent the subject matter of that information. An appreciation of the means by which statistics are collected is thus an essential part of the understanding of statistics and is least as important as a familiarity with the tools that are used in its interpretation.
Although the derivation of those tools involved advanced mathematics, the laws of chance on which much of statistics theory is based are no more than a formalisation of intuitive concepts, and the use of the resulting algorithms and computer software requires only a grasp of basic mathematical principles
The collection of statistics
The methodology adopted for the collection of observations has a profound influence upon the problem of extracting useful information from the resulting statistics. That problem is at its easiest when the collecting authority can minimise disturbing influences by conducting a "controlled experiment"[1]. A range of more complex methodologies (and associated software packages) referred to as "the design of experiments" [2] is available for use when the collecting authority has various lesser degrees of control. The object of the design in each case is to facilitate the testing of an hypothesis by helping to remove the influence of factors that the hypothesis does not take into account. At the furthest extreme from the controlled experiment, no such help can be provided through the physical elimination of extraneous influences - and, if they are to be eliminated, it must be done after they have been identified by a purely analytical technique termed the "analysis of variance"[3] For example, the rôle of the authorities that collect economic statistics is necessarily passive, and the testing of economic hypotheses involves the use of a version of the analysis of variance termed "econometrics"[4] (sometimes confused with economic modelling, which is a purely deterministic technique).
Statistical inference
The laws of chance
Probability distributions
Risks and faults
Correlation and association
Popular errors
An eminent authority has claimed that the results of most medical research are flawed because of statistical misinterpretation[5]
Accuracy and reliability
Applications
Surveys
Quality control
Econometrics
Forecasting
Risk management
References
- ↑ In a controlled experiment, a "control group", that is in all relevant respects similar to the experimental group, receive a "placebo", while the experimental group receive the treatment that is on trial
- ↑ Valerie Easton and John McCall: The Design of Experiments and ANOVA. STEPS 1997
- ↑ Anova Manova"
- ↑ Econometrics 2005
- ↑ John P. A. Ioannidis Why Most Research Findings are False, PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124 August 2005