Talk:Statistics theory: Difference between revisions
imported>Ragnar Schroder No edit summary |
imported>Hendra I. Nurdin (comment) |
||
Line 23: | Line 23: | ||
:[[User:Ragnar Schroder|Ragnar Schroder]] 18:09, 8 December 2007 (CST) | :[[User:Ragnar Schroder|Ragnar Schroder]] 18:09, 8 December 2007 (CST) | ||
::It should be noted that a [[random variable]] need not be numerical, but of course numerics is important for quantitative analysis . For example, one can have a random variable X take values on the discrete set {'Billy', 'James', 'Agnes', 'Jill'} endowed with the discrete topology and then take the Borel set to be that generated by the open sets of that discrete topology. But ultimately this set can be mapped to a numerical value, e.g., by the 1-to-1 assignment 'Billy'->0, 'James'->1, 'Agnes'->2, 'Jill'->3. | |||
::I really have no idea how you would manage to extricate statistics from random variables and, more generally, probability theory, for what would then be the theoretical basis (if any) for explaining your data and justifying your methods? Are there examples of notions in statistics that cannot be given a firm footing with mathematical statistics? |
Revision as of 18:56, 8 December 2007
Definition of a statistic
The modified sentence:
"More generally, a statistic can be any measure within a data sample. This would be some quantification of a random variable, or variables, of interest, such as a height, weight, polling results, test performance, and so on"
does not have the same meaning as the original
"More generally, a statistic can be any measurable function of the data samples, the latter being realizations of the random variables which are of interest such as the height of people, polling results, students' performance on a test, and so on."
In particular, a measure and a measurable function are not the same thing and the new sentence obfuscates the definition of a statistic. The point is that there is a precise definition of a statistic in mathematical statistics which is based on measure theoretic probability theory. For this purpose I provide a reference for this definition. An intuitive definition as given in the second paragraph of the article is fine as a gentle introduction, but it should also be complemented by a more rigorous mathematical definition.
I agree that my original sentence may not have been very readable, so to strike a compromise I combined the good parts of both sentences and produced what now appears in the article. Cheers, --Hendra I. Nurdin 17:25, 10 November 2007 (CST)
- Outstanding edit! --Michael J. Formica 19:17, 10 November 2007 (CST)
- "A data sample is regarded as instances of a random variable of interest..."
- I think referring to "random variable" here narrows the focus a little too much.
- Statistics is largely about extracting concise info from large piles of data. Sometimes, the data set is best described without reference to a numerical random variable, f.i. the fact that the most common 1st name in this or that town is "Billy" is a perfectly good statistic, ditto that "I" is the most commonly used word in English.
- Ragnar Schroder 18:09, 8 December 2007 (CST)
- It should be noted that a random variable need not be numerical, but of course numerics is important for quantitative analysis . For example, one can have a random variable X take values on the discrete set {'Billy', 'James', 'Agnes', 'Jill'} endowed with the discrete topology and then take the Borel set to be that generated by the open sets of that discrete topology. But ultimately this set can be mapped to a numerical value, e.g., by the 1-to-1 assignment 'Billy'->0, 'James'->1, 'Agnes'->2, 'Jill'->3.
- I really have no idea how you would manage to extricate statistics from random variables and, more generally, probability theory, for what would then be the theoretical basis (if any) for explaining your data and justifying your methods? Are there examples of notions in statistics that cannot be given a firm footing with mathematical statistics?