Talk:Statistics theory: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Michael J. Formica
imported>Hendra I. Nurdin
Line 54: Line 54:


::You did, so we agree, then, to disagree.  I was not referring to the content, but the manner in which it is (now, was) presented.  I have revised the article for readability and interior definition, without compromising the content.  --[[User:Michael J. Formica|Michael J. Formica]] 09:25, 10 December 2007 (CST)
::You did, so we agree, then, to disagree.  I was not referring to the content, but the manner in which it is (now, was) presented.  I have revised the article for readability and interior definition, without compromising the content.  --[[User:Michael J. Formica|Michael J. Formica]] 09:25, 10 December 2007 (CST)
:::Thanks for the edit. We should try to insert an illustrative example of how statistics work -- I'll see if I can do that in the near future. BTW, is there something we are disagreeing on? Cheers, [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 16:35, 10 December 2007 (CST)

Revision as of 16:35, 10 December 2007

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Advanced [?]
 
To learn how to update the categories for this article, see here. To update categories, edit the metadata template.
 Definition A branch of mathematics that specializes in enumeration, or counted, data and their relation to measured data. [d] [e]
Checklist and Archives
 Workgroup category mathematics [Please add or review categories]
 Talk Archive none  English language variant American English

Definition of a statistic

The modified sentence:

"More generally, a statistic can be any measure within a data sample. This would be some quantification of a random variable, or variables, of interest, such as a height, weight, polling results, test performance, and so on"

does not have the same meaning as the original

"More generally, a statistic can be any measurable function of the data samples, the latter being realizations of the random variables which are of interest such as the height of people, polling results, students' performance on a test, and so on."

In particular, a measure and a measurable function are not the same thing and the new sentence obfuscates the definition of a statistic. The point is that there is a precise definition of a statistic in mathematical statistics which is based on measure theoretic probability theory. For this purpose I provide a reference for this definition. An intuitive definition as given in the second paragraph of the article is fine as a gentle introduction, but it should also be complemented by a more rigorous mathematical definition.

I agree that my original sentence may not have been very readable, so to strike a compromise I combined the good parts of both sentences and produced what now appears in the article. Cheers, --Hendra I. Nurdin 17:25, 10 November 2007 (CST)

Outstanding edit! --Michael J. Formica 19:17, 10 November 2007 (CST)


"A data sample is regarded as instances of a random variable of interest..."
I think referring to "random variable" here narrows the focus a little too much.
Statistics is largely about extracting concise info from large piles of data. Sometimes, the data set is best described without reference to a numerical random variable, f.i. the fact that the most common 1st name in this or that town is "Billy" is a perfectly good statistic, ditto that "I" is the most commonly used word in English.
Ragnar Schroder 18:09, 8 December 2007 (CST)
It should be noted that a random variable need not be numerical, but of course numerics is important for quantitative analysis . For example, one can have a random variable X take values on the discrete set {'Billy', 'James', 'Agnes', 'Jill'} endowed with the discrete topology and then take the Borel set to be that generated by the open sets of that discrete topology. But ultimately this set can be mapped to a numerical value, e.g., by the 1-to-1 assignment 'Billy'->0, 'James'->1, 'Agnes'->2, 'Jill'->3.
I really have no idea how you would manage to extricate statistics from random variables and, more generally, probability theory, for what would then be the theoretical basis (if any) for explaining your data and justifying your methods? Are there examples of notions in statistics that cannot be given a firm footing with mathematical statistics? Hendra I. Nurdin 00:56, 9 December 2007
Not 'extricate', but rather 'deemphasize'. Rvs are just an ad hoc artifact of the mathematical model of the situation at hand - after all, not even coin-flipping has a unique a priori given random variable associated with it.
Like in your example above, there's an infinity of functions to choose from, with no formal reason to prefer one to the other.
Sometimes, like when the statistic in question is the population mode, they're not really called upon.
Of course, your point that one ultimately can't live without them is well taken.
Btw. thanks for informing me that rvs need not be numbers, I didnt realize that. I appreciate the enlightenment.
Ragnar Schroder 19:57, 8 December 2007 (CST)
Though a random variable may not be stated explicitly called upon, it does not mean that it is not implicitly used in a certain problem. It's only that these details are usually just swept under the rug in applications. Hendra I. Nurdin 17:59, 9 December 2007 (CST)

Readability

Ragnar, Hendra: I am reading your discussion about random variables with much interest. I have a concern about the readability of the article, and I am wondering if we could address it. I have a Masters degree in Stats, and, yet, I am struggling with the language that we are using to present the initial concepts here. Both the NY and the London Times are written on a 5th grade (by American standards) reading level. Do you think we could tone the article down to be more readable? Blessings... --Michael J. Formica 06:20, 9 December 2007 (CST)

IMHO it seems to read just fine, I mean there is a gentle general introduction about the subject that is worded to be suitable for lay people, and then there's also a more technical definition as well for those who have a math background. I think inclusion of a few examples or nice applications in the article will help clarify things... but then again you'd need to explain more clearly by what you mean when you say "readable" ...
I have to stress, as I have also mentioned to Ragnar on other occasions, that this is *foremost* a math article sitting in the Mathematics Workgroup. Thus one should not expect that such articles to be written to cater exclusively to lay people/general public -- there should be some balance in the presentation. At least if I need to look up a math related topic, I would expect to get some mathematics though I may then realize that I do not yet have all background necessary to completely understand an article (which simply means I have some catching up to do if I really want/need to understand).
Good math articles on sites like CZ or WP can potentially also be nice quick/initial refs for active math-inclined grad students (not necessarily studying math) or researchers who need to look up a certain definition or get some feeling for a topic they are not yet familiar with (for all its woes, WP does have a lot of good math articles, though I think their statistics article isn't up to scratch the last time I saw it) -- but how can they do that if the article is written to be devoid of "serious" math content?
Sorry, if perhaps I had misunderstood your concerns...Hendra I. Nurdin 18:33, 9 December 2007 (CST)
You did, so we agree, then, to disagree. I was not referring to the content, but the manner in which it is (now, was) presented. I have revised the article for readability and interior definition, without compromising the content. --Michael J. Formica 09:25, 10 December 2007 (CST)
Thanks for the edit. We should try to insert an illustrative example of how statistics work -- I'll see if I can do that in the near future. BTW, is there something we are disagreeing on? Cheers, Hendra I. Nurdin 16:35, 10 December 2007 (CST)