Talk:Statistics theory: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Hendra I. Nurdin
(New page: ==Definition of a statistic== The modified sentence: "More generally, a statistic can be any measure within a data sample. This would be some quantification of a random variable, or vari...)
 
imported>Boris Tsirelson
(→‎The lead: just describe something?: lead could contain something like this)
 
(78 intermediate revisions by 14 users not shown)
Line 1: Line 1:
{{subpages}}
==Definition of a statistic==
==Definition of a statistic==
The modified sentence:
The modified sentence:
Line 11: Line 12:


I agree that my original sentence may not have been very readable, so to strike a compromise I combined the good parts of both sentences and produced what now appears in the article. Cheers, --[[User:Hendra I. Nurdin|Hendra I. Nurdin]] 17:25, 10 November 2007 (CST)
I agree that my original sentence may not have been very readable, so to strike a compromise I combined the good parts of both sentences and produced what now appears in the article. Cheers, --[[User:Hendra I. Nurdin|Hendra I. Nurdin]] 17:25, 10 November 2007 (CST)
:Outstanding edit! --[[User:Michael J. Formica|Michael J. Formica]] 19:17, 10 November 2007 (CST)
:"A data sample is regarded as instances of a random variable of interest..."
:I think referring to "random variable" here narrows the focus a little too much.
:Statistics is largely about extracting concise info from large piles of data.  Sometimes,  the data set is best described without reference to a numerical random variable,  f.i. the fact that the most common 1st name in this or that town is "Billy" is a perfectly good statistic,  ditto that "I" is the most commonly used word in English.
:[[User:Ragnar Schroder|Ragnar Schroder]] 18:09, 8 December 2007 (CST)
::It should be noted that a [[random variable]] need not be numerical, but of course numerics is important for quantitative analysis . For example, one can have a random variable X take values on the discrete set {'Billy', 'James', 'Agnes', 'Jill'} endowed with the discrete topology and then take the Borel set to be that generated by the open sets of that discrete topology. But ultimately this set can be mapped to a numerical value, e.g., by the 1-to-1 assignment 'Billy'->0, 'James'->1, 'Agnes'->2, 'Jill'->3.
::I really have no idea how you would manage to extricate statistics from random variables and, more generally, probability theory, for what would then be the theoretical basis (if any) for explaining your data and justifying your methods? Are there examples of notions in statistics that cannot be given a firm footing with mathematical statistics? [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 00:56, 9 December 2007
:::Not 'extricate',  but rather 'deemphasize'.  Rvs are just an ad hoc artifact of the mathematical model of the situation at hand - after all,  not even coin-flipping has a ''unique'' a priori given random variable associated with it.
:::Like in your example above,  there's an infinity of functions to choose from, with no formal reason to prefer one to the other.
:::Sometimes, like when the statistic in question is the population [[mode]], they're not really called upon.
:::Of course,  your point that one ultimately can't live without them is well taken.
:::Btw. thanks for informing me that rvs need not be numbers,  I didnt realize that.  I appreciate the enlightenment.
:::[[User:Ragnar Schroder|Ragnar Schroder]] 19:57, 8 December 2007 (CST)
::::Though a random variable may not be stated explicitly called upon, it does not mean that it is not implicitly used in a certain problem. It's only that these details are usually just swept under the rug in applications. [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 17:59, 9 December 2007 (CST)
== Readability ==
Ragnar, Hendra:  I am reading your discussion about random variables with much interest.  I have a concern about the readability of the article, and I am wondering if we could address it.  I have a Masters degree in Stats, and, yet, I am struggling with the language that we are using to present the initial concepts here.  Both the NY and the London Times are written on a 5th grade (by American standards) reading level.  Do you think we could tone the article down to be more readable?  Blessings... --[[User:Michael J. Formica|Michael J. Formica]] 06:20, 9 December 2007 (CST)
:IMHO it seems to read just fine, I mean there is a gentle general introduction about the subject that is worded to be suitable for lay people, and then there's also a more technical definition as well for those who have a math background. I think inclusion of a few examples or nice applications in the article will help clarify things... but then again you'd need to explain more clearly by what you mean when you say "readable" ...
:I have to stress, as I have also mentioned to Ragnar on other occasions, that this is *foremost* a math article sitting in the Mathematics Workgroup. Thus one should not expect that such articles to be written  to cater exclusively to lay people/general public -- there should be some balance in the presentation. At least if I need to look up a math related topic, I would expect to get some mathematics though I may then realize that I do not yet have all background necessary to completely understand an article (which simply means I have some catching up to do if I really want/need to understand).
:Good math articles on sites like CZ or WP can potentially also be nice quick/initial refs for active math-inclined grad students (not necessarily studying math) or researchers  who need to look up a certain definition or get some feeling for a topic they are not yet familiar with (for all its woes, WP does have a lot of good math articles, though I think their statistics article isn't up to scratch the last time I saw it) -- but how can they do that if the article is written to be devoid of "serious" math content?
:Sorry, if perhaps I had misunderstood your concerns...[[User:Hendra I. Nurdin|Hendra I. Nurdin]] 18:33, 9 December 2007 (CST)
::You did, so we agree, then, to disagree.  I was not referring to the content, but the manner in which it is (now, was) presented.  I have revised the article for readability and interior definition, without compromising the content.  --[[User:Michael J. Formica|Michael J. Formica]] 09:25, 10 December 2007 (CST)
:::Thanks for the edit. We should try to insert an illustrative example of how statistics work -- I'll see if I can do that in the near future. BTW, is there something we are disagreeing on? Cheers, [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 16:35, 10 December 2007 (CST)
:Readability is my main concern:  In order for intellectual progress to occur, it's imperative that the present state of knowledge is absorbed as fast as possible by as many fertile minds as possible.
:However,  different people have different models and ways of gaining "understanding".  Some people learn from being presented a step-by-step reasoning chain,  others learn best from being presented with an overview they can intuit on.
:Writing an article for one group seems to make it rather unreadable for the other.  I really have no good solution for that conundrum,  other than to try to keep things as down to earth simple as possible.
:I think there's even developed a formal theory about this: Antoni KĘPIŃSKI's theory of informational metabolism, explained [http://www.indopedia.org/Information_metabolism.html here], [http://cat.inist.fr/?aModele=afficheN&cpsidt=1119366 here ] and [http://babelfish.altavista.com/babelfish/trurl_pagecontent?lp=ru_en&trurl=http://www.socioniko.net/ru/1.begin/infomet.html here]. 
:[http://www.socioniko.net/en/1.begin/index.html socioniko.net] expounds a union of Jung's and KĘPIŃSKI's model.
:I'm not a psychologist, so I may be wrong in much of the above.  Your insights are welcome.
:[[User:Ragnar Schroder|Ragnar Schroder]] 22:39, 10 December 2007 (CST)
::Sure, I know where you're coming from, as I'm sure you also know where I'm coming from. I also can't see how to get out of the conundrum. So, let's just keep trying to add some improvements and see how it goes... [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 00:43, 11 December 2007 (CST)
Hi all.  I took a look at this page and also wanted to help with the readability.  I also hope that CZ can be very readable, while still covering complex topics.  I made a few changes, like a new introduction, that talks about statistics in very general terms.  I also think that formulas are okay, as long as there is good explanation text close to it.  Hope I can help with this. [[User:Gene Shackman|Gene Shackman]] 04:35, 28 February 2009 (UTC)
== Statistics or ''statistical mathematics''? ==
I am attempting to draft an article on [[economic statistics]] and it would be useful to add, and draw upon, a link to this article.
Unfortunately, I find that the article is confined to the mathematics of statistics with no reference to the fact that their usefulness depends upon the methods by which they are categorised, collected and aggregated by professional statisticians. If it is not considered appropriate to include such material in this article, may I suggest that its title should be changed to "statistical theory" or "statistical mathematics" in order to make room for a new article on statistics  that would relate to their relevance  to disciplines other than mathematics, and to their other users. [[User:Nick Gardner|Nick Gardner]] 12:04, 28 January 2009 (UTC)
I have altered the opening sentence of this article as a reminder of the existence of  professional  statisticians - without whose achievements in the production of statistics, there would would be nothing for academics to manipulate. Their work is interesting, demanding  and important, and I am prepared to contest any attempts to define them out of existence.[[User:Nick Gardner|Nick Gardner]] 22:42, 10 February 2009 (UTC)
The revised opening statement avoids the absurdity of saying that statistics is a branch of mathematics, but it is still an invitation to confused thinking. It should be stated clearly that a statistic is an item of information that can exist independently of mathematics - and  that is also true of statistics (singular). Mathematics often assists the interpretation of statistics, but it is not necessary for that purpose. Everyone can interpret the statement that more people live in America than in France without the benefit of help from mathematicians - even if that statement is expressed numerically. That is perhaps a rather silly reductio ad absurdum,  but the serious point is that there are many problems of statistical inference that can be solved without the use of mathematics - and that failure to reach the correct solution to such problems is a common source of error (ask your banking friends).
[[User:Nick Gardner|Nick Gardner]] 10:59, 28 February 2009 (UTC)
: I also revised the opening of this article to put in more everyday examples or understanding.  I kept 'mathematics' in the start because statistics does rely on mathematics, and because as a relative newbie, I didn't want to completely remove it.  I also moved up the statement about the importance of "usefulness depends upon the methods by which they are categorised, collected and aggregated "  That would be good to mention in the introduction.
: I'm also going through and thinking of revising this article quite a bit, to make it much more everyday, showing how it relates to everyday life, and make it more accessible to lay people, but still keep in some of the formulas and theory stuff too. Is that okay with folks? [[User:Gene Shackman|Gene Shackman]] 15:55, 28 February 2009 (UTC)
==external links==
I had a couple of external links to on line statistical books.  These would be very useful if people wanted more detail.  Any particular reason why these were removed? [[User:Gene Shackman|Gene Shackman]] 15:48, 28 February 2009 (UTC)
:Gene, they were not removed. Someone correctly relocated them to the external links subpage which is one of the ways that Citizendium differs from Wikipedia. [[User:Milton Beychok|Milton Beychok]] 17:41, 28 February 2009 (UTC)
::Yes, that was me. See the tab at the "external links" tab at the top of the page.  Click the [[CZ:Subpages|"?"]] mark tab next to it to learn more about our use of subpages.  Hope this helps. [[User:Chris Day|Chris Day]] 20:47, 28 February 2009 (UTC)
==The workgroup categories in the Metadata template==
I am curious as to why the Metadata template lists only the Mathematics and Pyschology Workgroups as categories. Statistics are utilized in many other disciplines as well ... for example, Healing Arts (Medicine), Economics, Engineering and Politics.  I agree that Mathematics is certainly appropriate but should not the two others allowed by the Metadata template perhaps be reconsidered? [[User:Milton Beychok|Milton Beychok]] 17:41, 28 February 2009 (UTC)
: Hi Milton.  I added this to the introduction "Statistics is used in a very wide variety of fields. For example, statistics is used to develop and analyze psychological tests and public opinion surveys, in program evaluation to determine whether a program works or how it can be improved, in medicine with clinical trials to test the safety and effectiveness of new drugs, and in many other areas."  Could you add something about how statistics is used in engineering, and any other discipline? I'm not sure how to add the workgroups, but adding others sounds good to me. [[User:Gene Shackman|Gene Shackman]] 20:16, 28 February 2009 (UTC)
::Th assignment of workgroups is done in the metadata.  The metadata can be seen by clicking the orange "M" at the top right of the talk page header. More information on that can be found by clicking the link to a description of the "[[CZ:The_Article_Checklist|article checklist]]". [[User:Chris Day|Chris Day]] 20:50, 28 February 2009 (UTC)
==Okay to make major revisions?==
I'm going through this article and thinking of revising this article quite a bit, to make it much more everyday, showing how it relates to everyday life, and make it more accessible to lay people, but still keep in some of the formulas and theory stuff too. I'd probably delete much of what is already here and rewrite.  Is that okay with folks?  Would folks prefer if I put a rewrite in some kind of sandbox first?  Can someone let me know how to do that? [[User:Gene Shackman|Gene Shackman]] 03:51, 1 March 2009 (UTC)
:Check out this subpage, [[Statistics/Advanced]].  I forget where the discussion is on the forum but this was also done for [[Quantum mechanics]]. [[User:Chris Day|Chris Day]] 04:18, 1 March 2009 (UTC)
:: The advanced statistics page looks like what the statistics page used to look like.  Which one is the "statistics" page? I guess people would first go to the statistics page and then there would be a link to the advanced statistics page, right?  So is it okay if I change this statistics page? [[User:Gene Shackman|Gene Shackman]] 05:33, 1 March 2009 (UTC)
:::I'd say go head. The advanced page is a subpage, you can see it as a tab at the top of the statistics page. It is the exact same content as the article before you started editing, I just copy and pasted it from the history.  It does not exist as an article in it own right since it is part of the Statistics [http://en.citizendium.org/wiki?title=Special%3APrefixIndex&from=Statistics%2F&namespace=0 cluster] of pages. They all share the same metadata (see [[Template:Statistics/Metadata]]). [[User:Chris Day|Chris Day]] 13:57, 1 March 2009 (UTC)
::::Somewhere, there should be a place for:
::::*Do not use statistics as a drunkard uses a lamp-post: for support rather than illumination
::::*Statistics are like a bikini. What they reveal is suggestive; what they conceal is vital.
::::There is the lovely operations research parable of where to armor a B-17. Trying to remember if I did write that up somewhere...
::::Apropose quantum mechanics, I have had it suggested that my attempts to set the timing and points on an engine followed the Uncertainty Principle. Electronic ignition is better.
::::Further, Schrodinger's Cat is totally unnecessary and inhumane. Only brief feline exposure is quite adequate that the velocity and position of a moving cat cannot be predicted. Confirming the legend, we have a small cutout next to one of the house doors, so, indeed, it is routine to see Cats Walking Through Walls. [[User:Howard C. Berkowitz|Howard C. Berkowitz]] 19:11, 1 March 2009 (UTC)
===Good start===
On rereading the introduction, I wonder if it's wise to be introducing "model" this early in the discussion. It's one thing to talk about descriptive statistics, statistical experiments in the sense of hypothesis testing/Type I/II error, etc., but the economy?  I think of statistics as inputs into econometric, meteorological, or other models of complex systems, but let's not confuse those with an introduction to statistics.
Indeed, [[computer simulation]] mixes simulation and analytic modeling more than I find ideal. While I have some moderate experience with simulation software, I'd much rather stay discipline specific with modeling. In military work, the [[Lanchester equations]] are only a starting point, essentially a historical one. Of course, [[Murphy's Laws of Combat]] can never be ignored. [[User:Howard C. Berkowitz|Howard C. Berkowitz]] 20:30, 1 March 2009 (UTC)
: I agree with Howard and I recommend deleting the idosyncratic reference to economic models. Although they make use of the results of statistical investigations (as do pharmacy and bridge design) they do not involve any statistics and are certainly not an example of its application.[[User:Nick Gardner|Nick Gardner]] 15:01, 28 June 2009 (UTC)
==No consultation?==
I think there should have been some effort to contact the original contributors of this article of what they would think of these major changes. CZ provides a method to contact authors via email as necessary. Everything I have contributed into the article with some others have been happily deleted. I think the best approach would have been to reach a compromise rather than deleting contents outright.
 
This article is sitting in a mathematics section of CZ, yet all of the maths and discussions of  some of the abstract concepts that lie at the heart of statistics (and frankly provide justification for it). I think foundations are important and deserve some discussion in this article. Perhaps this article should be moved to another workgroup or since "Statistics is not mathematics" as some would claim (perhaps it isn't just mathematics, but see also my remark below), maybe a new CZ Workgroup called Statistics would be in order then?
I don't think statistics has quite the same relationship as physics to mathematics. Physics does not rely on mathematics to justify it's laws (e.g., Newton's law or the laws of quantum mechanics), at least not at the outset, but my feeling is statistics does. It relies on probability theory to give foundations to its methods. Without these justifications, then methods of statistics would all just be ad hoc and can be dangerous when applied in practice (due to unjustified conclusions). Take bootstrapping for example and check some of Brad Efron's early papers on this topic. Using any statistical method without some mathematical theory supporting it would be akin to selling snake oil.
The issue with statistics is that it is used in a variety of fields, including engineering (my field), so people of different backgrounds have different ideas about what "statistics" is. I had always appreciated the foundations of statistics because it gives me a sense of confidence in the statistical tools that I use and about conclusions that I can draw with them. Is there any other way one can achieve this? [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 21:33, 1 March 2009 (UTC)
:Good point that you should have been e-mailed. But do note that your version is still in the [[Statistics/Advanced|advanced subpage]]; tab at the top.  At this point I'm not sure what to do here, that was my quick reaction to what i saw.  The other option is to have the original article as the main one with a [[Statistics/Student Level|Student Level]] subpage.  In many ways this will probably be a test case with regard to the level of the CZ audience. Possibly it is time to push this article towards approval and see where that takes us? [[User:Chris Day|Chris Day]] 21:59, 1 March 2009 (UTC)
::Thanks, Chris. I did not know that. Yeah, I guess this is an important article and maybe time to get it polished. It would be great to have some statistician on board though, but don't see there are any around.[[User:Hendra I. Nurdin|Hendra I. Nurdin]] 00:17, 2 March 2009 (UTC)
:Okay, I suppose I was rather quick to make changes. Next time I'll search for the original authors and send emails. Just to note, however, I asked a couple of times about making changes.  See my note above:
:"I'm also going through and thinking of revising this article quite a bit, to make it much more everyday, showing how it relates to everyday life, and make it more accessible to lay people, but still keep in some of the formulas and theory stuff too. Is that okay with folks? Gene Shackman 15:55, 28 February 2009 (UTC) "
: and another one earlier today or yesterday.
:My point of view was trying to make this article more concept driven, less formula driven, at least in the beginning. If you all want to revert it to the original, thats fine by me. [[User:Gene Shackman|Gene Shackman]] 22:59, 1 March 2009 (UTC)
::Gene, I don't  subscribe to deletions (unless it's something cranky) and obviously you've already spent some time on this. I think to just leave it as it is for now, and maybe I can add in some stuff to the article in the next few days -- it's just hard to find time for to do stuff on CZ these days. Of course, no one is stopping you from doing further work on the article. I did note your remark, but because I do not always go to CZ these days I did not see them. But if there was an email about it, I would have been able to get involved in the discussions.
::One point is that the current article there's no discussion of why statistics is called statistics, and would is meant by a "statistic" as in the older versions, although examples are given such as mean, median, standard deviations. Thanks. [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 00:17, 2 March 2009 (UTC)
Suggestion?  There is a feature in "my preferences" that lets CZ email a user if someone makes a change to an article on your watchlist.  I'm concerned that requiring users to contact past editors before making changes to an article would not encourage wiki collaboration.  Users should feel free to edit any article that they feel they can improve.  Of course large deletions such as this need to be explained on the talk page, as Gene did. [[User:D. Matt Innis|D. Matt Innis]] 00:55, 2 March 2009 (UTC)
:I put back in the stuff I took out (with a little changed headers). [[User:Gene Shackman|Gene Shackman]] 02:22, 2 March 2009 (UTC)
::Incidentally, the discussions above prompted me to do a bit of research into whether statistics is considered a branch of mathematics or a separate discipline. It seems the consensus by professional organizations of statisticians is that statistics is considered as a separate discipline, like physics and chemistry. So, there should probably really be a CZ Workgroup setup called "Statistics" perhaps under the applied sciences category (like engineering). Well, I guess would be the first in an online encyclopedia like this, I don't see them doing that in Wikipedia. But until then, I guess it should be left under mathematics. A discussion of this could go into the article as well, it would make an interesting point. [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 11:04, 2 March 2009 (UTC)
:::Math and Stats are different departments at UW-Madison. (https://www.math.wisc.edu/ and  http://www.stat.wisc.edu/)[[User:Chris Day|Chris Day]] 15:29, 2 March 2009 (UTC)
==Advanced subpage==
This is the first article I have seen with an advanced subpage. I was concerned that [[statistics]] has a five paragraph introduction with one citation. I now see that the former citations are on the advanced page. Two questions 1) should we put an intro line on the main page to alert users that the advanced page exists? How do we decide what content goes on what page and prevent two parallel pages from developing. I have seen the basic versus advanced pages done well at wikepedia with quantum mechanics, but I suspect this requires much work for someone to curate. - [[User:Robert Badgett|Robert Badgett]] 02:35, 2 March 2009 (UTC)
I was thinking this could be a basic page about statistics, with a line near the top linking it to one or more advanced statistics pages, perhaps something like "many advanced methods are available, such as factor analysis, time series analysis, regression (others....)" with each of those linked to pages about those topics.  That work? [[User:Gene Shackman|Gene Shackman]] 04:22, 2 March 2009 (UTC)
:I think it will work if 1) the into is clear so an author-in-a-hurry doesn't write to the wrong page and 2) someone monitors to occasionally relocate content. - [[User:Robert Badgett|Robert Badgett]] 15:34, 2 March 2009 (UTC)
==Organization of basic page==
Gene, I think I see where you are heading and the direction seems good. Seems you could greatly redo the organization into basics, description, and inferential.  The sections more introduction and more illustration seem to need more specific labeling and improved placement into the structure of the page. Ok by me to make big changes. - [[User:Robert Badgett|Robert Badgett]]
== Intellectual confusion ==
To say that statistics is a branch of mathematics is like saying that words are a branch of literary criticism. Unless a clear distinction is maintained between what statistics are, and how they may be interpreted, logical confusion is inevitable. Can we please have a clear recognition that the task of collecting reliable numerical information - which has nothing whatever to do with mathematics - is the foundation of statistics, without which all else is futile - and something about how it is done? (Incidentally, the idea that economic modelling is a branch of statistical mathematics pushes intellectual confusion towards its outer limits!)  [[User:Nick Gardner|Nick Gardner]] 21:48, 15 May 2009 (UTC)
Since nobody has come forward to defend the paragraph on economic modelling, I propose to delete it in 2 days time. [[User:Nick Gardner|Nick Gardner]] 16:23, 2 July 2009 (UTC)
== New title ==
Rather than altering the existing text to include some of the more important non-mathematical aspects of statistics, I have altered its title and started a new article entitled [[Applied statistics]]. I did consider consulting those interested before making the change, but realised that my reasons for doing so could be effectively explained only by setting up an outline of the new article. The move can, of course, be reversed if it is generally considered to be objectionable, and I should then revert to the option of amending the existing text. I hope that this will not be necessary, however, and that some of the mathematics workgroup's statistics users will contribute to the new article. Some overlap is probable, but I don't consider that to be harmful. [[User:Nick Gardner|Nick Gardner]] 10:38, 27 June 2009 (UTC)
The draft article on [[Applied statistics]] is now complete, leaving this article free of any need to go beyond the abstract aspects of the subject. A more comprehensive treatment of those aspects would seem necessary, however.[[User:Nick Gardner|Nick Gardner]] 05:27, 11 July 2009 (UTC)
== The lead: just describe something? ==
The first phrase says: "Statistics theory is a mathematical approach to describe something, predict events, or analyze the relationship between things". Then, is there any application of mathematics beyond statistical theory? Most of physics, in particular, appears to belong to statistical theory according to this "definition". Really? Just because it describes (or analyzes) ''something'' mathematically?! [[User:Boris Tsirelson|Boris Tsirelson]] 17:01, 4 November 2009 (UTC)
: Hi Boris, that's a good point. And, also, I'm not a big fan of the name "Statistics theory", if you have some ideas as to how to improve the definition or even the name of this particular article, please do feel free to make improvements on this article. I am not aware of the name "statistics theory" being used widely for such a topic. Thanks. [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 20:51, 5 November 2009 (UTC)
:: No, I do not want to change articles that are far from my competence. I have some idea of ''mathematical'' statistics, but no idea of the rest of statistics. I see that this article is not just about mathematical statistics.
:: "Everyone can interpret the statement that more people live in America than in France without the benefit of help from mathematicians - even if that statement is expressed numerically" (Nick Gardner 10:59, 28 February 2009). If this is indeed an example of a statistical fact, then I really do not know, whether there is a non-statistical fact, at all.  Before changing the article we could reach here a consensus about several examples of facts within and beyond statistics. I mean, facts that are close to the boundary of statistics, from both sides of the boundary. [[User:Boris Tsirelson|Boris Tsirelson]] 12:40, 6 November 2009 (UTC)
::: What's the problem?  If a statistic is not any  fact that is expressed numerically, how else would you define the term? [[User:Nick Gardner|Nick Gardner]] 22:50, 6 November 2009 (UTC)
:::: Well, below I propose a list of several facts; and I wonder, which of them are statistical, in your opinion. [[User:Boris Tsirelson|Boris Tsirelson]] 16:05, 7 November 2009 (UTC)
* The mean distance between the Sun and the Earth is 149.6 x 10<sup>9</sup> m.
* The Solar System contains 8 planets.
* The mass of the proton is 1836 times the mass of the electron.
* An oxygen atom contains 2 electron shells, of 2 and 6 electrons.
* There are 4 fundamental interactions (electromagnetic, strong, weak, and gravitational).
* There are 6 flavors of quarks (up, down, charm, strange, top, and bottom).
* The proterozoic eon contains 3 eras (paleoproterozoic, mesoproterozoic, and neoproterozoic).
* A man has 1 head, 2 arms, and 2 legs.
* There are 5 regular convex polyhedra (Platonic solids: tetrahedron, cube, octahedron, dodecahedron, icosahedron).
* π = 3.14159...
* 2 + 2 = 4.
:(if I may interject a response to the question that Boris has addressed to me)<br> The last 3 are abstractions (and I suppose that a quark might be considered an abstraction too). The others are statements of observed facts, so fall in my opinion in the statistics category. I accept that their inclusion is counterintuitive, but I cannot think of a satisfactory definition that would exclude them. I don't see how to handle the exclusion of any on the grounds of their triviality (how would you define it?). Can you suggest any other criterion that would exclude some of your examples? [[User:Nick Gardner|Nick Gardner]] 19:15, 7 November 2009 (UTC)
::(also interjecting)<br>
::Being not a statistician, I'll try, still. Maybe statistics, just like mathematics, does not have its "ontological" domain in the Nature; rather, its definition should be "epistemological". No matter which numbers it takes, the matter is, what does it do with them. It has (probably) some specific ways of dealing with numbers (not excluding their gathering), and these ways are (somewhat) universal; that is, applicable in different situations. (Application-specific methods should belong to this application, not to statistics. The same for math...) Does it make sense? [[User:Boris Tsirelson|Boris Tsirelson]] 19:57, 7 November 2009 (UTC)
::And if so, then the phrase "statistics is a scientific discipline that is distinct from mathematics, just like physics and chemistry" should be rather "statistics is a scientific discipline that is distinct from physics and chemistry, just like mathematics"! [[User:Boris Tsirelson|Boris Tsirelson]] 20:07, 7 November 2009 (UTC)
::The fact that "A man has 1 head, 2 arms, and 2 legs" is of no interest to statistics just because it has nothing to do with it. Indeed, everyone can interpret the statement without the benefit of help from statisticians! [[User:Boris Tsirelson|Boris Tsirelson]] 20:10, 7 November 2009 (UTC)
::: (continuing interjection)
::: I agree with Boris. None of these statements is a statistical statement. Statistics is not equivalent to counting or measuring. Statistics is concerned with '''sets''' of data which may or may not be all known or observable.
::: "Our solar system has (at least) eight planets." is simply an observation.
::: "A typical solar system has eight planets." would be a statistical statement. Statistics has to provide means to decide whether such a statement is justified, and how it has to be interpreted.
::: [[User:Peter Schmitt|Peter Schmitt]] 02:15, 8 November 2009 (UTC)
:Who said again that there are lies, damned lies, and statistics? This saying shows that there is a kind of statistics that collects numbers (usually this is done by government agencies). The numbers can be recognized by  their tendency to change suddenly at election time. And there is a branch of mathematics that nobody would accuse of telling lies. The connection between the two kinds of statistics is weak and consists of a few concepts like "mean value" and "standard deviation".--[[User:Paul Wormer|Paul Wormer]] 16:55, 7 November 2009 (UTC)
::The source cited for the lede (Ref.1) says: "What is statistics? Statistics is the study of data, and how it can be collected, analyzed, and presented in order to answer questions pertaining to the world around us." This is something quite different from the first definition in the lede! (If it is the best definition may still be discussed.) But what is "statistics theory" meant to be? (I don't think that it exists.) The source of this problem is the move from [[Statistics]]. It is probably best to move it back, and to transfer sections which do not belong or fit into the core article to adequately named pages (or subpages?). [[User:Peter Schmitt|Peter Schmitt]] 18:09, 7 November 2009 (UTC)
:::To justify changing the title back to statistics it would be necessary to repair the omission of the vital aspects of the subject that are set out in the article on [[applied statistics]]. I should have no objection if anyone wishes to do that [[User:Nick Gardner|Nick Gardner]] 19:27, 7 November 2009 (UTC)
:::: I agree that this article -- in its current form -- is not a satisfactory core article. But I think that it is easier to start from this one -- adding material and removing (transfering) material -- than starting a new one, and having this as "Statistics theory" whose topic is unclear and which, if at all (because of "theory"), would have to be more theoretical than "Statistics". [[User:Peter Schmitt|Peter Schmitt]] 02:03, 8 November 2009 (UTC)
::::: I'm all for of the idea of reverting the name of this article back to "Statistics". The current name "Statistics theory" feels too much like an adhoc terminology not familiar to anyone in statistics and related fields. Well, ok, statistics is not mathematics, and probably should be placed in its own workgroup called rather than being lumped in mathematics (see earlier discussions above), but until someone initiates such a new workgroup (the article "Applied statistics" would probably make a home there as well), where it is now located seems to be the most appropriate. [[User:Hendra I. Nurdin|Hendra I. Nurdin]] 23:15, 9 November 2009 (UTC)
:::::: This or that name and/or workgroup, anyway, the lead could contain something like this: "Statistics gives us devices helpful in various researches when interpreting and using arrays of data that are too large for direct, unaided interpretation". [[User:Boris Tsirelson|Boris Tsirelson]] 10:56, 10 November 2009 (UTC)

Latest revision as of 04:56, 10 November 2009

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Advanced [?]
 
To learn how to update the categories for this article, see here. To update categories, edit the metadata template.
 Definition A branch of mathematics that specializes in enumeration, or counted, data and their relation to measured data. [d] [e]
Checklist and Archives
 Workgroup category mathematics [Please add or review categories]
 Talk Archive none  English language variant American English

Definition of a statistic

The modified sentence:

"More generally, a statistic can be any measure within a data sample. This would be some quantification of a random variable, or variables, of interest, such as a height, weight, polling results, test performance, and so on"

does not have the same meaning as the original

"More generally, a statistic can be any measurable function of the data samples, the latter being realizations of the random variables which are of interest such as the height of people, polling results, students' performance on a test, and so on."

In particular, a measure and a measurable function are not the same thing and the new sentence obfuscates the definition of a statistic. The point is that there is a precise definition of a statistic in mathematical statistics which is based on measure theoretic probability theory. For this purpose I provide a reference for this definition. An intuitive definition as given in the second paragraph of the article is fine as a gentle introduction, but it should also be complemented by a more rigorous mathematical definition.

I agree that my original sentence may not have been very readable, so to strike a compromise I combined the good parts of both sentences and produced what now appears in the article. Cheers, --Hendra I. Nurdin 17:25, 10 November 2007 (CST)

Outstanding edit! --Michael J. Formica 19:17, 10 November 2007 (CST)


"A data sample is regarded as instances of a random variable of interest..."
I think referring to "random variable" here narrows the focus a little too much.
Statistics is largely about extracting concise info from large piles of data. Sometimes, the data set is best described without reference to a numerical random variable, f.i. the fact that the most common 1st name in this or that town is "Billy" is a perfectly good statistic, ditto that "I" is the most commonly used word in English.
Ragnar Schroder 18:09, 8 December 2007 (CST)
It should be noted that a random variable need not be numerical, but of course numerics is important for quantitative analysis . For example, one can have a random variable X take values on the discrete set {'Billy', 'James', 'Agnes', 'Jill'} endowed with the discrete topology and then take the Borel set to be that generated by the open sets of that discrete topology. But ultimately this set can be mapped to a numerical value, e.g., by the 1-to-1 assignment 'Billy'->0, 'James'->1, 'Agnes'->2, 'Jill'->3.
I really have no idea how you would manage to extricate statistics from random variables and, more generally, probability theory, for what would then be the theoretical basis (if any) for explaining your data and justifying your methods? Are there examples of notions in statistics that cannot be given a firm footing with mathematical statistics? Hendra I. Nurdin 00:56, 9 December 2007
Not 'extricate', but rather 'deemphasize'. Rvs are just an ad hoc artifact of the mathematical model of the situation at hand - after all, not even coin-flipping has a unique a priori given random variable associated with it.
Like in your example above, there's an infinity of functions to choose from, with no formal reason to prefer one to the other.
Sometimes, like when the statistic in question is the population mode, they're not really called upon.
Of course, your point that one ultimately can't live without them is well taken.
Btw. thanks for informing me that rvs need not be numbers, I didnt realize that. I appreciate the enlightenment.
Ragnar Schroder 19:57, 8 December 2007 (CST)
Though a random variable may not be stated explicitly called upon, it does not mean that it is not implicitly used in a certain problem. It's only that these details are usually just swept under the rug in applications. Hendra I. Nurdin 17:59, 9 December 2007 (CST)

Readability

Ragnar, Hendra: I am reading your discussion about random variables with much interest. I have a concern about the readability of the article, and I am wondering if we could address it. I have a Masters degree in Stats, and, yet, I am struggling with the language that we are using to present the initial concepts here. Both the NY and the London Times are written on a 5th grade (by American standards) reading level. Do you think we could tone the article down to be more readable? Blessings... --Michael J. Formica 06:20, 9 December 2007 (CST)

IMHO it seems to read just fine, I mean there is a gentle general introduction about the subject that is worded to be suitable for lay people, and then there's also a more technical definition as well for those who have a math background. I think inclusion of a few examples or nice applications in the article will help clarify things... but then again you'd need to explain more clearly by what you mean when you say "readable" ...
I have to stress, as I have also mentioned to Ragnar on other occasions, that this is *foremost* a math article sitting in the Mathematics Workgroup. Thus one should not expect that such articles to be written to cater exclusively to lay people/general public -- there should be some balance in the presentation. At least if I need to look up a math related topic, I would expect to get some mathematics though I may then realize that I do not yet have all background necessary to completely understand an article (which simply means I have some catching up to do if I really want/need to understand).
Good math articles on sites like CZ or WP can potentially also be nice quick/initial refs for active math-inclined grad students (not necessarily studying math) or researchers who need to look up a certain definition or get some feeling for a topic they are not yet familiar with (for all its woes, WP does have a lot of good math articles, though I think their statistics article isn't up to scratch the last time I saw it) -- but how can they do that if the article is written to be devoid of "serious" math content?
Sorry, if perhaps I had misunderstood your concerns...Hendra I. Nurdin 18:33, 9 December 2007 (CST)
You did, so we agree, then, to disagree. I was not referring to the content, but the manner in which it is (now, was) presented. I have revised the article for readability and interior definition, without compromising the content. --Michael J. Formica 09:25, 10 December 2007 (CST)
Thanks for the edit. We should try to insert an illustrative example of how statistics work -- I'll see if I can do that in the near future. BTW, is there something we are disagreeing on? Cheers, Hendra I. Nurdin 16:35, 10 December 2007 (CST)


Readability is my main concern: In order for intellectual progress to occur, it's imperative that the present state of knowledge is absorbed as fast as possible by as many fertile minds as possible.
However, different people have different models and ways of gaining "understanding". Some people learn from being presented a step-by-step reasoning chain, others learn best from being presented with an overview they can intuit on.
Writing an article for one group seems to make it rather unreadable for the other. I really have no good solution for that conundrum, other than to try to keep things as down to earth simple as possible.
I think there's even developed a formal theory about this: Antoni KĘPIŃSKI's theory of informational metabolism, explained here, here and here.
socioniko.net expounds a union of Jung's and KĘPIŃSKI's model.
I'm not a psychologist, so I may be wrong in much of the above. Your insights are welcome.
Ragnar Schroder 22:39, 10 December 2007 (CST)
Sure, I know where you're coming from, as I'm sure you also know where I'm coming from. I also can't see how to get out of the conundrum. So, let's just keep trying to add some improvements and see how it goes... Hendra I. Nurdin 00:43, 11 December 2007 (CST)

Hi all. I took a look at this page and also wanted to help with the readability. I also hope that CZ can be very readable, while still covering complex topics. I made a few changes, like a new introduction, that talks about statistics in very general terms. I also think that formulas are okay, as long as there is good explanation text close to it. Hope I can help with this. Gene Shackman 04:35, 28 February 2009 (UTC)

Statistics or statistical mathematics?

I am attempting to draft an article on economic statistics and it would be useful to add, and draw upon, a link to this article.

Unfortunately, I find that the article is confined to the mathematics of statistics with no reference to the fact that their usefulness depends upon the methods by which they are categorised, collected and aggregated by professional statisticians. If it is not considered appropriate to include such material in this article, may I suggest that its title should be changed to "statistical theory" or "statistical mathematics" in order to make room for a new article on statistics that would relate to their relevance to disciplines other than mathematics, and to their other users. Nick Gardner 12:04, 28 January 2009 (UTC)

I have altered the opening sentence of this article as a reminder of the existence of professional statisticians - without whose achievements in the production of statistics, there would would be nothing for academics to manipulate. Their work is interesting, demanding and important, and I am prepared to contest any attempts to define them out of existence.Nick Gardner 22:42, 10 February 2009 (UTC)

The revised opening statement avoids the absurdity of saying that statistics is a branch of mathematics, but it is still an invitation to confused thinking. It should be stated clearly that a statistic is an item of information that can exist independently of mathematics - and that is also true of statistics (singular). Mathematics often assists the interpretation of statistics, but it is not necessary for that purpose. Everyone can interpret the statement that more people live in America than in France without the benefit of help from mathematicians - even if that statement is expressed numerically. That is perhaps a rather silly reductio ad absurdum, but the serious point is that there are many problems of statistical inference that can be solved without the use of mathematics - and that failure to reach the correct solution to such problems is a common source of error (ask your banking friends). Nick Gardner 10:59, 28 February 2009 (UTC)

I also revised the opening of this article to put in more everyday examples or understanding. I kept 'mathematics' in the start because statistics does rely on mathematics, and because as a relative newbie, I didn't want to completely remove it. I also moved up the statement about the importance of "usefulness depends upon the methods by which they are categorised, collected and aggregated " That would be good to mention in the introduction.
I'm also going through and thinking of revising this article quite a bit, to make it much more everyday, showing how it relates to everyday life, and make it more accessible to lay people, but still keep in some of the formulas and theory stuff too. Is that okay with folks? Gene Shackman 15:55, 28 February 2009 (UTC)

external links

I had a couple of external links to on line statistical books. These would be very useful if people wanted more detail. Any particular reason why these were removed? Gene Shackman 15:48, 28 February 2009 (UTC)

Gene, they were not removed. Someone correctly relocated them to the external links subpage which is one of the ways that Citizendium differs from Wikipedia. Milton Beychok 17:41, 28 February 2009 (UTC)
Yes, that was me. See the tab at the "external links" tab at the top of the page. Click the "?" mark tab next to it to learn more about our use of subpages. Hope this helps. Chris Day 20:47, 28 February 2009 (UTC)

The workgroup categories in the Metadata template

I am curious as to why the Metadata template lists only the Mathematics and Pyschology Workgroups as categories. Statistics are utilized in many other disciplines as well ... for example, Healing Arts (Medicine), Economics, Engineering and Politics. I agree that Mathematics is certainly appropriate but should not the two others allowed by the Metadata template perhaps be reconsidered? Milton Beychok 17:41, 28 February 2009 (UTC)

Hi Milton. I added this to the introduction "Statistics is used in a very wide variety of fields. For example, statistics is used to develop and analyze psychological tests and public opinion surveys, in program evaluation to determine whether a program works or how it can be improved, in medicine with clinical trials to test the safety and effectiveness of new drugs, and in many other areas." Could you add something about how statistics is used in engineering, and any other discipline? I'm not sure how to add the workgroups, but adding others sounds good to me. Gene Shackman 20:16, 28 February 2009 (UTC)
Th assignment of workgroups is done in the metadata. The metadata can be seen by clicking the orange "M" at the top right of the talk page header. More information on that can be found by clicking the link to a description of the "article checklist". Chris Day 20:50, 28 February 2009 (UTC)

Okay to make major revisions?

I'm going through this article and thinking of revising this article quite a bit, to make it much more everyday, showing how it relates to everyday life, and make it more accessible to lay people, but still keep in some of the formulas and theory stuff too. I'd probably delete much of what is already here and rewrite. Is that okay with folks? Would folks prefer if I put a rewrite in some kind of sandbox first? Can someone let me know how to do that? Gene Shackman 03:51, 1 March 2009 (UTC)

Check out this subpage, Statistics/Advanced. I forget where the discussion is on the forum but this was also done for Quantum mechanics. Chris Day 04:18, 1 March 2009 (UTC)
The advanced statistics page looks like what the statistics page used to look like. Which one is the "statistics" page? I guess people would first go to the statistics page and then there would be a link to the advanced statistics page, right? So is it okay if I change this statistics page? Gene Shackman 05:33, 1 March 2009 (UTC)
I'd say go head. The advanced page is a subpage, you can see it as a tab at the top of the statistics page. It is the exact same content as the article before you started editing, I just copy and pasted it from the history. It does not exist as an article in it own right since it is part of the Statistics cluster of pages. They all share the same metadata (see Template:Statistics/Metadata). Chris Day 13:57, 1 March 2009 (UTC)
Somewhere, there should be a place for:
  • Do not use statistics as a drunkard uses a lamp-post: for support rather than illumination
  • Statistics are like a bikini. What they reveal is suggestive; what they conceal is vital.
There is the lovely operations research parable of where to armor a B-17. Trying to remember if I did write that up somewhere...
Apropose quantum mechanics, I have had it suggested that my attempts to set the timing and points on an engine followed the Uncertainty Principle. Electronic ignition is better.
Further, Schrodinger's Cat is totally unnecessary and inhumane. Only brief feline exposure is quite adequate that the velocity and position of a moving cat cannot be predicted. Confirming the legend, we have a small cutout next to one of the house doors, so, indeed, it is routine to see Cats Walking Through Walls. Howard C. Berkowitz 19:11, 1 March 2009 (UTC)

Good start

On rereading the introduction, I wonder if it's wise to be introducing "model" this early in the discussion. It's one thing to talk about descriptive statistics, statistical experiments in the sense of hypothesis testing/Type I/II error, etc., but the economy? I think of statistics as inputs into econometric, meteorological, or other models of complex systems, but let's not confuse those with an introduction to statistics.

Indeed, computer simulation mixes simulation and analytic modeling more than I find ideal. While I have some moderate experience with simulation software, I'd much rather stay discipline specific with modeling. In military work, the Lanchester equations are only a starting point, essentially a historical one. Of course, Murphy's Laws of Combat can never be ignored. Howard C. Berkowitz 20:30, 1 March 2009 (UTC)

I agree with Howard and I recommend deleting the idosyncratic reference to economic models. Although they make use of the results of statistical investigations (as do pharmacy and bridge design) they do not involve any statistics and are certainly not an example of its application.Nick Gardner 15:01, 28 June 2009 (UTC)

No consultation?

I think there should have been some effort to contact the original contributors of this article of what they would think of these major changes. CZ provides a method to contact authors via email as necessary. Everything I have contributed into the article with some others have been happily deleted. I think the best approach would have been to reach a compromise rather than deleting contents outright.

This article is sitting in a mathematics section of CZ, yet all of the maths and discussions of some of the abstract concepts that lie at the heart of statistics (and frankly provide justification for it). I think foundations are important and deserve some discussion in this article. Perhaps this article should be moved to another workgroup or since "Statistics is not mathematics" as some would claim (perhaps it isn't just mathematics, but see also my remark below), maybe a new CZ Workgroup called Statistics would be in order then?

I don't think statistics has quite the same relationship as physics to mathematics. Physics does not rely on mathematics to justify it's laws (e.g., Newton's law or the laws of quantum mechanics), at least not at the outset, but my feeling is statistics does. It relies on probability theory to give foundations to its methods. Without these justifications, then methods of statistics would all just be ad hoc and can be dangerous when applied in practice (due to unjustified conclusions). Take bootstrapping for example and check some of Brad Efron's early papers on this topic. Using any statistical method without some mathematical theory supporting it would be akin to selling snake oil.

The issue with statistics is that it is used in a variety of fields, including engineering (my field), so people of different backgrounds have different ideas about what "statistics" is. I had always appreciated the foundations of statistics because it gives me a sense of confidence in the statistical tools that I use and about conclusions that I can draw with them. Is there any other way one can achieve this? Hendra I. Nurdin 21:33, 1 March 2009 (UTC)

Good point that you should have been e-mailed. But do note that your version is still in the advanced subpage; tab at the top. At this point I'm not sure what to do here, that was my quick reaction to what i saw. The other option is to have the original article as the main one with a Student Level subpage. In many ways this will probably be a test case with regard to the level of the CZ audience. Possibly it is time to push this article towards approval and see where that takes us? Chris Day 21:59, 1 March 2009 (UTC)
Thanks, Chris. I did not know that. Yeah, I guess this is an important article and maybe time to get it polished. It would be great to have some statistician on board though, but don't see there are any around.Hendra I. Nurdin 00:17, 2 March 2009 (UTC)
Okay, I suppose I was rather quick to make changes. Next time I'll search for the original authors and send emails. Just to note, however, I asked a couple of times about making changes. See my note above:
"I'm also going through and thinking of revising this article quite a bit, to make it much more everyday, showing how it relates to everyday life, and make it more accessible to lay people, but still keep in some of the formulas and theory stuff too. Is that okay with folks? Gene Shackman 15:55, 28 February 2009 (UTC) "
and another one earlier today or yesterday.
My point of view was trying to make this article more concept driven, less formula driven, at least in the beginning. If you all want to revert it to the original, thats fine by me. Gene Shackman 22:59, 1 March 2009 (UTC)
Gene, I don't subscribe to deletions (unless it's something cranky) and obviously you've already spent some time on this. I think to just leave it as it is for now, and maybe I can add in some stuff to the article in the next few days -- it's just hard to find time for to do stuff on CZ these days. Of course, no one is stopping you from doing further work on the article. I did note your remark, but because I do not always go to CZ these days I did not see them. But if there was an email about it, I would have been able to get involved in the discussions.
One point is that the current article there's no discussion of why statistics is called statistics, and would is meant by a "statistic" as in the older versions, although examples are given such as mean, median, standard deviations. Thanks. Hendra I. Nurdin 00:17, 2 March 2009 (UTC)

Suggestion? There is a feature in "my preferences" that lets CZ email a user if someone makes a change to an article on your watchlist. I'm concerned that requiring users to contact past editors before making changes to an article would not encourage wiki collaboration. Users should feel free to edit any article that they feel they can improve. Of course large deletions such as this need to be explained on the talk page, as Gene did. D. Matt Innis 00:55, 2 March 2009 (UTC)

I put back in the stuff I took out (with a little changed headers). Gene Shackman 02:22, 2 March 2009 (UTC)
Incidentally, the discussions above prompted me to do a bit of research into whether statistics is considered a branch of mathematics or a separate discipline. It seems the consensus by professional organizations of statisticians is that statistics is considered as a separate discipline, like physics and chemistry. So, there should probably really be a CZ Workgroup setup called "Statistics" perhaps under the applied sciences category (like engineering). Well, I guess would be the first in an online encyclopedia like this, I don't see them doing that in Wikipedia. But until then, I guess it should be left under mathematics. A discussion of this could go into the article as well, it would make an interesting point. Hendra I. Nurdin 11:04, 2 March 2009 (UTC)
Math and Stats are different departments at UW-Madison. (https://www.math.wisc.edu/ and http://www.stat.wisc.edu/)Chris Day 15:29, 2 March 2009 (UTC)

Advanced subpage

This is the first article I have seen with an advanced subpage. I was concerned that statistics has a five paragraph introduction with one citation. I now see that the former citations are on the advanced page. Two questions 1) should we put an intro line on the main page to alert users that the advanced page exists? How do we decide what content goes on what page and prevent two parallel pages from developing. I have seen the basic versus advanced pages done well at wikepedia with quantum mechanics, but I suspect this requires much work for someone to curate. - Robert Badgett 02:35, 2 March 2009 (UTC)

I was thinking this could be a basic page about statistics, with a line near the top linking it to one or more advanced statistics pages, perhaps something like "many advanced methods are available, such as factor analysis, time series analysis, regression (others....)" with each of those linked to pages about those topics. That work? Gene Shackman 04:22, 2 March 2009 (UTC)

I think it will work if 1) the into is clear so an author-in-a-hurry doesn't write to the wrong page and 2) someone monitors to occasionally relocate content. - Robert Badgett 15:34, 2 March 2009 (UTC)

Organization of basic page

Gene, I think I see where you are heading and the direction seems good. Seems you could greatly redo the organization into basics, description, and inferential. The sections more introduction and more illustration seem to need more specific labeling and improved placement into the structure of the page. Ok by me to make big changes. - Robert Badgett

Intellectual confusion

To say that statistics is a branch of mathematics is like saying that words are a branch of literary criticism. Unless a clear distinction is maintained between what statistics are, and how they may be interpreted, logical confusion is inevitable. Can we please have a clear recognition that the task of collecting reliable numerical information - which has nothing whatever to do with mathematics - is the foundation of statistics, without which all else is futile - and something about how it is done? (Incidentally, the idea that economic modelling is a branch of statistical mathematics pushes intellectual confusion towards its outer limits!) Nick Gardner 21:48, 15 May 2009 (UTC)

Since nobody has come forward to defend the paragraph on economic modelling, I propose to delete it in 2 days time. Nick Gardner 16:23, 2 July 2009 (UTC)

New title

Rather than altering the existing text to include some of the more important non-mathematical aspects of statistics, I have altered its title and started a new article entitled Applied statistics. I did consider consulting those interested before making the change, but realised that my reasons for doing so could be effectively explained only by setting up an outline of the new article. The move can, of course, be reversed if it is generally considered to be objectionable, and I should then revert to the option of amending the existing text. I hope that this will not be necessary, however, and that some of the mathematics workgroup's statistics users will contribute to the new article. Some overlap is probable, but I don't consider that to be harmful. Nick Gardner 10:38, 27 June 2009 (UTC)

The draft article on Applied statistics is now complete, leaving this article free of any need to go beyond the abstract aspects of the subject. A more comprehensive treatment of those aspects would seem necessary, however.Nick Gardner 05:27, 11 July 2009 (UTC)

The lead: just describe something?

The first phrase says: "Statistics theory is a mathematical approach to describe something, predict events, or analyze the relationship between things". Then, is there any application of mathematics beyond statistical theory? Most of physics, in particular, appears to belong to statistical theory according to this "definition". Really? Just because it describes (or analyzes) something mathematically?! Boris Tsirelson 17:01, 4 November 2009 (UTC)

Hi Boris, that's a good point. And, also, I'm not a big fan of the name "Statistics theory", if you have some ideas as to how to improve the definition or even the name of this particular article, please do feel free to make improvements on this article. I am not aware of the name "statistics theory" being used widely for such a topic. Thanks. Hendra I. Nurdin 20:51, 5 November 2009 (UTC)
No, I do not want to change articles that are far from my competence. I have some idea of mathematical statistics, but no idea of the rest of statistics. I see that this article is not just about mathematical statistics.
"Everyone can interpret the statement that more people live in America than in France without the benefit of help from mathematicians - even if that statement is expressed numerically" (Nick Gardner 10:59, 28 February 2009). If this is indeed an example of a statistical fact, then I really do not know, whether there is a non-statistical fact, at all. Before changing the article we could reach here a consensus about several examples of facts within and beyond statistics. I mean, facts that are close to the boundary of statistics, from both sides of the boundary. Boris Tsirelson 12:40, 6 November 2009 (UTC)
What's the problem? If a statistic is not any fact that is expressed numerically, how else would you define the term? Nick Gardner 22:50, 6 November 2009 (UTC)
Well, below I propose a list of several facts; and I wonder, which of them are statistical, in your opinion. Boris Tsirelson 16:05, 7 November 2009 (UTC)
  • The mean distance between the Sun and the Earth is 149.6 x 109 m.
  • The Solar System contains 8 planets.
  • The mass of the proton is 1836 times the mass of the electron.
  • An oxygen atom contains 2 electron shells, of 2 and 6 electrons.
  • There are 4 fundamental interactions (electromagnetic, strong, weak, and gravitational).
  • There are 6 flavors of quarks (up, down, charm, strange, top, and bottom).
  • The proterozoic eon contains 3 eras (paleoproterozoic, mesoproterozoic, and neoproterozoic).
  • A man has 1 head, 2 arms, and 2 legs.
  • There are 5 regular convex polyhedra (Platonic solids: tetrahedron, cube, octahedron, dodecahedron, icosahedron).
  • π = 3.14159...
  • 2 + 2 = 4.
(if I may interject a response to the question that Boris has addressed to me)
The last 3 are abstractions (and I suppose that a quark might be considered an abstraction too). The others are statements of observed facts, so fall in my opinion in the statistics category. I accept that their inclusion is counterintuitive, but I cannot think of a satisfactory definition that would exclude them. I don't see how to handle the exclusion of any on the grounds of their triviality (how would you define it?). Can you suggest any other criterion that would exclude some of your examples? Nick Gardner 19:15, 7 November 2009 (UTC)
(also interjecting)
Being not a statistician, I'll try, still. Maybe statistics, just like mathematics, does not have its "ontological" domain in the Nature; rather, its definition should be "epistemological". No matter which numbers it takes, the matter is, what does it do with them. It has (probably) some specific ways of dealing with numbers (not excluding their gathering), and these ways are (somewhat) universal; that is, applicable in different situations. (Application-specific methods should belong to this application, not to statistics. The same for math...) Does it make sense? Boris Tsirelson 19:57, 7 November 2009 (UTC)
And if so, then the phrase "statistics is a scientific discipline that is distinct from mathematics, just like physics and chemistry" should be rather "statistics is a scientific discipline that is distinct from physics and chemistry, just like mathematics"! Boris Tsirelson 20:07, 7 November 2009 (UTC)
The fact that "A man has 1 head, 2 arms, and 2 legs" is of no interest to statistics just because it has nothing to do with it. Indeed, everyone can interpret the statement without the benefit of help from statisticians! Boris Tsirelson 20:10, 7 November 2009 (UTC)
(continuing interjection)
I agree with Boris. None of these statements is a statistical statement. Statistics is not equivalent to counting or measuring. Statistics is concerned with sets of data which may or may not be all known or observable.
"Our solar system has (at least) eight planets." is simply an observation.
"A typical solar system has eight planets." would be a statistical statement. Statistics has to provide means to decide whether such a statement is justified, and how it has to be interpreted.
Peter Schmitt 02:15, 8 November 2009 (UTC)
Who said again that there are lies, damned lies, and statistics? This saying shows that there is a kind of statistics that collects numbers (usually this is done by government agencies). The numbers can be recognized by their tendency to change suddenly at election time. And there is a branch of mathematics that nobody would accuse of telling lies. The connection between the two kinds of statistics is weak and consists of a few concepts like "mean value" and "standard deviation".--Paul Wormer 16:55, 7 November 2009 (UTC)
The source cited for the lede (Ref.1) says: "What is statistics? Statistics is the study of data, and how it can be collected, analyzed, and presented in order to answer questions pertaining to the world around us." This is something quite different from the first definition in the lede! (If it is the best definition may still be discussed.) But what is "statistics theory" meant to be? (I don't think that it exists.) The source of this problem is the move from Statistics. It is probably best to move it back, and to transfer sections which do not belong or fit into the core article to adequately named pages (or subpages?). Peter Schmitt 18:09, 7 November 2009 (UTC)
To justify changing the title back to statistics it would be necessary to repair the omission of the vital aspects of the subject that are set out in the article on applied statistics. I should have no objection if anyone wishes to do that Nick Gardner 19:27, 7 November 2009 (UTC)
I agree that this article -- in its current form -- is not a satisfactory core article. But I think that it is easier to start from this one -- adding material and removing (transfering) material -- than starting a new one, and having this as "Statistics theory" whose topic is unclear and which, if at all (because of "theory"), would have to be more theoretical than "Statistics". Peter Schmitt 02:03, 8 November 2009 (UTC)
I'm all for of the idea of reverting the name of this article back to "Statistics". The current name "Statistics theory" feels too much like an adhoc terminology not familiar to anyone in statistics and related fields. Well, ok, statistics is not mathematics, and probably should be placed in its own workgroup called rather than being lumped in mathematics (see earlier discussions above), but until someone initiates such a new workgroup (the article "Applied statistics" would probably make a home there as well), where it is now located seems to be the most appropriate. Hendra I. Nurdin 23:15, 9 November 2009 (UTC)
This or that name and/or workgroup, anyway, the lead could contain something like this: "Statistics gives us devices helpful in various researches when interpreting and using arrays of data that are too large for direct, unaided interpretation". Boris Tsirelson 10:56, 10 November 2009 (UTC)