CZ Talk:How to convert Wikipedia articles to Citizendium articles

From Citizendium
Revision as of 08:53, 28 April 2008 by imported>George Swan (→‎More on that pesky WP checkbox)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Shortcut to this page: CZ:WP2CZ

What about encouraging writing from scratch in this article? Being WP-inspired, WP-suggested, but independent and not WP-tied-to? I think that this may have some far reaching consequences. I put a longer post about it on the forum. --Alex Halicz (hello) 08:32, 15 February 2007 (CST) Ps. WP stands for Wikipedia

You're right! What do you think of the first section now? --Larry Sanger 09:04, 15 February 2007 (CST)

Exciting. That's it.--Alex Halicz (hello) 09:08, 15 February 2007 (CST)


I brought a couple infobox templates over from WP, now that I've read this article I think I made them in error, especially considering the number of dead branches they contain. What can be done about this?

They may be useful if you tweek them and remove the need for dead branches. We will need info-like templates for the future and some have been brought over. Chris Day (Talk) 18:10, 15 February 2007 (CST)

On importing

There are a few articles on Wikipedia I wrote from scratch. Should these be treated the same as other imported Wikipedia articles if I import them? Marielle Fields Newsome 21:54, 27 March 2007 (CDT)

Here's what I'd do: either simply mark the articles as being from Wikipedia, or (if you don't) then make quite sure that the versions you import were developed by you and you alone. So, omit any changes/additions made by others. But in any case, we would still ask you to improve the article, please, over what is on Wikipedia, because we do not simply want to be a mirror of WP. Thanks, and welcome! --Larry Sanger 22:43, 27 March 2007 (CDT)

Slightly related: If I created/edited an article on Wikipedia up to where I thought it sufficient; and any changes since then were either administrative (adding category), silly (changing "tire" to "tyre"), or downright bad, and I import the version of which I was the only author: do I then use the "imported from wikipedia" template, or do I use the "I released this article to the Wikipedia. In particular, the identical text that appears there is of my sole authorship. Therefore, no credit for Wikipedia content on Citizendium applies" template? Thanks. Gerald Zuckier 09:51, 23 August 2007 (CDT)
You use the "I released" one.  —Stephen Ewen (Talk) 10:57, 23 August 2007 (CDT)

Wikipedia article links

I'm planning to import a couple of articles that I've worked on and resume working on them here, having given up on it at Wikipedia some time ago. These articles, of course, are full of Wikipeida cross-links. Are there any general guidelines for handling these; and should they be included in this article?

The three obvious choices are to delete them, preserve them (which I presume means editing each one into an external link to Wikipedia), or make them CZ links (which will be mostly to nonexistent articles.)I can see advantages to all three. The last is a bit sticky, though, because a whole lot of unimplemented links make an article that simply is worse than one with links to something half-decent or better, like a WP article. Perhaps one should encouraged to go around finding WP links in articles an creating the proper CZ article?

[signed, belatedly, Daniel Drake 00:55, 30 March 2007 (CDT)]

Please do not make links to WP articles. See CZ:How_to_convert_Wikipedia_articles_to_Citizendium_articles#Interwiki_links. Afterward, you might discriminatingly add some. Stephen Ewen 01:07, 30 March 2007 (CDT)
I'm in the same boat. I've contributed many articles to Wikipedia, and (in my own humble opinion, of course) they were well written compared to what could generally be found on that project. I plan to import a few of the most complete articles wholesale (I've done all the tweaking I would be inclined to do on the Wikipedia version). Cheers! Brian 22:43, 30 March 2007 (CDT)

There isn't any point to our hosting copies of Wikipedia articles that you aren't planning on improving further. CZ isn't a mere mirror of Wikipedia articles. Since CZ hasn't been set up to be an approval service for Wikipedia articles, and since approval in any case requires that we have many more active editors than we have now, it does not make sense (right now, anyway) to upload articles here for purposes of getting them approved. All this said, I have considerable skepticism about anyone who says he cannot substantially improve any piece of prose he's written, particularly something as free-form and expandable as an encyclopedia article. Can't you virtually endlessly improve your text? I find I can mine, particularly longer texts. --Larry Sanger 23:21, 30 March 2007 (CDT)

I've found I can endlessly tweak any piece of prose I've written, but that does not necessarily add up to "improving" it. Some of the articles I've done on Wikipedia were stubs (and some remain stubs). Others were labors of love for subjects dear to my heart, into which I've already invested substantial sums of time. There are only a handful (maybe a half dozen) for which I would claim the subject is covered completely and no further work need be done, but I see no logic in avoiding the importation of that handful simply because they are already as complete as I feel I am capable of making them. Cheers! Brian 00:56, 31 March 2007 (CDT)
Brain - and do see the important email I sent you earlier - see CZ:Introduction_to_CZ_for_Wikipedians#Citizendium_is_NOT_a_mirror_of_Wikipedia for a solution to your situation over the "maybe a half dozen". Make sure you at least follow all of the mechanical stipulates mentioned at CZ:How_to_convert_Wikipedia_articles_to_Citizendium_articles. Still, to the matter of improvability, see Get ready to rethink how to write encyclopedia articles. You might consider importing one at a time and rethinking like that describes, and see what kind of improvements you then feel free to make. You might be pleasantly surprised at what you can do given how CZ differs from WP. --- Regards, Stephen Ewen 03:26, 31 March 2007 (CDT)

Keep them

I'm new in this project, but I have contributed to Wikipedia before. I really like the idea of this project. However, I was astonished to see that interwiki links has the rule: "Delete all of these". I think it is one of the strongest characteristics of Wikipedia, for the following reasons:

  • They enable a quick glance of terminology you don't understand in an article, in order to understand what you are reading. Readers have various knowledge about the subject, and having interwiki links enables even the dumbest people to read an article. For really complicated subjects, that is also an advantage in Citizendium.
  • They enable associating around in the encyclopaedia; You read an article until you find an interesting link. Then you keep reading that article until you find the next interesting link etc. etc. Connections between articles make the encyclopaedia similar to a brain, with its connections between neurons. Thus, associating around feels like a more natural way to establish brain connections. In other words, it feels like a more natural way to learn.
  • They give an overview of what more the encyclopaedia offers, if the reader wants to know more about related topics.

In short, I say we keep them, at least those that point to subjects of any significance.

I'm aware of, however, that keeping links is a work requiring a lot of effort to do. As stated, Citizendium articles will have different titles from Wikipedia articles. Nevertheless, finding the corresponding Citizendium article is worth the effort, because it gives a good overview of the subject. And if there isn't any such Citizendium article yet, then I don't see how red links do any harm. Mikael Häggström 11:44, 25 July 2007 (CDT)

The reason that we decided (months ago) to delete these is quite simple: they will be completely useless for a very long time, and we cannot predict what the names of articles in other languages will be, either. The point is simply that the effort does not reward the harm of the misleading red links. --Larry Sanger 11:53, 25 July 2007 (CDT)

Also, I wonder if perhaps it is unclear that by "interwiki links" we mean links to articles in other languages, not internal links to other articles in English. The latter we certainly want to keep. --Larry Sanger 11:55, 25 July 2007 (CDT)

Oh, then I'm sorry I've misunderstood - I thought the latter was meant too. Perhaps that difference should be noted, in case somebody else misunderstands, and deletes more links than is necessary? Mikael Häggström 12:00, 25 July 2007 (CDT)
Good catch Mikael, You want to do the honors. --Matt Innis (Talk) 12:02, 25 July 2007 (CDT)

Wikipedia credit checkbox

Of course when we import an article from WP, we check the box with the submission. Then we start the major improvements that we (for certain values of we) never made on WP because it looked to be not worth the trouble in that environment.

And we insert the Wikipedia template at the end, giving proper credit in a standard, clean form. Or do we? I see that in Tycho Brahe it was inserted and then reverted on grounds that it didn't seem to work; but it looks good enough to me, perhaps because I don't know what it ought to do.

At that point, as we make further massive changes, is everyone expected to check the Wikipedia box when submitting? Or is that to be done only when importing more from WP? My instincts say the latter; is the box checked by default just as a matter of excess of caution, or is there some other reason?

(Oh, and is this the place to pester with newbie questions about the details, or is there a better one, like Forums?) Daniel Drake 11:16, 1 April 2007 (CDT)

If you check the box once it should stay checked until unchecked by soemone. If the final article retains one sentence from the WP article, the box needs to stay checked. Stephen Ewen 12:23, 1 April 2007 (CDT)
Actually, (IP Lawyer hat on) I'd say it would have to stay checked unless all semblence of the Wikipedia article has been erased, including cadence and structure. Unless such a complete change is made, at least some part of the new article may still be considered a derivative work of the old, which would still require attribution under Wikipedia's GFDL. (IP Lawyer hat off) Cheers! Brian Dean Abramson 19:50, 1 April 2007 (CDT)
Right, the permanent need for attribution is clear enough. What's not clear to a newbie is exactly what constitutes attribution for the purpose. For instance, in borrowing some GFDL text for my own use, I think (not having read the gfdl in a couple of years) that I'd be content, and feel safe, to put a highly visible notice in the product -- including the requisite link to the original. My notice would probably look much like what's produced by the existing Wikipedia template. My question was essentially, "Is that notice adequate for CZ's purposes?" and the answer appears to be "No; the checkbox is also required on every submission forever." I'm mildly interested in why not: what extra purpose the checkbox serves. I'm mildly concerned if CZ would find itself in legal jeopardy because some author, adding something of his own to an article, carelessly or ignorantly or accidentally or maliciously happened to turn off the checkbox when submitting. ...said Daniel Drake (talk) (Please sign your talk page posts by simply adding four tildes, ~~~~.)
The WP template was redundant and has actually been deleted. The checkbox suffices in so far as notice is concerned, we think, at this point. As for legal jeopardy, I think your core question is good. I think any jeopardy is very much contingent upon CZ's response should a CZ user fail to check it when it should have been, after clear and multiple and repeated notices, including each time the edit box is used where no case could be made that it was missed. In that case, I think CZ itself becomes a very poor and very unlikely target. But innocent mistakes may happen. Stephen Ewen 03:49, 3 April 2007 (CDT)
Let's not worry about legal jeopardy. There has never been a successful lawsuit attempted in this line of work. Moral issues should bother us more. On the specific issue, a little box somewhere saying parts of some articles are derived from Wiki should solve the legal and moral problems. Richard Jensen 01:54, 19 April 2007 (CDT)
If Brian is still wearing his lawyers hat, can he please give some legal citations for his opinion. Richard Jensen 03:41, 17 April 2007 (CDT)
Umm, speaking of innocent mistakes, I'm assuming that the deletion of a bunch of text at the (former) end of the article was inadvertent. Not really important text, but restored here for faithfulness of archive. Remembering my signature this time, Daniel Drake 01:47, 19 April 2007 (CDT)
Oh, the hat never really comes off. I'll go with one citation - 17 U.S.C. sec. 106 (2), granting the owner of copyright the exclusive right "to prepare derivative works based on the copyrighted work". The degree to which works are derivative of their predecessors is something of a "know it when you see it" judgment, but the point to keep in mind is that even in a 50 kb article, a single paragraph clearly derived from a single paragraph of the Wikipedia work requires attribution under the GFDL. Cheers! Brian Dean Abramson 02:12, 19 April 2007 (CDT)
CZ is employing both GFDL and Fair use provisions. Fair use allows us to make deriviative work --without fair use CZ would be quite impossible. We are not limited to the GFDL, of course--CZ never signed any agreement to taht effect. Richard Jensen 22:17, 19 April 2007 (CDT)
True, so long as the four-factor test for fair use is properly applied - which should cut strongly in our favor, as 1) we're a non-profit educational endeavor; 2) Wikipedia is also a non-profit (so they will not lost any appreciable revenue due to our copying); and 3) the work is heavy on fact and light on creativity. This leaves factor 4, the amount of work used. Based on what I've read above, it seems we're going to lean away from copying wholesale and as far as possible in the direction of generating our own unique content, and editing borrowed content to different standards, so our Wikipedia pickings should be light enough to call them fair use even outside the GFDL. Cheers again! Brian Dean Abramson 02:44, 20 April 2007 (CDT)
Invoking fair use still does not negate the need for attribution. Stephen Ewen 03:46, 20 April 2007 (CDT)
But it does negate the need for permission. Of course, with everything from Wikipedia being under the GFDL, that's not an issue really. Brian Dean Abramson 05:24, 20 April 2007 (CDT)
Need for attribution: there is no need under fair use rules to attribute facts or well known statements, nor small snippets. Wiki for example usually does not attribute its sources, which are usually a mystery. Richard Jensen 23:45, 20 April 2007 (CDT)
Facts are in the public domain, but prose describing facts is subject to copyright, although that protection is narrow - what constitutes a derivative work of a description of facts will naturally be more narrowly construed than what constitutes a derivative work of a purely creative product. To the extent that a one work may be deemed derivative of another, providing a proper attribution is not required to support a fair use claim, but it tends to disprove intent to pass off the derived work as wholly original. And yes, Wikipedia often fails to cite sources - not a good practice, and one they're working on. Cheers! Brian Dean Abramson 00:00, 21 April 2007 (CDT)

WP author template

The text regarding WPauthor template is somewhat confusing. We learn WPauthor applies whenever

  1. the (version of the) article imported from WP was created on WP solely by the importing author.
  2. the (version of the) article imported from WP was created mainly by the importing author.

This is somewhat contrary to the original purpose of introducing the template. I suggest creating different templates for the different situations.

  • Ad.1. In case of full authorship, the WPauthor template applies. It should be considered as "obligatory" ("do not forget"). It assures that no WP credit is needed. Further, it allows some easy/semi-automated test whether we correctly tag WP-imported articles.
  • Ad.2. ("10-99% authorship") we put "WPimported" template (to be created) that would declare the "main author" on WP that wants to maintain the article on CZ. In this case WP credit applies (until the text is entirely reworked, i.e. there is no more edits by other Wikipedians, say). The message just identifies the interested author and personal motives for import ("I want to maintain it here"). Actually, this is not that "obligatory" but "helpful" -- we have policies concerning external articles. It might be useful in case the article is considered to be deleted (if still external).

I (hope it not too bold of me to) modify the article accordingly. Aleksander Stos 12:51, 13 August 2007 (CDT)

More on that pesky WP checkbox

In this section

Are you the main author of that Wikipedia article?

we read this: "Second, do make sure that the "Content is from Wikipedia?" box is checked if you or anyone else added any other Wikipedia content to the version you've uploaded—even one comma necessitates checking this box."

Really? Are we sure? Grammatical/typographical/common knowledge/common phraseology impossible to paraphrase means an author has to tick the "from WP" box? Surely not?

Aleta Curry 16:21, 13 February 2008 (CST)

Most definitely not. It's de minimus. Stephen Ewen 20:29, 13 February 2008 (CST)
Stephen is definitely right here. The WP content in question has to be copyrightable. You cannot copyright the spelling of a word, a fact, or a punctuation convention, for example. James F. Perry 20:39, 13 February 2008 (CST)

You may feel free to change this, but I would not be so certain as Steve is here. The point is that unless we set a clear standard, we will find ex-Wikipedians saying, "I was the main author. Other people added a lot here and there, but I'm the main author--they don't really matter." Well, that's a matter of opinion. In fact, we already have seen ex-Wikipedians say that about their CZ contributions. If we simply say, "Any smidge of contribution from another person means we must give credit," then the matter is perfectly clear and not debatable. If we leave room for "wiggle room," people will take it, even when they shouldn't. I think it's better to err on the side of giving credit where it's due.

In any case, the wording in terms of "commas" could be changed. --Larry Sanger 20:45, 13 February 2008 (CST)

De minimus. That's what I was going for. Couldn't think of the phrase to save my life.
I understand the point you're trying to make, Larry, but surely the addition of "excepting grammatical and typographical errors, and terms and phrases which are clearly de minimus (or some such) would cover it to all reasonable intents and purposes for all reasonable people?
Aleta Curry 21:50, 13 February 2008 (CST)
Go to town, Aleta! Just try to use a wording that will cover unreasonable people too.  :-) --Larry Sanger 12:20, 14 February 2008 (CST)
I just re-read this policy. And I was alarmed by the "even a comma" phrase. I thought I had been sufficiently careful. But I haven't complied with even a comma.
I am relieved that this passage of the policy is going to be rewritten. Can I suggest an alternate wording? Instead of saying:
"Second, do make sure that the "Content is from Wikipedia?" box is checked if you or anyone else added any other Wikipedia content to the version you've uploaded—even one comma necessitates checking this box."
perhaps we could have something like:
"Second, do make sure that the "Content is from Wikipedia?" box is checked if anyone other than yourself added any intellectual content from the Wikipedia version to the Citizen version you're working on."
My understanding is that spelling corrections don't count as intellectual content.
My understanding is that correction or refinement of wikilinks doesn't count as intellectual content.
My understanding is that if someone rewords a single sentence from a version we were the sole author of, that rewording would count as "intellectual content". My reasoning -- we avoid violating others copyright by paraphrasing what they wrote. In order to paraphrase what they wrote we have to first understand it. Our paraphrase becomes new intellectual content. This protects our paraphrases. But it bites us here. If 90% of the sentences in a version on the wikipedia are ours, and someone else reworded 10% of the sentences, we would have to restore our wording of those 10% of the sentences.
About commas -- there is a book with an interesting title "Eats shoots and leaves". The title concerns how the addition of commas can completely alter the meaning of a sentence -- that the two following sentences have completely different meanings:
  • "Eats, shoots, and leaves."
  • "Eats shoots, and leaves."
It is interesting, but I suspect that instances like this are going to be so rare they should be ignored.
I think Feist Publications v. Rural Telephone Service is worth reviewing in this discussion.
Cheers! George Swan 09:53, 28 April 2008 (CDT)