This is an old revision of this page, as edited by Hesperian (talk | contribs) at 01:43, 23 June 2009 (→Anybot's algae articles: reply KP). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Revision as of 01:43, 23 June 2009 by Hesperian (talk | contribs) (→Anybot's algae articles: reply KP)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)Anybot's algae articles
AfDs for this article:- Anybot's algae articles - see User:Anybot/AfD for a list
Anybot created 4092 algae articles by scraping information out of the AlgaeBase database, and formatting it into articles. In doing so, it introduced numerous serious errors into more-or-less every article. Common errors include:
- basic taxonomic errors, such as calling a cyanobacteria an algae (a bit like calling a plant an animal, only much much wronger);
- writing articles about extinct taxa as though they were extant;
- descriptions that don't distinquish between the many different phases of the algal life-cycle, falsely implying a single generation or alternation of generations;
- misuse of descriptive terminology; for example, 69.226.103.13, who appears to have expertise in this area, refers to
"Phormidium, a cyanobacterium is described as having a crustose thallus. The term filamentous used in articles about cyanobacteria should be carefully distinguished as a bacterial colony's sheath. However, since our cyanobacteria articles make them eukaryotes the reader may not understand this is a bacterial colony not a multi-cellular organism with undifferentiated tissue (a thallus)." - incorrect species counts, partly caused by the false assumption that the number of species names recorded in AlgaeBase equals the actual number of accepted species, but partly inexplicable;
- incorrect and contradictory taxonomies, partly due to AlgaeBase itself being outdated, but partly inexplicable;
- creation of articles on names listed in the database as synonyms for other taxa;
The task of checking and fixing these articles has mainly fallen to 69.226.103.13, who has stated "Every one I investigated contained serious misinformation, except for those that had been later edited by other writers... There are so many errors and so many different types of errors that it is impossible to address each one other than by individually editing each article. I don't write science articles without checking sources. It would take me hours to verify each one." As with all of our articles, these articles are coming up at the top of Google searches; we are misinforming people who don't know better, and putting our incompetence on display to those who do. Our reputation is on the line here, folks.
There seems to be consensus at Misplaced Pages talk:WikiProject Plants#Algae articles AnyBot writing nonsense that these articles are unsalvageable unless we can find a phycologist willing to donate tens of thousands of hours of time; and even then it would be quicker to delete the articles and start again from scratch. The coder of the bot is presently very busy, and his/her response to this has been lukewarm at best. Specific errors that were pointed out a few months ago still have not been fixed.
The full list of articles nominated for deletion is at User:Anybot/AfD. If you are willing and able to rescue any, by all means do so, and then remove them from the list. Those that remain on the list at the end of this AfD should be deleted. I will personally commit to restoring any articles that should not have been deleted because they had already been corrected or verified as correct. Hesperian 00:28, 22 June 2009 (UTC)
- Delete all. (nominator) Hesperian 00:28, 22 June 2009 (UTC)
- Note: This debate has been included in the list of Organisms-related deletion discussions. —TexasAndroid (talk) 03:35, 22 June 2009 (UTC)
- Note: This debate has been included in the list of Science-related deletion discussions. —TexasAndroid (talk) 03:35, 22 June 2009 (UTC)
Question I see it mentioned at Misplaced Pages talk:WikiProject Plants#Algae articles AnyBot writing nonsense that some articles were pre-existing, and were then over-written by the bot. Are all of those articles included on this list too? -RunningOnBrains 06:21, 22 June 2009 (UTC)
- No; these are articles that the Anybot created (i.e. made the very first edit). Hesperian 06:24, 22 June 2009 (UTC)
- What about redirects? These are not listed on the page of anybot's articles. Some of the redirects are not to the correct article, because anybot did not distinguish synonyms-sorry, can't find an example. --69.226.103.13 (talk) 16:53, 22 June 2009 (UTC)
- You want some examples? I noticed that there are quite a few inappropriate redirects to Ulva. See here and look towards the bottom of the page. --Kurt Shaped Box (talk) 20:47, 22 June 2009 (UTC)
- Also, what links here for Palmaria (a disambiguation page). --Kurt Shaped Box (talk) 20:55, 22 June 2009 (UTC)
- You want some examples? I noticed that there are quite a few inappropriate redirects to Ulva. See here and look towards the bottom of the page. --Kurt Shaped Box (talk) 20:47, 22 June 2009 (UTC)
- What about redirects? These are not listed on the page of anybot's articles. Some of the redirects are not to the correct article, because anybot did not distinguish synonyms-sorry, can't find an example. --69.226.103.13 (talk) 16:53, 22 June 2009 (UTC)
- No, I'm not asking for examples. Anybot's redirects, and there may be thousands, are not on the list Hesperian created. What will be done with these, will they be deleted also?
- For example, and this may be one of the single worst articles anybot created, Leptophyllus was a redirect anybot created to Abedinium. The taxonomy box lists Abedinium as belonging to a diatom order (Brachysiraceae, an easy to recognize diatom order) in an obvious and familiar red macroalgae class (Gigartinales). However, in spite of our highly unusual taxonomy in the wikipedia article, Abedinium is a dinoflagellate in the order Noctilucales. I'm concerned that leaving the redirects will keep pages like this in search engine caches.
- PS I deleted the taxobox so you have to look in the history to see it. Also, this is why we cannot just keep articles that have been edited by humans, each one has to be checked. Like the IP edited articles these were edited by two competent human editors but not for the most egregious errors, only for wikipedia style matters. --69.226.103.13 (talk) 21:58, 22 June 2009 (UTC)
- Let's let this AfD run its course. When we're done here, I'll produce a list of redirects and we can go around again. Hesperian 23:42, 22 June 2009 (UTC)
- Delete all. Any mass creation of material that puts Misplaced Pages into disrepute as an innaccurate source of information should be reverted/deleted/removed SatuSuro 06:34, 22 June 2009 (UTC)
- Delete all. We need to get these off of Google and the WP mirrors ASAP. As 69.226. states here, some of these are coming up as the top/sole Google hit. During the course of the discussion of this someone (I can't seem to find the exact comment now) stated that they were an teacher of some denomination and had discovered this issue after one (several?) of his/her students had handed in an assignment containing a howling WP-sourced factual error. This shouldn't be happening. I'd also suggest that in future, anybody considering running a bot creating or editing articles in highly-specialized and arcane fields such as this one should endeavour to get an expert onboard to consult with, before unleashing the bot full-throttle. As I understand it, the bot operator in this case is somewhat knowledgeable in the area but missed blatant errors early on that if spotted, would've avoided this entire situation. --Kurt Shaped Box (talk) 10:57, 22 June 2009 (UTC)
- Question for nominator - have you removed from the list any bot-generated article since edited by User:213.214.136.54? As far as I know, these are now correct. It might also be an idea to get someone with a bot to AfD tag all the affected articles. There may be someone with one or more of these on their watchlist who could help with fixes, if made aware of the problem. --Kurt Shaped Box (talk) 11:05, 22 June 2009 (UTC)
- No, I haven't. I'll ask 213.214.136.54 is s/he is willing to vouch for the ones s/he has edited. Hesperian 11:16, 22 June 2009 (UTC)
- The IP only edited higher level taxonomies in the boxes. If you can generate a list of their edits I can edit the articles, maybe other writers could help. With the Chromalveolates I may have to stubify. I did edit one article, but undid my edit, because it would be a lot of work to edit these articles to a vouchable point, a couple of hours per article at least. --69.226.103.13 (talk) 17:21, 22 June 2009 (UTC)
- Okay, here's a list of 213.214.136.54's edits, as generated by ContributionSurveyor - if that's of any use to you... --Kurt Shaped Box (talk) 21:45, 22 June 2009 (UTC)
- Okay, this list is useful. It shows some underlying problems with wikipedia algae articles that need fixed first. This list could be used to stubify its articles with a bot under some guidelines: pick diatoms off by division/phylum (or both in some unfortunate cases, or class also). Have plant and protist editors pick the current higher level taxonomy for the Chromalveolata, and for the rest of these organisms (single-celled photosynthetic algae and their closely related non-photosynthetic taxon-mates), then run a bot (prefer an existing bot than anybot) to pull the class from the taxobox and rewrite the single sentence to "Thisgenus is a diatom." Leave that sentence, the taxonomy box, the link to algae base, and, to make it easier for other editors, categorize by family, order, class in that order of preference, as a stub. Check that a taxonomy box does not contain both a division and a phylum. A problem that was not evident earlier is that older higher level taxonomies from 2003/04 are different from later taxonomies. It seems a phycologist comes in every two years and uses their own taxonomy. One has to be picked for an encyclopedia. --69.226.103.13 (talk) 22:34, 22 June 2009 (UTC)
- Okay, here's a list of 213.214.136.54's edits, as generated by ContributionSurveyor - if that's of any use to you... --Kurt Shaped Box (talk) 21:45, 22 June 2009 (UTC)
- The IP only edited higher level taxonomies in the boxes. If you can generate a list of their edits I can edit the articles, maybe other writers could help. With the Chromalveolates I may have to stubify. I did edit one article, but undid my edit, because it would be a lot of work to edit these articles to a vouchable point, a couple of hours per article at least. --69.226.103.13 (talk) 17:21, 22 June 2009 (UTC)
- No, I haven't. I'll ask 213.214.136.54 is s/he is willing to vouch for the ones s/he has edited. Hesperian 11:16, 22 June 2009 (UTC)
- Another question (sorry - last one!) - what should be done with the pages currently stored in Anybot's userspace? --Kurt Shaped Box (talk) 11:14, 22 June 2009 (UTC)
- Google isn't scraping them, so they are low priority. I would be inclined to let the bot owner do whatever s/he wants with them. Hesperian 11:16, 22 June 2009 (UTC)
- Delete all. We have a duty to remove misinformation when it comes to our attention. In a case of this scale, deletion is required to achieve this in the most thorough and timely manner available. Melburnian (talk) 12:25, 22 June 2009 (UTC)
- Delete, and de-approve Anybot. Clearly, this is a mess. Stifle (talk) 13:16, 22 June 2009 (UTC)
- Delete all, for the reasons given above.--Curtis Clark (talk) 13:22, 22 June 2009 (UTC)
- Delete all that contain errors, improve bot code in consultation with 'experts', and run bot again. I had been unaware of the lengthy discussion at WikiProject Algae; first I should note that the original version of the bot contained some errors, which a later version of the bot corrected as soon as they were pointed out. The original version seems to have been run since April, replicating some of the errors, which has inflamed the discussion.
Now, in my opinion, articles that contain small errors (e.g. the wrong tense) but cite a reliable source are better than no article at all - and if all such pages were deleted from WP the encyclopaedia would probably shrink by a factor of two. As evidenced by the work of some dedicated IP editors, the existence of a skeleton article is often the seed from which a useful and correct article is developed. And as all of the articles use information attributed to a reliable source, it is possible for people to check the data against the facts (no-body should ever use WP as a reliable source in itself). Again, this makes the articles more useful than many other unsourced articles on WP.
However, I am embarrassed that wide-spread errors do exist. Systematic errors - such as the use of 'alga' instead of 'cyanobacterium' - are very easy to fix automatically. If I had a list of the errors that have been spotted, so that I could easily understand what is said that is wrong, and what should be said, I could re-code the bot until it got everything right, and then put it up for retesting (hopefully it is now notorious enough that people will be willing to check its output). At that point it would be possible to run the bot again and create error-free articles. In the meantime, perhaps it is a good idea to delete articles which contain factual errors. (I will never support the deletion of any article which details a notable subject, and contains factually correct information attributed to a reliable source.)
I think that the worst case scenario would be to delete articles willy-nilly and thereby deplete WP. We have the potential to use the Algaebase material to generate useful information - if it's not entirely up to date, then neither are most text books; and if the classification needs systematically updating, the bot can do that as taxonomy is updated. If this is done regularly, WP can keep up to date and become as useful a resource as Algaebase is today. Let's be careful to produce the best quality output we can before the deadline. Martin (Smith609 – Talk) 13:39, 22 June 2009 (UTC)
- I responded to this post on this page's discussion page, in length, repeating much I said on the WP:Plants page, the bot owner's page, and the bot's error reporting page. --69.226.103.13 (talk) 16:49, 22 June 2009 (UTC)
- Delete all and scrap the bot. per nom. Niteshift36 (talk) 13:41, 22 June 2009 (UTC)
- Keep the ones where the bot is not the only editor. Revert to the last non-bot edit for those created by humans, and delete or stubify the rest. This "kill-em-all" approach is not appropriate for an academic community. People have spent hundreds of hours creating some of these articles, improving others, and fixing the mistakes of the bot. We can't just get rid of all this good, solid content just because it's simpler to delete everything rather than be a bit more selective. Owen× ☎ 13:42, 22 June 2009 (UTC)
Those articles are not being nominated, see above.cygnis insignis 13:55, 22 June 2009 (UTC)- You are assuming that any subsequent edit is a fix. The truth is, people (and other bots) edit articles for all sorts of maintenance and cosmetic reasons. http://en.wikipedia.org/search/?title=Zygosphaera&action=history ? Hesperian 23:51, 22 June 2009 (UTC)
- Comment I noticed this bot created subpages, linked from the posts talk pages, when it found existing articles. The pages that aren't useful, such as the one announced on Talk:Amphibolis (a plant in this example), are potentially distracting and should be unlinked. I also can't see a reason for maintaining erroneous information in user space, the bot could restore improved versions as easily as it created them, the community should agree to their deletion too. Anyhow ... delete all those nominated above, for the reasons given above. cygnis insignis 13:55, 22 June 2009 (UTC)
- If possible, delete only articles edited by the bot alone. If this task can't be automated, I am willing to offer my admin services at the conclusion of this AfD. Otherwise, delete all un-vouched-for on the list. Also, make sure that the bot's over-writing of articles is reverted. -RunningOnBrains 14:37, 22 June 2009 (UTC)
- I am willing to edit articles edited by other writers if a list can be made. I may not be able to edit the Chromalveolates and there were some protozoa that I probably cannot touch. --69.226.103.13 (talk) 16:49, 22 June 2009 (UTC)
- @Runningonbrains: It isn't all that difficult to generate a list of article edited by Anybot alone. But the problem with this proposal is that a great many edits get made to articles for purely cosmetic purposes. Therefore one cannot assume that an article has been fixed and/or verified merely because someone else has edited it. See, for example, http://en.wikipedia.org/search/?title=Zygosphaera&action=history. You would keep this article? Hesperian 23:47, 22 June 2009 (UTC)
- Gasp! BAG approved the creation of crap? Delete all. I will edit the Chromalveolates that are salvageable--the IP list. If it's decided to delete them can they be posted to my user space in some way so I do not have to retype the taxoboxes? It's a shame to have an IP do a lot of work correcting wikiGarbage, then have the corrections deleted. I assume the list will be the photosynthetic Heterokonts and dinoflagellates, and I have no problem with editing these articles. No panic, Curtis, I'll use Lee, not Cavalier-Smith. I'll do it over the summer and start in a couple of weeks. I've been ill and had a family emergency that is slowly resolving. Also, the listed problems with the bot were discovered in its trial phase by an editor, who alerted me, and I ignored him/her based on an extraneous issue, then never got back around to looking at these articles. However, BAG told me to shut up, and I have been rather busy. My bad, but, bots do not need to be creating this many articles without specific approval and monitoring throughout. This is what comes of self-elected closed user groups: they decided to create these articles. --KP Botany (talk) 01:33, 23 June 2009 (UTC)
- Re: "It's a shame to have an IP do a lot of work correcting wikiGarbage, then have the corrections deleted." I agree. As soon as the IP editor tells us that they consider the articles they have edited to be fixed, rather than merely fiddled, I'll remove them from the list.
- I suggest you proceed as follows:
- identify the articles you want to work on;
- remove them from User:Anybot/AfD, so that they are not deleted as a result of this discussion.
- if it is not appropriate to leave them where they are whilst you are working on them, move them into your userspace with an edit summary that cites this discussion (it won't take long for someone to detect and delete the cross-namespace redirects that you leave behind.)
- Hesperian 01:43, 23 June 2009 (UTC)