Misplaced Pages

talk:Naming conventions (standard letters with diacritics): Difference between revisions - Misplaced Pages

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 19:40, 14 February 2007 editRarelibra (talk | contribs)Extended confirmed users10,991 edits Better slightly late than never← Previous edit Revision as of 06:39, 15 February 2007 edit undoPmanderson (talk | contribs)Autopatrolled, Extended confirmed users, Pending changes reviewers62,752 edits Better slightly late than neverNext edit →
Line 913: Line 913:


:::I am with ]. I personally think that to use "Stanislaw Ulam" instead of "Stanisław Ulam" (with a redirect of "Stanislaw Ulam" to the proper name) is both a disservice to the reader/researcher and is misleading, at best. When a reader is properly redirected to the name that includes proper diatrics, the user is further educated as to the real name without argument of form, type, usage, or any other argument that "popularity contests" present within wikipedia. Yes, most readers will not know what diatric to use (if any), that is what redirect is so useful for. As a geographer, I would rather have reference to the proper name in order to 'get it right the first time' then to wonder what the name really is (there is ongoing debate/clarification with regard to Mongolian names right now). Yes, one could argue that "Stanislaw Ulam" is correct in English - but actually it isn't. The proper translation, if we are proposing such usage, is "Stanley Ulam" - this, of course, takes away from the actual name and is the disservice that is being mentioned. Therefore, I think the best approach is to use such diatrics. ] 19:40, 14 February 2007 (UTC) :::I am with ]. I personally think that to use "Stanislaw Ulam" instead of "Stanisław Ulam" (with a redirect of "Stanislaw Ulam" to the proper name) is both a disservice to the reader/researcher and is misleading, at best. When a reader is properly redirected to the name that includes proper diatrics, the user is further educated as to the real name without argument of form, type, usage, or any other argument that "popularity contests" present within wikipedia. Yes, most readers will not know what diatric to use (if any), that is what redirect is so useful for. As a geographer, I would rather have reference to the proper name in order to 'get it right the first time' then to wonder what the name really is (there is ongoing debate/clarification with regard to Mongolian names right now). Yes, one could argue that "Stanislaw Ulam" is correct in English - but actually it isn't. The proper translation, if we are proposing such usage, is "Stanley Ulam" - this, of course, takes away from the actual name and is the disservice that is being mentioned. Therefore, I think the best approach is to use such diatrics. ] 19:40, 14 February 2007 (UTC)
::::I chose Ulam as an example, because he himself, his wife, his co-worker ], and his colleagues ''all'' spell the name as ''Stanislaw'' in English, as his autobiography attests. Rarelibra proposes, in such cases, to lie to the reader, because it's somehow politically correct, and will be "educational"; I do not. ] <small>]</small> 06:39, 15 February 2007 (UTC)

Revision as of 06:39, 15 February 2007

So what actually is this saying?

Noticeable by its absence is any indication of what the author actually wants to do with diacritics. Are we intending to encourage or discourage their use? FWIW my opinion is that articles should have titles which are correct, and alternate spellings should be catered for using REDIRECTs labelled with {{R from title without diacritics}}. HTH HAND —Phil | Talk 11:16, 30 January 2006 (UTC)

I concur with your proposal. —Nightstallion (?) 12:34, 30 January 2006 (UTC)
I think that it says that all diacritics used in modern languages should be used, while some diacritics typical only to old, inactive languages that have new form of spelling now should be avoided. It also excludes languages using other than Latin alphabet. The proposed label might be added to the policy.--SylwiaS | talk 19:35, 30 January 2006 (UTC)
I fully agree. For example, "Jaromir Jagr" is an incorrect spelling; when it is used, it is only for technical reasons. Misplaced Pages is supposed to be an encyclopedia, so it should give correct information. - Mike Rosoft 01:35, 8 February 2006 (UTC)
Ditto. Priority in choice of spelling for a proper name (e.g., of a ruler or other person, or of a geographic entity) should be given to that regarded as correct by the people with whom the name originates. That is the practice in serious scholarship, and we should not patronize any parties involved by talking down to them and homogenizing to the least common denominator. logologist|Talk 07:22, 8 February 2006 (UTC)
I know I'm repeating myself here, but -- 100% support from my side. —Nightstallion (?) 11:53, 8 February 2006 (UTC)
Totally false. Jaromir Jagr, the spelling routinely used in thousands upon thousands of newspaper, television, and magazine reports about this person, is most definitely not a misspelling. It is never a misspelling to use the English alphabet when writing in the English language. We have every bit as much right to use our own alphabet as anybody else does. We can certainly choose to include diacritics in his name; to do so because not to do so would be a misspelling is patently false. Gene Nygaard 10:13, 5 February 2007 (UTC)
Gene, I have you read that argument from you many times. Wards kan bee speled rong youing tha English alphabet wen riting in tha Inglish langwage two. Bendono 12:29, 5 February 2007 (UTC)

From here after the first version of the guideline formulation had been written down --Francis Schonken 22:23, 8 February 2006 (UTC)


With provision 4 this seems like a reasonable proposal. I see no harm it can do. But perhaps it would be useful if the author (or anybody else interested in this) could show us few examples where this policy would actually force us to rename an article? I see little point in creating a policy that would do nothing. This has to 1) solve some current disputes 2) make us rename some articles.--Piotr Konieczny aka Prokonsul Piotrus 17:21, 8 February 2006 (UTC)

Note that according to the proposal, anybody who wants to write an article on a subject which should include diacritics, let's say the Łobzów district of Kraków would have to do BOTH of the following.
  1. Find 10 reliable publications about the district written in English which only use the diacritics version when referring to the place.
    Please bear in mind here that less than 1% of wikipedia articles provide 10 references.
  2. Do a google search (or something equivalent) and from that make a convincing argument that 20% of all English webpages referring to Łobzów outside of wikipedia use the diacritics. Here is a test but note that even if I tried to exclude wikipedia, still at least two of the first 10 google pages are copies of wikipedia articles.

But don't take my word for it. Let's ask the author of the proposal (or anybody else) whether any of the following pages would have to be moved if this was accepted:

Stefán Ingi 19:52, 8 February 2006 (UTC)

@Piotr: The only question is if it would make a positive difference for wikipedia: if it would bring down the number & length of discussions about diacritics that would already be something; if it would bring down the number of WP:RM requests (either by additionally defining some types of moves as "non-controversial", or even better, because nobody would even see the need any more to move or to WP:RM certain diacritics-related pagenames) that would be a good point in itself, wouldn't it? Even today a new Village pump topic was started with emotions going high: Misplaced Pages:Village pump (policy)#Using diacritics (or national alphabet) in the name of the article (quote: "Man, I feel like the bottom man in a dogpile.", etc). If that could be tackled more easily by a guideline many people feel easy with, then that's enough for me. Even if not a page needs to be changed.

@Stefan:

  • "ß" and "ð" are presently outside the scope of this guideline, see Misplaced Pages:Naming conventions (standard letters with diacritics)#Scope. The present guideline wouldn't change a thing about Straße des 17. Juni and Davíð Oddsson, neither, for example, about Weißenhof or Weissenhof...
  • "Łobzów" wouldn't be so difficult. Appears Wladyslaw IV Vasa was born there. It appears to have a famous garden. It appears to have a lot of restaurants presently. So I could find at least 10 references in English for it having a lot of restaurants. If you'd reject these (wikipedia is not a tourist guide), even finding 10 non-wikipedia references in English for the "Wladyslaw IV birthplace" + "the famous garden" seems not too difficult. Anyhow, I had a higher Google result on "Łobzów" than on "Lobzow". Further, note that at the Misplaced Pages:WikiProject Geography of Poland naming conventions are being built presently (there's a vote going on at the moment): so in the future there might come a guideline that takes over (unless that guideline says nothing about diacriticals).
  • Antonín Dvořák might have to move back to Antonin Dvorak (at first sight less than 20% of google results, but would need further checking), but then he's without diacritics at the Radio Prague website (the same website spells of course Antonín Dvořák on their pages in the Czech language, , so this is not about "technical limitations"); further, he lived a few years in America (and many Americans would of course throw the diacritics overboard when they had to write his name); and at Misplaced Pages his compositions are disambiguated with "(Dvorak)" and not by "(Dvořák)", except his 7th symphony (see Category:compositions by Antonin Dvorak) - so the prevailing spelling is even at Misplaced Pages "Antonin Dvorak" - wouldn't hurt to see all these a bit on the same line would it? Note also that there is a wikipedia:WikiProject Composers - if they decide to make a naming conventions guideline the naming could be settled on the Czech name too - I wouldn't have a problem with that! Even Misplaced Pages:Naming policy (Czech) could be policy within a week or so...
  • Alliance française has far above 20% of google hits, and as it is an organisation that publishes a lot (and about which a lot is published) 10 reference works wouldn't be a problem, would it? So this would stay where it is.
  • Well, I did three examples now, maybe you do some?

--Francis Schonken 22:23, 8 February 2006 (UTC)

Hmm, there seem even more conventions around, limiting the scope of this proposal, than I had realised. So it seems very difficult to forsee which moves this proposal will mandate. That certainly makes me worry. Another problem I have, aside from the instruction creep of asking people to dig up 10 references and combing through google results before they can write an article, is that I just don't agree that we need to have a fallback convention severely limiting the use of diacritics. Infact I don't think the use of them needs to be limited at all in articles on Irish, Norwegean, Polish, Croatian or German subjects and others. And it seems to me that this is a feeling shared by most editors who are actively working on these article and for the most part we should just let them get on with it. So finally, rather than give examples, I would simply suggest that we drop this proposal. Stefán Ingi 23:01, 8 February 2006 (UTC)
Concur heartily. I propose one overarching convention: that the presumption will favor use of authentic spellings, with diacritics, in all cases. logologist|Talk 23:18, 8 February 2006 (UTC)
I strongly oppose that. It is exactly the same type of arguments used by people to argue that an 80-year long American citizen, who never used diacritics in his own name at least as an adult from anything found after extensive arguing, an American educated college physics professor who devised a chess rating scale known throughout the world as the Elo rating system (though the words for "rating system" differ of course, but always "Elo" without any diacritics, in pretty much any language, at least in anything published before the name was butchered and corrrupted by Misplaced Pages, though sometimes capitalized as ELO by people who mistake it for an acronym), should be at Árpád Élő. Furthermore, while the ship's manifest when he left to come to America as a child did include diacritics in his name, it was different diacritics, in addition to two l's instead of one in his surname, in that manifest. Nobody has ever provided any contemporary to that usage evidence whatsoever as to the spelling of his name before he came to America, other than that ship's manifest information which I provided. Gene Nygaard 10:44, 5 February 2007 (UTC)
Furthermore, that 10 publications bit is ludicrous when it comes to something that has thousands upon thousands of publications in English which use only the form without diacritics. Plus, that very same corruption due to Misplaced Pages's past misusage can play a role in this usage in other "reliable sources" (which in Wikijargon usage is a property of the publication and not of the accuracy of the contents of that publication) as well. Gene Nygaard 10:44, 5 February 2007 (UTC)

Using diacritics (or national alphabet) in the name of the article

The discussion below has been copied from Misplaced Pages:Village pump (policy)#Using diacritics (or national alphabet) in the name of the article - 07:41, 2 March 2006 (UTC)

I came to the problem with national alphabet letters in article name. They are commonly used but I have found no mention about them in naming coventions (WP:NAME). The only convention related is to use English name, but it probable does not apply to the names of people. National alphabet is widely used in wikipedia. Examples are Luís de Camões Auguste and Louis Lumière or Karel Čapek. There are redirects from english spelling (Camoes, Lumiere, Capek).

On the other hand, wikiproject ice hockey WP:HOCKEY states rule for ice hockey players that their names should be written in English spelling. Currently some articles are being moved from Czech spelling to the english spelling (for example Patrik Eliáš to Patrick Elias). I object to this as I do not see genaral consensus and it will only lead to moving back and forth. WP:HOCKEY is not wikipedia policy nor guideline. In addition I do not see any reason why ice hockey players should be treated differently than other people.

There is a mention about using the most recognized name in the naming conventions policy. But this does not help in the case of many ice hockey players. It is very likely that for American and Canadian NHL fans the most recognised versions are Jagr, Hasek or Patrick ELias. But these people also played for the Czech republic in the Olympics and there they are known like Jágr, Hašek or Patrik Eliáš.

I would like to find out what is the current consensus about this. -- Jan Smolik 18:53, 7 February 2006 (UTC)

The only convention related is to use English name, but it probable does not apply to the names of people - incorrect. "Use the most common name of a person or thing that does not conflict with the names of other people or things" - Misplaced Pages:Naming :conventions (common names). Raul654 18:54, 7 February 2006 (UTC)
I mentioned this in the third article but it does not solve the problem. Americans are familiar with different spelling than Czechs. --Jan Smolik 19:11, 7 February 2006 (UTC)
Well, since this is the English Misplaced Pages, really we should use the name most familiar to English speakers. The policy doesn't say this explicitly, but I believe this is how it's usually interpreted. This is the form that English speakers will recognize most easily. Deco 19:02, 7 February 2006 (UTC)
Well it is wikipedia in English but it is read and edited by people from the whole world. --Jan Smolik 19:11, 7 February 2006 (UTC)

There was a straw poll about this with regard to place names: Misplaced Pages talk:Naming conventions (use English)/Archive 3#Proposal and straw poll regarding place names with diacritical marks. The proposal was that "whenever the most common English spelling is simply the native spelling with diacritical marks omitted, the native spelling should be used". It was close, but those who supported the proposal had more votes. Since, articles like Yaoundé have remained in place with no uproar. I would support a similar convention with regard to personal names. — BrianSmithson 19:17, 7 February 2006 (UTC)

I'm the user who initiated the WP:HOCKEY-based renaming with Alf. The project Player Pages Format Talk page has the discussion we had along with my reasoning, pasted below:

OK, team, it's simple. This is en-wiki. We don't have non-English characters on our keyboards, and people likely to come to en-wiki are mostly going to have ISO-EN keyboards, whether they're US, UK, or Aussie (to name a few) it doesn't matter. I set up a page at User:RasputinAXP/DMRwT for double move redirects with twist and started in on the Czech players that need to be reanglicized.

Myself and others interpret the policy just the same as Deco and BrianSmithson do: the familiar form in English is Jaromir Jagr, not Jaromír Jágr; we can't even type that. Attempting to avoid redirects is pretty tough as well. Is there a better way to build consensus regarding this? RasputinAXP talk contribs 19:36, 7 February 2006 (UTC)

I think you misread my statement above. My stance is that if the native spelling of the name varies from the English spelling only in the use of diacritics, use the native spelling. Thus, the article title should be Yaoundé and not Yaounde. Likewise, use Jōchō, not Jocho. Redirection makes any arguments about accessibility moot, and not using the diacritics makes us look lazy or ignorant. — BrianSmithson 16:34, 8 February 2006 (UTC)
Tentative overview (no cut-and-paste solutions, however):
  • Article names for names of people: wikipedia:naming conventions (people) - there's nothing specific about diacritics there (just mentioning this guideline because it is a naming conventions guideline, while there are no "hockey" naming conventions mentioned at wikipedia:naming conventions).
  • wikipedia:naming conventions (names and titles) is about royal & noble people: this is guideline, and *explicitly* mentions that wikipedia:naming conventions (common names) does NOT apply for these kind of people. But makes no difference: doesn't mention anything about diacritics.
  • Misplaced Pages talk:naming conventions (Polish rulers): here we're trying to solve the issue for Polish monarchs (some of which have diacritics in their Polish name): but don't expect to find answers there yet, talks are still going on. Anyway we need to come to a conclusion there too, hopefully soon (but not rushing).
  • Misplaced Pages:Naming conventions (standard letters with diacritics), early stages of a guideline proposal, I started this on a "blue monday" about a week ago. No guideline yet: the page contains merely a "scope" definition, and a tentative "rationale" section. What the basic principles of the guideline proposal will become I don't know yet (sort of waiting till after the "Polish rulers" issue gets sorted out I suppose...). But if any of you feel like being able to contribute, ultimately it will answer Jan Smolik's question (but I'd definitely advise not to hold your breath on it yet).
  • Other:
    • Some people articles with and without diacritics are mentioned at wikipedia talk:naming conventions (use English)#Diacritics, South Slavic languages - some of these after undergoing a WP:RM, but note that isolated examples are *not* the same as a guideline... (if I'd know a formulation of a guideline proposal that could be agreeable to the large majority of Wikipedians, I'd have written it down already...)
    • Talking about Lumiere/Lumière: there's a planet with that name: at a certain moment a few months ago it seemed as if the issue was settled to use the name with accent, but I don't know how that ended, see Misplaced Pages:WikiProject Astronomical objects, Andrewa said she was going to take the issue there. Didn't check whether they have a final conclusion yet.
Well, that's all I know about (unless you also want to involve non-standard characters, then there's still the wikipedia:naming conventions (þ) guideline proposal) --Francis Schonken 19:58, 7 February 2006 (UTC)
Note that I do not believe no En article should contain diacritics in its title. There are topics for which most English speakers are used to names containing diacritics, such as El Niño. Then there are topics for which the name without diacritics is widely disseminated throughout the English speaking world, like Celine Dion (most English speakers would be confused or surprised to see the proper "Céline Dion"). (Ironically enough, the articles for these don't support my point very well.) Deco 20:42, 7 February 2006 (UTC)
Sticking diacritics, particularly the Polish Ł is highly annoying, esp. when applied to Polish monarchs. It just gives editors much more work, and unless you're in Poland or know the code, you will be unable to type the name in the article. - Calgacus (ΚΑΛΓΑΚΟΣ) File:UW Logo-secondary.gif 20:45, 7 February 2006 (UTC)
Redirects make the issue of difficulty in visiting or linking to the article immaterial (I know we like to skip redirects, but as long as you watch out for double redirects you're fine). The limitations of our keyboards are not, by themselves, a good reason to exclude any article title. Deco 20:50, 7 February 2006 (UTC)
Deco, I should rephrase what I said. I agree with you that some English articles do require diacritics, like El Niño. Articles like Jaromir Jagr that are lacking diacritics in their English spellings should remain without diacritics because you're only going to find the name printed in any English-speaking paper without diacritics. RasputinAXP talk contribs 21:20, 7 February 2006 (UTC)
I checked articles about Czech people and in 90 % of cases (rough guess) they are with diacritics in the name of the article. This includes soccer players playing in England (like Vladimír Šmicer, Petr Čech, Milan Baroš). And no one actualy complains. So this seems to be a consensus. The only exception are extremely short stubs that did not receive much input. Articles with Czech diacritics are readable in English, you only need a redirect becouse of problems with typing. This is an international project written in English. It should not fulfill only needs of native English speakers but of all people of the world. --Jan Smolik 22:33, 7 February 2006 (UTC)
Very many names need diacritics to make sense. Petr Cech instead of Petr Čech makes a different impression as a name, does not look half as Czech and is much more likely to be totally mispronounced when you see it. Names with diacritics are also not IMHO such a big problem to use for editors because you can usually go through the redirect in an extra tab and cut and paste the correct title. I also don't see a problem at all in linking through redirects (that's part of what they are there for). Leaving out diacritics only where they are "not particularly useful" would be rather inconsequent. Kusma (討論) 22:48, 7 February 2006 (UTC)
As a matter of fact, "Petr Sykora" and "Jaromir Jagr" are not alternate spellings; they are incorrect ones which are only used for technical reasons. Since all other articles about Czech people use proper Czech diacritics, I don't know of any justification for making an exception in case of hockey players. - Mike Rosoft 01:13, 8 February 2006 (UTC)
Man, I feel like the bottom man in a dogpile. Reviewing Misplaced Pages:Naming conventions (common names), there'sWhat word would the average user of the Misplaced Pages put into the search engine? Making the name of the article include diacritics goes against the Use English guideline. The most common input into the search box over here onthe left, for en-wiki, is going to be Jaromir Jagr. Yes, we're supposed to avoid redirects. Yes, in Czech it's not correct. In English, it is correct. I guess I'm done with the discussion. There's no consensus in either direction, but it's going to be pushed back to the diacritic version anyhow. Go ahead and switch them back. I'mnot dead-set against it, but I was trying to follow guidelines. RasputinAXP talk contribs 15:48, 8 February 2006 (UTC)
There are many names, and even words, in dominant English usage that use diacritics. Whether or not these will ever be typed in a search engine, they're still the proper title. However, if English language media presentations of a topic overwhelmingly omit diacritics, then clearly English speakers would be most familiar with the form without diacritics and it should be used as the title on this Misplaced Pages. This is just common sense, even if it goes against the ad hoc conventions that have arisen. Deco 18:30, 8 February 2006 (UTC)
Czech names: almost all names with diacritics use it also in the title (and all of them have redirect). Adding missing diacritics is automatic behavior of Czech editors when they spot it. So for all practical purposes the policy is set de-facto (for Cz names) and you can't change it. Pavel Vozenilek 03:18, 8 February 2006 (UTC)

See Misplaced Pages:Naming policy (Czech) --Francis Schonken 11:01, 8 February 2006 (UTC)

and: Misplaced Pages:Naming conventions (hockey) --Francis Schonken 17:41, 8 February 2006 (UTC)

There are those among us trying to pull the ignorant North American card. I mentioned the following over at Misplaced Pages talk:WikiProject Ice Hockey/Player pages format...
Here's the Czech hockey team in English compliments of the Torino Italy Olympic Committee Here they are in Italian: , French: . Here are the rosters from the IIHF (INTERNATIONAL Ice Hockey Federation) based in Switzerland: .'
Those examples are straight from 2 international organizations (one based in Italy, one in Switzerland). I'm hard pressed to find any english publication that uses diacritics in hockey player names. I don't see why en.wiki should be setting a precedent otherwise. ccwaters 02:19, 9 February 2006 (UTC)
Over at WP:HOCKEY we have/had 3 forces promoting non-English characters in en.wiki hockey articles: native Finns demanding native spellings of Finnish players, native Czechs demanding native spellings of Czech players, and American stalkers of certain Finnish goaltenders. I did a little research and here are my findings:
Here's a Finnish site profiling NHL players. Here's an "incorrectly" spelt Jagr, but the Finnish and German alphabets both happen to have umlauts so here's a "correct" Olaf Kölzig. Who is Aleksei Jashin?
Here's a Czech article about the recent Montreal-Philadelphia game Good luck finding any Finnish players names spelt "correctly"... here's a snippet from the MON-PHI article:
Flyers však do utkání nastoupili značně oslabeni. K zraněným oporám Peteru Forsbergovi, Keithu Primeauovi, Ericu Desjardinsovi a Kimu Johnssonovi totiž po posledním zápase přibyli také Petr Nedvěd a zadák Chris Therrien.
Well...I recognize Petr Nedvěd, he was born in Czechoslovakia. Who did the Flyers have in goal??? Oh its the Finnish guy, "Antero Niitymakiho".
My point? Different languages spell name differently. I found those sites just by searching yahoo in the respective languages. I admit I don't speak either and therefore I couldn't search thoroughly. If someone with backgrounds in either language can demonstrate patterns of Finnish publications acknowledging Czech characters and visa versa than I may change my stance. ccwaters 03:45, 9 February 2006 (UTC)
I support every word Ccwater said, albeit with not as much conviction. There is a reason why we have Misplaced Pages in different languages, and although there are few instances in the English uses some sort of extra-curricular lettering (i.e. café), most English speaking people do not use those. Croat Canuck 04:25, 9 February 2006 (UTC)
I must make a strong point that seems to be over-looked: this is not the international English language wikipedia. It is the English language wikipedia. It just so happens that the international communty contributes. There is a reason that there are other language sections to wikipedia, and this is one of them. The finnish section of wikipedia should spell names the Finnish way and the English wikipedia should spell names the English way. The vast majority of english publications drop the foreign characters and diacritics. Why? because they aren't part of the English language, hence the term "foreign characters". Masterhatch 04:32, 9 February 2006 (UTC)
I agree in every particular with Masterhatch. The NHL's own website and publications do not use diacriticals, nor does any other known English-language source. The absurdity of the racist card is breathtaking: in the same fashion as the Finnish and Czech language Wikipedias follow their own national conventions for nomenclature (the name of the country in which I live is called the "United States" on neither ... should I feel insulted?), the English language Misplaced Pages reflects the conventions of the various English-speaking nations. In none are diacriticals commonly used. I imagine the natives of the Finnish or Czech language Wikipedias would go berserk if some peeved Anglos barge in and demand they change their customary linguistic usages. I see no reason to change the English language to suit in a similar situation. RGTraynor 06:46, 9 February 2006 (UTC)
People like Jagr, Rucinsky or Elias are not only NHL players but also members of Czech team for winter olympics. Therefore I do not see any reason why spelling of their name in NHL publications should be prioritized. I intentionaly wrote the names without diacritics. I accept the fact that foreigners do that because they cannot write those letters properly and use them correctly. There are also technical restrictions. I also accepted fact that my US social security card bears name Jan Smolik instead of Jan Smolík. I do not have problem with this. I even sign my posts Jan Smolik. But Misplaced Pages does not have technical restrictions. I can even type wierd letters as Æ. And it has plenty of editors who are able to write names with diacritics correctly. The name without diacritics is sufficient for normal information but I still think it is wrong. I think that removing diacritics is a step back. Anyway it is true that I am not able to use diacritics in Finish names. But somebody can fix that for me.
I do not care which version will win. But I just felt there was not a clear consensus for the non-diacritics side and this discussion has proven me to be right. As for the notice of Czechs writing names incorectly. We use Inflection of names so that makes writing even more dificult (my name is Smolík but when you want to say we gave it to Smolík you will use form we gave it Smolíkovi). One last argument for diacritics, before I retire from this discussion as I think I said all I wanted to say. Without diacritics you cannot distinguish some names. For example Czech surnames Čapek and Cápek are both Capek. Anyway we also have language purists in the Czech republic. I am not one of them. --Jan Smolik 19:11, 9 February 2006 (UTC)
People like Jagr, Rucinsky or Elias are not only NHL players but also members of Czech team for winter olympics. Therefore I do not see any reason why spelling of their name in NHL publications should be prioritized -Fine we'll use the spellings used by the IIHF, IOC, NHLPA, AHL, OHL, WHL, ESPN, TSN, The Hockey News, Sports Illustrated, etc, etc, etc.
This isn't about laziness. Its about using the alphabet afforded to the respective language. We don't refer to Алексей Яшин because the English language doesn't use the Cyrillic alphabet. So why should we subject language A to the version of the Latin alphabet used by language B? Especially when B modifies proper names from languages C & D.
My main beef here is that that the use of such characters in en.wiki is a precedent, and not a common practice. If you think the English hockey world should start spelling Czech names natively, than start a campaign amongst Czech hockey players demanding so. It may work: languages constantly infiltrate and influence each other. Misplaced Pages should take a passive role in such things, and not be an active forum for them. ccwaters 20:09, 9 February 2006 (UTC)
People like Jagr, Rucinsky or Elias are not only NHL players but also members of Czech team for winter olympics. Therefore I do not see any reason why spelling of their name in NHL publications should be prioritized Great, in which case for Czech Olympic pages, especially on the Czech Misplaced Pages, spell them as they are done in the Czech Republic. Meanwhile, in the NHL-related articles, we'll spell them as per customary English-language usage. RGTraynor 08:05, 10 February 2006 (UTC)
I wish I understood why User:ccwaters has to be rude in his posts on this subject. "Stalkers of Finnish goaltenders" isn't the way I'd describe a Misplaced Pages contributor. Also, since you asked, Aleksei Jashin is the Finnish translitteration of Alexei Yashin. Russian transliterates differently into Finnish than into English. Of course you must know this, since you have such a habit of lecturing to us on languages. As for diacritics, I object to the idea of dumbing down Misplaced Pages. There are no technical limitations that stop us from writing Antero Niittymäki instead of Antero Niittymaki. The reason so many hockey publications all over the world don't use Finnish-Scandinavian letters or diacritics is simple laziness, and Misplaced Pages can do much better. Besides, it isn't accepted translation practice to change the spelling of proper names if they can be easily reproduced and understood, so in my opinion it's simply wrong to do so. Since it seems to be obvious there isn't a consensus on this matter, I think a vote would be in order. Elrith 16:40, 14 February 2006 (UTC)
Alas, a Finnish guy lecturing native English speakers on how they have to write Czech names in English (not to mention the lecturing regarding the laziness) is but a variation on the same theme of rudishness.
So, Elrith, or whomever reads this, if the lecturing is finished, could you maybe devote some attention to the Dvořák/Dvorak problem I mentioned below? I mean, whomever one asks this would not be problematic - but nobody volunteered thus far to get it solved. Am I the only one who experiences this as problematic inconsistency? --Francis Schonken 21:05, 14 February 2006 (UTC)
So is "Jagr" the Finnish transliteration of "Jágr"??? On that note, the Finnish "Ä" is not an "A" with "funny things" on top (that's an umlaut), its a completely separate letter nonexistent in the English language and is translated to "Æ". "Niittymaki" would be the English transliteration. "Nittymeki" or (more traditionally "Nittymӕki") would be the English transcription.
In the past I've said our friend's contributions were "thorough." I'll leave it at that. There will be nothing else about it from me unless asked. ccwaters 21:02, 14 February 2006 (UTC)
My opinion on the Dvořák/Dvorak issue is that his name is spelled Dvořák, and that's how the articles should be titled, along with redirects from Dvorak. Similarly, the article on Antero Niittymäki should be called just that, with a redirect from Niittymaki. You're right that it is a problematic inconsistency, and it needs to be fixed.
The only reason I may sound like I'm lecturing is that there are several people contributing to these discussions who don't understand the subject at all. Ccwaters's remarks on transliteration are

one example. It isn't customary or even acceptable to transliterate or transcribe Finnish letters into English; the accepted translation practice is to reproduce them, which is perfectly possible, for example, in Misplaced Pages. Niittymaki or anything else that isn't Niittymäki isn't a technically correct "translation". The reason North American, or for that matter, Finnish, hockey publications write Jagr instead of Jágr is ignorance and/or laziness. Misplaced Pages can do better that that.

However, since this discussion has, at least to me, established that there is no consensus on Misplaced Pages on diacritics and national letters, apart from a previous vote on diacritics, I'm going to continue my hockey edits and use Finnish/Scandinavian letters unless the matter is otherwise resolved. Elrith 04:32, 20 February 2006 (UTC)
Hi Elrith, your new batch of patronising declarations simply doesn't work. Your insights in language (and how language works) seem very limited, resuming all what you don't like about a language to "laziness" and "ignorance".
Seems like we might need an RfC on you, if you continue to oracle like this, especially when your technique seems to consist in calling anyone who doesn't agree with you incompetent.
Re. consensus, I think you would be surprised to see how much things have evolved since the archived poll you speak about. --Francis Schonken 23:14, 20 February 2006 (UTC)
My 2 cents:
1) This should NOT be setteld as a local consensus for hockey players, this is about how we name persons in the english wikipedia. It is wrong to have a local consensus for hockey players only.
2) I have tried to do some findings on how names are represented, it is wrong to say that since these names are spelled like this normally they should be spelled like this, many wrongs does not make it right. So I did a few checks,
If I look at the online version of Encyclopædia Britannica I get a hit on both Björn Borg and Bjorn Borg, but in the article it is spelled with swedish characters, same for Selma Lagerlöf and Dag Hammarskjöld, I could not find any more swedes in EB :-) (I did not check all..)
I also check for as many swedes as I could think of in wikipedia to see how it is done for none hockey swedes, I found the following swedes by looking at list of swedish ... and adding a few more that I could think of, ALL had their articles spelled with the swedish characters (I'm sure you can find a few that is spelled without the swedish characters but the majority for sure seams to be spelled the same way as in their births certificates). So IF you are proposing that we should 'rename' the swedish hockey players I think we must rename all other swedes also. Do we really think that is correct? I can not check this as easily for other countries but I would guess that it is the same.
Dag Hammarskjöld, Björn Borg, Annika Sörenstam, Björn Ulvaeus, Agnetha Fältskog, Selma Lagerlöf, Stellan Skarsgård,Gunnar Ekelöf, Gustaf Fröding, Pär Lagerkvist, Håkan Nesser, Bruno K. Öijer, Björn Ranelid, Fredrik Ström, Edith Södergran, Hjalmar Söderberg, Per Wahlöö, Gunnar Ekelöf, Gustaf Fröding, Pär Lagerkvist, Maj Sjöwall, Per Wästberg, Isaac Hirsche Grünewald, Tage Åsén, Gösta Bohman, Göran Persson, Björn von Sydow, Lasse Åberg, Helena Bergström, Victor Sjöström, Gunder Hägg, Sigfrid Edström, Anders Gärderud, Henrik Sjöberg, Patrik Sjöberg, Tore Sjöstrand, Arne Åhman, so there seams to be a consensus for non hockey playing swedes? Stefan 13:33, 21 February 2006 (UTC)
I also checked encarta for Björn Borg and Dag Hammarskjöld both have the Swedish characters as the main name of the articles, Selma Lagerlöf is not avaliable unless you pay so I can not check. I'm sure you can find example of the 'wrong' way also, but we can not say that there is consensus in the encyclopedic area of respelling foreign names the 'correct' english way. Stefan 14:16, 21 February 2006 (UTC)
This seems like a very constructive step to me. So I'll do the same as I did for Czech, i.e.:
  1. start Misplaced Pages:Naming conventions (Swedish) as a proposal, starting off with the content you bring in here.
  2. list that page in Misplaced Pages:Naming conventions#Conventions under consideration
  3. also list it on wikipedia:current surveys#Discussions
  4. list it in the guideline proposal Misplaced Pages:Naming conventions (standard letters with diacritics)#Specifics_according_to_language_of_origin
OK to work from there? --Francis Schonken 15:22, 21 February 2006 (UTC)
Works for me :-) Stefan 00:26, 22 February 2006 (UTC)
Tx for finetuning Misplaced Pages:Naming conventions (Swedish). I also contributed to further finetuning, but add a small note here to clarify what I did: page names in English wikipedia are in English per WP:UE. Making a Swedish name like Björn Borg English, means that the ö ("character" in Swedish language) is turned into an "o" character with a precombined diacritic mark (unicode: U+00F6, which is the same character used to write the last name of Johann Friedrich Böttger – note that böttger ware, named after this person, uses the same ö according to Webster's, and in that dictionary is sorted between "bottery tree" and "bottine"). Of course (in English!) the discussion whether it is a separate character or an "o" with a diacritic is rather futile *except* for alphabetical ordering: for alphabetical ordering in English wikipedia the ö is treated as if it were an o, hence the remark about the "category sort key" I added to the intro of the "Swedish NC" guideline proposal. In other words, you can't expect English wikipedians who try to find something in an alphabetic list to know in advance (a) what is the language or origin of a word, and (b) if any "special rules" for alphabetical ordering are applicable in that language. That would be putting things on their head. "Bö..." will always be sorted in the same way, whatever the language of origin.
What I mean is that "Björn Borg" (in Swedish) is transcribed/translated/transliterated to "Björn Borg" in English, the only (invisible!) difference being that in Swedish ö is a character, and in English ö is a letter o with a diacritic.
Or (still the same in other words): Ö is always treated the same as "O" in alphabetical ordering, whether it's a letter of Ötzi or of Öijer--Francis Schonken 10:56, 22 February 2006 (UTC)

For consistency with the rest of Misplaced Pages, hockey player articles should use non-English alphabet characters if the native spelling uses a Latin-based alphabet (with the exception of naturalized players like Petr Nedved). Why should Dominik Hasek be treated differently than Jaroslav Hašek? Olessi 20:48, 21 February 2006 (UTC)

If we are using other encyclopedias as litmus tests, we don't we look at a few hockey players: Dominik Hasek at Encarta Dominik Hasek at Britannica Jaromir Jagr at Encarta Teemu Selanne in Encarta list of top scorers

Last argument: We use the names that these players are overwhelming known as in the English language. We speak of Bobby Orr, not Robert Orr. Scotty Bowman, not William Scott Bowman. Ken Dryden not Kenneth Dryden. Tony Esposito, not Anthony Esposito. Gordie Howe not Gordon Howe... etc etc, etc. The NHL/NHLPA/media call these players by what they request to be called. Vyacheslav Kozlov used to go by Slava Kozlov. Evgeni Nabokov "americanized" himself for a season as "John Nabokov" but changed his mind again.

ccwaters 22:54, 25 February 2006 (UTC)

Dvořák

Could someone clean this up:

Article/category name without diacritics
Category:Compositions by Antonin Dvorak
Category:Operas by Antonin Dvorak
Cello Concerto (Dvorak)
String Quartet No. 11 (Dvorak)
String Quartet No. 12 (Dvorak)
Symphony No. 6 (Dvorak)
Symphony No. 8 (Dvorak)
Symphony No. 9 (Dvorak)
Violin Concerto (Dvorak)
Page name with diacritics
Antonín Dvořák
List of compositions by Antonín Dvořák
Symphony No. 7 (Dvořák)

I'd do it myself if I only knew which way the wikipedia community wants it... --Francis Schonken 10:53, 10 February 2006 (UTC)

I've been bold and renamed the articles to use diacritics in the title, since they already use them in the text. I've also slapped {{categoryredirect}} tags on the two categories: a bot should be along shortly to complete the job. —Ilmari Karonen (talk) 14:54, 21 February 2006 (UTC)
Tx!!! - I'll remove Dvořák as an exception from Misplaced Pages:Naming policy (Czech)#Exceptions --Francis Schonken 15:22, 21 February 2006 (UTC)

Moving up to guideline

Since the discussion on some related NC proposals (Czech, Swedish,... - see links above) appears to be concluded, I see no further problem to move this general treatment of diacritics up to NC guideline too --Francis Schonken 08:10, 2 March 2006 (UTC)


Francis, I think you need to advertise this widely and run a strawpoll before you change it into a guideline. The page has had only 5 editors and that is hardly enough of a consensus to create a guidline which had an impact on lots of articles across en.wikipedia. --Philip Baird Shearer 09:02, 2 March 2006 (UTC)

It seems clear to me that there is absolutely no consensus for this proposal. Aside from the objections that I have already stated, I might point out that this would be about as inconsistent as can be imagined. Eg. we should use diacritics for Serbian names according to the Cyrillic convention but would have to trawl through the strict conditions before we could use them for Croatian names. This is despite the fact that these two nations use a language which is very closely releated and uses the same set of diacritics. Also, with the Czech and Swedish proposal pages, both of which Francis started, we would have two islands where diacritics can be applyed regularly (perhaps with a few exceptions) but for every nation which borders these we would have to do the same trawling for any page which wanted to use diacritics.
Finally, to reiterate my main point: This proposal goes against the current practice on Misplaced Pages. Therefore there needs to be demonstrated a lot of support for it before the shift which it dictates is carried out. Stefán Ingi 10:16, 2 March 2006 (UTC)

The five editors of the page (six if you include my reversal). Apart from yourself non has made more than one edit.

  • 11:08, 28 January 2006 Francis Schonken (start)
  • 11:17, 30 January 2006 Phil Boswell (→Scope - {{{1}}})
  • 15:42, 9 February 2006 CesarB (→See also - Misplaced Pages:Naming conventions (technical restrictions)#Browser support limitations)
  • 08:38, 10 February 2006 TShilo12 (→Other - dab Hebrew, avoid redirects on other languages, changing description of Chinese)
  • 14:33, 20 February 2006 Nikai m (→Rationale - sp)
  • 08:56, 2 March 2006 Philip Baird Shearer (Reverted)

Francis as an old hand in this contriversial area you will be well aware (I certainly am) there are strong feeling among many editors of using Google or other search engines to decide issues like these. I suspect many people would object to such suggestion. Further there are those who argue that blog pages should not carry the same weight as research papers, books and other encyclopaedia entries.

So at the moment I know that you do not have a true consensus for changing this from a proposal into a guideline. Whether you have a Misplaced Pages:consensus is debatable. --Philip Baird Shearer 10:23, 2 March 2006 (UTC)

@Philip: your criticism is too absurd for words. It comes down to: "write a lower quality guideline, so that other wikipedians feel massively compelled to edit it". I did invite others to improve the text:
" could you have a look at Misplaced Pages:Naming conventions (standard letters with diacritics)? I mean, both w.r.t. (ab)use of the English language and content of the thing, --Francis Schonken 13:56, 14 February 2006 (UTC)"
reply: "Looks OK to me, Bill 14:21, 14 February 2006 (UTC)"
So, no, Bill does not appear in the list of minor changes to the guideline proposal. Which proves the absurdity of your newly invented method for assessing guideline proposals. To be remarked also that you're moving way out of consensus by even imagining that such flimsy method would be acceptable to the wikipedia community.
I know that you have trouble accepting wikipedia:google test *is* part of wikipedia consensus. It's a how-to guideline, so I need not defend that I rely on it. Of course that guideline is a lot about caution re. the application of google test. That's one of the reasons why this NC is only on "standard letters with diacritics": Google is unreliable in filtering out non-standard letters like ß, þ and ð (I commented on that at wikipedia talk:naming conventions (thorn)). The same unreliability does not exist for standard letters with diacritics, see for example wikipedia:naming conventions (Swedish)#Rationale (but that's of course not the only testing I did to check reliability of filtering out variants of the same word with and without diacritics).
Further, it's not "the search engine that decides", as you erroneously try to present it. There's only a check required that the version with diacritics is not totally uncommon (20% is not really a high treshold, and takes account of the internet's bias towards diacritic-less variants). A minimum of references that use the variant with diacritics in English, is a requirement of no lesser stature.
@Stefan: Your criticism is inconsistent: first you ask me to prove the guideline changes something (following Piotr who didn't see the need for a guideline if it doesn't change anything), then you reproach me it *would* change some things (on which you exaggerate, but that's another point). Why should I answer to such inconsistencies?
It has been established long ago that the same rule for all words with diacritics wouldn't work (the famous diacritics poll). So the "standard letters with diacritics" NC distinguishes between languages, and is also an invitation to come up with NC's for languages that would be problematic (I invited Haukur not so long ago to come with such proposal for Icelandic at wikipedia talk:naming conventions (thorn), of course, if you wouldn't have seen that invitation, I invite you likewise!). I wrote/copied the Czech NC, taken all time together, including talk page, in less than half an hour. I don't think, for example, that a Croatian NC would take more than that. Maybe Croatian isn't even problematic seen its proximity to Serbian (and Serbo-Croatian that has both Latin and Cyrillic spelling? – I'm not that much of an expert in those languages)? - Anyway, if there would be a deeper problem, in that case the "standard letters with diacritics" NC would probably only offer a temporary solution, until the specific guideline is written, but the diacritics NC may help in writing such guideline (like it helped me while writing the Swedish NC - without knowing Swedish).
You're trying to make a reproach of me writing some of the NC's for specific languages. How absurd can you be? I wrote them (or a part of them), even collaborated to the hockey NC (as if I know anything about ice hockey). What problem could you have with that? They all settled disputes over *differences of opinion* regarding current practice, and did so as straight derivations of the diacritics NC proposal. So, please, don't make problems where there are none. --Francis Schonken 16:01, 2 March 2006 (UTC)
I have thought and still do think that if this were to become policy then it would change many names, or at least make a lot more effort for those defending them. I disagreed with Piotr when he said that it would not change anything. I asked you to confirm that it would change something but the examples I took weren't very good and nobody offered any other examples so it was inconclusive. I'm also saying that since I think that it changes many things, then it has to be shown to have some sort of Misplaced Pages:Consensus before it can be put up as a guideline. As Philip says, whether that consensus exists is debatable, from looking around on Misplaced Pages I would say that it definitely doesn't, if you disagree then you should offer some evidence for that statement.
Also, I wasn't reproaching you for writing these specific guidelines, I was just pointing it out for the benefit of people who were to come to this discussion. Perhaps there wasn't much point in doing that, I'm not sure. I apologise that I worded it in such a way that you took it to imply that I was reproaching you. Stefán Ingi 16:41, 2 March 2006 (UTC)

Spanish accents

I would like to know if this proposal covers Spanish accents. My interest is in regards to accents in a person's name. Should accents be used in the article name if that person's original name has them? Joelito 14:38, 3 March 2006 (UTC)

As it stands, this proposal would cover Spanish accents so if you wanted to include them and this proposal were a policy or guideline you would have to go through the motions to justify the accents every time. But, this is just a proposal so instead you might just as well look around you, e.g. on the list of Prime Ministers of Spain and see whether the accents are used there. In the example I'm taking they are. Stefán Ingi 14:47, 3 March 2006 (UTC)
Well as the proposal stands it would be very hard to prove points 2 and 3 if the person is not well covered in English publications. For example Eddie Miró, a person known by all the people from Puerto Rico as a television show host has had very little English coverage. It would be very hard to provide references in English for him. Possibly 7-10 relevant English references could be found.Joelito 14:58, 3 March 2006 (UTC)
Yes, I think that in many cases it would mean a lot of unproductive work. That's one of the reasons why I am opposed to this proposal Stefán Ingi 17:52, 3 March 2006 (UTC)
I think that the burden to prove the spelling should be on the non-native name (English), not the other way around. When it comes to articles about the non-English world, much of the work is done by non-native English speakers, who are more familiar with local spellings. Such work usually gets copyedited and such by native English speakers, and if at that point they think a move to a more English-friendly name is useful, they should do the search and if there is a much more widely used English name variant, a move can be done.--Piotr Konieczny aka Prokonsul Piotrus 15:44, 5 March 2006 (UTC)

Speaking of Spanish accents, there is a small group of people who regularly edit Major League Soccer articles who have decided among themselves, without any other notification or documentation, that every MLS soccer player who now has US citizenship should have no accents in their names, even if they were born in countries where their names would normally be accented. 07:26, 29 March 2006 (UTC)

I object to this proposal

Currenty there is no vote ongoing to determine the attitude of the community towards this proposal. Stefán Ingi 15:28, 3 August 2006 (UTC)


I object to this proposed guideline on the grounds that it is unnecessary, onerous, and a waste of time. Diacritics should be used throughout English Misplaced Pages, period. Proper names are spelled correctly or incorrectly. Let the arbiter of correctness be the people with whom the proper name originates. This encourages the use of diacritics in all naming conventions, and suggests that the English Misplaced Pages now suffers from a surfeit of unnecessary traffic on the subject, including this proposed guideline and any like it. -- Mareklug 17:33, 3 March 2006 (UTC)

I agree. This proposal is not useful. "10 reliable publications that are fully in English" just to be able to correctly spell the name of some small town in Central Europe or that of some poet who isn't even known in the English-speaking world? Come on, this is silly. If there is an English form of a name with a well-established pronunciation, that should be used (Rome, Venice, Munich etc.). If there isn't, the original spelling should be used, including diacritics. That's all the policy we need on this matter. u p p l a n d 20:06, 3 March 2006 (UTC)
Likewise object to this proposal, on the excellent and comprehensive grounds given above. The proposal is impractical, unnecessary and patronizing. logologist|Talk 10:38, 4 March 2006 (UTC)
One should consider that at the moment Misplaced Pages:Naming conventions a policy states:
  • "Convention: Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the English form." (Misplaced Pages:Naming conventions (use English))
  • Convention: Use the most common name of a person or thing that does not conflict with the names of other people or things. (Misplaced Pages:Naming conventions (common names)). Which goes on to say "When choosing a name for a page ask yourself: What word would the average user of the Misplaced Pages put into the search engine."
Neither of these conventions suggest using names with diacritics, unless the name with diacritic is the most common usage in English. This is not usually the case. However there is disagreement on this issue. There is a summary of the disagreement on the Misplaced Pages:Naming conventions (use English). This proposal is an attempt by Francis to try to resolve the disagreement. --Philip Baird Shearer 00:22, 5 March 2006 (UTC)
tx, that's how I intended it. --Francis Schonken 09:16, 5 March 2006 (UTC)
No need to "resolve the disagreement," given Misplaced Pages's capability for redirection. logologist|Talk 09:51, 5 March 2006 (UTC)
True.--Piotr Konieczny aka Prokonsul Piotrus 15:39, 5 March 2006 (UTC)
  • Object. I don't think i should object without noting that i'm wowed by the thoroughness of the proposal. But as one who tends to overwrite, impressing me in that way is probably a bad sign, no matter how admirable the effort is in the abstract. Francis's tone on this page is irrelevant to the merits of the proposal, so it's important to ask whether the impression of the proposal as "patronizing" is really about the proposal or the talk page. My current take is that the proposal is impractical, and probably in itself patronizing. I sense a need for something that better reflects my impression of the outcome of the Zürich debate, but i doubt this is it.
    --Jerzyt 03:31, 28 March 2006 (UTC)
    • Hi Jerzy, thanks for your comment. The only one using "patronizing" against the proposal had been Logologist, to which I hadn't given much attention (for obvious reasons: spelling these out: Logologist is one of the most "patronizing" wikipedians when it comes down to using Polish spelling/translations in all sorts of contentious places, see for example how Logologist "decides" about the Polish version of the Polish Biographical Dictionary here - the Polish notice board talk page -, then overrides discussions on the talk page of the article by simply moving )
      FYI, my own major problem with the present version of the "diacritics" guideline proposal is that it is too complex. Doesn't help to say that language issues can be very complex, wikipedians want simple (or: simplistic) guidelines. Which probably won't happen for diacritics, while IMHO it is not possible to formulate the basic principles in a simplistic way. So, my next best solution is to split off as many particular issues that have "simple" solutions as possible (by language, by character,...). --Francis Schonken 09:02, 28 March 2006 (UTC)
To be effective, the proposal needs a simple "Summary" paragraph at the top with general rules, rather than immediately launching into complex language. --Elonka 16:49, 29 March 2006 (UTC)
  • I support the spirit of this proposal. However, I insist that the procedures listed here must be simplified. Specifically, we need to use some kind of a simple lexical procedure (to consider the criteria for diacritics word by word). I propose to use major English dictionaries to determine the correct usage, where available. Some of my ideas will come from the discussions we had at Japanese-MOS and a subsequent mediation (mentioned by Freshgavin below) and would like to make some suggestions based on such experience. I understand this is not a vote, and so would like to join discussions to improve this proposal.--Endroit 22:30, 20 April 2006 (UTC)
  • I might have already stated this above, if so, sorry for repeating myself, but I object and agree that Misplaced Pages should use diacritics throughout and always. —Nightstallion (?) 19:02, 23 April 2006 (UTC)
  • Object. Many non-English topics have no "common English name" -- in some cases Misplaced Pages is the first extant English-language reference. Mandating that we be forced to mangle names in these cases is ... at best, confusing.  –Aponar Kestrel (talk) 04:24, 29 May 2006 (UTC)
  • Strong support. It seems to me that every opposer is Polish. One ethnic community cannot dictate its terms to Misplaced Pages, hence their votes should be discounted. They are not authorities on proper English usage. I'm going to list the whole vote as a sample of Polonization. --Ghirla 07:11, 5 June 2006 (UTC)
Your argument is just plagued with fallacious statements and prejudice. First, the last time I checked I wasn't Polish. Second, the English Misplaced Pages isn't exclusively for native English speakers. Third, being a native English speaker doesn't make anyone an authority on the usage of the English language. Contrarywise, not being a native English speaker doesn't disqualify anyone from having a discussion on the proper use of the English language. I think your proposal to "discount" the votes submitted by the Polish people (even if it wouldn't serve any purpose but to insult them as decisions here aren't reached by vote but by concensus) just gives us a view into the type of mindset that doesn't respect foreigners as equals or their culture as valuable...I'm not surprised at all that you "strongly" support this proposal. Rosa 00:34, 2 February 2007 (UTC)
  • Anyway, I tried to follow Elonka's suggestion above and reworked a bit: made the the intro section less complicated (just showing the main thrust of what this is about), and moved the technical details to the body of the text. And added a "in a nutshell" formulation. And gave a more elaborate description of the "level" of applicability of this guideline in the "Criteria" section. Hope all this satisfies some of the concerns expressed above. --Francis Schonken 10:21, 5 June 2006 (UTC)
    The problem with this proposal is not the layout of the text, it is the content. That is the concern that most people who have registered opposition in this section have raised. Of course I have no problem with anybody improving the layout but I still object to the content and this cannot be made into a policy until some attempt is made to show that this has community support. Stefán Ingi 13:20, 5 June 2006 (UTC)
    Wow. Are you just ignoring me, or are you claiming (with no evidence whatsoever, I'd add) that I'm Polish? Because, um, I'm not, by any definition of the term. And I'd add the same (given a cursory inspection of their user page and contributions, which admittedly may not suffice) for users Uppland, Jerzy, and Doug Bell (all objectors above): as far as I can tell, not even most of the objectors here are Polish!  –Aponar Kestrel (talk) 07:47, 10 June 2006 (UTC)
  • And I'd like to register an additional objection: as written it seems to imply that the ‘okina should be used in Hawai‘ian article titles but the kahakō should not, which is... er, counterintuitive, to put it mildly. (Macrons are necessary in some languages to write them properly, contrary to the statement in the proposal.)  –Aponar Kestrel (talk) 07:47, 10 June 2006 (UTC)
    Tried to work away the problem you mentioned (macrons are apparently used outside IPA...) --Francis Schonken 09:03, 10 June 2006 (UTC)
  • Support I strongly support this proposal. Sure English uses diacritics sometimes, but in most cases, diacritics are omitted. Misplaced Pages has a policy of using the most common English name "Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the English form." and a well established guideline that says "If you are talking about a person, country, town, movie or book, use the most commonly used English version of the name for the article, as you would find it in other encyclopedias and reference works." Everying points to the avoidance of using diacritics in article titles unless that is the way it is most commonly written in English. I don't even see why this is an issue. It is pretty clear cut if you ask me. Misplaced Pages is pretty clear about using the most common English name for article titles (for a multitude of reasons). It is quite rare to see words in English that are more common with diacritics. I think it is reasonable to avoid diacritics in the title and use diacritics in the first sentence of the first paragraph. That way, the most common English spelling is used for the title and the reader can see the native spelling of the word. this debate has been carried over recently at Jaromir Jagr. Basially, virtually every English internet outlet does not use diacritics in his name. Local media and the vast majority of English reference books don't use diacritics either. Why should wikipedia be different? Masterhatch 04:06, 23 June 2006 (UTC)
Sorry--I overlooked that. The other point I made is almost covered too, but should be reworded. What it says is "This guideline does not apply to redirect pages, which can (and should) use diacritics to ensure that all popular variations of a name's spelling, still redirect to the proper article." This misses the most important case; when the article's name does include diacritics, it is essential to have a redirect (or a link from a disambiguation page) which does not have diacritics. As written, it only discusses redirect pages which do include diacritics. Gene Nygaard 12:18, 3 August 2006 (UTC)
Please proceed – it's not as if this proposal is owned by anyone. Note however that, for example, Vitoria redirects to Vitoria-Gasteiz and not to Vitória. The case is explained at Misplaced Pages:Naming conventions (precision)#Minor spelling variations. --Francis Schonken 13:38, 3 August 2006 (UTC)

Support I support the proposal as an effort to bring English use and Anglicisations into Misplaced Pages. It is English Misplaced Pages, not Czech, Polish, Finnish, French, German, etc. Charles 15:18, 3 August 2006 (UTC)

This encyclopedia is English but foreign terms aren't.Rosa 00:03, 2 February 2007 (UTC)

Strongly Oppose An encyclopaedia's purpose is to enlighten about subjects foreign to us, like foreign cultures and how they manifest themselves differently to our own culture, like through the use of a different language. Therefore, not using the native name constitutes a loss of information for all of us.Rosa 00:03, 2 February 2007 (UTC)


This looks to me like a policy looking for a problem. We don't need more policy. Joe Llywelyn Griffith Blakesley talk contrib 10:37, 7 August 2006 (UTC)

Agreed, the community is too divided on this for now.Rosa 00:03, 2 February 2007 (UTC)
I object because there is nothing here worth objecting to. No one will have trouble finding an article. No one will be unable to read the article. Since it's only about pedantry, this is a case where an ardent and active minority forms a consensus. WikiPedia cannot be as consistent as a 20th century encyclopedia. Since the official policy is (necessarily) nebulous, rightness or wrongness is in the eye of the beholder. As articles are revised, more and more of them are hit by diacritic pedants who change the article names. Someone put a macron on ]; who's going to undo it? I put an acute on ]; who's going to undo it? (I was resolving little 'a' and capital 'P', and sumbuddy tossed in acutes, so I moved it.) I noticed a few days ago that two articles on similarly named Nazis were ] and ]. Umlaut is one thing, but eszett has traditionally been verboten in English. The question of whether to call him Höß or Höss or Hoess or Hoss (Hoß and Hoeß are laughable) has been decided. And why not Heß? He's well enough known that sumbuddy changes him back to Hess. The few where there are disputes are usually resolved by removing the diacritic, but the vast majority of diacritics stay. That's wikiconsensus. — Randall Bart 11:08, 2 February 2007 (UTC)
As long as we're being pedantic, I am using "diacritic" in the typographic sense. In French ö is an o with a diacritic, but in German it's an o-e ligature and in Swedish it's a different letter, though it's derived from the German. The ß is a confluence of ss and sz ligatures. The ð is derived from d with a diacritic but considered a different letter. The þ is a completely distinct letter; akin to theta, but derived (like the other runes) from Italic alphabets. — Randall Bart 11:08, 2 February 2007 (UTC)

Japanese

An independant mediation supported a past change to the Japanese MoS so that now the inclusion of macrons in titles of articles with Japanese content is acceptable. (Don't blame me!) Japanese romanization utilizes (ō), (ū), and (').  freshgavinΓΛĿЌ  03:19, 20 April 2006 (UTC)

Technicality and an alternative proposal

As was raised in the very begining, there is problem with provision "There are at least 10 reliable publications that are fully in English". Besides the question 'what is meant by 'reliable' (and note that on WP:RS there is no specific list or easy 'how to determine' process)' there are many, many cases when there are much fewer then 10 publications. Lots or smaller towns or villages are not mentioned in 10+ English publications, the same problem is with many historical personas who might be notable in their country but are barely (or not at all) known outside. But if the proposal passes, then we will be forced to invent the undiactricized versions of many names, thus for example Okopy Świętej Trójcy would become Okopy Swietej Trojcy because this former village is apparently almost completly unknown to the English-speaking world. Instead I'd like to draw attention to a similar naming convention, which proposes a different approach and seems to attact mostly positive comments. The Misplaced Pages:Naming conventions (geographic names) proposed policy supports the use of English names, but states that if there is no widely accepted English name (with 'widely' being defined later) then local name should be used. I personally believe that this policy is more realistic, and it can be expanded beyond geographic names to other 'rare' names.--Piotr Konieczny aka Prokonsul Piotrus 21:51, 5 June 2006 (UTC)

FYI,
...if you ask me not the best example of a stable Polish name... Up till now 20% of the total number of edits to that page have been page moves... For me this rings a bell that maybe a good guideline would be better than this move-warring... no?
Then you also make a link to Misplaced Pages:Naming conventions (geographic names) which is at a no-consensus for "proposal F" state, afaik *longer than the diacritics proposal exists* - I wouldn't boast too much on the "near to consensus" state of Misplaced Pages:Naming conventions (geographic names). Anyway, I don't even see "competition" there, its parameters for determining a choice for a name are comparable to those of the more general "diacritics" proposal () - and it certainly isn't a guideline that would be less complex to put in practice.
For instance, when applying the recommendations of that proposal (version F) to your example, I'd need to check Britannica, Columbia, Encarta, Google Scholar and Google Books. I pick one of these 5 recommended reference sources (Google Books):
  • "Okopy Świętej Trójcy" - did not match any documents.
  • "Okopy Swietej Trojcy" - did not match any documents.
  • 2 pages on "Ramparts of the Holy Trinity"
  • "Stronghold of the Holy Trinity" - did not match any documents.
  • 1 pages on "Okopy Sw. Trojcy"
IMHO, this confirms the present name of the article at Ramparts of the Holy Trinity, and not any version of the Polish name. But yeah, true, if the "geographic names" proposal would be guideline I'd need to check 4 more reference sources in the same fashion. For the "standard letters with diacritics" proposal, the case would already been settled: translation appears indicated... no need to discuss about Polish versions with or without diacritics. --Francis Schonken 10:08, 6 June 2006 (UTC)
Your search above is actually misleading. The 3 (not 2) pages on Ramparts are actually:
shoud be discarded because it does not refers to the village but to the ramparts of the castle
This is the same case, note the lower case used in ramparts (surely if it was to be the village's name it would used an upper case?
The third source is the only one which capitalizes it.
Finally it should be noted that while translation makes sense in the literary text (like #2 or discussions about it like #3) it makes no sence when we are refering to the geographical place.--Piotr Konieczny aka Prokonsul Piotrus 17:30, 9 June 2006 (UTC)
My feeling is this: If a particular town or structure isn't being written up in any English publications, then by some standards we shouldn't have an article on it at all, because it doesn't pass the "Notability" test. On the other hand, I do see some advantages to having articles about not-necessarily-in-the-press things, such as schools and towns and some obscure bits of history. So I'm willing to accept the idea of having an article at Misplaced Pages about it, as long as the article is titled by whatever the most common English name is. If there's no English name, then okay, I'd say list the title, but without diacritics. Personally, even though I have a passing familiarity with several languages, I find diacritic names jarring to look at in what is supposed to be an English reference work, because it looks like an article has been written solely for the use of the locals in that area, which makes it less accessible to other nationalities (including people on other continents for whom English is a second language anyway). It's unsettling to see a name that is so clearly unpronounceable to most English speakers. So I would rather see the title use the non-diacritic version, which is how it would probably show up anyway in an English-language newspaper if they *did* end up writing an article about the town. And in this case, the problem would be self-correcting. If a town with an odd spelling did become famous in English-language literature, and genuinely notable, then the Misplaced Pages article would have a standard to follow: If the article was showing up in English-language newspapers with diacritics, a move could be requested to the more commonly-used version of the name. But until then, I'd say let's stick with the "no diacritics unless it can be shown that it's common usage in English" guideline. --Elonka 18:06, 9 June 2006 (UTC)

Related poll

Interested editors are invited to participate in: a poll on whether or not to use diacritics in the titles of Polish monarchs. --Elonka 18:13, 13 June 2006 (UTC)

Major rewrite

I took a stab at simplifying and condensing the proposed guideline to make it easier to read and understand. If I removed anybody's favorite section, please feel free to add it back in. :) --Elonka 06:05, 26 June 2006 (UTC)

I think it reads quite well now. Honestly, i think this is all common sense and I would like to see this get moved up one to become a guideline (eventually). Masterhatch 17:51, 26 June 2006 (UTC)

somewhat related

There is a somewhat related poll here Talk:Voss-strasse if anyone is interested in adding their two cents. Masterhatch 17:47, 27 June 2006 (UTC)

Scope addition

As regards the other "letters not included in this guideline", such as þ and Đ and ß, what is the feeling about adding wording such as: "Because of the limited geographic regions in which these letters are used, English-speakers in other parts of the world (especially those for whom English is a second language) often find these symbols incomprehensible and unpronounceable. As a result, this guideline recommends that their use be avoided in article titles." --Elonka 20:46, 27 June 2006 (UTC)

That makes sense. Personally, i don't know the difference between the sounds the diacritics make and i am a native speaker of English. Masterhatch 22:36, 27 June 2006 (UTC)

Other options

I hope that we have finally reached the agreement that linking through redirects is not a relevant issue, and that we can concentrate on English usage in Misplaced Pages articles. There are some important points to be made:

  • Misplaced Pages conventions are just that, conventions of Misplaced Pages. They are not natural laws, and we can choose to implement any naming convention we chose, including numbering articles by timestamp at creation or some other scheme. Use English is a convention of Misplaced Pages because we decided so, not because that's how it must be.
  • There is no central authority on correct usage in English.
  • Popular usage is not always correct - even if most sources refer to Tories, the party's proper name is still the Conservative Party.
  • We chose article titles which we judge as the most appropriate for Misplaced Pages, not those that are the most correct (United States is properly called United States of America) nor always those which are most commonly expected (China is overwhelmingly used to mean People's republic of China in the real world).
  • There are sources in English which use diacritics regularly, those which use them irregularly, those which use just certain diacritics or use them just in certain languages, and those that don't use them at all.
  • Foreign words used in English text don't automatically become English words.

So, we should approach this question with an open mind. Use English does not require us to not make exceptions for classes of special cases. It also does not require us to use or omit diacritics as neither omitting or including them is wrong in English per se. We should discuss what the advantages and disadvantages of using diacritics are. One obvious advantages I see is providing the additional information. The one real practical disadvantage mentioned so far is that they make it hard to search for the name inside the browser page.

There is the real possibility that both ways are equally correct and that it's a matter of taste, and tastes are hard to change through debate. In such cases, we should be looking for widely acceptable rather than hypercorrect solutions. IMO, we should aim to avoid the ridiculous situations when the spelling of people's names depends on ancd changes with where they currently live or work, or when the same first or last name is spelt differently in different article titles without a clear criterion.

The rewritten proposal is much better than previous attempts, but it has two major problems: (1) it's too long, and (2) it goes against the current practice, which has many supporters, and which will be hard to change, even if this is promoted into a guideline. Keep in mind that putting a guideline tag on something doesn't magically make all articles conform with it nor all editors agree with it.

Other solutions include:

  • Use no diacritics at all
This would have the advantage of being short, clear and easy to enforce, but as the current proposal says, it would sometimes force us to use wrong titles even for English names, which makes it unacceptable.
  • Alway use the original spelling
This would also be short and clear, but it would be simply wrong for monarchs and many other historic people, which makes it unacceptable.

There are other, more nuanced options, none of which should exclude per-case decisions in special circumstances, nor using English names when they are spelt entirely differently.

  • Use whatever makes more sense, or what the first editor used if no choice is substantially better
This is how BE/AE spelling and CE/AD era notations are handled.
  • Use diacritics if the common English spelling is the same as the original one, but without the diacritics.
This is more or less how place names are handled.
  • Use the original spelling unless the person has legally adopted the spelling without diacritics or regularly uses it even in their native language.
This would cover naturalized citizens or other people who have genuinely changed their name and language identity, while leaving most articles as they are now.

In short, I'd prefer names to be spelt by dafult like properly translated sources in English from the country of origin spell them (i.e. use the local transliteration), but any of the nuanced solutions are acceptable to me. Zocky | picture popups 01:29, 28 June 2006 (UTC)

What is this "natural laws" thing you keep talking about? are we discussing physics? Well, we decided to use english because this is the English language section of wikipedia. It would be kinda strange to use korean here, now wouldn't it? anyways, that is why there are multiple language sections on wikipedia.
Well, the Brits have their English and the Yanks have theirs. For wikipedia, we blend the two and use the most common form of English.
While popular usage isn't always correct, wikipedia has a policy of using the most common form of English in usage because wikipedia is for the layman and it is the most common form of English that the layman understands the best.
See above, we use the most common form, unless there is a disambig problem, of course.
That is why we go with the most common form. Simple, eh? If a word or name is most commonly written with diacritics, then wikipedia should use diacritics. If it is most commonly written without diacritics, then wikipedia should follow suit. Masterhatch 07:26, 28 June 2006 (UTC)
Look the only forseeable solution i can see is that for names, places, etc that diacritics are most commonly used in English, they keep the diacritics here on wikipedia article titles. For ariticles where English most commonly drops diacritics, wikipedia should reflect that. Isn't that simple? How can anyone logically argue against that? That way both sides win. People that like diacritics get them on articles where they are most commonly found in English and people who don't like diacritics don't have to have them rammed down their throat for words that they almost never see them on in daily life. Masterhatch 07:26, 28 June 2006 (UTC)
The above was inserted into middle of my comment which made it hard to read. All I can say is that it has been previously established on numerous occasions that reader ignorance is not a valid concern for editorial decisions, so most of Masterhatch's comment is irrelevant. We also know that common usage is one of the factors used for these decisions, not the overriding deciding factor, which makes the rest irrelevant. Zocky | picture popups 10:30, 28 June 2006 (UTC)


Zurich -- (Talk:Zürich/archive1#Move (Zürich -> Zurich)) --Philip Baird Shearer 07:43, 28 June 2006 (UTC)

I've reverted some of the changes to the proposal:

(1) categories don't have redirects
So it's Category:Compositions by George Frideric Handel (without diacritic) if we have the composer at George Frideric Handel, and Category:Compositions by Camille Saint-Saëns (with diacritic) if we have the composer at Camille Saint-Saëns. I tried to draw a bit more attention to the category aspect of being consequent (while categories don't have redirects). Also the section is important, while it draws attention that being consequent only applies to the name of a topic, not to "copying all diacritics of a language".
(2) this is NC not MoS
refers to "first contributor" rule (copied from MoS) that was removed by me from this NC proposal, while the "first contributor" rule can only be used if it's competition between varieties that are acceptable in English. In other words, one doesn't fight non-English nationalistic POV by inciting to start as much wrong-named articles as possible, to give way to "I was the first" claims.
(3) remove "national varieties" doubling (Irish not nat. var. of Eng)
don't know why the "national varieties of English" link, that was already in the intro of Misplaced Pages:Naming conventions (standard letters with diacritics)#Specifics according to language of origin was doubled in a rephrased format in one of the subsections. Note that Irish is not a "variety of English" (the rephrased intro was a bit more ambiguous on this point).
(4) keep all commented out proposals until further notice
why delete some, and keep some others? Some of the deleted ones were pages in Misplaced Pages "naming conventions" format, some of those that were kept, were merely a link to an encyclopedia article (so not guidance on how such specifics are handled in wikipedia page names) --Francis Schonken 09:41, 28 June 2006 (UTC)
  • If you realy mean this "After the choice has been made whether a name is written with or without diacritics in a page name, all other Misplaced Pages pages" then I find it unacceptable. It is a step way beyond WP:NC. The reason for redirects is so that Misplaced Pages can accommodate other names for the same subject.
    • I see the confusion I generated: changed that to content pages. I hope "content page" is clear enough as a concept, or should "except redirect pages and disambiguation pages" be added to that? --Francis Schonken 11:24, 28 June 2006 (UTC)
  • I think that when there is a dispute over the name which can not be resolved then first is a reasonable compromise. If not how does one decide as a page has to have a name? It cuts down on revert wars while an alternative consensus is reached.
    • WP:RM always leads to a resolution (even if "stalling by lack of consensus"). And it has a slight bias towards "keep where it is" (while 60% is the usual threshold for a move). And part of the rules are, as far as I know, the WP:RM should not start from a place where the page has just been moved to (recently there was still a WP:RM vote broken off for that reason). --Francis Schonken 11:24, 28 June 2006 (UTC)
  • Depends on what you mean by "Irish" lets call it "Irish English" as reduces the ambiguity.
  • Only those which effect National verities of English should be mentioned here. The rest should not because there are potentially hundreds of these and there is no reason why this general guideline should be explicitly subservient other potentially POV laden guidelines like this proposed one: Misplaced Pages:Naming policy (Czech):
Czech names: almost all names with diacritics use it also in the title (and all of them have redirect). Adding missing diacritics is automatic behavior of Czech editors when they spot it. So for all practical purposes the policy is set de-facto (for Cz names) and you can't change it.
"Only those which effect National verities of English" - the title of the section is "Specific languages using the (extended) Latin alphabet". Neither Irish nor Māori language are a National variety of English. French has a more profound effect on UK English than on US English; it seems also that, for instance, Spanish has a more profound effect on US English than on UK English. But this is not the point of this section (while these US/UK style variants are treated in the MoS). If "böttger ware" turns up in Webster's, with a diacritic as in German, this is not an issue limited to "national varieties of English", but it should be part of Misplaced Pages's diacritic-related guidelines. FYI, German is a language using "extended Latin alphabet". --Francis Schonken 11:24, 28 June 2006 (UTC)
--Philip Baird Shearer 10:16, 28 June 2006 (UTC)

Catering for dumbed-down and lazy usage

A café is a café is a café. It is a French word which we, English speakers, have adopted to mean a particular type of building. The word is not an English word and it would be incorrect to treat it as such--it is a French word used by English speakers. Sure, some people spell it "cafe", some people even pronounce it "caff" - that's fine, local variation is good - language develops and evolves. One day, in a few decades time, the spelling "café" may seem quite alien - at that time, it will have been fully adopted. The word role, once spelt rôle, is an example of a French word which has been fully adopted into English and whose original spelling looks odd to most. This is the distinction between a non-English word in common usage among English speakers, and a fully adopted word. The misspelling café is one thing, but proper names are quite another - "Antonín Dvořák" is spelt one way and there is no alternative. Ultimately, we are trying to write an encyclopedia, doing something somehow authoritative. In a casual e-mail I may miss the diacritics due to laziness and in the understanding that the recipient would understand who I was referring to. We are not writing a casual e-mail, we are not texting our friends and we are not instant messaging our colleagues. As such, we should not treat language as if we were. It would also be wrong to go too far in the opposite direction and become too prescriptive about language, insisting that those who don't use diareses in words such as the verb meander (thus meänder) are somehow wrong or illiterate. Of course, we are still a dynamic work that is able to stay current and appeal to all but we have, at our disposal, a range of tools which enable us to use the correct characters for a wide range of languages more than anyone has ever had in history. Misplaced Pages is such that if someone doesn't understand, or recognise, a particular character they are able to look it up and educate themselves. We should make the effort to provide truth.

"Diacritics should only be used in an article's title, if it can be shown that the word is routinely used in that way, with diacritics, in common usage" is entirely flawed as a guideline. Common usage varies from continent to continent, from country to country and from culture to culture. I am sure it is common for most people who are writing about Dvořák to spell his name "Dvorak" because they aren't sure how to get that funny "ř" character to appear (this was particularly the case in the days of the typewriter). "Dvorak" would therefore be considered common usage, but this doesn't reduce from "Dvorak" being wrong through-and-through when referring to the man Dvořák. --Oldak Quill 17:17, 28 June 2006 (UTC)

In that case, the article about the man should definitely include the proper spelling of his name, in the body of the text. This guideline is not referring to the main article, but strictly to the titles of articles, and trying to come up with a consistent method which allows for ease of linking, reading, and finding, for the vast majority of English speakers. If I, myself, were looking for an article on the man you mentioned, I would type "Dvorak" into the search box, not something with diacritics. --Elonka 17:36, 28 June 2006 (UTC)
I cannot see your reasoning. Surely the title of an article should be spelt correctly? This is particularly the case in Misplaced Pages as we have redirects which allow us to use correct characters in titles without inconveniencing our visitors. Redirects allow us to both maintain a high standard of spelling and lexical correctness while making the browsing experience easy for the visitor. Potentially, all people who have primarily Cyrillic names could have Cyrillic article titles, redirects would ensure that finding the correct article would be as easy as finding a non-Cyrillic counterpart (Pyotr Ilyich Tchaikovsky, or whichever transliteration we chose to use, would redirect to Пётр Ильич Чайкoвский). Of course, using non-Latin alphabets in titles goes too far for most so it is something we don't do. But diacritics are easy to use and understand, they are part of our Latin alphabet- we should not incorrectly label people and things due to sheer institutional laziness. --Oldak Quill 17:46, 28 June 2006 (UTC)
I have to disagree with you Quill. I am strong believer in the most common form of English be used in article titles (except of course in the event of disambig problems). Diacritics, in most cases are foreign to English and if you look around, most people, places, and things that have diacritics in their native language, lose it when mentioned in English. Take Jaromir Jagr for example, his name in Czech include diacritics. if you look around publications in English, the extreme vast majority of times the diacritics are dropped, even on his hockey sweater. The native spelling, with the diacritics, should be (and is) shown in the first line of the first paragraph of the article. The title should be the most common form found in English, whether the most common form includes diacritics or not. Your basic reasoning is that English is spelling the names wrong. Well, i am telling you, English is not spelling it wrong, it is just spelling it English. That is what, in fact, got me into this. I used to not care either way if there were diacritics in titles until someone called the English spelling wrong. That lit a fire in my arse because that is pure ignorance for someone to call an accepted English spelling as wrong. I don't go and say that "États-Unis d'Amérique" is spelt wrong and they must spell it the English way! I understand that French has its spellings and you must understand that English has its spellings. Masterhatch 18:37, 28 June 2006 (UTC)
I would say that as long as a name is written in a Latin-derived alphabet and there is no commonly accepted English name (such as Spain for España, Rome for Roma etc) then the name should be written with its original diacritics. A name is a name, it is either spelled correctly or incorrectly and we shouldn't start disfiguring it. The majority of non-English names, whether they are of places or people are too little known in English to have commonly accepted English spellings. Simply because the name of a Czech village or a minor figure from Paraguayan history has diacritics that are not normally used in English doesn't mean they should be removed when written in English as there are no commonly accepted English spellings for such names. Booshank 19:21, 28 June 2006 (UTC)
Well, maybe some of those places that aren't well enough known to English speakers aren't notable enough to have their own article. If those places can't be found in an English atlas, then why would wikipedia have an article about it? If there are no English publications in regards to that place (or person), are they really notable enough to have an article? Most small czech villages can be found in a thorough english atlas. If those English atlases have diacritics, then wikipedia should follow suite. If those atlases drop the diacritics, then, again, wikipedia should follow suite. Same with people. If there are no or only a very few English publications about a person, then is he/she really notable enough to have an article? If there are English publications about that person, have a look at them and, again, whichever form is most common (with or without diacritics), then use that form. I have no problem with the use of diacritics if that is the most common way to spell that name in Engish. I only have a problem with the use of diacritics when the most common way of writing that name in English drops those very diacritics. Masterhatch 20:15, 28 June 2006 (UTC)
I don't believe a single source should ever be used to push an agenda. Just because a single atlas rejects diacritics (for the sake of space, perhaps) does not mean that we should follow suit. Diacritics should only be excluded if a non-diacriticed version has become more popular in English. On a side point, whether there are any publications on something in a particular language (where there might be plenty in another language) is not a measure of notability. Czech villages should be covered as extensively as British villages in Misplaced Pages (the latter of which will be covered far more thoroughly in the English language), for example. Articles on small Czech villages which won't be widely known enough among English speakers to have their own spelling variants (regardless of what a particular atlas states) should always be given the correct Czech name. --Oldak Quill 20:53, 28 June 2006 (UTC)
Single source? you must have misunderstood my comment. I would never say that a single source is good enough. Anyways, to clarify, for notability sakes, if there is no mention of a small czech village in an English reference book, then i ask, is it really notable enough for an article? But that is side tracking and that is a debate for a different day and a different place, so I won't discuss it further. Back to my point, if most English atlases and reference books aren't using diacritics for a czech town (city, village, person, whatever), then wikipedia shouldn't either. If most English atlases and reference books are using diacritics, then wikipedia should too. Funny thing is, with all this arguing back and forth no one has told me what is wrong with that. It is fair for everyone and it follows the wikipedia naming convention policy to a tee. Masterhatch 02:28, 30 June 2006 (UTC)
. Of course "États-Unis d'Amérique" isn't wrong, this example is not analagous to what I have been saying. États-Unis d'Amérique and United States of America are both correct, it is "Etats-Unis d'Amerique" which is not. If one is going to use French words then spell them correctly - E is a different letter to É. In many languages with diacritics, forgetting the diacritic can result in an entirely different word with an entirely different meaning. In Afrikaans, for example, if one forgets the diacritic on the ë in the word "hoërskool" (meaning high school) and so produce "hoerskool", one would be expressing "whore school". This is an example which demonstrates the fact that a letter with and a letter without a diacritic are different letters and to confuse them is to arrogantly thrust the English non-use of diacrtics onto loaned words. Jaromir Jagr chooses to spell his name differently in English, to transliterate his name for an English-speaking sport. That is perfectly acceptable and we use his adoptive English name in articles. But we, as an encyclopedia, cannot thrust new names onto people because it suits us, because it is easier for us to type. Proper nouns, and some adopted words, exist outside the rules of the language by which they are adopted - they are words which should still be treated with the spelling rules of the language from which they came until their usage is so common that those rules are dropped. My name is Oldak Quill and I wouldn't expect speakers of a language which doesn't use "Q" or "Qu" to change my name to 'Oldak Kwill" to give themselves an easier time - it is just incorrect. Dropping the diacritics of a French or Polish word is nothing short of transliteration and is entierly comparable to changing "Quill" to "Kwill". Transliterations can only ever give rough and fuzzy approximations of a word and should therefore be avoided as much as possible.--Oldak Quill 19:17, 28 June 2006 (UTC)
You are missing my point (and i am probably missing yours) that we here on wikipedia aren't about changing English, but following the majority of English publishers to arrive at "most common form of English". It is simple, if the majority of English publications drop the diacritics, then wikipedia must follow suit. If the majority of other English encyclopaedias and reputable publishers don't use diacritics for names, why should we on wikipedia? As i have said before, if the majority of reputable English publishers are using diacritics for a given name, then, by all means, include them in the article's title in wikipedia. I am not trying to eliminate diacritics from wikipedia, i am just trying to make sure that wikipeida doesn't stray too far from "the most commonly used form of English".
See my reply below. Reputable English encyclopaedias do use diacritics in many, many names. --Oldak Quill 20:46, 28 June 2006 (UTC)
Agreeing with Oldak.--Piotr Konieczny aka Prokonsul Piotrus 22:02, 28 June 2006 (UTC)

Not a place for change

Really, i think it is all pretty simple: if the majority of reputable English publications include diacritics, then wikipedia should too. If the majority of reputable English publications don't use diacritics, then wikipedia shouldn't either. It's a case by case situation. What is wrong with that? Honestly, if people have problems with English dropping diacritics, don't come to wikipeida with your grievances. Misplaced Pages isn't here to change the English language or how English does things. Go to websters or oxford or whoever. Misplaced Pages is not the place for change. Masterhatch 20:15, 28 June 2006 (UTC)

I have no desire to make Misplaced Pages radically different from other works of reference in English. As I said, if it has come to be that a word without diacritics is more common than one with, then it should be the one used (such as role instead of rôle). What I do object to is purposefully going out of our way to force change by standardly rejecting diacritics when they play a very valid rôle ;) in many words. Further, no works of reference that I know of force change in peoples names to exclude diacritics. Dvořák is Dvořák is Dvořák, there is simply no alternative way to spell this name - the diacritics are not just aesthetic but functional. I am striving to do exactly what you accuse me of doing: all I see in this proposal is an enforced rejection of diacritics for the sake of dumbing-down and laziness. --Oldak Quill 20:46, 28 June 2006 (UTC)
I am not standardly rejecting diacritics!! that would be ignorant of me. I will repeat what i have said all along, because people for some reason keep missing it: if most English works use diacrtics, then wikipedia should. If most works don't, then wikipedia shouldn't. This is a case by case situation, not a blanket covering! Someone please tell me why that won't work? Masterhatch 02:28, 30 June 2006 (UTC)
Nonsense. There are zillions of reputable English publications discussing Antonin Dvorak and other Dvoraks, and things such as the Dvorak Simplified Keyboard as well as its inventor August Dvorak. And there are probably as many English publications using Antonin Dvořak as Antonín Dvořák, so why is the former a redlink as I write this? Redirects are cheap enough that we could even include August Dvořák for fools who mistakenly think "Dvořák is Dvořák is Dvořák". Gene Nygaard 16:56, 18 August 2006 (UTC)

Dumbed down?

Café/cafe
IMHO, it's rather "dumbed down" to state that in English only the version with diacritic is correct:
Webster's 1981 international printed edition
café cafe
café au kirsch
café au lait
café brûlot
cafe car
café chantant
café concert
café crème
cafe curtain
café noir
café society
OED minidictionary (1994)
café
I was using café as an example - café is, as far as I can tell, the most commonly used form and the form used by most reputable dictionaries (including the current OED). I did not deny that some people did use the word "cafe" (in fact, I stated that they did) nor that derivative words might use that spelling (such as "cafe curtain"). --Oldak Quill 20:46, 28 June 2006 (UTC)
Well, "dumbing down" comes from calling a less used version a misspelling (that's the word you used). even caff (Brit : café) is in the addenda of the 1981 international printed Webster's, and so not a "misspelling". --Francis Schonken 21:31, 28 June 2006 (UTC)
I did not call caff a misspelling, I called cafe a misspelling. I was quite wrong about "cafe" being a misspelling though, but that isn't the discussion at hand. In trying to get my point across I erroniously emphasised something too strongly. All of my points (except calling cafe a misspelling) still stand. --Oldak Quill 22:02, 28 June 2006 (UTC)
Tx for taking the point. My next point is that "Antonin Dvorak" is not a misspelling in English. And that was your main point (at least your main example). I'm prepared to discuss whether the article on this composer should be at Antonin Dvorak or at Antonín Dvořák. In fact I already did, twice, as you can see in #So what actually is this saying? and #Using diacritics (or national alphabet) in the name of the article above on this page. As a result of these discussions, among others Category:Compositions by Antonin Dvorak was moved to Category:Compositions by Antonín Dvořák. I'm prepared to discuss the page naming for this composer again, but not on the basis of the "dumbed down" assumption that Antonin Dvorak is a "misspelling". --Francis Schonken 22:29, 28 June 2006 (UTC)
"ř" is an entirely different letter to "r". "Dvořák" is a different word to "Dvorak" with a different pronounciation. "ř" produces that particular "j" sound where "r" would produce, well... you know. It is not appropriate to replace "ř" with "r" just because they look similar. --Oldak Quill 23:11, 28 June 2006 (UTC)
"Antonin Dvorak" is not a misspelling in English. --Francis Schonken 23:17, 28 June 2006 (UTC)
"Antonin Dvorak" is not a misspelling in English. Gene Nygaard 12:20, 5 February 2007 (UTC)
Antonin Dvorak
As you might know, this composer lived and worked a few years in the USA. I always wondered how they wrote his name in the music school where he was director at that time? How was his name written on the concert programs when his music was performed during his stay in New York? Does anyone have any info on that? --Francis Schonken 20:11, 28 June 2006 (UTC)
That would be a good piece of info, but the spelling at the time is probably not what we want to use as a criterion. It would make us spell a bunch of historic names weirdly, if nothing else. Zocky | picture popups 13:01, 29 June 2006 (UTC)
Neither do I intend to do so! Unless when it would be clear that in English, the composer used the diacritic-less version of his name exclusively. In that case we'd have the same situation as for Arnold Schoenberg, who clearly changed his name to a diacritic-less variant when moving to the USA. For this composer the version of his name with diacritics could be considered a "misspelling" in English. And only in English - all other languages write Arnold Schönberg afaik. --Francis Schonken 13:52, 29 June 2006 (UTC)

Diacretics are not English, full stop

This is the English language Misplaced Pages, so diacretics should not be used, other than to indicate the form used in an original language. We don't use Chinese characters, and there is no more justification for using diacretics, other than that to some degree we can get away with it. Words written with diacretics are not English, and that is the end of the matter. Chicheley 20:51, 28 June 2006 (UTC)

This is simply not true. English does make use of diacritics natively (such as the diaresis). Further, it makes extensive use of diacritics in loan words because letters with diacritics are not the same as the similar-looking letter which doesn't have one. --Oldak Quill 20:55, 28 June 2006 (UTC)
I disagree, but that counts for little. Of rather more importance are the vast number of reliable sources which use diacritics where conventions dictate that they should be used. Angus McLellan (Talk) 21:43, 28 June 2006 (UTC)
I must agree with the title of this section: "Diacretics are not English, full stop". Indeed "diacretics" is not an English word. "Diacritics" is. And they are used in some English words, all dictionaries agree on that. Sorry for the pun. Couldn't resist :) --Francis Schonken 22:58, 28 June 2006 (UTC)

Foreign words are not English words. Full or partial stop or whatever. Zocky | picture popups 11:43, 29 June 2006 (UTC)

Of course loanwords are English words. Some of these have diacritics. See Misplaced Pages:Naming conventions (standard letters with diacritics)#Rationale--Francis Schonken 13:59, 29 June 2006 (UTC)
Personal names and other foreign words used in English texts are not loanwords, they're cited foreign words. Zocky | picture popups 14:04, 29 June 2006 (UTC)
Wasn't talking about proper nouns but about loanwords. The question was whether there are "English" diacritics. Proper nouns are afaik of no use when trying to prove whether diacritics are part of English or not. Loanwords, on the contrary, are useful in that context. And then the answer is yes, some diacritics are English. --Francis Schonken 15:07, 29 June 2006 (UTC)

Technology, not "it's not English," is why diacritics were historically stripped in English

The question was whether there are "English" diacritics. et al...

The notion of "diacritics" are "not English" is fundamentally rooted in pre-computer typesetting technology, whose ultimate technological achievment was the Linotype machine. Typing technology consisted of assembling a set of molds for all the letters and spaces in a line of text, then pouring hot metal into the created mold to create the line of print. It was simply not feasible (if not mechanically, then certainly not economically) to create Linotype machines which can do what we can do now from any computer keyboard, that is, type any language on the planet (almost). And so, what you had were Linotype machines set up by language, with a few "extras" tossed in which were used often enough that including them in the set of additional "special" characters was not overly burdensome.
   The result, when printing foreign names in English, particularly in the case of Eastern Europe, were all sorts of variations from just the elimination of diacritics to a seemingly endless supply of semi-transliterations.
   References recognized this and began moving away from non-diacriticalized versions as early as the 80's as phototypsetting technology began rolling out in force (having become commercially affordable in the late 1970's), using instead the original language form. However, it's only more recently that typing technology for the masses has caught up, from the RIGHT-ALT "GRE" language character shift standard to all the standard font faces supporting all the extant "code pages" and computer programs accepting alternate font "code pages" besides just Latin and "extended" Latin.
   Just in the tiny little article naming corner of the world I find myself embroiled in, we find the following question (leaving issues of monarchal titles, other languages, etc., out). What, for example, defines common English usage and/or current English usage for the Polish "Władysław"? Let's do the "Wiki" thing (google et al. searches, library searches, etc.):
  • Wladyslaw (drop the diacritics),
  • Vladislav,
  • Ladislau,
  • Ladislas,
  • along with the native Władysław.
Here we find three variants based on some sort of transliteration, one based on stripping the diacritics, and the native "Władysław," quite frequently used in major/popular references (not just obscure academic history journals) and simply indexed for English usage as if there were no diacritics.
   I submit that "Władysław is not English" is a red herring. Enough current English references, and going back 3 decades, simply use the native Polish syntax. So, what are the arguments in favor of the four non-Polish variants above?
  • A person "won't know how to type the ł's"—a moot point because redirects handle all the variants.
  • It looks strange—articles should mention all the historical English usage variants; I think it would be a net benefit for (many) English speakers to have a less parochial view of the Latin script; since it already appears to be generally accepted that one can use the native syntax within articles, there's no need to restrict the title.
  • Typing those ł's is damn inconvenient−shift to Polish keyboard, RIGHT-ALT plus "l", it really couldn't be easier... łłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłłł (repeating key). Took less than 30 seconds to install Polish keyboard support on my PC.
  • A person typing it without the diacriticals won't find the Wiki article from a search engine—perhaps once upon a time, but as far as I can tell, search engines now pretty much ignore diacritics.
  • I can't save the file in Notepad—save as UTF-8.
It seems exceedingly odd to me that an encyclopedia which exists only because of computer technology, which provides a built in editor supporting all the (major) "font pages" so one can insert all manner of characters with the click of a mouse, would insist that article titles remain rooted in a technology that became obsolete thirty years ago.
   English usage no longer means "restricted to the English language character set." Yes, "Władysław is not English"—it's Polish, but "Władysław" is accepted and increasingly preferred English usage. —Pēters J. Vecrumba 03:49, 19 November 2006 (UTC)
Don't be silly.
"It isn't generally a technology limitation. Newspapers and magazines have long been able to include diacritics when they choose to. They do not choose to.
It's a "moot point because redirects handle all the variants"? Utter hogwash. First of all, redirects don't just happen. Just look through my contributions listing, and see all the redlinks mentioned in my edit summaries that are still red even after I have called the problems to the attention of those watching the articles.
Furthermore, there are much broader implications than what happens when you type something into the go box, or when you put a link into an article. The use of diacritics or not also has significant effects on searches in various search engines; and on the major search engines, there are so many different parameters that affect the results that those results are never predictable. I can show you a great many cases in which diacritics do indeed make significant differences, on various search engines.
We English speakers have every damn bit as much right to establish our own identity in the characters we use in our language as the users of some other language who can't think of any better way to establish their identity than to see how cute they can get with the squiggles they put on the letters they use. I get tired of hearing from far too many editors claiming not that we should choose to use the options with diacritics here on Misplaced Pages, but rather that it is an "error" for us not to do so. It is never an error to use the English alphabet when writing in English. We can at times choose to retain some other languages characters for some purposes; that does not mean that we are in any way obligated to do so.
I never used to be in favor of an "English as an official language" law in the United States. After dealing for two years on Misplaced Pages with POV-pushers who insist on sticking diacritics in hundreds of places where they clearly do not belong, I am now urging my representatives in Congress to not only pass such a law, but not to pass some wimpy, empty mollification of constituents clamoring for such laws by passing some meaningless nonsense, but rather to make a real law with real language police with real authority.
It may have taken you less than 30 seconds to install the Polish keyboard on your computer, but big fucking deal. What's the point. Yes, I can do that too. But the caps on my keys do not change. What the hell am I supposed to do? Be like those monkeys they talk about in math classes, given enough time they'd duplicate all the works ever written? Do I just hit keys at random, until something shows up that resembles what I'm supposed to be looking for? Does that keyboard include "dead keys" that don't do anything themselves, but rather change what happens with the next key you hit? I may get it installed in 30 seconds, but it is hard to teach an old dog like me new tricks. It would take me 30 years to learn how to use that Polish keyboard, and I likely don't have that much left. By the way, I have for many years had both the German and the Norwegian keyboards used on my computers. Even there, where I konw what the letters are, I have found it more trouble than it is worth to try to learn the keyboard layout. Rather, I find it much, much simpler to remember a few numbers and use the Alt-numeric keypad method to create them as I need them (and in most cases, it is things like Alt-134 that I remember, the old DOS operating system versions which still work with Windows, rather than the Alt-0229 Windows version which I had to look up because I don't know it. Of course, in the case of Alt-0248 and Alt-0216 which weren't available in DOS, I learned the Windows numbers.
Then, what in the world do I do for the 500 or so other languages used here on the English Misplaced Pages. How many keyboards can I install? How much memory does each one of them take up? How am I ever going to remember what I need to do to switch to the one I want to use, let alone remember what the layout of the keys is if I do figure that out?
Your example of the "native" Władysław spelling is the most common late-20th century/early 21st century spelling in the Polish language, of a name that has had dozens of variant spellings in both the Polish language and as well as in other languages throughout history, including the times when many of the people bearing that name lived, and including various other languages actually spoken by the people bearing equivalent names under whatever spelling. Gene Nygaard 03:04, 3 February 2007 (UTC)

English Misplaced Pages NHL team pages & diacritics

NHL teams don't use diacritics on their euro-players jerseys, we should respect that. Futhermore, the euro-players in the NHL have consented (haven't disputed) their names being anglonized on thier NHL jerseys. IF the euro-NHLers (past & present) consented, why can't the pro-diacrtics & supporters do the same, at NHL team pages? GoodDay 23:23, 29 December 2006 (UTC)

Another thing

Another thing: We indeed don't use Chinese characters, but we do use the transliteration which the Chinese use. Zocky | picture popups 13:04, 29 June 2006 (UTC)

For that reason romanization systems (like pinyin) are defined as outside the scope of this proposal at Misplaced Pages:Naming conventions (standard letters with diacritics)#Scope --Francis Schonken 13:59, 29 June 2006 (UTC)
So, the Chinese and Serbs get to use their own transliteration, but Swedes and Czechs don't? Zocky | picture popups 14:03, 29 June 2006 (UTC)

OK, forget Serbian and take Macedonian, which uses a very similar latin spelling as Serbian, but only as transliteration. Keeping a person with the same last name at Buckovski or Bučkovski (which both would spell Bučkovski themselves), depending on whether they're from Serbia or Macedonia sounds unworkable.

I really have no clue what you're trying to get at. How to romanize the Macedonian Cyrillic script is described at Misplaced Pages:Naming conventions (Cyrillic)#Macedonian. Indeed, there, it is described as "may be written as Serbian" (with a few specifics/variants). If you have a problem with that, please direct your concerns to Misplaced Pages talk:Naming conventions (Cyrillic). Misplaced Pages:Naming conventions (standard letters with diacritics) is not about romanizing Cyrillic scripts. If you want the people involved in the romanization of Cyrillic script languages to read your suggestions, then this talk page is not the right place, Misplaced Pages talk:Naming conventions (Cyrillic) is. --Francis Schonken 16:19, 29 June 2006 (UTC)

My idea is that all languages should be treated the same - use the same spelling as used in English texts produced in the country of the language's origin. Zocky | picture popups 15:50, 29 June 2006 (UTC)

Don't see what that would solve. These English texts produced in the country of the language's origin don't all use the same spelling. And a side-effect would be that you'd make the current *agreement* on National varieties of English (as described at WP:MoS) explode: "the country of the language's origin" would be the UK in that case I suppose, so you'd get all the USA people against you. --Francis Schonken 16:19, 29 June 2006 (UTC)

Read that as "the country of the original language's origin", of course. In other words, spell Slovenian names as English texts produced in Slovenia do, spell Chinese names like English texts produced in China do, and spell American names as English names produced in US do.

UK would still be "the country of the original language's origin" when speaking about English (the original language) --Francis Schonken 16:49, 29 June 2006 (UTC)

The problem with this proposal excluding romanization is that it would e.g. force Serbian and Croatian names to drop diacritics while the same names used in Macedonia would keep them. Imagine a situation where both presidents of Serbia and Macedonia had the same first or last name, which includes a diacritic both in Serbian latin spelling and in the Macedonian romanization. A sentence saying "Sasa Cacic visited Saša Čačovski in Skopje" would look ridiculous. Zocky | picture popups 16:34, 29 June 2006 (UTC)

Giving an example of English texts produced in the country of the language's origin don't all use the same spelling:
Note that all the mentioned websites are Polish (.pl), and that for the Polish pages of each of these websites always the version with diacritics is used... (I mean: the differences in the English spelling don't result from the often gratuitously assumed "laziness" in this case).
So, no, I don't think Zocky's alternate proposal would solve much.
Neither for Chinese for that matter, Lao Tzu as well as Laozi (and some other variants) can be encountered in English texts produced in China. --Francis Schonken 16:47, 29 June 2006 (UTC)
Of course they don't all use the same spelling, but that's in no way different from English texts produced in English speaking countries, but it would still be the same rule for all languages. Zocky | picture popups 17:19, 29 June 2006 (UTC)

There are two aspects to you proposal (apart from the US/UK English thing, but that could be worked away with a diligent way of formulating the principle):

1. Use local sources in English for determining spelling in English Misplaced Pages
This has several problems, for one that it would be less compatible with the current provisions of wikipedia:naming conflict. For example for Lech Walesa/Wałęsa, using the table provided by that guideline:
Criterion Lech Walesa Lech Wałęsa
1. Most commonly used name in English 1 0
2. Current undisputed official name of entity 0 1
3. Current self-identifying name of entity (in English!) 1 0
1 point = yes, 0 points = no. Add totals to get final scores.
This is a weighed result. Doesn't give precedence to a single principle. Compatible with the present "diacritics" proposal. What you propose is that a single principle gets precedence, a principle that doesn't apply likewise to all countries/languages (not all countries/languages produce readily available "reliable sources" in English covering everything that is notable about the country, for instance - for several countries the majority of reliable sources in English are produced outside the country).
So, as far as your "published in English in home country single principle" proposal is concerned: this might seem a good idea on first sight, but I foresee too many problems, and won't support it.
"self-identifying name of entity (in English)" is roughly what I'm talking about, and in this case would probably more commonly be the one with diacritics. In fact, almost for all foreign names, items 2 and 3 gives points for the name with diacritics and trumps the English common usage. I guess that's why most articles are at the names with diacritics now. Also, Misplaced Pages:Use English says that for languages which use the latin alphabet, no transliteration is necessary, which I interpret as "use the original spelling". Zocky | picture popups 02:43, 30 June 2006 (UTC)
I've no idea what, in sum, you're trying to say:
  • Currently "self-identifying name of entity" should determine for 33% (the other two thirds being "official name" and "common name in English"), per the naming conflict guideline;
  • Then you say: no, "self-identifying name of entity" should determine for 100%, it is an appropriate formulation of the "published in English in home country single principle";
  • Then you say: no, "self-identifying name of entity" should determine for 0% while it trumps English common usage (which, furthermore, it obviously didn't in the example given above).
...all in all a quite confusing comment.
Also, your quote of Misplaced Pages:Use English is very questionable. The sentence where you quote from has quite clearly "If there is no commonly used English name". Arguably, for example, "Andre" is the common English format of the French name "André". Seems also as if you never read the guideline till the end. It has very clearly: "There is disagreement over what article title to use when a native name uses the Latin alphabet with diacritics", in Misplaced Pages:Use English#Disputed issues. It is (a part of) that dispute we're trying to solve with the present "diacritics" NC proposal. Your comments above seem so confused to me, that I still don't know what grounds you have to either support the thing getting solved, or not. --Francis Schonken 07:38, 30 June 2006 (UTC)
The problem here is the idea that everything has an "English name", which is simply not true. Some things are named in other languages and English uses them as citations. With substantial usage some of these become English words, and sometimes the spelling changes (that's how "Andrew", the real English equivalent of "André" came about). But in most cases where diacritics are used there are no English words, just cited foreign ones.
"Self-identifying name" to my mind is simple - what the person or entity uses themself. I have never said that anything trumps common English usage automatically or common English names at all (in fact, I supported titles like Oder, Drave, Save, Styria, etc.). I just meant to comment that if the above template is applied, diacritics would win in most cases, even if versions without diacritics were really "English".

How citations are rendered, is a matter of choice, but there's no magic formula that says that dropping the funny dots makes a foreign name or word English. Zocky | picture popups 03:33, 15 July 2006 (UTC)

2. treat the Latin alphabet languages and those with other native scripts with the same rules.
In fact I agree with you there. The "caveat" for the non-Latin alphabet languages is a practical one. Wikipedians have elaborated guidelines for Japanese, Chinese, etc... I think they did a good job. I'm not remotely experienced in these languages to doubt their assertions that on some level somewhere a more "formal" linguistical romanization system should be used, like pinyin, which results in some diacritics being used. Anyway, that's a different problem, and is, for those languages, covered by active guidelines. I don't think it would be a good idea to undermine that work. Of course, on short term for the natively "dual script" languages (how many are there: 2 or 3?) the guidance should be clear. Which for Serbian means that, unless the "Cyrillic" naming conventions page is updated in view of the impending diacritics NC guideline, things will be as said if and when this diacritics guideline goes life (change "Latin spelling is used" to "Latin spelling is used including native diacritics" on the Cyrillic NC page, and the thing would be settled too, without Serbian names needing to be changed).
Whether in a later stage Japanese, Cyrillic, Chinese, etc. guidelines are to be brought in line with the "Latin alphabets" diacritics guideline is not a problem to be solved now. Maybe it never happens. If it happens, and its a language I'm remotely acquainted with (Greek might fall in that category ) I'd support a diacritic-free romanization. --Francis Schonken 19:08, 29 June 2006 (UTC)

ä, ö, å

What to do with Finnish ä, ö and the Swedish å in article titles? Aren't they allowed? My keyboard has them, but I know there are many whose doesn't. Finlandais 14:24, 3 July 2006 (UTC)

I would follow the lead of English-language newspapers that were writing about whatever word had those letters. If the English-language newspapers routinely included the special diacritics, then I would title the Misplaced Pages article in the same way. If the newspapers didn't usually include the diacritics, then I would leave them off of the Misplaced Pages article title. If the subject just never gets written about in anything English-language, I would probably leave the diacritics off of a Misplaced Pages article title, but I would include the proper spelling with diacritics in the body of the article. If you can provide a more specific example, I'd be happy to take a look at it. --Elonka 17:15, 3 July 2006 (UTC)
There is no rule here on wikipedia on whether to use the diacritics in titles or not and there have been, and still are, long arguments about this. However, diacritics are allowed in the sense that they are used for many article titles, e.g. the ones you are asking about are used in Norrköping, Jämtland and Fucking Åmål, but they are nowhere explicitly allowed. One way of deciding whether to use them in a particular case or not is to look at similar articles and see how the question has been resolved there. If you are interested in seeing an argument develop, you can look for outside help, e.g. here or on Requested moves where contested moves should be put up for discussion. Stefán Ingi 17:30, 3 July 2006 (UTC)

Actually, I am asking for those FINNISH alphabets, not Swedish. All examples given above by Stefan seem to be Swedish. Finlandais 15:16, 9 July 2006 (UTC)

Try Väinämöinen, which is at the location with the diacritics - which is also clearly the most common way to refer to him in English so that should be uncontroversial. Generally you should probably use diacritics if you want to be consistent with the current situation, the exception is if a Finnish person is mostly known for something like playing football abroad where his name may be consistently spelled without diacritics. But even then some people will prefer to use diacritics on Misplaced Pages (see Jaromir Jagr, though he's not Finnish). Haukur 16:04, 9 July 2006 (UTC)
I've created a proposal for Finnish proper names at Misplaced Pages:Naming conventions (Finnish), as Finnish letters are different from diacritics. Elrith 14:44, 8 August 2006 (UTC)

Umlaut and ß sources

At Misplaced Pages:German-speaking Wikipedians' notice board/Umlaut and ß I've been putting together some examples of how English language publications deal with ß and umlauts. Would anyone like to contribute? Discussions using reason and argument have so far only ended in stalemates, and I am hoping that if we can agree on how the matter is usually dealt with in printed English it might give us some clues on how to do so at Misplaced Pages. Saint|swithin 11:12, 7 July 2006 (UTC)

another related poll

There is currently a survey about moving article page names here Talk:Marián Gáborík and here Talk:Teemu Selänne. Feel free to come voice your opinions. Masterhatch 19:17, 25 July 2006 (UTC) (polls closed)

A suggested move and related debate about whether to name an article "Meissen" or "Meißen" is ongoing at Talk:Meissen. Interested editors are invited to participate. --Elonka 00:09, 22 September 2006 (UTC)

Ælfric and other Old English names

Section (and subsections) moved to Misplaced Pages talk:Naming conventions (use English)#Ælfric and other Old English names by Francis Schonken 23:39, 8 January 2007 (UTC)

Change article name and scope?

Subsection moved to Misplaced Pages talk:Naming conventions (use English)#Change article name and scope? by Francis Schonken 23:39, 8 January 2007 (UTC)

Æ/æ/Œ/œ - rules proposal

Subsection moved to Misplaced Pages talk:Naming conventions (use English)#Æ/æ/Œ/œ - rules proposal by Francis Schonken 23:39, 8 January 2007 (UTC)

Most accurate form.

Why does this have to be a big deal? Use the most accurate form of a name unless theres an overwhelming reason not to, and use redirects to make sure users that can't easily input diacritics can find the article. Everybody wins. - Stephanie Daugherty (Triona) - Talk - Comment - 06:15, 4 October 2006 (UTC)

  1. Redirects don't just happen.
  2. You seem to be making a false assumption that when versions with diacritics and without diacritics exist, and with varying number of letters with diacritics on them, that the one which is the most cluttered up with diacritics is somehow "most correct". That simply is not true. Gene Nygaard 06:02, 14 October 2006 (UTC)
Furthermore, redirects do not solve most of the problems with hiding information from searches of various kinds (including find on page), and they do not solve problems of category sorting being all messed up.
Redirects also do not solve problems of all the squiggles being eyesores. They are of some help in solving problems such as distinguishing between Ð and Đ, where the difficulty is that our eyes either cannot make the distinction, or that we don't know which is which even if our eyes do see a difference.
To most English speakers, all a bunch of diacritics means is " I guess I'm supposed to pronounce this funny". So unless it is something whose pronunciation is familiar, the squiggles are of no help. In fact, I'll just not even pronounce it to myself, and it gets into my memory as a blur of letters. Then when I come across some different incomprehensible gibberish, which actually might not even remotely resemble the first one other than being about the same length of word, my brain just lumps the new one together with the old one, and pretty soon I don't remember which was which and don't retain any information about either unpronounced word. Gene Nygaard 06:21, 14 October 2006 (UTC)
I responded elsewhere on the dropping of diacritics reflecting a limit of technology, not a preference translating to proper English usage. To Gene's points...
  • redirects—totally agree, if someone is using diacritics in the title, then they are obligated to insure all the appropriate non-diacriticalized redirects are created
  • find on page, etc.—it's a bit of an inconvenience, but installing keyboard support for the language in question goes a long way; also, since the proper spelling appears in the title, one can just cut and paste that into a "find"--since the article most likely will have the diacriticalized syntax, anyway, it's only the title where they are being dropped, making the search argument invalid in the first place
  • jibberish and confusion—if we all stuck to the proper spelling in the first place it would be a lot less confusing for everyone; the real confusion is titles according to one convention and then articles written in keeping with another convention; that makes no sense; confusion would be minimalized if the diacriticalized version were always used since that's the one that is accurate. Everything else (transliterations from Latin script into other Latin script in particular) just reduces comprehendability.
  • readability—Firefox, for example, allows enlarging the text regardless of what the web page says about size (fixed or not)
  • if pronounciation is an issue, then someone should record a sound-bite so people can hear it, the other alternative is to cite the pronounciation using the correct symbols for that purpose (which, trust me, is a whole lot more confusing for the average person than just some diacritics!)
Pēters J. Vecrumba 21:10, 21 November 2006 (UTC)
sigh* You are assuming that the "proper" spelling is the one with diacritics and you are assuming that the removal of diacritics is the improper spelling. You are also assuming that the reason for the removal of diacritics in English is because of technology. Sorry, but the "proper" spelling in English is the most common spelling in English, whether that includes diacritics or not. Whatever the causes, whether it be due to technology or some other cause, the fact remains that in a large number of cases, most i would say, English drops the diacritics in both type and writ. Misplaced Pages has a policy of following the most common spelling in English--the spelling that is most recognisable to English speakers-- and if we were all to follow that, there wouldn't be this discussion. The solution is simple in every case. If the most common spelling in English uses diacritics, then wikipedia should too. If the most common spelling does not include diacritics, then wikipedia shouldn't either. It is a case by case situation and every article should be looked at separately. This is a really simple solution that follows common sense and wikipedia guidelines and policies already in place. Masterhatch 19:49, 22 November 2006 (UTC)
Good points. The claim about "limits of technology" is hogwash, a minor factor in a limited number of cases. The fact of the matter is that it is a quite legitimate and proper option to choose to use the English alphabet when writing in English, and English-language books, newspapers, magazines, television, pamphlets, brochures, and whatever often choose to do so, even though they are quite capable of including diacritics when they choose to do that.
Many people such as Vecrumba misstate the usefulness of diacritics in determining pronunciation. For one thing, all they mean to many English-language users is that this is a signal that we are supposed to pronounce this in some strange way, with no clue as to what that is, and then they are often surprised to learn that they should pronounce it the same way they'd learned to pronounce it when it is written without those diacritics. Furthermore, the same character doesn't necessarily have the same effect on pronunciation in different languages—and even within a single language such as Norwegian, there are quite significant regional variations in pronunciation in words spelled exactly the same way in any of those places. Gene Nygaard 12:36, 5 February 2007 (UTC)
Page names and pronunciation

I'd like to elaborate on the "page names should reflect pronunciation" issue. This argument is used very often. I think it is irrelevant. English speakers, whether American, British, Australian or whatever can't pronounce "Clijsters" (as in Kim Clijsters). "Clijsters" is both the correct spelling in Dutch, as the spelling used in international tennis tournaments. If the spelling were to indicate how it's pronounced, we shouldn't hear something that in Dutch would be written Claaistejs or Cleestes (or whatever) over and over again when the name is pronounced in tennis tournaments in English-speaking countries, or in whatever country where Dutch pronunciation rules are no common knowledge and/or where the sounds that need to be produced to pronounce "Clijsters" are no part of the usual sounds produced in the local language. English speakers don't know how to pronounce "Clijsters". English speakers have no reference for the "ij" sound in that name, while it is a sound that doesn't exist in English. The rules about when and how to pronounce an "r" are different in English and Dutch. Live with it. The pronunciation info (if any) goes in the body of the Misplaced Pages article, per Misplaced Pages:Pronunciation. The article name should not be based on pronunciation, while Misplaced Pages is a written source, so it is based on how names are usually written in English. If a diacritic is usually used in written English, Misplaced Pages should do the same in page names. If it usually isn't, then also Misplaced Pages should do the same. If it's difficult to establish common usage in written English, pronunciation is not something that should interfere in the decision on how to name the page with the content, while that aspect is really irrelevant.

Maybe we should inscribe the principle of "pronunciation info goes in the body of the article, and doesn't influence the page name" in the Misplaced Pages:Naming conventions policy. What would you think about that? --Francis Schonken 09:21, 14 October 2006 (UTC)

It's less clear for Cyrillic, Greek, Chinese, etc.; that said, such a principle would go a long way to eliminating the confusion and conflicts created by "transliterating" Latin-script names to alternate Latin-script which never succeed more than marginally replicating the "native" sound (or simply stripping of diacritics with no regard to the sound). As Misplaced Pages is a reference, there is no reason not to follow other modern references in using the native Latin script name for article naming. It makes no sense to transliterate, for example, Władysław to "Vladislau", then redirect from Władysław, Wladyslaw, Ladislas, Vladislav, Ladislau, etc. and then (stupidly and needlessly) argue about which bastardization of the native syntax (transliteration) is the most "accurate" or most "popular." (And when there is disagreement over what an appropriate transliteration is, it's naïve to believe that pronounciation won't come into the argument.)
I would agree with Masterhatch more if it weren't already completely accepted practice (including Wiki) to write names in their native syntax in articles. Titles using the native Latin script would eliminate, not cause, confusion--and would conform to, not ignore, current reference titling.
The "most common spelling in English" and "proper English usage" are not synonymous, which is the underlying basis for Masterhatch's (understandably) sighful response. —Pēters J. Vecrumba 18:36, 23 November 2006 (UTC)
And the

something funny

I was walking through Costco yesterday and I happened to notice that there was an authentic autographed Markus Naslund sweater up for sale. It was a little too rich for my blood, but I had a boo anyway. I noticed something kinda funny while admiring it; there were no diacritics in his own personal signature (or anywhere else in the literature for that matter). My point? Many wikipedians who seem determined to ram diacritics down the throats of English speakers often say, "Misplaced Pages should write the names the way the actual people write them." These same wikipedians also claim that dropping the use of diacritics means the name is spelt wrong. So, did Mr. Naslund spell his own name wrong on a $500 sweater? I still maintain that the most common spelling in English be used for article titles. Masterhatch 17:46, 8 November 2006 (UTC)

AIEEE!! Costco as a reputable source! :-) Really, in the end, one can't argue that "Jānis Čakste" (first president of Latvia) is "wrong" because "Chakste" was a popular transliteration at one time. (This is actually a significant problem with Latvian surnames which were still often transliterated according to German orthography for "English consumption" into the second half of the 20th century.) As I've stated, "anglicization" and "English usage" are two completely different animals, and the Wiki preference should favor current English reference usage over historical anglicizations (plural intentional, as there is no consistency/consensus in that arena). —Pēters J. Vecrumba 18:46, 23 November 2006 (UTC)
P.S. I will say for the record now that should I ever be famous enough to merit my own article, I request "Pēters Jānis Vecrumba" as the title, not "Peters J. Vecrumba" (which is how I sign documents—in the U.S.—and is not my "real" name). I know, someone will insist on seeing the "diacriticalized spelling" on my (Brooklyn, New York) "birth certificate," and so it starts... :-) —Pēters J. Vecrumba 19:10, 23 November 2006 (UTC)
It's usually the POV-pushing nationalistic or in other ways chauvinistic editors, or those just plain anti-English or anti-American or whatever, who argue that what you use is irrelevant, as they did in Arpad Elo discussion and many others, and presumably would argue—but haven't even done so because they haven't discussed it at all—in hundreds of still-misnamed articles which have had diacritics slapped on with a totally unreferenced and undiscussed move, and that whatever we use should be determined by the "original spelling", which is in this case by your self-admitted birth certificate spelling. OTOH, your personal choice and feelings about the matter are not and should not be determinative, either. What matters is most common use in English. As you pointed out, there likely exist many cases which could prove your use of the proper English spelling without diacritics, and with some evidence of usage with diacritics as well, then it becomes a matter of choosing from among the legitimate alternatives in picking the one to occupy the one slot available for the article's name. Gene Nygaard 12:51, 5 February 2007 (UTC)

Turkmen spelling

It's hard to keep the correct spelling of all those Turkmen names you see in the media since the death of Saparmyrat Nyýazow. All the sources used by journalists are in Russian, so Turkmen names have to be translitterated from the Latin alphabet used by the Turkmen language to the Cyrillic used by Russian, and then retranslitterated into Latin alphabet. You can see the result in the first footnote of the article Gurbanguly Berdimuhammedow. Official Turkmen internet sites are themselves written in Russian, and sometimes translated in English, but they are not written in Turkmen, since the Turkmen people has no right to access internet. But I think we succeeded to keep on with the Turkmen spelling. The result can see here, for example: Turkmen presidential election, 2007.

There's still one black hole: Saparmurat Niyazov's article itself. It should be renamed into Saparmyrat Nyýazow at once. There is no reason to keep the English translitteration of the Russian translitteration of his name. I'm already hearing the usual argument I hear really too often on Misplaced Pages: "The custom is Saparmurat Niyazov. We can't go against the custom." That's a very week argument. Errors can be customs. Once a misspelling is made in one source, every media repeat the error again and again, and one day you discover on Google there are 2290000 articles containing "niyazov", 721000 containing "niazov", 33600 containing "nyazov" and only 20400 containing "nyýazow"!! Well, those 20400 are right, and the other are wrong. We can change the custom if we change the article's name.

Another bad reason to keep the "Niyazov" spelling is that "Nyýazow" is to difficult to spell because of the accent on the Y. This is not true. Every name of foreign origin with accents are spelled with their accent on Misplaced Pages. See John C. Frémont or Charlotte Brontë for simple examples. Or event Rudolf Slánský to find one with exactly the same letter. And don't tell me it's not English, I know it is not. Foreign names are by nature not English names. And if you write the name with a misspelling, the redirects are here to conduct you without any effort to the correct spelling. There is absolutely no drawback for the reader to rename the article.

That's why I propose to rename the article into Saparmyrat Nyýazow.

Švitrigaila 00:21, 30 December 2006 (UTC)

FOR THE RECORD, because of its significance to these discussions: The proposed move failed. (Note also that it wasn't because of my arguments, since I did not participate in that discussion. Should it come up again and I know about it, I will likely join the considerable existing opposition.) Gene Nygaard 13:00, 5 February 2007 (UTC)

One more voice . . .

I must say, I'm a bit stunned by the amount of discussion on this matter. Like most (regardless of which side they are on on this debate) it seems to me to be a matter of common sense. I agree with those who say that the most common English spelling should be used, regardless. And just now, as I was typing this, a thought occured to me—an argument, if you will.

I think it's safe to say that, in matters of translating names into English from Chinese or Arabic, that there's probably no one who proposes that we should use the "native spelling" of the name, simply because it's impossible (and of course, please don't tell me that "spelling" doesn't exist in Chinese—you get my point). Well, you know, if every language in the world came with its own alphabet or grapheme-system or whatever you want to call it, we wouldn't even be having this argument. We would simply recognize that, this is the English-language Misplaced Pages, and it's silly to try to use the spelling conventions in the English Wik, because they have no meaning here.

Well, I think that's true now, in our current situation as well. "Õ" and "ɮ" and "þ" simply have no meaning in English. Oh, supporters will argue, "But that's the correct spelling", to which I say, "you are correct . . . and irrelevant". If this was the Misplaced Pages Internationale, then I would, by all means, support the use of characters as they are portrayed in native languages. But people, this is the English Misplaced Pages. Yes, there are some "foreign words" in English that use diacritics (though none, I am sure, that use "Þ" or "ß"). But that is because, for whatever reasons, that particular convention has arisen. Who are we to change that? Is Misplaced Pages going to be the fountainhead of a revolution in English spelling? Is that the goal here? You know, though it's not exactly the same thing, in a way, starting a new spelling convention on Misplaced Pages would seem to violate the prohibition on Original Research, n'est-ce pas?

One other thing. Frankly, I think that, from what I have seen on article discussion pages, the majority (though not all) of persons supporting "foreign" spellings are non-native speakers of English. I would not be so rude as to go into the Spanish or French Wikipedias and tell them how to spell their articles, and if I did, I'd expect to be shown the door. You know, one of the problems in reaching consensus on this issue is that the majority of Wikipedians are average, ordinary folks (I think the breadth of articles is quite demonstrative of this). They have little or no interest in these matters of policy. But non-English speakers, almost by definition, probably have a higher level of education—and interest in such esoteric matters as language naming conventions—than the average English speaking Misplaced Pages users (Please do not start telling me about your degrees. Yes, thousands of English Misplaced Pages users are highly educated. My point is merely that, the average Joe and all kinds of English speakers come together here, whereas the average Ivan or Guido is less likely to visit the English Wik; therefore the average educational level of English speakers is probably lower). But if the English speaking masses were aware of these debates, the consensus would be overwhelming, I dare say, easily 90% (at least, of American users) would squash the practice of employing non-English characters.

I just want for us (the English-speaking Wikipedians) to be left alone. To be sure, there are some native English speakers who will still continue to favor using diacritics and non-English ligatures, just as there are a few isolates that favor elminating private property or establishing Christianity as an official state religion (I purposely selected wackos from opposite ends of the spectrum). You know, to those non-English speakers who mock the backwardness (they speak of "dumbing down" Misplaced Pages by eliminating diacritics) of us Anglophiles, let me point them in the direction of the speakers of German, and the fact that they can't even decide amongst themselves how to make use of something that they consider (on some of their discussion pages) to be essential to an accurate article: the "ß". I mean, the rules for using "ß" are different today than they were 15 years ago. So, if we decide that we are going to use German or Swedish graphemes, but then, in ten years, ther Germans or Swedes change their rules, does that mean that we will have to change all the affected English-language articles? I'll tell you what—why don't we just write them in English today and for always.

Whether we realize it or not, though many of us claim to use the "Latin" alphabet, the fact is, the alphabets of the European countries that are based upon Latin have long since diverged, albeit only slightly. The use of "Ł" in Polish and the use of "J" in English and the use of "ß" in German indicates that none of these languages are actually using the original Latin alphabet, and none of them are using the same alphabet. It's only because our different alphabets share a common origin and still look more alike than different that we think that they're the same. But they're not. We should quit fooling ourselves and just spell things written in our respective native languages the way that comes naturally to each of us. Anything else, quite frankly, is little more than an affectation. Unschool 08:14, 31 December 2006 (UTC)

"English spelling" of foreign proper names doesn't exist. See here for my argumentation about the subject. If I clearly understand you, you would prefer to rename, for example Lech Wałęsa into Lekh Vawensa since Ł and Ę are non English letters? But the Polish pronunciation is different too from English pronunciation. So, if we drop the original spelling, why not dropping the original pronunciation too? Why not renaming Lech Wałęsa into That guy with moustache, as it is done usually in sign language? Švitrigaila 12:09, 31 December 2006 (UTC)
Just one more thing about what you say about average edicationnal level of English speakers. An encyclopedia doesn't have to put itself at the same level as the average reader. It must be understood by the average reader, yes of course. But it mustn't limit itself to what the average reader can read. It must pull the reder to the high. When the reader find an article on Leoš Janáček, for example, he must find the only correct spelling of his name. After that he's free to use it or not, like every correct spelling of any English word. If the reader wants to write this name on Leos Janacek, or even Leon-Yann a Czech, it's his problem. An encyclopedia won't force anyone to use the good spelling, but it's not a reason not to give it. When the artcle Eris (dwarf planet) says that its perihelion is 37.77 AU (5.65 Tm), I find it normal, even if personnally I have absolutely no idea about what it means. I won't write "Let us alone! Anything else, quite frankly, is little more than an affectation." Knowledge is not relative. If we have the information, we put it in the article. We don't vote about what the reader will think about the information. Švitrigaila 12:28, 31 December 2006 (UTC)
First of all, no, I'm not proposing to change the spelling of Wałęsa to Vawensa. I apologize for being unclear (and, upon review of my comments, I was a bit verbose.) No, no, with Walesa, I propose simply maintaining the current convention. When writing the name of the leader of Libya, we do not write "معمر القذافي", we write Gadaffi or Khadaffi or Qadafi or whatever such transliteration. But no matter the difficulty in putting the name into English, we do not get too bent out of shape worrying about whether or not it is perfect. Why is that? Because we recognize that we can't get it perfect, and that all we have to do is to make sure that we are talking about the correct person. "A rose by any other name smells as sweet", in other words. All that matters is that we all follow the same conventions. And in English, the convention is to spell it "Walesa", despite the fact that it is pronounced something more like "Vuh WEN suh". I read your comments on the subject, as you asked, and wonder if you read mine. You say that our conventional English spellings are "not correct". In this, you are technically correct. Would you deny that spelling the Libyan leader's name "Gadaffi" is also not correct? If you don't agree, then tell me, what is "correct"? If you do agree, then would you have us change the title of the article, and the usages within the article, to "معمر القذافي"? If not, why not? My central point is that it is simply a mistake to think that we can employ foreign spellings of names into the English Misplaced Pages just because they happen to utilize an alphabet which looks like the same one we use, but in fact, is not the same. Most people recognize this with Arabic, Chinese, Cambodian, Korean, or Russian, but they don't with Polish or Spanish or German because their initial instinct, upon viewing these letters, is to believe that we can, somehow, spell things "correctly", because we appear to use the same alphabet. But we do not.
By the way, where does it end? Must we change the spelling of the "Rome" to "Roma" and "Cologne" to "Köln"? Do you favor changing all the placenames with Anglicized spellings to their "correct" spellings in their native tongues? I'm quite curious. Unschool 19:31, 31 December 2006 (UTC)
These just gave me a clear glimpse of your attitude towards foreign cultures, "I would not be so rude as to go into the Spanish or French Wikipedias and tell them how to spell their articles, and if I did, I'd expect to be shown the door...I just want for us (the English-speaking Wikipedians) to be left alone." How dare you call us rude for expressing our opinions on Misplaced Pages's proposed guidelines? The English Misplaced Pages IS NOT exclusively for native English speakers, foreigners aren't second rate users here and my opinion is as valuable as yours whether you like it or not.
About your question, yes, I'm in favor of changing spellings of Anglicised terms to their correct native tongues just as I'm in favor of changing spellings of Hispanised terms to their correct native tongues. Using local terms to refer to foreign names creates confussion, supposes a loss of information and shows a lack of respect for foreign cultures. Rosa 01:18, 2 February 2007 (UTC)
In various other discussions, you have shown no respect for the English language, or for the culture of places which use it. You do not admit that we have every bit as much right to determine on our own how to spell things in English as anyone else has. In various other discussions, you have clearly demonstrated a truly distorted sense of the role of the Spanish language police, the Royal Spanish Academy, and the scope of its authority, and have claimed that it should have some control over our usage here on the English Misplaced Pages.
For example, our Misplaced Pages article on La Coruna, Spain, is not at that English spelling, nor at Corunna, the other common English spelling. But despite the existence of the Royal Spanish Academy, this Spanish town is also not at the Spanish spelling (which is, in fact, used in the Spanish language version of the town's own website, as well as for the name of its article on Spanish Misplaced Pages) of La Coruña. Rather it is at A Coruña. There is a serious short-circuit in your brain if you can accept the fact that this place can be spelled "A Coruña" in the Galician language and "La Coruña" in the Spanish language, yet for some reason the users of the English language do not have any rights at all, so it would be improper for this place to be called "Corunna" or "La Coruna" in English.
Do users of the Spanish language have a right to spell it "La Coruña"? Sure; every bit as much right, and only as much right, as we have to spell it "La Coruna" in English. Gene Nygaard 03:52, 3 February 2007 (UTC)
First, the Royal Spanish Academy doesn't have any authority akin to those given to a "police" type of force. I would rather compare it to a council of elders. Please, don't lecture me on the role of the Academy, as until last week you had no idea what it was yourself, whereas I've studied its organization, rules and discussions since childhood. Second, please do not give ill-intentioned interpretations of my arguments regarding the naming conventions. Third, it's fine if we have a different point of view in this issue, but it's not fine for you to attack me personally. Please, refrain from using the Argumentum Ad Hominem rather than addressing the inherent strength of the argument itself; that's to say, don't use phrases like "distorted sense", "serious short-circuit in your brain", "Rosa's reasoning (using the term loosely)", or "sheer lunacy" when referring to me or my thought process.
The renaming of proper names according to their native languages is a process, it doesn't happen overnight and as I've said a couple of times, the Spanish language still has a long way to go on this matter. The Academy certainly doesn't run the Spanish Misplaced Pages, nor is it a branch of the Spanish government.Rosa 15:55, 3 February 2007 (UTC)
It isn't a process that anybody mandates us, or the editors of any other publication, to participate in.
It would be rather easy to disprove your claim that "until last week you had no idea what it was yourself".
I'm talking from relevant experiences on Misplaced Pages, from hundreds of arguments about whether we should use the Ukrainian spelling of something or the Russian spelling, whether we should use the Polish spelling or the Lithuanian spelling, whether we should use the Galician spelling or the Spanish spelling, with a very significant number of the participants unwilling to accept either the fact that the English spelling can legitimately differ from both of them, or even that the English spelling can be the same as one of them and different from others. They are quite willing to admit that spellings can and do vary among other languages, to give any pair of other languages the right to determine their own spelling—yet they insist English should not do the same. Go add up the number of participants in some of those discussions who refuse to accept that the English spelling is even an alternative that could be considered, let alone the proper choice under our naming conventions.
And that doesn't even get into the fact that most things don't have any sort of "official" name, that in other cases there is no entity that has ever been granted plenary authority in determining the name of something, and that there are many objects, ideas, people, places, or whatever that have different and distinct "official" names either created by different organizations, or used in different languages, or used in different fields of activity with different professional organizations coming up with conflicting rules, for example. Note that this particular guideline under discussion here is not limited to people, not limited to places, not limited to a combination of the two, but it also includes, for example, things such as the proper place for the Misplaced Pages articles on units of measure such as the ampère or ampere, the ångström or ångstrom or angstrom, and the mètre or metre or meter. Can you guess wehre they are now? Your odds of guessing right are improved on one of them—not even a redirect or a disambiguation page from one of those original spellings with diacritics, I see! Gene Nygaard 17:24, 3 February 2007 (UTC)

Two issues

I see two distinct issues here. One, which seems to be the topic of lots of heated debate on this page, is the choice between a plain letter and a letter with a diacritic. The other is the use of diacritics in titles which are transliterations from some other writing system.

Plain vs Diacritic

Right now, this is the lesser issue. My personal feeling is that we English-speakers need to get used to the idea that our language is like the Borg: we assimiliate just about every single thing we come into contact with, and we add its uniqueness to our own. We eagerly soak up words with spelling that makes little or no sense in our own twisted orthography, such as Czech, cappuccino or bon appétit. Sometimes the new loanword comes with baggage (an accent) such as café. Sometimes the absence of an accent does not hamper intelligibility, as in Guantánamo, and sometimes it creates a problem, as with passé. Maybe the printed page looks more "pretty" without any accents at all, but we're reading English, here, not Latin. On accents, then, my advice is Get used to it.

Printability

This is an embarrassing problem. We have fancy Unicode-enabled browsers, and we have installed at least one Unicode font on our super-duper 21st-century computers. Still, though, some characters refuse to display properly. What characters are they? The ones I keep running into are the diacritic-adorned letters used for transliterating foreign alphabets such as Arabic and the Indic scripts. See Kharosthi, or more specifically, the page it redirects you to. The problem with Arabic is the ArabDIN character-encoding. My browser just doesn't support it. I get stupid little boxes all over the page. Having them in the article text section is bad enough, but the article title ...? noooooooooooooooo. WHEN someone updates the browsers, we can think about the nice little choices between accented characters and their unadorned counterparts. Until then, we REALLY need a solid, citable guideline about putting unprintable characters in article titles.

--Cbdorsett 12:37, 8 January 2007 (UTC)

That's a problem that needs to be addressed with computers, not one that should be allowed to restrict Misplaced Pages. Kharoṣṭhī looks fine to me, no boxes there. Perhaps it's time to update the ol' computer. --LakeHMM 04:25, 17 January 2007 (UTC)

That's fine for you; it's not true on this IE computer, which shows Kharohī (I use for the annoying little square box). WP should be widely accessible and informative; that's what it's for. Septentrionalis PMAnderson 01:25, 2 February 2007 (UTC)
Renders fine on my Windows 2000, IE6 and XP IE7 boxes, as well as with Firefox 2. Windows 2000 is essentially dead now and is currently in extended support only. A few months back I confirmed that macrons (yes, a slightly different issue) displayed fine on Windows 98, IE5. That OS is fully unsupported now. I do not have access to that system any more to check this example though. Vista has all of the Uniscribe and foreign-language features enabled by default now. There are many free operating systems as well that often run well on very old systems. In the past, I used to view websites over telnet. Images would not render, but I could hardly expect them to be replaced by ASCII art. Missing glyphs can be solved by installing new fonts. Diacritics are often necessary. Bendono 03:15, 2 February 2007 (UTC)
We should not have to install fonts for all possible glyphs to use the English Misplaced Pages. People who want to use one of the other Wikipedias will expect to have to have support for the characters in its language; that should not be a prerequisite to the use of the English Misplaced Pages.
Furthermore, the other characters used in the other languages most closely related to English have been supported by all browsers, all computers set up for use in English, since back in the DOS days and even before the Internet existed. It is, of course, quite natural and to be expected that English is more likely to borrow from, and more likely to use without modification, the various characters used in the languages which are most closely related to it, in the same family of languages. Gene Nygaard 03:34, 3 February 2007 (UTC)
My Windows 2000 and Windows XP systems were both rebuilt within the last month. To the best of my knowledge, I have not installed any special fonts. Everything just displays fine, so I have not had the need to. When I tested Windows 98 for macrons (only) a few months ago, it was most definitely the default install. I do not know how some people are configuring their systems to exclude some of these fonts. However, there are many freely available to fix that. Bendono 03:52, 3 February 2007 (UTC)
But not necessarily the font face those people like to use. There is a whole lot more to the aesthetics of choosing a font than the ability to display some obscure character that the users would never use themselves. Gene Nygaard 16:57, 3 February 2007 (UTC)
Not sure out Linux, but on a Windows systems that is irrelevant. Even if the currently selected font does not contain the required glyph to display a character, font linking--a feature of MLang--will automatically search other fonts for an appropriate glyph. Square boxes mean that the glyph, including from other fonts via font linking, can not be found. Bendono 17:27, 3 February 2007 (UTC)
And all that notwithstanding, over 99.99% of Misplaced Pages users will still get square boxes in some Misplaced Pages articles, and just my rough guess that it is likely that a majority will get square boxes (or is it some of that %hex number gibberish instead?) in some article names (not counting redirects).
I see, however, that for some of them which show up as boxes on my computer when I look at the Special:All pages page, they show up as glyphs when I follow the link to the pages. So I dispute your claims about this font substitution always taking place. It clearly does not, as I have seen with my own eyes (I'm using Windows). Gene Nygaard 19:06, 3 February 2007 (UTC)

My Opinion: Be Specific

I say be as specific as possible. Somebody else criticised the idea that names with more diacritics would be more correct, but, unless they're being added for no reason, they usually are. For example, with pinyin, it specifies which word exactly, as the tone is there. Leaving the tone marks out are like leaving letters out of English words. For example, Norwegian doesn't use the letter c, but they don't call their article on Fidel Castro "Fidel astro", and they don't replace the letter with another one that looks similar, like making it "Fidel Oastro". They put it in there because it's missing a vital part of the word if you leave it out or change it. Same with other diacritics and "non-standard letters", that make letters make differnet sounds. Examples are Lech Wałęsa and Anders Jonas Ångström. When mentioned in English media, they usually leave the diacritics out, but we leave them in because that's how the names are spelled.

I know this has been discussed many times elsewhere in the page, but I'm just adding my thoughts. Don't feel like you need to repeat things you've said elsewhere on this page or on Misplaced Pages. I've heard the other arguments, but I think it's best to be specific, accurate, and proper, and redirect the less accurate pages to the ones with diacritics. --LakeHMM 04:23, 17 January 2007 (UTC)

One difference is that the Norwegian alphabet does still officially include C, Q, W, X, and Z among its 29 letters, even though they are no longer used in native words that are not proper names (some personal names, and a few place names, and other proper names do still include some of them). Gene Nygaard 03:26, 3 February 2007 (UTC)
And one reason for that, of course, is that they were used in the past in Norwegian words.
Note also that neither of the two Norwegian Wikipedias has any squiggles on their articles about România: They are at no:Romania and nn:Romania, just as our English one is at Romania (the earlier link is a redirect), even though the Romanian Misplaced Pages article is at ro:România. Of course, the Norwegian Wikipedias don't have an article at "Norway" either; but you may not realize that they also do not have it under the same name in the two Wikipedias, but rather each has it under separate spellings. Gene Nygaard 07:20, 3 February 2007 (UTC)

The United States of America's Misplaced Pages

I'd like to address a topic that comes time and again in this discussion, some native English speakers referring to themselves as "us" versus non-native English speakers as "them". The thing is, this is the "English Misplaced Pages", not the "United States of America Misplaced Pages" nor the "(British) Commonwealth of Nations Misplaced Pages" and certainly not the "Native English Misplaced Pages". As far as I know, the English Misplaced Pages doesn't distinguish non-native English speakers from native English speakers when it comes to editing and discussing here, all of us are foreign to each other but we are all editors period. That means that users such as myself aren't foreigners sticking our noses where we aren't invited but that we are editors to the English Misplaced Pages discussing the issues which pertain to the English Misplaced Pages.


In this discussion there have been references made to "foreign" editors coming to the English Misplaced Pages via any of its sister Wikipedias. This isn't accurate as the English Misplaced Pages is by far the most popular of the Wikipedias and many of the non-native English speakers, like myself, knew of Misplaced Pages through its English version and we aren't even active editors at our native language's Misplaced Pages. This doesn't show a lack of allegiance towards our native culture as Misplaced Pages isn't a bastion for any specific culture and no one owes any kind of alligiance towards any of the Wikipedias. Just as the English Misplaced Pages isn't of the native English users, neither the French, Deutsch, Spanish, etc. Wikipedias are of or for the native speakers of those languages exclusively. For historical reasons the English language is spoken by millions of people other than British, Americans and Aussies so maybe that's why this discussion concerning foreign names arises more frequently here than in other Wikipedias where the population is more homogeneous.


Finally, those of you who want the native English speakers to be "left alone" to decide on Misplaced Pages's policies, guidelines etc. fail to measure the consecuences of what you're asking for. The reason why I and many non-native English speakers became editors to this Misplaced Pages is because this one totally rocks when compared to the other ones, it has thousands of more articles and the topics vary immensly... why? It's not because the native English speakers are more prolific writers than those of other languages, it's in large part due to the fact that thousands of editors from Asia, Latinamerica, Middle East, Africa and the non-English speaking European countries contribute to this Misplaced Pages. Look at today's featured article for instance, a biography on Sir Syed Ahmed Khan Bahadurسید احمد خان بہا در , do you think the main contributors to that article were John Johnson and Jack Jackson? Rosa 23:20, 3 February 2007 (UTC)

So, what do you have against John Johnson and Jack Jackson? Is there some particular reason why you think they are especially stupid and incompetent, and incapable of contributing to this article? Besides, who knows. In fact, it may well have been them; a great many of the contributors to that article are completely anonymous IP addresses; most of the others just have the relatie anonymity of their Misplaced Pages user names, plus whatever you might want to believe about whatever information they might choose to present about their identities on their talk pages. Gene Nygaard 01:26, 4 February 2007 (UTC)
Whoa, whoa, Gene, c'mon. I am totally on your side in this debate, but if you thought that her point was that Americans or English speakers can't write or are ignorant, then you need to cool off a moment. All I think she was saying is that people in foreign lands are a lot more likely to write knowledgeably about their native countries than the majority of people who haven't been there. I write on articles dealing with places I've lived, I think it goes for all of us. Unschool 02:39, 4 February 2007 (UTC)
Having said that, I was much grieved today to have written a lengthy response to Rosa's comments, only to have my browser freeze, losing my comments. I really must learn to write my stuff with Word before posting. I'll post an abbreviated version later. Unschool 02:41, 4 February 2007 (UTC)
Rosa wrote a nice litle essay here; a little weak on some of its factual basis, but expressing some good sentiments most of us could agree with. Then she threw it all out the window with a bit of the improper stereotyping on her own part. Gene Nygaard 04:02, 4 February 2007 (UTC)
wow...this is as close as you've ever come to giving me a compliment on any of the issues we have discussed these couple of weeks Gene...lol...and Unschool gave that sentence its proper interpretation, sans the acrid twist you season your comments to me with. I have nothing against John or Jack; they could be Dr. Johnson or His Honorable Judge Jackson for all I know. All I meant was that it's more likely that someone like Atif nazir created this article on Sir Syed Ahmed Khan Bahadurسید احمد خان بہا در; just as it's more likely for a Jack Jackson to write an excellent article on George Patton or Benjamin Franklin or for a Juan García to write the bio of Simón Bolívar (or for a Rosa Martínez to write an article on Armando Manzanero for that matter lol).Rosa 20:14, 5 February 2007 (UTC)

Redirects don't just happen! Part I, the easy test

Several editors have expressed sentiments along these lines. But what is this, anyway? Some ivory tower speculation by people who have never actually gone out and even looked at what we have in Misplaced Pages?

"Redirects make the issue of difficulty in visiting or linking to the article immaterial" Deco, 7 Feb 2006
"Articles with Czech diacritics are readable in English, you only need a redirect becouse of problems with typing." --Jan Smolik 7 Feb 2006
"Czech names: almost all names with diacritics use it also in the title (and all of them have redirect)." Pavel Vozenilek, 8 Feb 2006
"I hope that we have finally reached the agreement that linking through redirects is not a relevant issue," Zocky, 28 Jun 2006
"This is particularly the case in Misplaced Pages as we have redirects which allow us to use correct characters in titles without inconveniencing our visitors." Oldak Quill, 28 Jun 2006
"A person "won't know how to type the ł's"—a moot point because redirects handle all the variants." Vecrumba, 19 November 2006 (UTC)

Let's just assume you are somebody who had just read something about the following in an English-language newspaper or magazine. So you put one of these into the box on the Misplaced Pages page and hit "Go", ending up at the page creation notice.

But unlike the typical user, who just assumes that Misplaced Pages doesn't have anything about it, you are "smarter than the average bear!". You realize that it might be there, but under a different spelling. Maybe you've heard me when I have pointed out that redirects don't just happen!

So you use all the tool in your little bag of tricks. Then you create the missing redirects, so that those who follow after you don't have to go through the same thing. Some the folowing are missing redirects which include diacritics (usually in conjunction with missing redirects without diacritics), and there are even a few don't deal with diacritics at all.

  1. Josip Kras
  2. Basil Stoica
  3. Florent Prevost
  4. George Mihaita
  5. Paco Leon
  6. Nurgul Yesilcay
  7. Dezso Foldes
  8. Moment of force
  9. Ciftelia
  10. Vasterleden
  11. Tresovice
  12. Ante Radonic
  13. Francisco Ibanez Talavera
  14. Francisco Ibanez
  15. Stephane Matteuzzi
  16. Karen Oppegard
  17. paski sir
  18. Sutlu Nuriye
  19. Duved
  20. Daniel Tchur
  21. Konrad Fialkowski
  22. Entremes
  23. Paidi O Se
  24. Paidi Ó Sé
  25. Paidi O'Shea
  26. Erwin Proll
  27. Erwin Proell
  28. Miroslav Rozic
  29. Robert Strak
  30. Hannu Juhani Nurmio
  31. Hannu Nurmio
  32. Visa Makinen
  33. Bermejo Pass
  34. Julieta Colas
  35. Julieta Colás
  36. Julieta Colás Márquez
  37. Julieta Colas Marquez
  38. Jindrich Backovsky
  39. Edvaldo Valerio
  40. Chatenay, Ain
  41. Chatillon-en-Michaille
  42. Chatillon-la-Palud
  43. Marian Tomasz Golinski
  44. Glenn Caron
  45. Sater Municipality
  46. Torshalla
  47. Gunnar Lindstrom
  48. Manuel Menendez
  49. Ricardo Pio Perez Godoy
  50. Jean-Pierre Clement
  51. Ruben Pellanda
  52. Umut Guzelses
  53. Osman Kursat Duman
  54. Janusz Kolodziej
  55. Jacek Koscielniak
  56. Miroslaw Kozlakiewicz
  57. Gordan Kozulj
  58. Ahmet Koc
  59. Grzegorz Kolacz
  60. Lech Kolakowski
  61. Robert Kolakowski
  62. Viktor Kovacs
  63. Thomas Floegel
  64. Aoua Keita
  65. Luis Marin Munoz
  66. Yvon Cote
  67. Yvon Coté
  68. Edin Curic
  69. Gustaf Soderstrom
  70. Besiktas Cola Turka

It will be interesting to see how long these might sit here without turning blue instead of red. All of the above are ones where I have already called it missing redirects to the attention of editors in my edit summaries, but nonetheless they still haven't been fixed. So you can always find the article to which any of them belongs by looking through my contributions list for the past two months. Gene Nygaard 13:24, 5 February 2007 (UTC)

The problem is that the "Go" button is much less user-friendly than it should be. Search engines like Google or Yahoo handle special characters automatically, are case insensitive and even make suggestions for misspelled words. The Go button does nothing of that, but many users seem to expect it to do so and therefore forget to create necessary redirects when writing articles. -- memset 18:08, 5 February 2007 (UTC)
There are a great number of ways in which characters with diacritics give different results from those without in Google and Yahoo, and even more in most other search engines.
And I, for one, would not want the "Go" button to work any differently than it does now. The "Search" button, of course (and still defaulting to a search if the Go button doesn't find what you put in) is a different story, but I'd bet you about 100:1 that what you describe never crosses the minds of most of the people who fail to create redirects. Gene Nygaard 22:16, 5 February 2007 (UTC)
Here is a specific example of how diacritics do affect searches, memset. For example: Google <Ngobe-Bugle site:en.wikipedia.org> finds the article at Ngöbe Buglé redirected to from Ngobe Bugle (note, no hyphens in either) , but it does not find the separate article about a different subject at Ngöbe-Buglé (which has been on Misplaced Pages for 11 months, so Google certainly must have indexed it by now). It might have, if the two articles had been properly crosslinked and both redirects existed--but of course, they are not and do not.
Note also that Googling for <Ngöbe-Buglé site:en.wikipedia.org> (differing from the above search only in the inclusion of the diacritics) does, of course, find both articles, with and without the hyphen.
That is just one of the ways diacritics affect search engine results. Another harder to determine factor is the effect on the weighting of the results; it often makes some difference on how high up in the search results a particular hit will appear.
Furthermore, a search on Google including a word with diacritics, and excluding the same without diacritics (or vice versa) is almost never empty, as it would be if they were treated as identical for its search purposes. And the two directions give markedly different results.
Now put the same Ngobe-Bugle into the "Go" box on the Misplaced Pages page. It doesn't go to an article, but switches to a search. That search does find both Ngöbe-Buglé and Ngöbe Buglé, unlike the Google search which only finds the one without the hyphen.
Not what you expected, is it, Memset? So before you continue bad-mouthing the Misplaced Pages search engine, keep in mind that all the different search engines have their own little advantages and disadvantages; there is no magic bullet that makes everything work best for everyone, every time. Gene Nygaard 17:05, 7 February 2007 (UTC)
But search engines don't just stick to the exact search string, they also search for different diacritic variants: google searches for Władysław, Dvořák, "Okopy Świętej Trójcy", or Düsseldorf give roughly the same results with and without diacritics. That the results aren't exactly the same and differently weighted is indented and depends on the user's location and interface language, accordung to this. It looks like this doesn't work well for Ngöbe-Buglé.
Misplaced Pages's full-text search isn't doing much better either, just compare its results for México and Mexico. It has just some advantages over Google because it sees all redirects (the redirect Ngobe Bugle isn't in Google's cache for some reason) and the wikicode of all pages (in Ngöbe-Buglé, "Ngobe-Bugle" appears only as a category sort key that is not part of the generated HTML page that Google sees).
I'm not badmouthing anything, I'm just saying that the "redirects don't just happen"-problem would not exist with an improved "go" search. -- memset 00:31, 8 February 2007 (UTC)
Gene, the issue with redirects work both ways. If you create an article without diacritics, then the diacritic versions will redlink without proper redirects. No matter what side of the issue you are on, both sides need redirects.
As you have said, you have known about the above redlinks for at least two months. In all of that time you continue to complain but why have you not created redirects? In any case, they were all fixed within hours of your posting. Also, a significant number of your links have nothing to do with diacritics. Bendono 23:45, 5 February 2007 (UTC)
You are to be congratulated for showing that there is indeed one Wikipedian who will fix these problems when they are called to his attention. I don't know if you had any help with some of these, but I think (at least hope) that others would also have done so if you hadn't beaten them to it.
However, it is not exactly a two-way street. When we don't have the English alphabet spelling in the English Misplaced Pages, that is indeed something that is missing. But the converse is not true; foreign spellings may be a nice-to-have alternative, but they aren't a required element that is missing.
There is an even more significant reason why it is not a two-way street, however. In most cases, each article name will map into one specific English-alphabet version, some into two alternatives. Since multiple occurences of the same character will normally all be mapped the same way, almost all article names will map into no more than four alternatives.
So if somebody tries to use a version with diacritics and it doesn't work, they know what they need to try next.
But the same is not true for going in the other direction. Each letter of the English alphabet can be mapped into by many different possible versions with diacritics. For some article names without any diacritics, there may be hundreds or thousands, even millions, of possible ways that it could it could be written with diacritics.
So if somebody tries the version without diacritics and it doesn't work, the situation is entirely different. They usually do not know what they need to try next. Or, in other cases, they maybe do know what they want to try next; problem is that they don't know how to make the squiggly character they want to make.
Note that if you are just reading an article and want to put something into the "Go box, you don't even have that somewhat helpful little edit-screen crutch of having a half-page of squiggly letters you can wade through looking for the one you want to use, after you have gone to the additional trouble of increasing your font size to get them big enough so that you can tell one little squiggle from another one that looks a lot like it. (In a few cases even that doesn't help, of course, and they remain indistinguishable no matter how large you make them.) Gene Nygaard 04:57, 6 February 2007 (UTC)

Part 1b of the test

But now let's get into exactly what you have accomplished, or not accomplished, in jumping right in and fixing those redlinks.

First of all, it was mostly invisible work.

Even on this talk page, where people can see that the links are no longer red, they don't see who fixed them. But that isn't the real problem.

More significantly, this is invisible because absolutely nothing about it shows up on the watchlists of anybody following these articles. The people previously involved with the article will not have any way of knowing that the problem has been fixed, let alone that a problem even existed in the first place. There is nothing on the talk page to tell them about the existence of the problem, In this case there is only one thing that can make them aware of the existence of the problem, and that is something that most article where this problem has been fixed do not have—my earlier edit summaries pointing out the fact that redirects were indeed missing.

Nobody is going to stop creating these problem articles without redirects, just because you jumped in and fixed those. They are going to remain blissful in their ignorance. The ones who remain active on Misplaced Pages will keep creating the same problems over and over again, as will a new crop of editors who don't ever stumble across anything telling them of the need to fix this problem.

Unfortunately, another problem is that far too many of the creators of these problems appear to have been on kamikaze missions, and have died or otherwise disappeared since the blitzkrieg attack in which which created a whole bunch of unreferenced stubs, often with many other problems in addition to these missing redirects and the fact that they are not properly sorted in categories.

For example, we can see what happens when we go to look at a the other entries in the same categories (ignoring, for now, stub categories and birth/death/dead/living categories) as the ones you fixed above. We still find missing redirects to other articles in those same categories, some likely created by the same individuals who created the articles above without redirects, who remain ignorant of the problem and will likely continue doing so in the future. For example, in categories corresponding to the entries of the numbers above:

1. Otokar Kersovani
2. Sandor Kanyadi
3. Noel-Antoine Pluche
4. Stefan Banica, Sr.
5. Eduardo Gomez
6. Ugur Yucel
7. Imre Konig
8. not dealing with diacritics
9. Scoil na gClairseach
10. Noykkio
11. Vyseherad
12. Matija Divkovic
13-14. Miguel Angel Martin is a bluelink. But it is a different article, about a golfer b. 1962, from the one at Miguel Ángel Martín (b. 1960). Neither of these two articles have a disambiguation link to the other.
13-14. Jose Escobar Saliente
15. the only other one which didn't involve diacritics
16. . . . Enough! You should have the idea by now.

My calling attention to these problems in edit summaries wasn't a resounding success, but it did result in a significant number of redirects being added by others before I listed the ones that hadn't been fixed here. So if you have any bright ideas on other ways to call the existence of this problem to the attention of those editors largely responsible for creating it, and to get them to stop doing so in the future, have at it and report back to us on what you tried and on how successful it was. Gene Nygaard 05:26, 6 February 2007 (UTC)

Unless someone else takes care of them first, I'll fix these tonight. I'm at work now, so it will not be as fast as last time. Gene, why not help out a little? You know what needs to be fixed and you continue to complain about it. You seem to care about Misplaced Pages, and yet these test seem so counterproductive. Bendono 05:41, 6 February 2007 (UTC)

By the way, it wouldn't be that difficult to create all those missing redirects automatically using a bot. Just go through pages with non-ASCII titles, check if there is a redirect (or another article that contains a link to the current article, e.g. a disambiguation) at the "non-squiggly" version of the title, and create the redirect if it is missing. If there already is a different article (like that Miguel Ángel Martín example), add the article to some list so it can be fixed manually. -- memset 10:46, 6 February 2007 (UTC)

Done. Please double check Vyseherad. Vyšehrad is the real article. And Vysehrad already exists. If you think that Vyseherad should exist too, please add it yourself. Now enough with the games. I try to make Misplaced Pages a better place, but it is not a one-person job. Next time you notice some needed redirects, please add them. Bendono 13:26, 6 February 2007 (UTC)
I suppose I could claim that was part of the test. But it was my typo that caused me to think it was a redlink. So substitute the redlink at Velka Chuchle for another article in the same category.
I have, of course, created many redirects myself. But as I have pointed out, that doesn't solve the problem. I'll still do it myself on occasion. But I'm now more likely to try to call it to the attention of those who care if the article can be found or not, the ones who should be most interested in creating these redirects. Most of the time it is something that can remain hidden away for all I care.
Of course, those articles with diacritics which do not have these redirect/disambiguation page/disambiguation line links also have a much higher probability of being missorted in their categories than the ones which do have redirects. So as you are out adding redirects, could you check for proper sort keys as well? Gene Nygaard 15:06, 6 February 2007 (UTC)

Bot proposal

After some of my personal experiences with de-diactricized User:Piotrus/List of Poles, I am strongly in favour of a bot that would automatically create a de-diactrcized redirect of any article using dicatrics in names. What do you think about requesting such a bot?-- Piotr Konieczny aka Prokonsul Piotrus | talk  03:10, 11 February 2007 (UTC)

A good idea. The only real problem I can see is that it might make compromise solutions on diacriticless forms harder; for example it would have created Wladyslaw II Jagiello, which couldn't be moved to from Jogaila.
Three specifications:
Redirects with no edit history (i.e. just created and never edit) can be deleted during moves by non-admins (I think...). WP:RM has an easy way to deal with non-controversial moves requiring a deletion, so I don't think much problems would be generated by this.-- Piotr Konieczny aka Prokonsul Piotrus | talk  07:45, 11 February 2007 (UTC)
About the stop list/Wiktionary lookup: I think a better way to prevent red links from being turned into wrong blue links is to have the bot check the what-links-here list before creating the redirect. If they are any red links, they should be checked by the operator before creating the redirect. -- memset 09:18, 11 February 2007 (UTC)

Turkish dotted and dotless I

I've removed the following:

Turkish distinguishes between dotted and dotless I:

  • dotted: İ/i
  • dotless: I/ı

Where different from the I/i normally used in English, the specific Turkish characters can be used subject to what is explained in this guideline (although this is no "diacritic" issue strictly speaking), leading to, for example:

  • Istanbul (Turkish: İstanbul) - For this city the version of the name starting with dotted capital İ is quite uncommon in English.
  • Diyarbakır (with Diyarbakir as a redirect page) - in this case the version of the name with dottless lower case ı in the last syllable is fairly well spread in English too.

The reason is because, out of all articles about Turkish place names, Istanbul is the only one I can think of that doesn't use the correct spelling. If you see İzmir, Muğla, Eskişehir, Ağrı, Gümüşhane, Karadeniz Ereğli, etc. you will notice that they all use the diacritics. The way it read made it seem like some Turkish articles used the English spelling and some don't, but this is not the case. Khoikhoi 02:53, 6 February 2007 (UTC)

The only change that really needs to be made is to remove the nonsense quibbling about what a "diacritic" is. For the purposes of this convention, it is no different from any other diacritic. It causes the very same problems of inaccessibility of information as any of the others. I reverted your move pending the outcome of this discusion.
Furthermore, just because some articles violate the existing conventions is not a reason to grant them exemption from those conventions. Gene Nygaard 04:53, 6 February 2007 (UTC)
What inaccessibility? And what do you mean by "some articles violate the existing conventions"? Are you suggesting that we should rename the articles to "Izmir", "Mugla", "Eskisehir", "Agri", "Gumushane", and "Karadeniz Eregli"? Are there certain conventions that say we should? Khoikhoi 04:57, 6 February 2007 (UTC)
I agree to respect the Turkish spelling of the Turkish names. But why only Turkish and not Azeri? Why are İsmet İnönü's dots respected and not İlham Əliyev's (Ilham Aliyev)? Švitrigaila 16:51, 6 February 2007 (UTC)
Probably the person who created the article just spelled it without the diacritic in the first place. The community is divided right now between those of us who want native spellings to be used in articles and those who want to use the Anglicised spellings. In fact, in this very page the second title which reads "Using diacritics (or national alphabet) in the name of the article" tells of a discussion which took place about this one year ago at the village pump regarding this subject. It's not clear to me what exactly was the outcome of that discussion but if the trends still hold true then the answer probably is that community couldn't reach consensus.Rosa 17:30, 6 February 2007 (UTC)
Not exactly. There can be a consensus of what can't be done, and no consensus of what must be. There was a long discussion about Azeri names. The Ilham Aliyev article was normally titled İlham Əliyev until several mounths ago, and there was a discussion about the place of the ə in azeri names. A clear majority has decided that the ə must be excluded... and decided of nothing else to replace it. And you know that if it is decided to forbid the only correct spelling without deciding (or even discussing) which one of the wrong spellings must be used instead, all ends in a total mess. Article titles with Azeri names have constantly an chotically changed names since this "decision". That's why I'm clearly in favor of keeping the Turkish spelling of the Turkish names... ans the Azeri spelling of the Azeri names. Švitrigaila 00:00, 7 February 2007 (UTC)
Just remember that it is not a misspelling to use the English alphabet when writing in English, and you shouldn't have a whole lot of problems. The point I'm making is that if you feel some substitution would be a misspelling in a foreign language, so what. You can put the article under the English spelling, and redirect to it from various other possibilities. Gene Nygaard 00:33, 7 February 2007 (UTC)
To be more specific, in the particular case under discussion, this has been done by having it at Ilham Aliyev, which is not a misspelling, and with redirects from both the İlham Əliyev spelling you used and the İlham Aliyev you claim to be a misspelling. Gene Nygaard 01:01, 7 February 2007 (UTC)
Exactly Khoikhoi...this project tries to specify Misplaced Pages's naming conventions on this very issue. The proposal is that the articles should be spelled as it is routinely used according to English sourcesnames, and when in doubt of what the English "common usage" (with or without diacritics) is, then diacritics should be avoided. So, one of the consequences of implementing this project would be that İzmir, Muğla, Eskişehir, Ağrı, Gümüşhane, Karadeniz Ereğli will have to be renamed "Izmir", "Mugla", "Eskisehir", "Agri", "Gumushane", and "Karadeniz Eregli" unless you come up with reliable English sources that use the names with diacritics. Rosa 17:30, 6 February 2007 (UTC)
For several of these names, such sources probably exist, however. (And the Turkish wikipedia should, as it does, follow Turkish usage; even tr:Londra.) Septentrionalis PMAnderson 17:53, 6 February 2007 (UTC)
Khoikhoi, Turkish names get the same consideration as anybody elses. No more. Our primary consideration remains best known in English.
What do I mean by accessibility? Are you totally oblivious to the world around you? What do you suppose has constituted a large portion of the discussions on this page? I'm talking about missing redirects. the associated redlinks and inability to find things with the "Go" button. I'm talking about articles that aren't in categories, because they aren't alphabetized the way they should be. And these screwball dotless i's and dotted I's are not one silly iota different from any of the other diacritics; there's no reason whatsoever for them not to be included within the scope of this project page. Don't be trying to remove them from there. Gene Nygaard 20:41, 6 February 2007 (UTC)

Better slightly late than never

Happy Birthday, Naming conventions (standard letters with diacritics), from me! Have a great day! Stefán 03:19, 6 February 2007 (UTC)
LOL...this project is surely past its prime. If there isn't consensus from the community for more than a year, shouldn't this proposal be dismissed or something? Rosa 16:57, 6 February 2007 (UTC)
Well, as long as the discussions are ongoing, proposals are considered active. WP:NCGN was discussed for over a year before being 'promoted'. We need a policy on diactrics, although honestly I don't see why we cannot replace the entire page with 'they are allowed, use them' :) -- Piotr Konieczny aka Prokonsul Piotrus | talk  03:08, 11 February 2007 (UTC)
As a reader and researcher, I find diactrics to be very helpful in my quest to at least have an idea how to pronounce a word that is from a language that is not my own. I recommend that we use diactrics in titles that are names of foreign origin and have the English variant of the name, if different in more ways than just the diactrics, in parentheses after the original name. In my opinion that simple task can be Misplaced Pages's contribution to the broader and subtle education of people who are not used to seeing anything other than un-accented English. Lemuela 06:46, 14 February 2007 (UTC)
This is an argument for including the foreign spelling in the first line of the article; as we do. I don't on the whole think it's a very good argument: Most readers will not have any idea what pronunciation Stanisław Ulam or Paul Erdős represents, and if they did, it would be misleading; experience shows that the letters in question are pronounced in practice as if they had no diacritics. I prefer to present the valuble information that Stanislaw Ulam's name was adapted to English; but Paul Erdős did not. Septentrionalis PMAnderson 19:30, 14 February 2007 (UTC)
I am with Lemuela. I personally think that to use "Stanislaw Ulam" instead of "Stanisław Ulam" (with a redirect of "Stanislaw Ulam" to the proper name) is both a disservice to the reader/researcher and is misleading, at best. When a reader is properly redirected to the name that includes proper diatrics, the user is further educated as to the real name without argument of form, type, usage, or any other argument that "popularity contests" present within wikipedia. Yes, most readers will not know what diatric to use (if any), that is what redirect is so useful for. As a geographer, I would rather have reference to the proper name in order to 'get it right the first time' then to wonder what the name really is (there is ongoing debate/clarification with regard to Mongolian names right now). Yes, one could argue that "Stanislaw Ulam" is correct in English - but actually it isn't. The proper translation, if we are proposing such usage, is "Stanley Ulam" - this, of course, takes away from the actual name and is the disservice that is being mentioned. Therefore, I think the best approach is to use such diatrics. Rarelibra 19:40, 14 February 2007 (UTC)
I chose Ulam as an example, because he himself, his wife, his co-worker Jan Mycielski, and his colleagues all spell the name as Stanislaw in English, as his autobiography Adventures of a Mathematician attests. Rarelibra proposes, in such cases, to lie to the reader, because it's somehow politically correct, and will be "educational"; I do not. Septentrionalis PMAnderson 06:39, 15 February 2007 (UTC)
Misplaced Pages talk:Naming conventions (standard letters with diacritics): Difference between revisions Add topic