This is an old revision of this page, as edited by Tamfang (talk | contribs) at 19:45, 20 September 2012 (→Embedding foreign names (like names with diacritics) in English Misplaced Pages). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
Revision as of 19:45, 20 September 2012 by Tamfang (talk | contribs) (→Embedding foreign names (like names with diacritics) in English Misplaced Pages)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
Archives |
|
This page has archives. Sections older than 30 days may be automatically archived by Lowercase sigmabot III when more than 4 sections are present. |
kadı vs. kadi
I've requested that the article kadı be moved to kadi, per WP:ENGLISH ("kadı" is the spelling in Turkish). If anyone has an opinion on this debate, please visit Talk:Kadı#Requested move. Kaldari (talk) 04:28, 19 May 2012 (UTC)
See Misplaced Pages talk:Manual of Style#MOS on Zoë Baird, advance notice of proposal here
I have mentioned before my view that poor visibility on existing guidance on Latin-alphabet orthography (aka "accents" "diacritics") is encouraging time-wasting disruption on en.wp. I have placed a note on WT:MOS that I would like to make a proposal to add Zoë Baird as an English-language example to WP:DIACRITICS here. I have pre-notified WT:MOS there in case anyone objects that WP:DIACRITICS is not the best place to make a proposal to add an English-language diacritics example. In ictu oculi (talk) 00:07, 30 May 2012 (UTC)
- There is no poor visibility WP:UE is well known and explains that usage shoudl follow common usage in Reliable English language sources. The MOS covers usage in articles were a name is not used for the article title and can be found at MOS:FOREIGN (and basically says follow usage in the reliable sources used in the article) where the confusion arises is were people try to bring other more obscure guidelines to the party which are either not relevant or are worded in a way that contradicts WP:UE. At the moment for example there is an attempt to get Misplaced Pages:Manual of Style/Proper names#Diacritics to reflect clearly the guidance in MOS:FOREIGN, because at the moment some people are confused by the wording and think it contradicts the guidance given at MOS:FOREIGN.
- To address you specific request for an alteration to this guideline I do not think that Zoë Baird is a good example. I think Charlotte Brontë would be better example. But as this is about "Use English" and both those names are English language names I am not sure what relevance it is to this guideline to add an English language name to this guideline. Whatever is added to this guideline must not contradict WP:UE, so before anything is added to the guideline please place on this talk page a block quote the change you wish to make to the guideline. -- PBS (talk) 12:23, 30 May 2012 (UTC)
- Perhaps then under WP:DIACRITICS, "for example personal names Charlotte Brontë, Lech Wałęsa, Mustafa Kemal Atatürk, but Franz Josef Strauss, and place name Saint-Étienne, brand Häagen-Dazs" In ictu oculi (talk) 22:55, 20 June 2012 (UTC)
- No because names like Lech Wałęsa have not been agreed upon using reliable English language sources and so are not good examples. -- PBS (talk) 11:28, 21 June 2012 (UTC)
- PBS, are you saying that Lech Wałęsa is at the wrong title? It had 2 extensive RMs leading to consensus among en.wp editors that is reflected in Category:Polish people. Most editors will recognise that article titles like Lech Wałęsa have in fact been agreed upon using reliable English language sources, it is just that WP:IRS has been followed to use reliable-for-context sources, such as Frommers, Lonely-Planet, Ascherson, Sczerbiak, Biskupski rather than unreliable-for-context sources like the Daily Express. But, okay, if you do not like Lech Wałęsa then what about Václav Havel? Or do you consider Václav Havel is at the wrong title too? Is there any accented European-name article on en.wp that do you consider is at the right title? In ictu oculi (talk) 11:52, 21 June 2012 (UTC)
- No because names like Lech Wałęsa have not been agreed upon using reliable English language sources and so are not good examples. -- PBS (talk) 11:28, 21 June 2012 (UTC)
- Perhaps then under WP:DIACRITICS, "for example personal names Charlotte Brontë, Lech Wałęsa, Mustafa Kemal Atatürk, but Franz Josef Strauss, and place name Saint-Étienne, brand Häagen-Dazs" In ictu oculi (talk) 22:55, 20 June 2012 (UTC)
- To address you specific request for an alteration to this guideline I do not think that Zoë Baird is a good example. I think Charlotte Brontë would be better example. But as this is about "Use English" and both those names are English language names I am not sure what relevance it is to this guideline to add an English language name to this guideline. Whatever is added to this guideline must not contradict WP:UE, so before anything is added to the guideline please place on this talk page a block quote the change you wish to make to the guideline. -- PBS (talk) 12:23, 30 May 2012 (UTC)
- In the case of Lech Wałęsa the last requested move resulted in about 20 editors expressing an opinion and the result was split about 50/50. This is not as you say "consensus among en.wp editors". As to my opinions on that article title see my comments on that talk page's archive, this is not the place to debate it. As for the other names like Václav Havel, I have not taken part in a debate over that name so I do not know what the common usage in reliable English language sources is, but from a look at the English language sources used in the article usage would appear to be divided, so further investigation would be needed and as far as I can tell it has not been subject to a requested move, so it does not make a good example to use here. -- PBS (talk) 12:48, 21 June 2012 (UTC)
- Then as above please indicate an accented European-name bio on en.wp that you do consider is at the right title. Thanks. In ictu oculi (talk) 12:56, 21 June 2012 (UTC)
- In the case of Lech Wałęsa the last requested move resulted in about 20 editors expressing an opinion and the result was split about 50/50. This is not as you say "consensus among en.wp editors". As to my opinions on that article title see my comments on that talk page's archive, this is not the place to debate it. As for the other names like Václav Havel, I have not taken part in a debate over that name so I do not know what the common usage in reliable English language sources is, but from a look at the English language sources used in the article usage would appear to be divided, so further investigation would be needed and as far as I can tell it has not been subject to a requested move, so it does not make a good example to use here. -- PBS (talk) 12:48, 21 June 2012 (UTC)
- If we're trying to improve the wording or interpretation of our rules, then constructing arguments based on compatibility with our existing rules may not be helpful. At best we get a new set of rules whose internal consistency is beyond reproach but which have a weakened connection to reality and to our encyclopædic goals. We already have disputes where both sides cite our existing rules yet cannot agree... So, if we're going to improve our rules, I would rather base it on accuracy, verifiability, and readability. bobrayner (talk) 13:11, 22 June 2012 (UTC)
Intro for diacritics & non-diacritics versions of a name.
In an article's intro, we should be allowed to use both diacritics & non-diacritics versions of a name, when both are sourced. GoodDay (talk) 21:22, 2 June 2012 (UTC)
- I thought we allow it already? I mean there are a few extremists who would censor either the diacritic or non-diacritic version depending on their own pov, but sourced multiple spellings or pseudonyms that are common are supposed to be present in the lead for our many readers. Fyunck(click) (talk) 23:24, 2 June 2012 (UTC)
- I believe it is allowed, although it would sometimes seem silly if the diacritics are visually insignificant to the typical English-speaking reader, e.g., "La bohème or La boheme is an opera in four acts by Giacomo Puccini..." WhatamIdoing (talk) 00:55, 3 June 2012 (UTC)
- I attempted to add a doubly-sourced non-diacritics name version at Zoë Baird. It got a bad reception, however. GoodDay (talk) 03:00, 3 June 2012 (UTC)
- @WhatamIdoing. This seems to have become a very common misconception on wp. It is not silly or unnecessary to mention an alternative spelling that is "visually insignificant". If an article is titled La bohème, then for an English reader it is obvious that it "might" be written La boheme as well. What the reader doesn't know is whether "La boheme" is effectively used in English reliable sources or not. If it is used (in several reliable sources) then we add it as an alternative spelling, and this tells the reader that this rendering is also commonly used. If "La boheme" is not used in our sources then it makes no sense to add it as an alternative spelling in the lede. That's why our article Café starts with: "A café , also spelled cafe...". It would not be right (or complete) is we omit "cafe" because "visually insignificant". How is a reader supposed to know whether an alternative rendering (no matter how "obvious") is commonly used or not? MakeSense64 (talk) 05:37, 3 June 2012 (UTC)
- The generic cafe is a total different issue to eg Café La Monde. Agathoclea (talk) 06:35, 3 June 2012 (UTC)
- No, it's just the same. If Cafe La Monde appears as an alternative spelling in a good chunk of our reliable sources, then we mention it as a common alternative spelling in the lede. If that spelling does not appear in our reliable sources, then there is no reason (and no source) to mention it as an alternative spelling. We are diacritics neutral and we report on what we find. On what basis are you going to refuse a piece of information that is backed by several sources for the article? MakeSense64 (talk) 06:42, 3 June 2012 (UTC)
- Have you actually been reading Cafe? I mean more than just the lead sentence. Agathoclea (talk) 06:50, 3 June 2012 (UTC)
- No, it's just the same. If Cafe La Monde appears as an alternative spelling in a good chunk of our reliable sources, then we mention it as a common alternative spelling in the lede. If that spelling does not appear in our reliable sources, then there is no reason (and no source) to mention it as an alternative spelling. We are diacritics neutral and we report on what we find. On what basis are you going to refuse a piece of information that is backed by several sources for the article? MakeSense64 (talk) 06:42, 3 June 2012 (UTC)
- The generic cafe is a total different issue to eg Café La Monde. Agathoclea (talk) 06:35, 3 June 2012 (UTC)
- @WhatamIdoing. This seems to have become a very common misconception on wp. It is not silly or unnecessary to mention an alternative spelling that is "visually insignificant". If an article is titled La bohème, then for an English reader it is obvious that it "might" be written La boheme as well. What the reader doesn't know is whether "La boheme" is effectively used in English reliable sources or not. If it is used (in several reliable sources) then we add it as an alternative spelling, and this tells the reader that this rendering is also commonly used. If "La boheme" is not used in our sources then it makes no sense to add it as an alternative spelling in the lede. That's why our article Café starts with: "A café , also spelled cafe...". It would not be right (or complete) is we omit "cafe" because "visually insignificant". How is a reader supposed to know whether an alternative rendering (no matter how "obvious") is commonly used or not? MakeSense64 (talk) 05:37, 3 June 2012 (UTC)
- I attempted to add a doubly-sourced non-diacritics name version at Zoë Baird. It got a bad reception, however. GoodDay (talk) 03:00, 3 June 2012 (UTC)
- I believe it is allowed, although it would sometimes seem silly if the diacritics are visually insignificant to the typical English-speaking reader, e.g., "La bohème or La boheme is an opera in four acts by Giacomo Puccini..." WhatamIdoing (talk) 00:55, 3 June 2012 (UTC)
- The thing which really irritates me is the persistent wikislangish usage of the term "diacritic" for any non-ASCII Latin-derived letters (see e.g. a non-diacriticized example with "ı" above on this page). Could somebody propose a correct term for this to deprecate the usage of "diacritic" at least in official guidelines? Incnis Mrsi (talk) 08:12, 3 June 2012 (UTC)
- I believe that the examples MakeSense64 has in mind are not La bohème but tennis BLPs like "Saša Hiršzon or English Sasha Hirszon" where a tennis editor has been adding ITF names (diacritic disabled names) as if they are exonyms. WP:OPENPARA already makes it clear we don't do that. Perhaps this page should also make that clear? In ictu oculi (talk) 23:52, 20 June 2012 (UTC)
- And another editor has been systematically censoring all English names from article titles and lead sentences throughout all wikipedias (English and non-English versions alike), even where well sourced. Policy says we do otherwise for the best of our readers. Fyunck(click) (talk) 01:46, 21 June 2012 (UTC)
- Hello Fyunck, yes you would be the editor I am referring to, and I would be one of the "censors" you are referring to. Yes I would like to be able to "systematically censor" all "English names" from WP:TENNIS-ledes like: Saša Hiršzon or English Sasa Hirszon." Though note that yourself, and perhaps MakeSense64 above, are the only WP:TENNIS editors who are for this style of lede. we don't see other tennis editors doing this:
- And another editor has been systematically censoring all English names from article titles and lead sentences throughout all wikipedias (English and non-English versions alike), even where well sourced. Policy says we do otherwise for the best of our readers. Fyunck(click) (talk) 01:46, 21 June 2012 (UTC)
- I believe that the examples MakeSense64 has in mind are not La bohème but tennis BLPs like "Saša Hiršzon or English Sasha Hirszon" where a tennis editor has been adding ITF names (diacritic disabled names) as if they are exonyms. WP:OPENPARA already makes it clear we don't do that. Perhaps this page should also make that clear? In ictu oculi (talk) 23:52, 20 June 2012 (UTC)
- Błażej Koniusz or Blazej Koniusz (born February 22, 1988 in Świętochłowice) is a tennis player from Poland.
- Sergio Gutiérrez Ferrol (b. Alicante March 5, 1989) and known professionally as Sergio Gutierrez-Ferrol, is a tennis player from Spain ..
- Facundo Argüello (tennis) (born August 4, 1992), known professionally as Facundo Arguello, is a tennis player from Argentina.
- Manuel Sánchez (tennis) (born January 5, 1991) and known professionally as Manuel Sanchez, is a tennis player from Mexico.
- Frédéric Vitoux (born Versailles, 30 October 1970) and known professionally as Frederic Vitoux, is a former professional tennis player from France.
- César Ramírez (born January 25, 1990) and known professionally as Cesar Ramirez, nicknamed "el Tiburón" ("the Shark"), is a tennis player from Mexico.
- Filip Horanský (born January 7, 1993) and known professionally as Filip Horansky, is a tennis player from Slovakia.
- György Balázs (Hungarian: Balázs György) (born Budapest, July 24, 1985) and known professionally as Gyorgy Balazs, is a tennis player from Hungary.
- Sophie Lefèvre (born February 23, 1981 in Toulouse) and known professionally as Sophie Lefevre, is a professional French tennis player.
- Tomislav Brkić (born Ljubuški, March 9, 1990) known professionally as Tomislav Brkic, is a tennis player from Bosnia and Herzegovina.
And so on, about 120x of these article ledes...
This is completely contrary to the lede style shown in WP:OPENPARA and WP:FULLNAME
François Maurice Adrien Marie Mitterrand (26 October 1916 – 8 January 1996) was the fourth President of France ...
As I say, I and other editors would like to remove those pointless duplications of accent-stripped names. But as with Saša Hiršzon Błażej Koniusz whoever removes the accent stripped version you immediately revert it. Anyway, since we're here at WT:Naming conventions (use English) if you consider this guideline is one of those which supports these ledes then please could you indicate the wording in WP:Naming conventions (use English) that justifies these sort of ledes? Cheers. In ictu oculi (talk) 10:33, 21 June 2012 (UTC)
- btw note that other tennis editors are also removing your "English names" Rüdiger Haas (b Eberbach, 15 December 1969), also known as Rudiger Haas. Cheers. In ictu oculi (talk) 11:36, 21 June 2012 (UTC)
- Iio the name used in the article title should reflect common English language usage. The first line of the article should also contain that name either in the text or a footnote, because otherwise there is a danger that the article may not show up in internet searches and we do no reader a favour if the article is not available to the reader in the first page returned in a search. As it happens Google has a policy of placing Misplaced Pages high up its search lists, but that is not guaranteed to remain and not all other search engines may give that weighting. Late last year I moved a page back to its common name (Popski's Private Army), because under it official name, without Popski's Private Army in the text, the page did not show up in Google searches even though the redirect ("Popski's Private Army") still existed.
- In cases such as Saša Hiršzon the obvious solution is to place the article under whatever is the most common usage in reliable English language sources and if that is "Sasha Hirszon" then use the format
Sasha Hirszon (Croatian Saša Hiršzon) ...
- -- PBS (talk) 12:17, 21 June 2012 (UTC)
- Also Iio you should have a look at the restrictions placed on the initiator of this section and ask yourself if you crusade for removing accent-stripped names from articles (whether or not supported by usage in reliable sources) could be considered by disinterested parties such as the members of the arbitration committee to be the mirror image of his/her behaviour (See also Wilfried Böse's denial before you make the obvious retort). -- PBS (talk) 12:17, 21 June 2012 (UTC)
- PBS,
- The "Sasha" above is a typo. "Sasa Hirszon" shows up perfectly well in Google searches, so your example of moving No. 1 Demolition Squadron to Popski's Private Army is not needed. Does your suggestion for Sasha Hirszon (Croatian Saša Hiršzon) still hold if it is Sasa Hirszon (Croatian Saša Hiršzon), which it is?
- If it does still hold, then could you please provide examples from en.wikipedia for the style Sasa Hirszon (Croatian Saša Hiršzon) since this is not what is shown in WP:OPENPARA
- I will ignore personal attacks such as "crusade," and threats, it would be helpful if you would engage in the discussion by providing an example for the style abve. Best regards. In ictu oculi (talk) 12:29, 21 June 2012 (UTC)
- PS - I have looked at Wilfried Böse as instructed and seen that he is reported to have told a Jewish passenger who had showed to Böse his Auschwitz tattoo, "I'm no Nazi! ... I am an idealist."
- Perhaps you will explain the relevance of that to this discussion later, but first can you please provide an bio en.wp example for the style Sasa Hirszon (Croatian Saša Hiršzon) since this is counter the lede style shown in WP:OPENPARA and WP:FULLNAME:
François Maurice Adrien Marie Mitterrand (26 October 1916 – 8 January 1996) was the fourth President of France ...
- Thank you In ictu oculi (talk) 12:34, 21 June 2012 (UTC)
An aside
I agree with Incnis Mrsi's point about the term "diacritic". For instance, one of the recent requested moves involved Kadı, and ı certainly isn't a diacritic. (It's a dotless i, commonly used to represent a particular vowel sound when writing Turkic languages in a latin alphabet). However, I feel that this is a side issue; even if "diacritic" isn't strictly accurate, we all know what we're talking about when we see it, the term is already overwhelmingly dominant in discussions around here, and the inaccuracy doesn't really affect the substance of the dispute. So, although I would support a more accurate label (ie. "characters which are in ISO/IEC 6937 but not in US-ASCII"), I suggest that we keep this question out of the main thread, because the main discussion is big enough already. bobrayner (talk) 13:33, 21 June 2012 (UTC)
- no argument from me. Agathoclea (talk) 14:50, 21 June 2012 (UTC)
- nor from me. I also think a Turkish name would be a better example to "Tomás Ó Fiaich, not Tomas O'Fiaich" - it's possible that some Users don't see "Tomás Ó Fiaich, not Tomas O'Fiaich" as foreign in the same way that "Mustafa Kemal Atatürk not Mustafa Kemal Ataturk" would be foreign. And WP:USEENGLISH might actually get read for what it says rather than what the shortcut currently suggests. In ictu oculi (talk) 00:06, 22 June 2012 (UTC)
- I'm not sure that it is necessary or even beneficial that people see the example as "foreign". --OpenFuture (talk) 06:55, 22 June 2012 (UTC)
- I'm not suggesting that "Tomás Ó Fiaich, not Tomas O'Fiaich" be removed, just that "Mustafa Kemal Atatürk not Mustafa Kemal Ataturk" be added. Experience suggests that some Users will object to accents on French Spanish Czech whatever names and cite WP:DIACRITICS without realising that the only example is "Tomás Ó Fiaich, not Tomas O'Fiaich." It doesn't have to be Turkish, "Karel Čapek not Karel Capek" will acheive the same effect. Even "François Mitterrand not Francois Mitterand" would acheive the effect. My concern is that "Tomás Ó Fiaich, not Tomas O'Fiaich" doesn't poke editors in the eye and say FOREIGNER! And yet 99.9% of diacritic names will be foreign people/places. Ireland on its own isn't foreign enough. In ictu oculi (talk) 06:35, 28 June 2012 (UTC)
- Yes, you said. I still don't see why it should poke people in the eye as foreign. It's hardly relevant how foreign it is. --OpenFuture (talk) 08:33, 28 June 2012 (UTC)
- Hi OpenFuture,
- Thanks for the comment. Perhaps, but theoretically would you recognise that there may be 1 or 2 English-speaking editors who exist who consider an Irish person less foreign than a French or Polish person, and that some people would find it helpful for the guideline to feature a truly "foreign" name? Or, if you might not have a view on that, would you think that an Irish name is a frequent and typical example that easily transfers to names like e.g. Lech Wałęsa?
- But back to the proposal, whether Tomás Ó Fiaich and François Mitterrand are equally foreign or unforeign or not then there's another argument for having François Mitterrand alongside Tomás Ó Fiaich. Who is better known? Who features in WP:OPENPARA?
- Any view on this? In ictu oculi (talk) 12:32, 29 June 2012 (UTC)
- Yes, you said. I still don't see why it should poke people in the eye as foreign. It's hardly relevant how foreign it is. --OpenFuture (talk) 08:33, 28 June 2012 (UTC)
- I'm not suggesting that "Tomás Ó Fiaich, not Tomas O'Fiaich" be removed, just that "Mustafa Kemal Atatürk not Mustafa Kemal Ataturk" be added. Experience suggests that some Users will object to accents on French Spanish Czech whatever names and cite WP:DIACRITICS without realising that the only example is "Tomás Ó Fiaich, not Tomas O'Fiaich." It doesn't have to be Turkish, "Karel Čapek not Karel Capek" will acheive the same effect. Even "François Mitterrand not Francois Mitterand" would acheive the effect. My concern is that "Tomás Ó Fiaich, not Tomas O'Fiaich" doesn't poke editors in the eye and say FOREIGNER! And yet 99.9% of diacritic names will be foreign people/places. Ireland on its own isn't foreign enough. In ictu oculi (talk) 06:35, 28 June 2012 (UTC)
- I'm not sure that it is necessary or even beneficial that people see the example as "foreign". --OpenFuture (talk) 06:55, 22 June 2012 (UTC)
- nor from me. I also think a Turkish name would be a better example to "Tomás Ó Fiaich, not Tomas O'Fiaich" - it's possible that some Users don't see "Tomás Ó Fiaich, not Tomas O'Fiaich" as foreign in the same way that "Mustafa Kemal Atatürk not Mustafa Kemal Ataturk" would be foreign. And WP:USEENGLISH might actually get read for what it says rather than what the shortcut currently suggests. In ictu oculi (talk) 00:06, 22 June 2012 (UTC)
If an example of diacritics being used is added, it should be balanced with an example of diacritics not being used: Ho Chi Minh, not Hồ Chí Minh. Kauffner (talk) 12:42, 28 June 2012 (UTC)
- I actually don't disagree with this but to compliment it should be a European example (because Asian languages are in a state of flux on en.wp - Hawaiian terms use okinas, Ho Chi Minh himself is anglicized, but 100s of Vietnamese geo articles are not). It should be a personal name (since geographical names can be exonyms, exonyms for people are less common). Can you propose an example of a European name where diacritics are not used? In ictu oculi (talk) 22:52, 28 June 2012 (UTC)
- Perhaps we should start where existing WP consensus already is strongest, French names:
The policy on using common names and on foreign names does not prohibit the use of modified letters, if they are used in the common name as verified by reliable sources. Example: François Mitterrand
- François Mitterrand is already used twice at WP:FULLNAME and WP:OPENPARA so is an uncontroversial example. As above if anyone can think of a modern French person bio which isn't titled with diacritics as a counter example, please say so. In ictu oculi (talk) 02:02, 29 June 2012 (UTC)
- The consensus is that article title under the name commonly used in reliable English language sources. Whether or not that is true for all French names can not be ascertained without looking at all French biography articles and comparing the article title with the common name in English reliable sources. Iio, I think you are confused about something: The title of an article like Tony Blair is not under the subjects full name. As to the name that is used at the start of the the text, it should reflect usage in reliable English language sources, whether "François Maurice Adrien Marie Mitterrand" is a good example depends on whether reliable English language sources agree that this is his full name. But whatever his full name is it is it is not directly relevant to the article title. -- PBS (talk) 10:55, 29 June 2012 (UTC)
- Hi PBS
- As I have noted before when you gave Tony Blair as an example, Tony Blair does not have accents to remove. Back to the accents question, can you please link to me where you have answered the request to provide 1x a modern European accented name bio (non-monarch, non-stagename, non-ß) where you agree with the current en.wikipedia article title?
- Thanks. In ictu oculi (talk) 12:32, 29 June 2012 (UTC)
- The consensus is that article title under the name commonly used in reliable English language sources. Whether or not that is true for all French names can not be ascertained without looking at all French biography articles and comparing the article title with the common name in English reliable sources. Iio, I think you are confused about something: The title of an article like Tony Blair is not under the subjects full name. As to the name that is used at the start of the the text, it should reflect usage in reliable English language sources, whether "François Maurice Adrien Marie Mitterrand" is a good example depends on whether reliable English language sources agree that this is his full name. But whatever his full name is it is it is not directly relevant to the article title. -- PBS (talk) 10:55, 29 June 2012 (UTC)
- Every Vietnamese biography is currently at an ASCII title. As for geography, WP:NCGN says to follow Encyclopedia Britannica, Columbia Encyclopedia, and Encarta, none of which use Vietnamese diacritics. Kauffner (talk) 03:04, 29 June 2012 (UTC)
- Hi Kauffner, yes I have seen that most of the VN bios have been moved. I have also seen the moves to towns you made or requested admins to make for you after Talk:Cần Thơ/Archive 1. This kind of confusion is one reason why I don't think a VN example would be uncontroversial and think an uncontroversial example like François Mitterrand which is already enshrined in WP:OPENPARA etc. would be better. Which is why I ask "Can you propose an example of a European name where diacritics are not used?" - I'll ask again, can you? Because if you can it would be a sensible balance to François Mitterrand. Can you provide one? In ictu oculi (talk) 03:46, 29 June 2012 (UTC)
- Oh, just go for it: RM Napoleon to Napoléon I, Napoleon III to Napoléon III, and Francis I to François I. N'est-ce pas chaque mot de plus beau en Français? Kauffner (talk) 07:25, 29 June 2012 (UTC)
- Hi Kauffner
- I said modern. Can you please link to me where you have answered the request to provide 1x a modern European accented name bio (non-monarch, non-stagename, non-ß) where you agree with the current en.wikipedia article title?
- Thanks. In ictu oculi (talk) 12:32, 29 June 2012 (UTC)
- If modified letters are not commonly used in reliable English language sources they should not be used in the article titles. As to pointing out article titles Iio given your recent behaviour of moving such article titles without putting in a RM request (or it seems to me going into much detail about English language usage), providing such names to you reminds me of Beans. -- PBS (talk) 10:55, 29 June 2012 (UTC)
- Please see request above, Thanks In ictu oculi (talk) 12:32, 29 June 2012 (UTC)
- The guideline already contains "The use of modified letters ..." so the proposed addition of "The policy on using common names and on foreign names does not prohibit the use of modified letters." is unnecessary and could lead to confusion with the sentence already in the article. -- PBS (talk) 10:55, 29 June 2012 (UTC)
- Hopefully it will lead to non-confusion, admitting that en.wikipedia uses diacritics in modern person bio titles and bringing peace to the subject. In ictu oculi (talk) 15:20, 29 June 2012 (UTC)
- If modified letters are not commonly used in reliable English language sources they should not be used in the article titles. As to pointing out article titles Iio given your recent behaviour of moving such article titles without putting in a RM request (or it seems to me going into much detail about English language usage), providing such names to you reminds me of Beans. -- PBS (talk) 10:55, 29 June 2012 (UTC)
- Hi Kauffner, yes I have seen that most of the VN bios have been moved. I have also seen the moves to towns you made or requested admins to make for you after Talk:Cần Thơ/Archive 1. This kind of confusion is one reason why I don't think a VN example would be uncontroversial and think an uncontroversial example like François Mitterrand which is already enshrined in WP:OPENPARA etc. would be better. Which is why I ask "Can you propose an example of a European name where diacritics are not used?" - I'll ask again, can you? Because if you can it would be a sensible balance to François Mitterrand. Can you provide one? In ictu oculi (talk) 03:46, 29 June 2012 (UTC)
- François Mitterrand is already used twice at WP:FULLNAME and WP:OPENPARA so is an uncontroversial example. As above if anyone can think of a modern French person bio which isn't titled with diacritics as a counter example, please say so. In ictu oculi (talk) 02:02, 29 June 2012 (UTC)
- To those who are wondering why we talking about Vietnamese, it relates to this RM: (Discuss) – Dan Tinh → Đàn tính. The dan tinh is a really obscure musical instrument. But from iio comments here, I gather that he is trying leverage it into something bigger. Kauffner (talk) 11:21, 29 June 2012 (UTC)
- Kauffner. You introduced Vietnamese, I did not.
- I read that again, how exactly is me not mentioning a Vietnamese instrument trying to "trying leverage it into something bigger", are you 100% certain you're not raising it to avoid disagreeing with François Mitterrand having a French name?
- I proposed Tomás Ó Fiaich + François Mitterrand, not Tomás Ó Fiaich + a Vietnamese musical instrument. Do you have a contrary example to François Mitterrand? Cheers. In ictu oculi (talk) 12:32, 29 June 2012 (UTC)
- Now, rather than discussing what no one has proposed, is there anyone who can provide a contrary example to François Mitterrand, modern European accented name bio (non-monarch, non-stagename, non-ß). 1x example. In ictu oculi (talk) 12:46, 29 June 2012 (UTC)
- I suggested adding "Ho Chi Minh" as an example of a name without diacritics. You responded by saying that the Vietnamese titles are "in a state of flux", as if that was something that was happening independently of your own activity. As for Napoleon, the reason he doesn't have diacritics is not related to his being a monarch. It's because the name is extremely well known, so there are editors know that it's not supposed to have a diacritic. Kauffner (talk) 04:06, 30 June 2012 (UTC)
- Hi Kauffner
- Yes, I do have reservations about some of your moves to Vietnamese articles, particularly the moves after Talk:Cần Thơ/Archive 1 but moreso to the music, culture, cuisine articles which are actual Vietnamese words not possibly exonyms. Be that is it may, I don't see there's any way you can claim with that Vietnamese titles on en.wp are in an equivalent state of stability as French titles. This is why I proposed a French example. However maybe Ho Chi Minh is a good idea as a counter balance to François Mitterrand. It would effectively recognise that to find examples of accent-removed names we have to move outside of Europe to Asia. The only problem with that is that Ho Chi Minh would have to take into account other Latin-alphabet Asian names which retain diacritics, WP:HAWAII for example. So okay, provisionally if others support Ho Chi Minh being added, then fine. That dealt with, now do you also have a European example? (non-monarch, non-stagename, non-ß) In ictu oculi (talk) 06:28, 30 June 2012 (UTC)
- I suggested adding "Ho Chi Minh" as an example of a name without diacritics. You responded by saying that the Vietnamese titles are "in a state of flux", as if that was something that was happening independently of your own activity. As for Napoleon, the reason he doesn't have diacritics is not related to his being a monarch. It's because the name is extremely well known, so there are editors know that it's not supposed to have a diacritic. Kauffner (talk) 04:06, 30 June 2012 (UTC)
below vs after
I changed the word after to below as it has been misunderstood to add the categorization templates in a seperate edit to lock the redirect, a behaviour that got someone ARBCOM sanctioned topic banned (and later longterm blocked) recently. Agathoclea (talk) 16:12, 11 July 2012 (UTC)
- Eh? Please clarify. S a g a C i t y (talk) 16:52, 11 July 2012 (UTC)
- ANI thread for the one and this for the misunderstanding. Agathoclea (talk) 17:55, 11 July 2012 (UTC)
Embedding foreign names (like names with diacritics) in English Misplaced Pages
- Not many editors seem to be aware of the Misplaced Pages guidelines for embedding foreign words and names in English Misplaced Pages; these guidelines are for web accessibility reasons. For an explanation, please see my essay here.
- I propose that a caution, and a link to these guidelines, be added to WP:DIACRITICS. LittleBen (talk) 11:14, 12 September 2012 (UTC)
- As has been pointed out to you in the past, its not that people aren't aware, its that people don't agree with you that those words are Non-English. -DJSasso (talk) 11:58, 12 September 2012 (UTC)
- Did you take the time to read the Misplaced Pages guidelines, and look at Lang-XXX templates and Template:Lang/doc?
- Are you saying that German, French, Vietnamese and all the other Latin extended characters here—including letters with diacritics—are not foreign characters, they are English? Has Misplaced Pages got it all wrong? Has W3C got it all wrong? Please provide some reliable references to support your apparent claim that all German, French, and Vietnamese names are in fact varieties of English. LittleBen (talk) 12:18, 12 September 2012 (UTC)
- Yes, there have been many debates on wiki about this, and there have been many links to many references indicating that English takes in letters with diacritics in names etc. It is a situation similar to loan words. The words aren't originally English but are considered English. -DJSasso (talk) 12:36, 12 September 2012 (UTC)
- I haven't seen any debates on Misplaced Pages that decided that these Misplaced Pages guidelines are wrong, and that W3C recommendations are rubbish. Would you please provide links, if you have any that support this POV. It is surely not true that there have been RfCs with such conclusions. LittleBen (talk) 12:41, 12 September 2012 (UTC)
- What guideline are you pointing to? There is no guideline that says they aren't English. The guideline you link to above just says for Non-English words you should use those templates. But nowhere in that guideline does it indicate diacritic names are Non-English. And the W3C also doesn't make that declaration anywhere. There have been entire RfCs on the this debate. You have been given the links to them numerous times in the past. The wiki is split about 50/50 on the topic. About half the wiki thinks they are usable English and about half don't think they are. I think you severely misunderstand what the debate about diacritics has been based on. -DJSasso (talk) 12:45, 12 September 2012 (UTC)
- Surely the Unicode standards cited above define which parts of Unicode are reserved for which languages. Are you saying that an RfC can vote such an international standard to be wrong? LittleBen (talk) 12:55, 12 September 2012 (UTC)
- The Unicode standards don't reserve letters only to ever be used by a given language. They only indicate what the letters are and what languages usually use them. But numerous characters are used in numerous languages. That being said everything on Misplaced Pages is determined by consensus so technically yes, the wide community could decide they didn't wish to adopt a given set of standards. But as I say the Unicode standards don't say these characters must only be used for a given language. -DJSasso (talk) 13:01, 12 September 2012 (UTC)
- Unicode characters are mapped to contiguous blocks; some blocks are reserved for certain languages, and some are shared. The Unicode standards are quite clear about which characters are reserved and which characters are shared. The great majority of fonts do not—of course—support all the Unicode characters. As explained in Misplaced Pages and other web accessibility guidelines, if non-English text is embedded then screen readers are unlikely to be able to read it if the language is not properly marked up. Some Japanese and Chinese characters even share the same character codes, and look quite different if they are displayed in the wrong font—the font is determined by the "lang" markup. The Misplaced Pages community has surely not decided to ignore the Misplaced Pages guidelines or the Unicode standard, right?
- There's a parallel discussion about language templates here. LittleBen (talk) 13:32, 12 September 2012 (UTC)
- Except we aren't talking about Japanese or Chinese characters. We are talking about latin-based diacritics. Which are recognized by screen readers and most (though obviously not all) fonts do support them. You keep throwing Japanese or Chinese characters into your debate which is a complete red herring because nobody believes we should use non-latin based text. -DJSasso (talk) 13:35, 12 September 2012 (UTC)
- Is it really necessary to start so many parallel discussions? This seems to be your personal style. It would be more helpful if you had the dicussion in the most relevant place and linked to it. --Boson (talk) 13:46, 12 September 2012 (UTC)
- It might be appropriate to add a comment to the appropriate template documentation that it is permissible to use the template for language-marking text for accessibility or other reasons, regardless of the actual status of the word. For instance ''{{lang|fr|''déjà vu''|nocat=true}}'' would indicate to accessibility aids that the phrase should be pronounced with a French 'u' in this particular context. Any note there or elsewhere should, of course, expressly state that this is without prejudice to the English status of the word or text. --Boson (talk) 13:46, 12 September 2012 (UTC)
- The most popular reference on screen readers that I have found so far says that only the most popular screen reader can switch between English and French—and then only if the text is properly marked up. But if the language is properly marked up, then screen readers can read the language code but skip the text that they can't read. LittleBen (talk) 13:54, 12 September 2012 (UTC)
- And of course whole quotes should be wrapped in a template which is what that blog is suggesting. Proper names are a different situation, because they won't be in dictionaries anyways. -DJSasso (talk) 14:07, 12 September 2012 (UTC)
- There are surely naming conventions (p.24) (that apply to birth certificates, passports and the like) for US nationals. Anyway, why should names be different? Speech synthesizers are surely rule-based rather than dictionary-based. LittleBen (talk) 14:12, 12 September 2012 (UTC)
- Seriously? You are now trying to use passport guidelines for a single country to justify your dislike of diacritics and desire to strip them out of the wiki? -DJSasso (talk) 14:19, 12 September 2012 (UTC)
- Did you bother to read my brief essay? Am I advocating stripping diacritics out of Misplaced Pages? LittleBen (talk) 14:31, 12 September 2012 (UTC)
- Surely passport and birth certificate naming guidelines apply to Americans with BIOs on Misplaced Pages? Of course other countries have similar naming guidelines for passports. LittleBen (talk) 14:22, 12 September 2012 (UTC)
- No because a government is a separate entity from wikipedia who matches information to the way they like to see it or need it. We are an encyclopaedia whose purpose is to give correct and full information to the reader. Their needs and our needs are different. We don't go by internal style guides of various organizations, though we might take them along with others into account when creating ours. -DJSasso (talk) 15:24, 12 September 2012 (UTC)
- Are you now saying that language-marking the text will cause most screen readers to ignore it completely?--Boson (talk) 14:46, 12 September 2012 (UTC)
- I said that "if the language(s) is/are properly marked up, then screen readers can read the language code but (will) skip text that (is in a language that) they can't read". Surely that's quite clear? The screen reader knows from the markup that it's Japanese text, for example, but will not try to read the text if the language is not supported. In this case, if Japanese text is embedded in English but not properly marked up, the screen reader would probably spew out garbage or crash. LittleBen (talk) 15:03, 12 September 2012 (UTC)
- Since the source you cited stated that most screen readers would not switch from English to French, your comment that they could "skip the text that they can't read" suggested that you might be inviting the inference that such screen readers would then ignore anything marked as French. So I asked. What do you think they do when confronted with a text marked as French? Do they treat it as English (as if it were not marked) or do they treat it as an unsupported language (meaning that they ignore it)? --Boson (talk) 18:00, 12 September 2012 (UTC)
- The review cited above, as I read it, says that the most popular software (75% of the market) could switch between English and French if the text was properly marked up. I think the article said that, at that time, such software could handle only five languages (but the ability to switch between other languages was not mentioned). How the screen reader would handle unsupported languages depends on the software, but I would presume that it would see the language markup code (Japanese, for example) and probably just announce "Japanese text", but not try to read it, if the reader did not support it. LittleBen (talk) 05:07, 13 September 2012 (UTC)
- Out of curiosity, what is the next argument going to be when this one fails to gain traction? Resolute 14:16, 12 September 2012 (UTC)
- What are you trying to say? Surely responsible editors are aware of Misplaced Pages guidelines? LittleBen (talk) 14:22, 12 September 2012 (UTC)
- Yup. And you have never come close to establishing a consensus that supports your personal interpretation of said guidelines. My point is to wonder how long, and in how many forums, you intend to fight this battle before accepting your position is not widely supported? Your behaviour in this regard is rather predictable as you aren't the first person who has tried to force Wikepedia to suit their own viewpoints by trying to wear editors out. It never works. Resolute 14:30, 12 September 2012 (UTC)
- So please provide a link to the RfC (if any) or guideline that decides that diacritics must be used virtually universally in titles of articles relating to Europe (or Vietnam, for that matter)—and an RfC (if any) or guideline that decides that the most widely-used English name will be stripped out of the lede, and only the less-frequently-used version with diacritics used in the article. Isn't this ignoring established guidelines and trying to impose one's own POV on Misplaced Pages by intimidating admins. who favor a neutral POV, and trying to wear other editors out. LittleBen (talk) 14:39, 12 September 2012 (UTC)
- You mean like WP:MOSBIO which says "the subject's full name should be given in the lead paragraph, if known (including middle names, if known, or middle initials)." The full name of the individual includes diacritics. We don't use the common name in the lead paragraph we use the full name. As for wearing editors out, it is you that is posting the same conversations over and over in multiple forums, not the ones who disagree with you. So if anyone is trying to wear people out it is the one creating discussions in a million different areas of the wiki instead of one centralized place. -DJSasso (talk) 15:22, 12 September 2012 (UTC)
- Your question assumes I agree with you that an "English name" excludes diacritical marks. Since I do not, it is invalid. As to this silly tug-of-war over Vietnamese article titles especially, you will note that I generally have not participated in the move requests and counter-requests, other than to say that your arguments have been consistently unconvincing. I do know that the usage of diacrtics in general has dramatically increased in English media as of late, and would suggest if the horse you are beating is not already dead, it is very close. Resolute 15:28, 12 September 2012 (UTC)
- I'm a member of the vast majority of English-speakers who don't understand (for example) Czech; but I do know the difference between ‹c› and ‹č› and I want to know which it is so that I have a chance of pronouncing it more correctly. There presumably exist readers who can say the same of Vietnamese, and I want them to have the same benefit even if the diacritics are meaningless to me; I am not harmed by seeing the funny squiggles, now that Unicode fonts are generally available. If Misplaced Pages excluded everything that a substantial number of English-speakers don't understand, I wouldn't bother with it. —Tamfang (talk) 17:58, 12 September 2012 (UTC)
- I agree with presenting important terms in both languages in the lede, using language templates that properly tag foreign words; that is essentially what I say in my brief essay here. The key issue is that the majority of Misplaced Pages users—who cannot read, write, or remember difficult diacritics or other foreign languages—should be given a choice. (Foreign-language article titles do not give the reader a choice.) The same template-based approach (showing both versions of the word) can be used in the article body. With very simple and familiar diacritics like in Pelé it is surely unnecessary to show the non-diacritic version of the name, but this should not be a general rule for all names with diacritics.
- However, there are people who insist on adding diacritics universally to article titles, and then removing the version without diacritics from the article body (even when the version without diacritics is much more widely used in English sources). Two problems with the "simply adding diacritics makes an article title superior" mantra are as follows:
- Article titles with diacritics cannot be disambiguated in a way that an English speaker can understand—without a mishmash of foreign languages and English. So adding diacritics to the titles "Buon Ma Thuat city" or "Dong Hoi city" gives "Buôn Ma Thuột city" and "Đồng Hới city". Such a mishmash looks unprofessional if not ridiculous.
- Another problem with the "where possible, add diacritics to titles of BIO articles" mantra is that the perpetrators do not bother to check the correct name in the corresponding foreign version of Misplaced Pages. "Simply adding diacritics" to the title "Manuel Sanchez (tennis)" gives "Manuel Sánchez (tennis)", another mishmash title. The problem with this is that the article title in Spanish Misplaced Pages is "Manuel Sánchez Montemayor". So simply adding diacritics to the English version of a name does not necessarily make it an acceptable article title name in Spanish, for example, and the full Spanish name would probably not be recognized in English.
- Be aware that foreign languages in English article titles should be properly tagged. (There aren't any markup tags for mishmash article titles.) The language here suggests that foreign-language page titles (including redirects) on English Misplaced Pages are a heavy load on the servers. So much for "redirects are free". Foreign languages within the page are handled by the browser. LittleBen (talk) 04:29, 13 September 2012 (UTC)
- I don't see that "Buôn Ma Thuột city" is any more ridiculous than "Buôn Ma Thuột is a city in Vietnam". —Tamfang (talk) 07:35, 13 September 2012 (UTC)
- Do you really think that "Buôn Ma Thuột is a city in Vietnam" is an acceptable article title? LittleBen (talk) 08:30, 13 September 2012 (UTC)
- Do you really think that "Buon Ma Thuot is a city in Vietnam" is an acceptable article title? —Tamfang (talk) 19:45, 20 September 2012 (UTC)
- I see the Manuel Sánchez Montemayor example still gets used. Has no-one explained the reason why in Spanish names are longer than in English? Perfect case of WP:UE unlike the fallacy of dropping diacritics making something English. Agathoclea (talk) 08:09, 13 September 2012 (UTC)
- I was suggesting that surely Manuel Sánchez is neither the most widely-used version of the name in English nor an acceptable Spanish name. Simply adding diacritics to a widely-used English version of a name doesn't necessary "improve" it. Names deserve a certain minimum of research, like checking the corresponding foreign-language Misplaced Pages (where they will surely be found if they are considered notable). LittleBen (talk) 08:30, 13 September 2012 (UTC)
- Note that there's an article on Tokyo, but another article on Tōkyō Station. Isn't that ridiculous? The Japan-related MoS doesn't seem to suggest that consistency is preferable (macrons in the lede is surely the "standard"?) No respectable English source would write it as Tōkyō Station. More mishmash. LittleBen (talk) 08:43, 13 September 2012 (UTC)
- I don't see that "Buôn Ma Thuột city" is any more ridiculous than "Buôn Ma Thuột is a city in Vietnam". —Tamfang (talk) 07:35, 13 September 2012 (UTC)
- I don't understand "users should be given a choice". Should there be a user preference switch, "don't ever show me any damn foreign squiggles"? Or are you saying that when an article mentions Đồng Hới it ought to say, "(if diacritics hurt your eyes, look here instead: Dong Hoi)"?
- Obviously an article with funny characters in its title ought to have a "plain" redirect so that the reader has the choice of putting the easier-to-type version in the search box. —Tamfang (talk) 08:49, 13 September 2012 (UTC)
- Quote: I don't understand "users should be given a choice". The standard for foreign languages is to put both the foreign language term, how it is commonly written in English and how it is pronounced—and optionally also the literal meaning in English where appropriate—in the lede (first sentence of the article). See the Chinese MoS or most Vietnamese articles for examples. LittleBen (talk) 10:09, 13 September 2012 (UTC)
- How is the user expected to guess what the (redirected) plain English version of a title with complex diacritics is? I don't think it's obvious to most users: "Redirected from" is displayed in very tiny letters. LittleBen (talk) 09:04, 13 September 2012 (UTC)
- Obviously the community has decided against a larger version when they deleted the relevant templates. Agathoclea (talk) 14:01, 13 September 2012 (UTC)
- The statemend above was amended with a misreading of the template refered to. The heavy serverload is the result of altering any high-use template, Not by using the template on the page. Agathoclea (talk) 15:17, 13 September 2012 (UTC)
- One of the main criteria for article titles is surely recognizability. For foreign words, the recognizability criteria is easy: foreign words like résumé that have been around a while (and also words of foreign origin that are recognized as being shorter, simpler, easier to remember, and more appropriate than their English equivalent—e.g. tsunami vs. tidal wave) are adopted into English, and begin to appear in dictionaries. The "recognizability" criteria for names or people and places is surely that quite famous people or places whose names or nicknames have simple diacritics, and would be recognized by large numbers of English-speaking people, are considered "recognizable": Pelé is an example. Names—with complex diacritics—of relatively-unknown people (and places) are not going to be recognized, and the majority of English Misplaced Pages users, supporters, and editors, surely won't be able to read, write, or remember them. For such unrecognizable names with diacritics, the version of the name without diacritics should be used in the article title, so that the majority of users can read and write it. Making Misplaced Pages less accessible, and adversely affecting its usability, is not an "improvement". LittleBen (talk) 02:56, 16 September 2012 (UTC)
- Ignorance is cureable - stupitity lasts a lifetime. Most people coming to Misplaced Pages are looking for the cure. The question of being "recognizable" has to be also seen in the context of the article. An article on a mathematical conept will be just as lost on a chav as the sqiggles on a Vietnamese name, but that chav will hardly ever bother looking it up. The people looking to better themselves on the other hand should not be deprived based on the fact that there are a few people around who can't be bothered. Agathoclea (talk) 10:34, 17 September 2012 (UTC)
- How many major publishers are so stupid, or so ignorant, that they choose titles containing Vietnamese diacritics for books intended for English native speakers, then? (Excepting "Learn Vietnamese!" textbooks, of course). LittleBen (talk) 11:59, 17 September 2012 (UTC)
- Time will tell the difference between stupid and ignorant. Anyway we are straying into verifyabilty but not truth territory is that what you want? Agathoclea (talk) 13:11, 17 September 2012 (UTC)