Revision as of 20:08, 8 May 2009 editPmanderson (talk | contribs)Autopatrolled, Extended confirmed users, Pending changes reviewers62,752 edits →Follow the sources: example← Previous edit | Revision as of 00:36, 9 May 2009 edit undoJwy (talk | contribs)Extended confirmed users, Pending changes reviewers17,014 edits →Follow the sources: keep chronological.Next edit → | ||
(One intermediate revision by the same user not shown) | |||
Line 669: | Line 669: | ||
::::I can't see how the principles conflict: if different reliable secondary sources use different styles, you can pick one and use it consistently in an article. --<span style="font-family: monospace; font-weight: 600; color: #00F; background-color: #FFF;">]</span> (formerly Army1987)<small> — ''], not ]''.</small> 20:05, 8 May 2009 (UTC) | ::::I can't see how the principles conflict: if different reliable secondary sources use different styles, you can pick one and use it consistently in an article. --<span style="font-family: monospace; font-weight: 600; color: #00F; background-color: #FFF;">]</span> (formerly Army1987)<small> — ''], not ]''.</small> 20:05, 8 May 2009 (UTC) | ||
:::::I think, using this general principle, we can simplify a lot of the existing text. For example, we could add a link to it from the section on mid-sentence The - and I will do so if it is not reverted again. ] <small>]</small> 20:08, 8 May 2009 (UTC) | :::::I think, using this general principle, we can simplify a lot of the existing text. For example, we could add a link to it from the section on mid-sentence The - and I will do so if it is not reverted again. ] <small>]</small> 20:08, 8 May 2009 (UTC) | ||
:::mea culpa - fast/lazy reading on my part. I was thinking it was wikipedia-wide consistency. Sorry to take your time to deal with my poor reading skills (and for the initial revert). (John ] ]) 00:35, 9 May 2009 (UTC) | |||
== Navboxes == | == Navboxes == |
Revision as of 00:36, 9 May 2009
Manual of Style | ||||||||||
|
En dashes vs. hyphens
Following the section here on en dashes, I moved Ural-Altaic languages to Ural–Altaic languages. However, I've gotten complaints saying that a hyphen is used in the literature, and that takes precedence over the MOS. Since punctuation varies from source to source, it doesn't seem that clear-cut to me. So I'd like your input:
- Does the punctuation of academic literature take precedence over wikipedia's MOS? (per complaint, "they are inappropriate in established linguistic names", as in #2 below)
- When a language family is named after two languages (Yuki–Wappo, named after the Yuki and Wappo languages) or geographic areas (Niger–Congo, spoken along the Niger and Congo rivers), and neither is a prefix, should we use the en dash? A hyphen is always used in the lit, but is this an orthographic issue, or a punctuation issue?
- What if one of the names is shortened to its root form? "Uralic–Altaic languages" I think should be en-dashed, and "Uralo-Altaic languages" clearly should be hyphenated, but what about "Ural-Altaic/Ural–Altaic languages", which is Ural Mountains plus Altai Mountains plus the suffix -ic?
I've also gotten more general complaints:
- "En-dashes are also ridiculous since they are not easy to type. Use them in mathematical formulas, but not in connected English text or in hyphenated vocabulary items."
kwami (talk) 00:50, 8 March 2009 (UTC)
This seems to me a case where a hyphen is correct. The use is for conjunction, not disjunction. There is not a from–to, versus, or opposition sense between the two terms that would indicate en-dash usage.But I'm not an expert, wait for other opinions. -- Tcncv (talk) 06:52, 8 March 2009 (UTC)
- See the "Usage guidelines" subsection under Dash#En dash. --Wulf (talk) 09:10, 8 March 2009 (UTC)
- Yes, that is what we are discussing, but what is your interpretation? -- Tcncv (talk) 18:51, 8 March 2009 (UTC)
- To me these seem very much like Bose–Einstein condensate, which the CMS would not require an en dash for, but which has an en dash in the title here on wikipedia. I've never seen that phrase with an en dash in the academic lit either, so it does seem to be a good illustration for my question.
- Another objection I've heard (see my talk page) is that readers looking things up will be constantly redirected from hyphenated search strings, wasting their time and concentration wondering how what they entered was wrong. Is this a concern for anyone else? If it is, should we set up a bot to fix links to en-dashed article titles? kwami (talk) 10:04, 8 March 2009 (UTC)
- I am retracting my earlier opinion. It appears that Misplaced Pages MoS also prefers en-dash usage for "and" relationships, which (I believe) include both the Bose–Einstein condensate and Niger–Congo languages cases. It appears that other style guides are mixed on this issue, with some (such as Chicago) preferring dashes. Again, I am not an expert. (As a side note, is it just me, or is it confusing to have the "and" condition covered under "disjunction"?)
- Other opinions are requested. -- Tcncv (talk) 18:51, 8 March 2009 (UTC)
- This is another case in which WP:MOS has been written to "reform" English, rather than to record what it does. This should be fixed. Septentrionalis PMAnderson 23:14, 8 March 2009 (UTC)
- This is also an archaic preservation of typesetting style which is inappropriate for computer usage. Hyphens are all that are required. The distinction between an en-dash and a hyphen is strictly a holdover from the printing industry. In linguistics, we never use en-dashes in formulations like Niger-Congo and Ural-Altaic and Amto-Musan, etc. An additional objection to the silly wording of this MOS concerning en-dashes is their use in "and" constructions. That would require them in "hyphenated" names, then, as well, such as Meredith Whitney-Bowes, for example. I am looking at the January 2008 issue of International Journal of American Linguistics right now. Page 2 "Patla-Chicontla Totonac" , page 59 "Uto-Aztecan" , page 89 "Proto-Cholan" . These are constructions of language names. However, in the formulation on page 141, an en-dash is correctly used in the formulation "Cherokee—English Dictionary". "Cherokee-English" is not an accepted linguistic formulation and the dictionary is clearly "from" Cherokee "to" English. On page 142, we also see "Eastern Ojibwa—Chippewa—Ottawa Dictionary" with en-dashes. Thus, the formulation "from—to" is a correct usage of the en-dash, while the "and" construction is not. ("Niger-Congo" is not a "from—to" construction, but is an "and" construction--"the languages of the Niger and Congo Basins"). (Taivo (talk) 22:26, 9 March 2009 (UTC))
- This is another case in which WP:MOS has been written to "reform" English, rather than to record what it does. This should be fixed. Septentrionalis PMAnderson 23:14, 8 March 2009 (UTC)
- Other opinions are requested. -- Tcncv (talk) 18:51, 8 March 2009 (UTC)
- Playing devil's advocate, "proto-" and "Uto-" would always be hyphenated, because they're prefixes.
- I'm not so sure "Niger-Congo" is an "and" formulation: language families are frequently named for their geographic extremes, in effect 'the languages from A to B'.
- What about cases where one or both of the joined names contain more than one word? Or hyphenating an already hyphenated name, as often happens with "proto-"? kwami (talk) 23:03, 9 March 2009 (UTC)
- Linguistic usage always prevails--hyphens all the way. And we all know that the "Niger-Congo" family extends far beyond the Niger and Congo Rivers. Indeed, Atlantic-Zambezi would be a more accurate "extension" description. The argument for "geographical" extension works (in archaic typesetting terms if necessary at all) only for terms that are not accepted linguistic names. Accepted linguistic names should always be hyphenated because they are proper names. In examining linguistic usage, the only time one finds en-dashes (or, better, em-dashes) is in forms (as cited above) that are dictionary names, "From Cherokee to English". These are not geographical ranges, but "translation ranges" only. But, in the end, en-dashes are silly retentions from typesetting and have no real function in the modern, computerized world. (Taivo (talk) 23:58, 9 March 2009 (UTC))
- Your "proper names" argument may be the way to go. That would take care of hyphenated surnames as well. (Except that dictionary titles are also proper names. "Unitary terms", perhaps?) However, em dashes are not appropriate for dictionaries. An em dash would give a bizarre reading, rather like a colon, such as the title is "Cherokee" and that it's an English dictionary. kwami (talk) 00:19, 10 March 2009 (UTC)
- When combining hyphenated forms into larger units, older sources that were typeset sometimes used en-dashes to combine hyphenated forms. Thus, in the Handbook of Native American languages, Volume 10, Southwest (1983, typeset by linotype), on page 115 we find "Proto-" added to "Uto-Aztecan" with an en-dash. But in Mithun's The Languages of Native North America (1999, computer typeset) we find "Athapaskan-Eyak-Tlingit" with all hyphens (page 346 ff) and on page 123 "Proto-Uto-Aztecan" with all hyphens. In linguistics books, all the linotype-typeset (precomputer) books I checked have occasional (although not universal) en-dashes in places and all the computer-typeset books I checked have hyphens all the way. This WP:MOS appears to be a misguided attempt to turn back the clock to a precomputerized typesetting era. (Taivo (talk) 00:12, 10 March 2009 (UTC))
- What we'll end up with then is differing punctuation standards depending on the topic of the article. That doesn't seem to be a tenable situation. (I don't see the point of the AET example, but pUA captures the diff.)
- There's also the issue of precision. Within linguistics, the meaning of these names is obvious. However, they're not always so obvious to the non-linguist. Granted, hyphens are not wrong, but en dashes help disambiguate. This reminds me of punctuation in quotations. A final period or comma may come before or after the quotation mark, depending on the style guide we're following. However, here on WP we've decided to follow logical order, as being an encyclopedia warrants precision in such matters. kwami (talk) 00:19, 10 March 2009 (UTC)
- Sorry for not being more specific--Athapaskan-Eyak-Tlingit is a combination of Athapaskan-Eyak with Tlingit. What you are implying about the precision comment is that books published by linguists are imprecise and that Misplaced Pages is somehow more precise. Ahem. Within linguistics, hyphens are now standard usage for all proper names of languages. That's the Manual of Style which should be followed for all language and linguistics articles. Within our field, we get to establish what is standard usage and what is not. These are proper names in the same way that Meredith Baxter-Birney is a proper name. When you start using an en-dash in her name, then you have a valid argument for using them in linguistics proper names. Otherwise, there is no valid contemporary reason for using en-dashes in linguistic names when the specialists within that field don't use en-dashes. (Taivo (talk) 00:33, 10 March 2009 (UTC))
- Oh, yeah, I got that much. It's just not clear to me that AET isn't just a list of the three branches of the family, without trying to subclassify them. Yes, I agree with your surname analogy, as I've said above. kwami (talk) 00:38, 10 March 2009 (UTC)
- One further point about en-dashes within linguistic proper names. If linguists are using hyphens in all proper names of languages and language groups, then who will be decided which names get en-dashes and which ones don't? Non-linguists? I hardly think that they have the authority to decide such matters. Out in the world of linguistics, there aren't any en-dashes, so adding them into Misplaced Pages articles is actually a falsification of the data. (Taivo (talk) 00:36, 10 March 2009 (UTC))
- It's a punctuation standard, not decided on a case-by-case basis. So there is no "decision". Illustrations from some of the Papuan families: Trans–New Guinea, East Bird's Head–Sentani, Left May–Kwomtari, Ramu–Lower Sepik, Yele–West New Britain, Reefs–Santa Cruz, but a hyphen in Eastern Trans-Fly. (Per the MOS, most of these should actually have spaces: East Bird's Head – Sentani.) I don't know about the spaces, but even if we stick with hyphenating Niger-Congo, I think we should follow Linotype-level precision for protolanguages (which does not contradict linguistic custom), and keep the en dashes in these Papuan families, which otherwise are ambiguous. I think precision is valuable for its own sake, even if specialists don't bother with it. kwami (talk) 00:50, 10 March 2009 (UTC)
- But you are applying two different things in your examples. First, "proto-" and "trans-" are prefixes and prefixes should always be attached with hyphens and not with en-dashes. "co-operate" should never have an en-dash any more than "proto-" or "trans-". Thus, your example of Trans-New Guinea is in a different category than an example such as Athapaskan-Eyak-Tlingit. None of these examples from New Guinea are complex, all are simple: A + B. The only ambiguous cases are where you have formulations such as: A+B + C+D. But again I ask, who is going to make the decision where to put en-dashes and where to put hyphens? If you put en-dashes throughout in Athapaskan-Eyak-Tlingit, then you've violated your argument about precision. I'm not willing to trust these decisions to anyone except the original linguist author, but they all have used hyphens. So using en-dashes here and there where the original authors did not is a falsification of the data. (Taivo (talk) 00:59, 10 March 2009 (UTC))
- You're conflating different phenomena. The MOS description isn't very clear: Indo-European and Proto-Indic both take hyphens, because they involve prefixes. Proto–Indo-European, however, takes an en dash, because the prefix docks to an already hyphenated form. This is established usage in linguistics, as you yourself showed. (It being obsolete is a different argument entirely.) Besides using en dashes when conjoining already hyphenated terms, it's also standard to use them when conjoining multi-word terms, as in the Papuan examples. That has nothing to do with something like Niger-Congo, where your surname argument is convincing. kwami (talk) 01:16, 10 March 2009 (UTC)
- Just for a control, I looked all the way back at Voegelin and Voegelin's Classification and Index of the World's Languages (1977, long before computerized typesetting) and they used hyphens. As just a sample, on page 243, I found "Athapascan-Eyak", "Na-Dene", "Sino-Tibetan-Na-Dene", "Kuki-Chin", "Naga-Kuki-Chin", and "non-Indo-European". All with nothing but hyphens and some of them constructed themselves from other hyphenated forms. (Taivo (talk) 00:40, 10 March 2009 (UTC))
- The quality of the linguistics has little to do with the quality of the printing or typesetting. For all we know, V&V wrote the book on a manual typewriter and expected the typesetters to take care of such issues, and the typesetters didn't know the difference with these unfamiliar names. And even if they chose to be imprecise, I don't think we should go with the lowest common denominator. I can go along with "Na-Dene", as that is consistent with broader English usage, but "Sino-Tibetan-Na-Dene" is just stupid. Within the linguistic community, okay, everyone knows what they mean. But for a broader audience it definitely needs an en dash: "Sino-Tibetan–Na-Dene". kwami (talk) 00:50, 10 March 2009 (UTC)
- This sounds like support for Do what reliable sources in English do, leaving the present elaborate distinction as a rule of thumb when sources conflict, or taking them out altogether.
- Which, if either, should we do?
- Other comments?
- I should think the difference between the two junctions in non-Indo-European worth marking, myself; but if sources don't....00:44, 10 March 2009 (UTC)
(outdent) Actually, using en-dashes does violate linguistic custom. All linguists are currently using hyphens for "proto-" and most have used hyphens in the past. The problem is that any use of en-dashes violates contemporary linguistic usage and much of past usage. En-dashes are extinct in linguistic literature and were never very common even in the past. (Taivo (talk) 00:43, 10 March 2009 (UTC))
- Yes, "Proto-Indo-European" is almost universally hyphenated. However, this is not restricted to linguistics: prefixes on already-hyphenated forms are generally hyphenated themselves, regardless of the field. Therefore I think this is an argument for amending the MOS, not for making linguistics an exception.
- I don't have a problem with hyphenating "Proto-Indo-European", "Niger-Congo", and "Amto-Musan". However, I do object to hyphenating "Sino-Tibetan–Na-Dene", "East Bird's Head–Sentani", and "Yele–West New Britain", as the results are difficult to parse.
- I'm finding that TNG is often not conjoined at all: "Trans New Guinea", but that when it is, it is often en-dashed: "Trans–New Guinea". It seems that the current trend is to write it as three separate words, despite the fact that one is a prefix. This is a clear indication that people find hyphenation problematic in this case. kwami (talk) 01:41, 10 March 2009 (UTC)
Looking through the MoS Talk archives, it is apparent that dash usage is either the most recurring topic or is close to it. I think that points to a systemic problem either in the way Misplaced Pages defines its dash guidelines or in its expectations of its editors, and the repeated debates are a distraction from other MoS issues. Short of abandoning dashes (which I'm sure has no chance of happening), I think the dash guidelines should start with a clear and concise summary on which the rest of the guideline can build. The best, clearest, and most concise summary I've come across in recent discussion was by The Duke of Waltham (talk · contribs) in this earlier discussion.
“ |
Simple, isn't it? |
” |
Of course, hyphens have many other uses such as prefixes (Proto-Indo-European) and phrasal adjectives (hard-boiled egg), but I think there is general agreement on these uses. Unfortunately (IMHO), Misplaced Pages guideline does not distinguish the conjunction ("and") cases from the disjunction ("to" and "verses") cases and specifies that en-dashes be used for both. Thus the conjunction in "Michelson-Morley experiment" uses an en-dash, and as I interpret the current guideline, Ural-Altaic languages should also use an en-dash. However, for those familiar with other styles such as The Chicago Manual of Style, this seems unnecessary.
I would propose relaxing the dash guideline to recognize (and even encourage) the use of hyphens as an alternative to en-dashes for conjunctions. Disjunctions would continue to use en-dashes. There is precedent for this in the allowance of spaced en-dashes as an alternative to unspaced em-dashes for interruption. -- Tcncv (talk) 03:23, 10 March 2009 (UTC)
- Since the linguistic usages of hyphenated forms falls within the "conjunction" guideline, this follows current linguistic usage of hyphens as found in the journals. (Taivo (talk) 03:38, 10 March 2009 (UTC))
"Michelson-Morley experiment" sounds like an experiment by some guy named Michelson Morley, whereas "Michelson–Morley experiment" makes it clear that there were two people, Michelson and Morley. This is I believe an important distinction to maintain.
VikSol's comments from my talk page:
- The requirement for n-dashes is an instance of prescriptivism, as in prescriptive grammar. I think practical utility is a much more important consideration than formulaic correctness. I doubt very many readers ever notice when an n-dash is used rather than a hyphen or are even aware of the existence of both. I think the requirement for n-dashes in certain formations is a harmless conceit as long as it doesn't interfere with the use of the encyclopedia. In this case, it does. Practically no one has an n-dash key on their keyboard and only a few know where to find one. This is proved by the fact that the editors who replace hyphens almost always use the "& n d a s h ;" command > "–", which makes text harder to read for editors, rather than a "physical" n-dash "–". Evidently, it is not widely known that the physical n-dash even exists. Because the n-dash character is a hangover from the print industry, and no one has it on their keyboards, everyone who types in a search is going to use the hyphen, e.g. "Eskimo-Aleut languages" rather than "Eskimo–Aleut languages". The result is that every single search of this kind is going to bring up the "Redirected from Eskimo-Aleut languages" message or the like. The readers feels this as a slap in the face, wondering "what did I do wrong?" S/he may then scrutinize the typed message to see if it was mistyped and lose time figuring out an n-dash was required or, more likely, giving up in frustration. By this time precious seconds have been wasted, the reader's chain of concentration is likely broken, and their reaction to Misplaced Pages begins to turn from positive to negative, because we are not anticipating their predictable reactions. If it was possible to devise a fix whereby searches using hyphens automatically produced the article with n-dash without the "Redirected" message, we would again have a harmless conceit, especially if this fix was automatic and did not require a further effort by the editor, but it would, IMHO, be a waste of time, which is what is scarcest on this planet. In sum, the MOS guideline, if it requires n-dashes in titles, should be changed. Most Misplaced Pages guidelines are not rigid, recommending that common sense be used and the particular situation considered. If this isn't so here (Kwami, could we have a link to the guideline?), one would want to know why not. Why would this particular bit of typographical traditionalism be allowed to run roughshod over common sense and practicality?
and,
- Misplaced Pages does not generally follow typesetting principles derived from the print industry but remains close to what people type on their computer screens. For example, it puts a line between paragraphs rather than indenting the first line. The purpose of this I think is to maintain editability, i.e. to make it easy for the average user to edit Misplaced Pages without special knowledge. This is really one of the last holdouts in the computer world to the era of DOS and other user-editable operating systems, which has since been totally eclipsed by systems that freeze out the user and keep him dependent on a handful of corporate monopolies. It's the last glimpse of a world as it might have been. The mere fact that people don't naturally type an n-dash under any circumstances is a sufficient argument against using it, in my opinion. Why spend all this time and effort putting in n-dashes when almost nobody is ever going to notice whether an n-dash or a hyphen was used or not? As I see it, it would be better to remove the use of n-dashes altogether, thereby making the use of dashes/hyphens consistent with the general principle of Misplaced Pages typography, namely that it's not an imitation of print typesetting but sacrifices some of its refinements in order to maintain direct contact with the average user. All this in a totally non-dogmatic spirit, I hope it's clear. User:VikSol
Ah, here's another example which I think cries out for an en dash: Trans-Fly–Bulaka River, as opposed to Eastern Trans-Fly. kwami (talk) 07:52, 10 March 2009 (UTC) kwami (talk) 07:42, 10 March 2009 (UTC)
- If you want to write new text, kwami, with en-dashes, ok, but don't go changing existing text or existing article titles. There are tons more useful things that you could be doing in the linguistics articles other than turning hyphens into en-dashes. That's just a waste of time, IMHO. (Taivo (talk) 02:23, 11 March 2009 (UTC))
- Actually, I've reverted all the changes along the lines of Niger-Congo. kwami (talk) 02:29, 11 March 2009 (UTC)
- These should be counter-reverted; we should not invent usage. Septentrionalis PMAnderson 16:48, 11 March 2009 (UTC)
- Actually, I've reverted all the changes along the lines of Niger-Congo. kwami (talk) 02:29, 11 March 2009 (UTC)
- No, Pmanderson, these should not be "counter-reverted". Misplaced Pages is the inventor of usage here, not linguists. Linguists nearly universally use hyphens here now and have generally abandoned en-dashes. Misplaced Pages should follow the field, not the other way round. And, Kwami, you have a point about readability, but at what point do we end the hyphen/en-dash madness? How about South Bird's Head-Timor-Alor-Pantar, which is composed of South Bird's Head + (Timor + (Alor-Pantar))? Should we then use an em-dash to add another layer of detailed understanding: South Bird's Head—Timor–Alor-Pantar (I don't know if I got the right symbols inserted since their appearance here on the edit page is not the same as their appearance on the article page--another argument for just using hyphens). (Taivo (talk) 19:53, 11 March 2009 (UTC))
- I've reverted to just using en dashes to join multi-word terms. So "Bird's Head–Timor-Alor-Pantar". One en dash to join Bird's Head, with a space in it, to Timor-Alor-Pantar, with hyphens in it. From what I've seen in print, you generally use en dashes for the highest level in the taxonomy, and reduce everything else to hyphens.
- Em dashes would never be used. If we want to be sticklers, the way to join phrases which contain spaces would be with en dashes with spaces. So theoretically we could have "Bird's Head – Timor–Alor-Pantar". However, I don't see any point in doing that, as it doesn't improve legibility, and have reverted the few families where I had spaced en dashes following the MOS. kwami (talk) 21:18, 11 March 2009 (UTC)
- Actually, Taivo, in the classifications I'm familiar with, TAP is not a family with two branches, Timor and Alor-Pantar, but rather one with multiple branches spread over three islands, Timor, Alor, and Pantar. Therefore simple hyphens are all that is needed: Timor-Alor-Pantar. Where the en dash would come in is in "West Timor–Alor-Pantar", as it specifies that "West" applies only to Timor, not to the whole of *Timor-Alor-Pantar. Omitting the en dash would imply that it might contrast with *East Timor-Alor-Pantar, rather than with East Timor, as it actually does. kwami (talk) 21:26, 11 March 2009 (UTC)
- (Makasai-)Alor-Pantar is contrasted with ungrouped languages of Timor in Ethnologue, Ruhlen, and International Encyclopedia of Linguistics (which all follow Wurm's classification). In fact, none of my references have Alor-Pantar ungrouped--all group them together against the languages of Timor. I'd be curious as to who isn't following Wurm's lead in this particular grouping. But the point still remains--if you want to use en-dashes for clarity, then you must use an em-dash when you have A+(B+(C+D)). The journals, however, are still in favor of hyphens all the way, though. OK, I just decided to look at a journal that I don't subscribe to and found a real mess--Oceanic Linguistics. I love OL and read it on-line regularly, but it's got a real mess. In the first article of the Dec 2008 issue I found "Proto-Oceanic" (hyphen), "Central Malayo-Polynesian" (hyphen), "Central-Eastern Malayo-Polynesian" (hyphens), but "South Halmahera–West New Guinea" (en-dash), "Pre–Proto-Oceanic" (en-dash and hyphen), "Proto–Central-Eastern Malayo-Polynesian" (en-dash and hyphens), "Proto–Central Malayo-Polynesian" (en-dash and hyphen), "Proto–Western Malayo-Polynesian" (en-dash and hyphen), and, perversely, "Proto-Eastern-Malayo-Polynesian" (all hyphens-not a typo, but consistently throughout the article). So some "proto-"s have hyphens and some have en-dashes. It also has "Proto–Trans–New Guinea" (en-dashes). In the third article of that issue, I found "Proto-North–Central Vanuatu" (hyphen and en-dash, not a typo, but consistently). Contrast this with the first Squib of that issue which has "Proto–North-Central Vanuatu" (en-dash and hyphen, not a typo, but consistently). In the fourth article, there was "Timor-Alor-Pantar" (all hyphens). My point is that there is no real consistency even when editors of articles subject to typography try to distinguish between en-dashes and hyphens. Misplaced Pages is not a typeset article and I find it pretentious to think that it is. And when the same editor in a prestigious journal like Oceanic Linguistics mixes en-dashes and hyphens in the same form in two different articles just proves what a confusion they really are and not the enlightenment you would like them to be. And to think that we are encouraging non-linguists to use en-dashes...... (Taivo (talk) 23:38, 11 March 2009 (UTC))
- Actually, Taivo, you pretty much prove my point.
- In none of your sources is Timor a genetic node apart from Alor-Pantar. Therefore Timor-Alor-Pantar is not an (A+(B+C)) cladistic description, but an (A+B+C) geographic description like Niger-Congo, and hyphens are all that is needed. Your OL citation supports this.
- Good examples from OL, which show that en dashes still are used in the linguistic lit. It's actually not bad at all. There are only a few inconsistencies. "Proto-" vs "Proto–" is a case in point: en dash when prefixed to a hyphenated term, hyphen otherwise. Perfectly consistent except for "Proto-Eastern-Malayo-Polynesian", which we'd expect to be like Proto–Central Malayo-Polynesian. Evidently an author who doesn't use en dashes, but IMO that's still a pretty good batting average. And as noted above, hyphens may be universally substituted for en dashes, so really it's just a stylistic difference, just as placing punctuation inside or outside quotation marks is a matter of style. Same consistency with the other prefixes, Pre– and Trans–.
- "Proto-North–Central Vanuatu" is clearly an error. Perhaps a typo in the custom spell checker? Are you really claiming we must abandon en dashes because you found a typo?
- No, we never use em dashes for compounds. I don't know where you get the idea that we "must" do this. Any sources to support your claim?
- As far as professionals getting it wrong, so what? I bet they misspell words too. Should we abandon standard spelling because the professionals sometimes get it wrong? But the professionals very rarely got it wrong: One abandoned en dashes for all hyphens, which is a stylistic difference, while only one name was actually incorrect, and that in only one of the articles you found it in. So no, I would not agree that this is "a mess", but rather an excellent guide to what we should be doing. kwami (talk)
- Actually, Taivo, you pretty much prove my point.
- Okay, another few cents' worth: (1) What the discussion above shows is the Byzantine complexity of the rules for using n-dashes, along with their variability, not only between English and French etc. usage but within English usage itself (and we haven't even really factored in here the differences between American and British usage, and various schools thereof). If people dripping with graduate degrees, specialized knowledge, and long versed in Misplaced Pages can't figure it out, who can? Obviously, there is no chance that the average editor coming to Misplaced Pages for the first time can. (2) It is impossible to devise a practically applicable standard for the use of n-dashes, because there are so many different ways to think about an expression like Proto-Uto-Aztecan and the like. (3) Linguistics provides some ways to think about these problems, and indeed it is the science best placed to do so. (a) What we have here is arguably a conflict between the two poles that govern language change, ease of expression and ease of understanding. E.g. it's easier to assimilate sounds to nearby sounds but after a certain point harder to understand the result. Languages generally arrive at a compromise. (b) The fundamental problem is not of our making and has no solution: it's that English is conflicted about the process of compounding. In German for instance there is never any doubt about whether a word should be compounded or not: Urindogermanisch is one word, unlike its English equivalent, 'Proto-Indo-European' or more literally 'Proto-Indo-Germanic', but on the other hard it's disarticulatable into its component elements, rather like the inflections of an agglutinating language, whereas in English once elements are joined in a word they stay joined, if the compounding has passed the point of hyphenation (or n-dashing). Thus English, unlike German, has all sorts of different levels of compounding, sometimes linked to accent and type of word, sometimes not: e.g. some manuals of style will tell you to write "the twentieth century" (nominal) but "twentieth-century events" (adjectival). But this principle is not consistently observed, even in principle, i.e. the language is unsettled on its principles of composition. (4) We could use hyphens only in article titles, as in "Na-Dene languages", and n-dashes inside the articles, as in "Na–Dene languages". But the result would be that using a hyphen in the search box and hitting "Go" would find the article title, but clicking on the "Search" button would fail to find the instances with n-dashes inside articles. I haven't actually checked this, and perhaps there is some fix. (5) But - and I strongly agree with Taivo on this - why go to all this effort? Every time someone edits one of these articles, an army of bots must crawl into place, we editors concerned with linguistics issues must drop what we are doing and spring into action, and the poor sap who has used a hyphen sees his edits squashed by some know-it-all (as he is likely to see it). (6) No one can ever hope to master the Byzantine complexities governing the use of hyphens versus n-dashes. To do so would require an army of lawyers, who would be just as productive as the English real estate lawyers of the 19th century who liked to keep lawsuits going for generations (a steady income, don't you know). It is as much as we can do to keep up with the two-way distinction between hyphens and m-dashes, as illustrated by the fact that some people prefer an m-dash as a mark of punctuation, others an n-dash with two spaces (potentially micro-spaces - the Byzantine complexities multiply). (7) It's true that an n-dash can help to differentiate expressions of the A-B + B-C form. But, referring to point (3a) above on the struggle between ease of expression and ease of understanding, is it worth it? (a) Most people are unaware of the distinction and don't notice it, so the benefits are reserved to an elite. We could try to educate people about this, but the fact is that spontaneous understanding is not there, and we are not writing for typesetters alone. (b) If we use n-dashes, we must define when they are to be used, and as this entire discussion shows, there is no realistic way to do so. Every attempt to do so founders on the lack of any clear standards in either American or British / Commonwealth usage, and this lack of clear standards is ultimately related to the fluctuating status of compounding in the English language and English orthography. (8) In conclusion, given the practical and real impossibility of defining any standards that are (a) consistent and (b) simple enough to be generally used, I suggest that Misplaced Pages should abandon the use of n-dashes altogether, even in date ranges (where, again, nobody types an n-dash spontaneously), but at a minimum and most definitively in all linguistic names of languages and language families. That is a way to end the confusion, to ensure that searches find what they are looking for, and to simplify the tasks of editors, and it will work. Regards to all, VikSol (talk) 23:53, 11 March 2009 (UTC) PS- I think that Taivo's point above that Misplaced Pages is not typeset deserves to be taken very seriously. For example, footnote numbers and other superscripts bump up the line separation unevenly - hardly esthetic, but tolerated. This is a much more serious defect than any wandering between n-dashes and hyphens, but so far there's no practical fix. We are a long way from the refined standards of the print industry (when it managed to apply these), but then again, democratic culture has its advantages. Why imitate something that is different by nature? VikSol (talk) 00:06, 12 March 2009 (UTC)
- IMO, it's all quite clear. We can discuss which style guidelines are most appropriate to follow, but if we're going to throw up our hands and say "we're confused!", we might as well abandon spellings, pronunciations, calendars, technical terms, and units of measurement we find confusing. The use of the en dash as Taivo illustrated in OL serves a valuable disambiguating function, and IMO it should be preserved. kwami (talk) 00:27, 12 March 2009 (UTC)
- But we are confused! If you are saying, "we now use hyphens, but use n-dashes only to disambiguate", as in A-B–C-D, fine, then we have a simple principle, but one that innovates. If this does not inhibit searches, then it does no great harm. It remains true that no one will notice it, except for a few ultra-professionals. So my point above about "is it worth it?" stands. If you want to try to spell out some clear principles we should follow, then fine, I'll look at them with interest. But at the moment no coherent set of principles is in view. Also, I am not advocating orthographic anarchy, but a clear and simple principle: hyphens in all compounds, n-dashes not at all (preferably) or only in date-range expressions like 1900–1910 (because the usage is relatively well established). A principle simple enough for everyone to follow. The idea is that we sacrifice a little ease in comprehension (the possibility of disambiguating A-B–C-D from A-B-C-D) for a lot of ease of expression. No doubt, there is a real advantage either way. But the relative balance of advantages seems clear to me. VikSol (talk) 01:15, 12 March 2009 (UTC)
En dashes when compounding words which contain spaces or already contain hyphens. En dashes when compounding two people's names, vs. hyphens for compounding the name of a single person. Both pretty standard. Comprehension on the part of the reader trumps ease of input for the editor. Of course, there are situations where we may disagree on usage, but that's no different than differences on capitalization. Just came across an example: An editor abbreviated "East Fijian-Polynesian" as "East" in a table, evidently thinking it meant East (Fijian-Polynesian), when actually it meant (East Fijian)-Polynesian. With an en dash, "East Fijian–Polynesian", the structure of the compound is clear. kwami (talk) 03:01, 12 March 2009 (UTC)
- You ignored one of my main points, Kwami. The same editor used both Proto-North–Central Vanuatu and Proto–North-Central Vanuatu. It was not a typo since each was 100% consistent within the article in which it occurred. The rules are silly and confusing to the point that the same person used two different combinations for the same term on two different days. And don't go all warm and bubbly because one of the journals I regularly read uses en-dashes. The other half-dozen or so that I regularly read do not. Indeed, an increasing amount of linguistics is being published from camera-ready copy and not being typeset at all. It should go without saying that virtually all camera-ready copy is being done with hyphens and not en-dashes. I second everything that VikSol is saying about the needlessness of en-dashes in an on-line, user-edited format such as Misplaced Pages. And your suggestion that this can be done with bots is absolutely ludicrous. Bots are not thinking machines, they are stupid computer programs that don't know anything beyond 1 or 0. In the 1960s, the Air Force and NASA agreed that solid rocket propulsion would be called a "motor" and liquid rocket propulsion would be called an "engine". The next proposal that came out of Thiokol included the mechanical replacement of "engine" with "motor" for consistency. Thus, the buildings were protected by an independent fire department and its "fire motors". I can't tell you how many times I've seen bots do silly things here. You think I'm going to trust a computer program to correctly place en-dashes when I don't trust anyone who doesn't have a linguistics degree? Get real. En-dashes are a relic from an age of typesetting and have no place in Misplaced Pages. (Taivo (talk) 04:33, 12 March 2009 (UTC))
- Ah, I didn't catch that it was the same editor. Still, the fact that all but one term was correct, and even that term was correct in one of the articles, tells me it's not all that difficult for other people. You make it sound as if I'm the only one who understands this. And if someone makes a mistake, so what? This is a wiki, and someone else will come along and correct it, just as they do with capitalization, quotations, and other formating issues. Most people will continue to write with hyphens, and that's fine. In the infrequent cases where a hyphenated name is ambiguous, we can make it more precise. I never said bots should make the decision, but once an article is moved to a name with an en dash, bots can fix the redirects and other mentions of the name. I don't see what the problem is: precision vs. ease of data entry, and meanwhile it's okay to use the easier form of data entry.
- Anyway, you've convinced me to abandon the more extreme interpretation of when to use en dashes, and there are few linguistic articles which compound multi-word terms. kwami (talk) 09:43, 12 March 2009 (UTC)
- Kwami, I am concerned that you have been replacing n-dashes with hyphens in articles and article titles such as "Proto-Chukotokto-Kamchatkan", now changed to "Proto–Chukotko-Kamchatkan". (1) I had the impression that the discussion here was moving toward a consensus, but the principles to be followed have not been spelled out comprehensively. Please have a little more patience with Taivo and me and the other persons concerned until the positions are clearly defined. If some parties then don't get their way, fine, but we should have the principles spelled out clearly as well as the grounds for decision. (2) Adding an n-dash after "Proto" raises two real concerns: (a) As we all agree, it has the drawback of complicating searches, by bringing up a "Redirected" message. (b) A further concern is that, if n-dashes are used after "Proto", this conflicts with the use of n-dashes to disambiguate expressions like Uralo-Indo-European by transforming them into Uralo–Indo-European, since we then have to sometimes speak of Proto–Uralo–Indo-European - two uses of an n-dash that use conflicting rules, indicating a contradictory and therefore confusing system. I think we should try to achieve consensus here before changing any more article titles. Regards, VikSol (talk) 02:25, 13 March 2009 (UTC)
- Concern noted, VikSol. Not very many articles are affected, so it won't be hard to undo, and I'm quite willing to compromise on the protolanguages.
- An alternative to your "Uralo–Indo-European" example would be "Uralo-Indoeuropean". You're right, with only two levels of conjunction (hyphen and en dash), you can get two en dashes in terms like "Proto–Uralo–Indo-European". I've seen this in print, actually, with "Proto–Trans–New Guinea". As for other hyphenated protolanguages, I can't see that there would actually be much chance of miscomprehension, so a hyphen on proto- wouldn't be problematic. On the other, besides being typographically correct, we've seen that en dashes are still used in the linguistic literature for such protolanguages.
- Given that the world's most cited protolanguage, pIE, is (nearly?) always doubly hyphenated, I don't see a problem with deciding to hyphenate prefixes (proto-, macro-, pre-, post-) on all hyphenated family names. There are so few en-dashed names that we can take them on a case-by-case basis. Where I really think that in the interest of clarity we should have en dashes is in families which join two multi-word terms. "Uralo-Indo-European" looks like a tripartite name composed of Uralic, Indic, and European. Since most of these will be extremely obscure names (otherwise someone would have come up with something shorter!), we can't expect people to understand them just through familiarity. kwami (talk) 07:05, 14 March 2009 (UTC)
Taivo: You said "this WP:MOS appears to be a misguided attempt to turn back the clock to a precomputerized typesetting era." However, using that argument would suggest that proportional fonts are unnecessary and we should all just use Courier (with two spaces after periods, no less). Hyphens are sometimes used on computers not because the underlying thinking has changed, but merely due to simple technical limitations. Therefore, typographically correct characters should be used whenever possible. I'm in favor of an en dash in this case. "The is not a typewriter". --Wulf (talk) 05:45, 13 March 2009 (UTC)
- There is a fundamental difference between proportional fonts and en-dashes. One is automatic and the other is not. We don't need to insert any special commands in order to use proportional fonts--the kerning is built into the font. It also does not require the use of any special characters. It takes the characters typed on the keyboard and mechanically spaces them proportionally. An en-dash is a fundamentally different thing--it is a character that is not found on anyone's keyboard. It is a highly specialized creature that (as I illustrated above with the same editor using en-dashes in two different places in the same word) has no real rules of usage outside the world of typography (and even then the rules are arcane and not-well-known). It is not an ASCII character, it is not a character in any phonetic font, it is just a leftover from another era. It is not "the typographically correct" character, it is just an archaic option. (And, BTW, I do use two spaces after periods.) (Taivo (talk) 06:11, 13 March 2009 (UTC))
- Okay, now you're making sense. You just accidentally proposed that MediaWiki convert double hyphens to en dashes. Also, the "leftover from another era" is when we had to cram as many characters as we could into 7-8 bits. Misplaced Pages does not use ASCII, nor does anybody else these days. By your reasoning, we should use the asterisk as a multiplication sign because the true multiplication sign "is not an ASCII character". There was never a sea change in typography, just a comparatively very brief period of technical limitation which we have now passed... --Wulf (talk) 03:52, 14 March 2009 (UTC)
- Fortunately, without your "help", kwami decided that current linguistic usage superseded Misplaced Pages's misplaced efforts at making editing more difficult rather than less difficult. (Taivo (talk) 04:57, 14 March 2009 (UTC))
- Maybe I'm missing something here... Remind me why you believe Ural-Altaic should have a hyphen, yet Bose–Einstein condensate gets to keep its en dash? --Wulf (talk) 08:33, 14 March 2009 (UTC)
- Fortunately, without your "help", kwami decided that current linguistic usage superseded Misplaced Pages's misplaced efforts at making editing more difficult rather than less difficult. (Taivo (talk) 04:57, 14 March 2009 (UTC))
- For me, it's simply a matter of disambiguating. "Ural-Altaic" means basically the same thing, whether you read it as one family spread from the Urals to the Altai, or the combined Uralic and Altaic families. Bose-Einstein, however, could be misunderstood as somebody named "Bose Einstein". (Not likely with that name, perhaps, but much more ambiguous with other names.) kwami (talk) 09:57, 14 March 2009 (UTC)
- Because Ural-Altaic (hyphen) is standard usage among linguists and has never had an en-dash in it. Linguistic usage favors hyphens over en-dashes. And, as Kwami says, there's no ambiguity, but ambiguity is not so much a factor in contemporary linguistic usage. Our field uses hyphens generally now and it isn't Misplaced Pages's place to try to impose its will on it. Ural-Altaic is conjunctive, not distributional in nature and most linguists will interpret it as conjunctive. (Taivo (talk) 11:02, 14 March 2009 (UTC))
- Hmm, it took me a while to figure out that Gay-Lussac is one person but Boyle–Mariotte are two, when I was in high school. --80.104.235.34 (talk) 12:32, 14 March 2009 (UTC)
- I'm just not sure how what most people in a particular field happen to use has to do with Misplaced Pages's standardized style manual and the consistent application thereof. Should we now have separate style manuals for each WikiProject? (And, speaking of WikiProjects, shouldn't WikiProject Typography be consulted on this?) --Wulf (talk) 21:37, 14 March 2009 (UTC)
- For me, it's simply a matter of disambiguating. "Ural-Altaic" means basically the same thing, whether you read it as one family spread from the Urals to the Altai, or the combined Uralic and Altaic families. Bose-Einstein, however, could be misunderstood as somebody named "Bose Einstein". (Not likely with that name, perhaps, but much more ambiguous with other names.) kwami (talk) 09:57, 14 March 2009 (UTC)
- Yes, that's a good example for this project page. kwami (talk) 21:18, 14 March 2009 (UTC)
en dashes vs hyphens (cont.)
(1) For years, everyone has been happily naming articles "Proto-Indo-European language" and the like and finding them in searches without any difficulty. Thus, established and settled usage on Misplaced Pages is to use hyphens in all names of languages. Kwami has been innovating in changing this established and settled usage. But this usage has never posed the slightest practical problem. Changing it will not increase the encyclopedia's ease of use. It will, on the contrary, decrease it by afflicting users with constant "Redirected from ..." messages, among other problems, including but not limited to increased difficulty of editing and the need to constantly update edits.
It's true the current MOS guidelines can be interpreted to require n-dashes in article titles when they are used in language names. But the more fundamental question is: is it a good idea to do so?
When an n-dash is used in a range of numbers, such as 1914-1918, it is an ideogram, read in practice as “to” in most instances. According to Tcnv above, it disjoins the numbers. When an n-dash is used to write a compound, it is used to conjoin, the opposite usage. Thus, the use of an n-dash in these cases follows a different rule in each case and the two rules are directly contradictory. This is our first warning that we are entering arcane territory here, with no safe footing for the average user of language.
(2) Kwami has flagged the complicated instance of Uralic-Altaic versus Uralo–Altaic and Ural–Altaic. I believe this illustrates the impossibility of arriving at a system simple and logical enough for the average person to utilize.
For Uralic-Altaic, there is no problem. Uralic-Altaic means “Uralic and Altaic”. It is similar to Sanskrit dvandva compounds, a well-known form in linguistics.
For Uralo-Altaic, the issue is more complicated. At first glance, Uralo- looks like a prefix, like proto-, neo-, geo-, turbo-, as well as trans-, pre-, etc. But Indo-European developed the use of -o / -ō as a combination form, followed in this by several of its daughter languages, in some cases by inheritance (e.g. Greek), in others by drift (e.g. Avestan). This is what is going on here. Uralo-Altaic means “Uralic and Altaic”, but the primay suffix -ic has been replaced by the secondary suffix -o. It appears to be a combination of prefix and nominal, but in fact it is a combination of two nominals.
Similarly, in Ural-Altaic, the -ic suffix has been elided in the first element, a procedure well known in languages, rather like gapping in syntax.
Uralic-Altaic, Uralo-Altaic, and Ural-Altaic, then, are all identical in meaning, in spite of first appearances.
As these examples show, the distinction between coordinate forms (as Uralic-Altaic obviously is) and prefixed forms (as Uralo-Altaic appears to be at first glance) is not always easy to tell.
Furthermore, there is no way to tell from the form of the first element what its function is. For example, Indo-European is the language family from which many of the languages of India and Europe are derived — in this case the elements are coordinate — but Indo-Aryan is those forms of Aryan spoken in India — in this case “Indo-” is a prefix qualifying “Aryan”.
As I understand it, because of such issues Kwami has now abandoned the “complicated” version of using n-dashes in favor of a somewhat simpler system, detailed below.
(3) Let me try to sum up the evolving positions. I think there has been and will continue to be some movement in everybody’s position and this is the purpose of the discussion.
Taivo and I, along with various other people (see discussion of “curly quotes” in the section just archived), would ideally like to see n-dashes eliminated from Misplaced Pages and entirely replaced with hyphens both in compound words (including language names, such as Proto-Uralic) and in number ranges (such as 1914–1918). But, above all, we would like to see the existing de facto custom of hyphenating all language names continued.
I am beginning to grasp what Kwami has been trying to get across about the advantages of n-dashes in disambiguation, e.g. Uralo–Indo-European versus Uralo-Indo-European. I think these advantages are real and must be weighed in the balance.
Kwami is taking the view that:
- In names of languages that are compounds, the hyphen is the basic form – the default. Example: Indo-European.
- The hyphen is replaced with the n-dash in several different circumstances:
- When an element is added to a name separated by a space. Example: Trans–New Guinea.
- When an element is added to a name that is already hyphenated, in several specific circumstances:
- When a simplex name is added to a name that is hyphenated. E.g.: Uralo–Indo-European.
- When two hyphenated language names are conjoined. Example: Indo-European–Hamito-Semitic.
- When a prefix is added to a hyphenated name. Example: Proto–Indo-European.
The most important point here is that, as I understand it, Kwami is now advocating a system in which a first compounding is indicated with a hyphen, a second with an n-dash. Thus we get Indo-European, but Uralo–Indo-European and Proto–Indo-European.
(4) There are several problems with this.
(a) A fairly serious problem is the fluctuation that results from these principles in prefixing “Proto-”. For example, we get “Proto-Uralic” (hyphen), but “Proto–Chukotko-Kamchatkan” (n-dash). Here there is no advantage whatsoever in disambiguation, since the expressions are totally unambiguous: a proto-language being a single language by definition, there is no possibility of misunderstanding Proto-Chukotko-Kamchatkan as “Proto-Chukotko plus Kamchatkan”.
(b) Another problem is: what do you make of expressions like Pre-Proto-Indo-European, actually fairly frequent in some works? What about Proto-Uralo-Indo-European or Pre-Proto-Uralo-Indo-European? Obviously, we have long since run out of different forms of hyphens and dashes.
There is a simple solution to these arcana and inconsistencies: eliminate — or more precisely continue to avoid — n-dashes and keep using hyphens, as everyone has been doing on Misplaced Pages for years.
VikSol (talk) 22:54, 14 March 2009 (UTC)
- Hyphens are almost always an acceptable substitute for en dashes. As you've pointed out, en dashes are sometimes useful for disambiguation. For me, that's the relevant issue, not legalistic adherence to the guideline. So, for example, per the MOS, pIE should be en-dashed, and with several other protolanguages, en dashes are found in the literature. In the IE lit, however, it's always hyphenated, or nearly always so. Per your point in (4a), there is no ambiguity, so on the balance I'd say we should probably go with hyphens. The MOS after all is a guideline, and we need to take other considerations into account.
- However, with something like Trans–New Guinea or Indo-European–Hamito-Semitic, en dashes are found in the lit, or sometimes there is no established usage, and there is potential ambiguity. Here I think the advantage of en dashes is the overriding factor. Also, there are relatively few such language families, and even fewer have dedicated articles (most are branches intermediate between better-established families which do have articles), so they're not disruptive.
- As for your question in (4b), there are only two levels, hyphen and en dash. Once you reach an en dash, everything from there on out is also an en dash: Proto–Indo-European–Hamito-Semitic, Proto–Trans–New Guinea. (The latter at least attested in the ling lit.) Theoretically you might think we'd need further dab'ing. However, human language is not infinitely recursive. We quickly reach a cognitive processing limit, which IMO is why we don't see many terms where a third level would be useful. In the very few cases were we come across such terms, we could go with the en dashes, or take advantage of acronyms, which is what is generally found in the lit anyways: proto-TNG, pre-proto-TNG, etc.
- I occasionally see hyphens replaced with spaces, as in "Trans New Guinea" and "Meso Philippines". I don't see any advantage to such usage, but maybe someone else here does? kwami (talk) 23:52, 14 March 2009 (UTC)
- I think that the linguists here--kwami, vik-sol, and myself--seem to have come to a workable solution for 99% of all cases--hyphens all the way. Now, the "problems" and "ambiguous cases" that kwami cites are mostly smoke and mirrors since no one talks in the literature about Proto-Uralic-Indo-European in any realistic sense since there are virtually no contexts in which such an artificial formulation would be used. There are probably only a dozen truly ambiguous cases that are actually likely to be used in Misplaced Pages and they are nearly all in New Guinea. We can arm wrestle over each of them if they start to cause problems of interpretation, but since the number of people who are actually ever going to write or edit (or even read) an article on a language of the Trans-Fly-Bulaka River family can be counted on the fingers of one hand (with a few fingers left over), the problem is probably moot. I don't give a hoot about the non-linguistic uses of en-dashes versus hyphens, so the non-linguists who have been involved in this discussion can argue about the Barnes–Noble Paradigm versus the Barnes-Noble Paradigm and I don't really care. (Taivo (talk) 00:59, 15 March 2009 (UTC))
- The only top-level families are Trans–New Guinea, East Bird's Head–Sentani, Ramu–Lower Sepik, and Yele–West New Britain (assuming that's valid), all in New Guinea. There are also some branches of Austronesian such as South Halmahera–West New Guinea, also mostly in NG, or at least in Melanesia. kwami (talk) 01:19, 15 March 2009 (UTC)
- But I honestly don't think there's any real ambiguity in any of these. I find the use of an en-dash after a prefix especially inappropriate (trans-, proto-, pre-). But, in actuality, these are very minor issues since the use of any of these is so rare (except, perhaps, for Trans-New Guinea). (Taivo (talk) 01:44, 15 March 2009 (UTC))
- The only top-level families are Trans–New Guinea, East Bird's Head–Sentani, Ramu–Lower Sepik, and Yele–West New Britain (assuming that's valid), all in New Guinea. There are also some branches of Austronesian such as South Halmahera–West New Guinea, also mostly in NG, or at least in Melanesia. kwami (talk) 01:19, 15 March 2009 (UTC)
- I think it's telling that the trend for TNG seems to be writing it as three words, "Trans New Guinea", despite trans being a prefix. Maybe people object to treating "trans-new" as if were a unit? And South Halmahera–West New Guinea is difficult to parse with just a hyphen. kwami (talk) 01:58, 15 March 2009 (UTC)
- Do you have any published sources for this trend? (Taivo (talk) 03:43, 15 March 2009 (UTC))
- Again, what does this have to do with linguistics/linguists? I see this as a simple typography issue... You'll notice that the punctuation, hyphen and dash articles all belong to Category:Typography -- not Category:Linguistics or anything related. --Wulf (talk) 03:59, 15 March 2009 (UTC)
- Wulf, Thanks for the links, they are most useful. I quote from the article Dash:
- ====Usage guidelines====
- The en dash is used instead of a hyphen in compound adjectives for which neither part of the adjective modifies the other. That is, when each is modifying the noun. This is common in science, when names compose an adjective as in Bose–Einstein condensate. Compare this with "award-winning novel" in which "award" modifies "winning" and together they modify "novel". Contrast "Franco-Prussian War", "Anglo-Saxon", etc., in which the first element does not strictly modify the second, but a hyphen is still normally used. The Chicago Manual of Style recognizes but does not mandate this usage and uses a hyphen in Bose-Einstein condensate.
- Thus, "Bose–Einstein", taken as a supposedly unshakable example of the use of an n-dash, is contradicted by the most prestigious manual of all, The Chicago Manual of Style. There could be no better example of the confusion that reigns in this area, which we must not inflict on Misplaced Pages users. VikSol (talk) 04:59, 15 March 2009 (UTC)
Let me try to characterize the discussion to this point:
(1) There is consensus that the prefix “Proto-” does not need to be n-dashed, since there is no ambiguity. I will hazard that the same principle would apply to “Pre-”, as in the title of Winfrid P. Lehmann’s book ‘’Pre-Indo-European”.
(2) The next question to consider, I think, is whether this principle applies to all prefixes, or to these prefixes only? I suggest it should apply to all prefixes, on these grounds:
- The treatment of prefixes should be consistent, as much as practical.
- Prefixes, by their nature, do not give rise to ambiguities of the type “Indo-Germanic-Semitic” (which I recently had to use to translate Hermann Möller’s indogermanisch-semitisch).
- The confusion of a prefix with a language name is nonexistent or so rare as to be unimportant. As far as I know there are no languages named Pre, Trans, Macro, or any other letter combination identical to an English prefix. (I did once wonder whether Macro-Ge involved a language called Macro, but one gets beyond such things.)
In consequence, all prefixes should be hyphenated, since they do not involve ambiguity.
(3) But what of the case where the prefixed expression involves two separate words, as in Trans–New Guinea? Here there does appear to be a frequent usage of an n-dash. However, with regard to Trans–New Guinea, it seems to me that, “Trans-” being simply a prefix like “Proto-” and “Pre-”, and no more ambiguous than them, there is no reason to n-dash it simply because the following words are not hyphenated.
(4) This leaves the case of disambiguation, but let’s leave that for later.
My suggestion, then, is that we adopt the principle that a hyphen should follow all prefixes in language names. Examples: Proto-Indo-European, Pre-Indo-European, Pre-Proto-Indo-European, Macro-Ge, Trans-Eurasian, Trans-New Guinea.
VikSol (talk) 04:37, 15 March 2009 (UTC)
Concluding remarks (?)
(1) If I am not mistaken, there is now consensus that hyphens should be used after all prefixes. (Assuming my argument above about forms like Trans-New Guinea is accepted.)
We could provide a more linguistically precise definition of “prefixes” here, but this does not seem to be of immediate relevance.
(2) The remaining issue on the table is disambiguation.
Taivo and I have signaled that we will not fight this one to the bitter end.
There is general agreement that the decision depends on balancing competing considerations. I will try to sum these up.
(3) There is a genuine advantage to the use of an n-dash to disambiguate terms. The forms concerned are primarily:
A + B-C. Example: Uralo–Indo-European.
A-B + C. Example: Indo-Germanic–Semitic.
A-B + B-C. Example: Indo-European–Hamito-Semitic.
Also theoretically possible and sometimes really occurring are such forms as:
A + B-C + D. Example: Korean–Japanese-Ryukyuan–Ainu. (Made-up term, discussed below.)
A B + C. Example: East Fijian–Polynesian.
A (B) + C D. Example: North–Central Vanuatu.
A (B-C) + D E-F. Example: Central–Eastern Malayo-Polynesian.
etc.
(4) Let me point out that many of the so-called ambiguous forms are not that ambiguous when closely considered. The language is pretty smart and already has built-in ways to avoid ambiguity. In particular, most of the compound junctures are disambiguated – in the spoken language itself – by the combination form -o or the use of English terms like "West" which could never (as a practical matter) constitute a language name. So actually such forms as South Halmahera-West New Guinea, North-Central Vanuatu, and Central-Eastern Malayo-Polynesian are not ambiguous at all.
What happens in such cases in that we get again into the hair-splitting we encountered in such series as Uralic-Altaic, Uralo-Altaic, and Ural-Altaic. The grounds for deciding whether a hyphen or an n-dash is needed are so obscure, subject to individual interpretation, and hypertechnical that no non-linguist can reasonably be expected to grasp them all, and no two linguists may agree on all interpretations.
In other words, a disambiguation that produces ambiguity is no progress.
(5) In other cases, solutions may be possible short of the use of an n-dash. For example, some forms can be disambiguated by combining them, a possibility Kwami has raised above. For example, we could write Indogermanic-Semitic rather than Indo-Germanic-Semitic. This is justified by usage fairly often. For example, the terms Afroasiatic and Afro-Asiatic are both in current use.
Other forms can be avoided in practice. For example, some linguists prefer Uralo-Indo-European to Indo-Uralic, but the shorter term is much more prevalent. Joseph Greenberg spoke of Japanese-Ryukyuan, but Korean-Japanese-Ainu, presumably to avoid a lengthy and ambiguous term. Indo-European–Hamito-Semitic could be abbreviated to Indo-Semitic.
This raises the reflection that the very complex names tend to be reserved for new proposals and controversial groupings. When a language family is well established it tends to get a simple name, for obvious practical reasons: it’s simpler to work with and linguists already know what languages it groups. I note in Kwami’s list of language families (Template:Language families) that none of the established upper-level families have very complex names – at most something like Yele–West New Britain.
Usually, a new or controversial proposal will not get its own article but will be explained in some other context, e.g. Korean-Japanese-Ainu is explained under “Altaic languages” and “Classification of Japanese”. The famous but controversial proposals already have short names, e.g. Nostratic and Amerind.
What I am trying to get at is that the ambiguity problem is one of very limited scope, so limited that the occasional useful n-dash is likely to puzzle people, since they will have encountered it so rarely.
(6) Other objections may be catalogued as follows.
- Most people do not know that such a character as an n-dash exists. I discussed this whole set of issues with one of the top legal draftsmen in the country, a Harvard JD, who had never heard of n-dashes. This is after twenty years of work in a field that demands extreme precision of language. A character that is not recognized by probably over 99% of readers does not disambiguate anything. It just gives the impression the typography is inconsistent (even when it’s not). And the few who recognize it, such as Kwami, already know perfectly well what the expressions mean.
- The latest edition of The Chicago Manual of Style gives increased preference to hyphens over n-dashes and is also concerned to adjust typography to the computer era. I think these things are not an accident and that close scrutiny of the manual would reveal that it is because of the computer era that the n-dash is falling from favor.
- The issue of searches is of great importance. When n-dashes were adopted, there was no way to search a text electronically. It did not matter to a reader glancing across pages, flipping through a book, or reading down the columns of an index whether the typesetter had used hyphens or n-dashes. Today it does. Yes, our computers are dumb and can’t even do accents properly. But this is the way things are. I am not sure that the cognitive dissonance provoked by adding an n-dash key to the computer keyboard would be worth it. Compare Martinet’s Economie des changements phonétiques on the disadvantages of having too many phonemes that are too similar.
Obviously, we cannot have forms with hyphens in titles and forms with n-dashes in the text. It’s all or nothing.
(7) My sense is that, given the minimal advantages of disambiguation in practice, the rarity of the character that would result, the practical impossibility of defining usable criteria, and most crucially the issue of searches, on balance the n-dash should be avoided in language names. Let the physicists sort out whether they want to use “Bose–Einstein” or, per the new Chicago Manual of Style, “Bose-Einstein” (see Dash).
I too like the advantages of being able to disambiguate Indo-Germano-Semitic and similar expressions. But there are workarounds and these may be preferable to adopting a character of rare application, obscure usage, and diminishing currency.
VikSol (talk) 21:24, 15 March 2009 (UTC)
- I came late to the game. For technical language such as linguistics or physics, I would think the Misplaced Pages MoS should defer to the technical language, so it should be Bose–Einstein condensate but if linguists really don't care about the en-dash–hyphen distinction, then technical linguistic terms should appear as they do in the linguistics literature.
- I have to agree with Wulf that CMS15 has much more to do with the dark ages of computer typography than it does about what formal published work should contain. I think it's notable that the fields that are particular about their dashes—math, physics, and computer science—are the fields that have had access to powerful typesetting software (TeX/LaTeX) the longest. I would argue that Misplaced Pages should do what it can to make typographically beautiful articles even if that means typography geeks like ourselves are running around putting in directional quote marks and en-dashes.
- As for searches, I think that's the job of the search engine and of redirect pages. Google doesn't have any problem with a search for "Bose-Einstein condensate" (although Google does find the page with the hyphen that redirects to the page with the en-dash). —Ben FrantzDale (talk) 23:35, 15 March 2009 (UTC)
- Just a comment--Misplaced Pages is not typeset and never will be because it would never pass peer-review. As careful as we specialists are with individual articles, this is still a user-edited document, and, as such is a computer-only thing. Therefore, following Viksol's admonishment that this should be easy for computer searches is paramount. Leave out the en-dashes and directional quotation marks because they are just arrogance. (Taivo (talk) 00:52, 16 March 2009 (UTC))
- Where do you keep getting this idea that typesetting == letterpress or something? As the relevant Misplaced Pages article opens, “typesetting involves the presentation of textual material in graphic form…”. As far as Misplaced Pages being a “user-edited… computer-only thing”, must I mention Misplaced Pages:Books – or the article in Nature which compared Misplaced Pages with the Encyclopedia Britannica? (Although what does peer review or being in print have to do with good typography anyway?) You also complained about searching, but – as Ben had already said – Google already handles it fine. There is also a bug filed with Mozilla regarding the find bar, and other browsers will assuredly follow shortly. MediaWiki already utilizes Unicode normalization, and it would be fairly straightforward to normalize searches as well (which would mean that Unicode characters and their equivalents would be properly treated as just that – equivalent).
- Oh, and you’ll notice that this post is written using only proper, semantic, typographically-correct characters – all of which were entered using the buttons below every Misplaced Pages edit box. Proper punctuation and typography are no more arrogant than proper spelling and grammar. —Wulf (talk) 03:38, 17 March 2009 (UTC)
- The issue has been resolved for linguists. You can believe the conceit, Wulf, that Misplaced Pages is on a par with EB, but there's not a college professor that I know who will accept it as a legitimate source for a term paper. (Taivo (talk) 04:56, 17 March 2009 (UTC))
- Misplaced Pages is not “a legitimate source for a term paper” because it’s a tertiary source, not so much because it’s unreliable (not that it is). But that’s why we have a thorough citation system. I have to ask: if you think so little of Misplaced Pages, why bother contributing? —Wulf (talk) 08:58, 17 March 2009 (UTC)
- The issue has been resolved for linguists. You can believe the conceit, Wulf, that Misplaced Pages is on a par with EB, but there's not a college professor that I know who will accept it as a legitimate source for a term paper. (Taivo (talk) 04:56, 17 March 2009 (UTC))
- Great post, and thanks for pointing out the connection between availability of powerful typesetting software and being particular about dashes and such. I’d never noticed that. —Wulf (talk) 03:38, 17 March 2009 (UTC)
- Just a comment--Misplaced Pages is not typeset and never will be because it would never pass peer-review. As careful as we specialists are with individual articles, this is still a user-edited document, and, as such is a computer-only thing. Therefore, following Viksol's admonishment that this should be easy for computer searches is paramount. Leave out the en-dashes and directional quotation marks because they are just arrogance. (Taivo (talk) 00:52, 16 March 2009 (UTC))
If it makes no practical difference which form we use, then it doesn't really matter. But when using a hyphen is misleading, I don't think that we should dumb down an article because we think our readers won't understand what a dash is. That's the same argument people make for abandoning the IPA: spelling pronunciations are precise enough for our purposes, my dictionary doesn't use the IPA, it's too much to ask people to learn just to use Misplaced Pages, etc. If linguists expect others to learn the IPA in order to figure how to pronounce the name of a moon, a literary character, or a chemical element, then I don't see why others can't expect linguists to learn basic punctuation. kwami (talk) 09:24, 17 March 2009 (UTC)
- Usage, my friend--description and not prescription. That's the key to linguistics and why using hyphens to describe our field is much more important than trying to impose an en-dash where all our colleagues use hyphens. That's been my point all along--linguistic usage is hyphens all the way. VikSol has more subtle (and just as valid) arguments, but my principal point has always been usage takes precedence over all other factors. And Wulf's point, that Misplaced Pages is a tertiary source, makes usage in the primary and secondary sources all the more important since a tertiary source should never impose its will upon the more important sources. So, since usage is most important, and since the vast majority of primary and secondary sources use only hyphens.... (Taivo (talk) 12:10, 17 March 2009 (UTC))
- One thing I still don't understand is how using an n-dash in an expression like Trans–New Guinea instead of Trans-New Guinea disambiguates anything.
- If the purpose of an n-dash is to express a higher level of separation, then Trans–New Guinea means a language called "Trans" plus a language called "New Guinea".
- I guess the idea is that the space in New Guinea represents a higher level of separation than the hyphen in Mixe-Zoque, so the space needs to be trumped by a still higher level of separation, that of an n-dash. But an n-dash, being a conjoining symbol here, like the hyphen, logically indicates a lower level of separation than a space. The poor reader has nowhere to turn:
- If he guesses the n-dash is a disjoiner, as in "the evolution–creation debate", then he reads the expression as meaning "the Trans language plus the New Guinea language".
- If he guesses the n-dash is a conjoiner, and therefore weaker than a space, then he reads "the Trans-New language of Guinea" or "the Trans-New form of the Guinea language".
- If he is aware of the principle that prefixes receive hyphens, not n-dashes (on which Taivo, Kwami, and I have reached consensus for expressions where no space occurs), he expects Trans-New Guinea, and wonders why an n-dash was used instead of a hyphen.
- I think this usage just adds a layer of confusion and should be dropped. I am not speaking here to the merits of cases where the n-dash really disambiguates. VikSol (talk) 23:25, 17 March 2009 (UTC)
- TNG is not named for a language, but for a geographic area, like Niger-Congo. It is trans-(New Guinea). With a hyphen, it would imply (trans-new) Guinea, which is on the wrong continent. The use of an en dash when joining hyphenated or interspaced terms is a basic rule of punctuation. True, we can substitute a hyphen without much loss of comprehension. But then we could also drop capitalization without much loss of comprehension: trans-newguinea. That doesn't mean we should.
- As for Taivo's point, you're proposing that we use a different system of punctuation for each field of knowledge in an attempt to remain authentic to the lit, which would be a complete mess. This is just punctuation. True, the literature should be considered, but we should come up with one standard for wikipedia. Most linguistics sources use seriffed fonts too. Should we force all linguistics articles to display with seriffed fonts in an attempt to be authentic? kwami (talk) 00:06, 18 March 2009 (UTC)
Actually, the WP:MOS already specifies that very thing:
- An overriding principle on Misplaced Pages is that style and formatting should be applied consistently within articles, though not necessarily throughout the encyclopedia as a whole. One way of presenting information may be as good as another is, but consistency within articles promotes clarity and cohesion.
- The Arbitration Committee has ruled that the Manual of Style is not binding, that editors should not change an article from one guideline-defined style to another without a substantial reason unrelated to mere choice of style, and that revert-warring over optional styles is unacceptable.
- Where there is disagreement over which style to use in an article, defer to the style used by the first major contributor.
- An overriding principle on Misplaced Pages is that style and formatting should be applied consistently within articles, though not necessarily throughout the encyclopedia as a whole. One way of presenting information may be as good as another is, but consistency within articles promotes clarity and cohesion.
Just quoting what the WP:MOS already says. (Taivo (talk) 03:46, 18 March 2009 (UTC))
- …How does that quote back you up here? Nowhere does it advocate a different style for each field of knowledge. It just says we shouldn’t modify existing articles back and forth as the guide changes – nothing about it not being ideal to have consistent styling across the entire site. If we were to have different styling for each field, then there would be no central Manual of Style (its duties being performed by a myriad of WikiProject subpages). There’s a bit of flawed logic in your mere inclusion of that quote in the first place, but the circular nature of it would require a rather long explanation… Suffice it to say I think that quote has no relevance in this discussion. —Wulf (talk) 00:20, 23 March 2009 (UTC)
- Good riddance to a "central Manual of Style". Yes, each field should be allowed to practice its own habits within Misplaced Pages. Otherwise, it is falsification of data to change a style just because some dilettante in Misplaced Pages with no experience in the field desires it. The quote clearly states that the style of the original editor has priority. (Taivo (talk) 04:28, 23 March 2009 (UTC))
- Bit of a problem with that… There is a manual of style, and I seriously doubt it’ll ever go away. It may give preference to original article styles, but page titles are far more important and should be uniform. Besides, it seems the quote you are using regarding consistency within articles is more about American vs. British English than mandated proper typography. For example, see the most recent reference given for the relevant section in the MoS:
Misplaced Pages does not mandate styles in many different areas; these include (but are not limited to) American vs. British spelling, date formats, and citation style. Where Misplaced Pages does not mandate a specific style, editors should not attempt to convert Misplaced Pages to their own preferred style, nor should they edit articles for the sole purpose of converting them to their preferred style, or removing examples of, or references to, styles which they dislike.
- Misplaced Pages:Naming conventions#Special characters says “for the use of hyphens and dashes in page names, see Manual of Style (dashes)”, which says “when naming an article, a hyphen is not used as a substitute for an en dash that properly belongs in the title…”. Furthermore, Dash says “the en dash is used instead of a hyphen in compound adjectives for which neither part of the adjective modifies the other.”. —Wulf (talk) 14:05, 23 March 2009 (UTC)
- Good riddance to a "central Manual of Style". Yes, each field should be allowed to practice its own habits within Misplaced Pages. Otherwise, it is falsification of data to change a style just because some dilettante in Misplaced Pages with no experience in the field desires it. The quote clearly states that the style of the original editor has priority. (Taivo (talk) 04:28, 23 March 2009 (UTC))
- …How does that quote back you up here? Nowhere does it advocate a different style for each field of knowledge. It just says we shouldn’t modify existing articles back and forth as the guide changes – nothing about it not being ideal to have consistent styling across the entire site. If we were to have different styling for each field, then there would be no central Manual of Style (its duties being performed by a myriad of WikiProject subpages). There’s a bit of flawed logic in your mere inclusion of that quote in the first place, but the circular nature of it would require a rather long explanation… Suffice it to say I think that quote has no relevance in this discussion. —Wulf (talk) 00:20, 23 March 2009 (UTC)
Re "This discussion is in need of attention from an expert on the subject", please bear with an intrusion from a newcomer working in a different research community.
From the foregoing discussion it seems plain that there can be no such expert, in any universal sense, because it seems plain that different communities have different en-dash conventions.
For instance, in my own community (mainly classical physics, mathematical physics, and climate research) there is an obsolescent convention of the "Bose--Einstein" sort.
The trend represented by recent editions of the Chicago Style Manual is also exhibited by some of the recently founded online journals in my field, specifically, those of the European Geosciences Union. They all follow the suggestion that in-text en dashes should all be replaced by hyphens. Surely that's the way of the future.
The tiny minority of readers who care about the difference can easily think of some of the hyphens as "really" being en dashes.
I agree that the issue of searches is of great importance... EdgeworthMcIntyre (talk) 18:08, 28 March 2009 (UTC)
PS: Misplaced Pages could do a great service to humanity by slightly redesigning its typography such that en dashes look exactly the same as hyphens. It would be easy to make hyphens a touch longer and thinner, and en dashes a touch shorter and thicker. In the computer code they could all be hyphens, helping toward bug-free searches.
It would be wonderful if everyone's opinion as to how to arrange hyphens and en dashes were equally well served by what's on the screen. Indeed, perception psychology ("categorical perception" etc) tells us that those with the strongest opinions might well, in fact, see the particular arrangement they like. EdgeworthMcIntyre (talk) 11:33, 29 March 2009 (UTC)
- Well, another solution would be a major military conflict over the issue of dashes....... Michael Hardy (talk) 13:30, 29 March 2009 (UTC)
- A Hyphen War, perhaps? ^^ —Wulf (talk) 15:36, 29 March 2009 (UTC)
- I'm fine with tweaking MOS to be more tolerant of hyphens. - Dan Dank55 (push to talk) 04:09, 31 March 2009 (UTC)
- Ditto. — SMcCandlish ‹(-¿-)› 05:31, 14 April 2009 (UTC)
- I'm fine with tweaking MOS to be more tolerant of hyphens. - Dan Dank55 (push to talk) 04:09, 31 March 2009 (UTC)
- A Hyphen War, perhaps? ^^ —Wulf (talk) 15:36, 29 March 2009 (UTC)
- Who on earth is even going to notice the difference between a hyphen and an en-dash? This really is just such a waste of time. Much more offensive, to me, is the statement in the MOS that an EM dash should not have a space either side of it. I think it should, and then it doesn't matter whether what is written between the spaces is an em dash, an en dash, or a hyphen. Alarics (talk) 21:28, 12 April 2009 (UTC)
- Sometimes it does, sometimes it doesn't, depending upon usage. — SMcCandlish ‹(-¿-)› 05:31, 14 April 2009 (UTC)
- Check a style guide. Em dashes should technically have hair spaces, which is impractical on computers (more so than dashes). With proportional (i.e. not monospaced) fonts, the hair spaces are (at least in theory) taken care of by the font. However, I seem to recall one style guide recommending en dashes over em dashes just to avoid all the confusion over it, as nobody argues for a lack of spaces around en dashes. —Wulf (talk) 18:06, 22 April 2009 (UTC)
Moving forward again
This discussion is getting longwinded and off on tangents. For my part I am (surprisingly?) in agreement with Septentrionalis/PMAnderson that MOS is being overly prescriptive rather than descriptive on some aspects of this issue, and also agree with many related points raised by Tcncv and Taivo. I also agree with much of what MOS says about use of en-dashes in the sense of "to" or "through", as in "1990–1998", as well as the juxtapositional use as in "Canada–UK relations", but feel as many do here that "Ural–Altaic" is taking it too far. I think such a usage is a misconstruance of the purpose of en-dashes, to the extent there is any (including off-Misplaced Pages) consensus on their use to begin with. So, the question before us is what should MOS say on the matter? I think we need to refocus on what what we can come to consensus on that en-dashes are actually useful for (with reference to an overall sense of what off-WP style guides say), and reformulate from there. I would suggest that deference is generally given to the hyphen, based on a preponderance of external evidence, from post-Internet communication styles, to current academic journals, and so on. — SMcCandlish ‹(-¿-)› 05:31, 14 April 2009 (UTC)
- Coming late to this debate, that sounds good to me. (I am currently engaged in a discussion over Weaire-Phelan structure vs. Weaire–Phelan structure). The Chicago Manual of Style and most search engine hits are on my side, but WP:ENDASH is against me. The Dash article says that "A 'simple' compound used as an adjective is written with a hyphen; at least one authority considers name pairs, as in the Taft-Hartley Act to be 'simple', while most consider an en dash appropriate there. That "most" seems highly suspect in the light of this discussion, and I really do wonder whether that missing citation actually exists. If it doesn't, then WP:MOS should surely be revised to follow the real world. -- Cheers, Steelpillow (Talk) 2danish oi0:35, 2 May 2009 (UTC)
- Also coming lamentably late to the discussion, but wanted to harp a bit on the notion that establishing a (good) style guide, unlike doing (good) linguistics, is always a fundamentally prescriptive endeavor. That said, style is not prescriptive in the sense that using a given feature (hyphen, en dash, whatever) is the ultimately correct way of using a language, but insofar as consistency in presentation is desirable for a given publication (in this case, Misplaced Pages as a whole). While it is certainly in Misplaced Pages's best interest to reflect the informed usage of experts, it is a confusion of logical levels to refer to such as "descriptivism", at least when it comes to establishing style.
- While I have extreme respect for linguists and linguistic expertise, linguists are not the experts to which we should ultimately defer in matters of typological style. Scientists' work involves technical distinctions within their fields, not those of typography, and (more to the point) scholars publish in journals, which have their own styles and their own economy of production. In my editorial experience, periodical literature (including many, but not all, scientific and medical journals) tends away from hyphen/en-dash distinctions, whereas you'll find them quite closely observed in other contexts and fields.
- To cite the typographical tendencies of experts in a specific field, especially when using their technical vocabulary, as evidence to call for a specific stylistic usage in a broad context (WP) is a little mixed up. To say "linguistic usage always prevails" strikes me as a creeping prescriptivism of the wrong kind – although a linguist's expertise regarding the geographical extent of a language family (and thus whether the terms Niger-Congo and Ural-Altaic are disjunctive or otherwise) is most welcome in deciding on how to implement the local typographical style. But to that end, the linguist is the expert on the language family and the substantive aspects of its naming, not necessarily on how the name is punctuated. And to call a style that differs from the specialist's usage an attempt to "reform" the language misses the mark.
- Of course, the point of style is to give coherence and consistency, deviations from which can detract from the publication's voice (in this case, an encyclopedic voice). I do think that en dashes provide useful distinctions in formatted text and shouldn't be tossed away as some archaism based on a subset of formatting conventions. When it comes to specifics, I pretty much agree with SMcCandlish's position above, but wanted to point out my reservations about some of the discussion that led to it.
- Just a quick additional comment on the discussion above: hyphen usage with prefixes is only relevant in contexts where the prefix is attached to a capitalized item (Trans-Siberian, but transcontinental), or where hyphenless treatment results in phonological ambiguity (re-elect). Affixes in English are generally run in without space except where circumstances demand otherwise. Sorry if that was already obvious to the group here. /Ninly (talk) 06:38, 8 May 2009 (UTC)
Proposal
I have just checked through some maths books for conjunctive name pairs. Cambridge University Press (e.g. Cromwell's Polyhedra). use en dashes. Allen Lane (e.g. Mlodinow's Euclid's window) use hyphens. The only Dover books I have to hand are older works (e.g. Coxeter's Regular polytopes, 2nd Edn), also using hyphens. I find the same name-pair with hyphen in one book and en dash in another.
En dashes are a pain to maintain. The practical approach for Misplaced Pages is to use hyphens unless there is a clear, referenced usage of en dashes in any particular field - irrespective of the publisher.
I propose to amend WP:MOS accordingly. However I do not know the etiquette - should I just do so, or is there a protocol to work through first? Certainly, the discussion aspect has been done to death. Should we vote on it? -- Cheers, Steelpillow (Talk) 09:28, 4 May 2009 (UTC)
- My vote is for. -- Cheers, Steelpillow (Talk) 09:28, 4 May 2009 (UTC)
- Opposed in the interest of stylistic consistency (comments in above subsection) and because I don't think en dashes are a pain to maintain. /Ninly (talk) 06:42, 8 May 2009 (UTC)
- support. ndashes provide no visible benefits to users, while mdashes w/o spaces look too much like hyphenated terms. Both ndashes and mdashes are a pain to use - especially in tools like refTools, where the edit box "Insert" facility doesn't work. As for the 7-char HTML entities that you have to get absolutely right first time otherwise the text becomes a dog's breakfast ...! --Philcha (talk) 09:41, 8 May 2009 (UTC)
Question on alphabetizing
When alphabetizing a list of names by surname, how are names beginning with "Mc" listed . . . at the beginning of the "M"s or after "Ma"? I have seen it both ways but cannot seem to find the Misplaced Pages convention. Thanks, Alanraywiki (talk) 22:13, 22 April 2009 (UTC)
- There apparently is not a convention for that at the present time, although I discussed the idea of developing a set of conventions with other Wikipedians a few months ago. See the following.
- Interest just might be undergoing a re-awakening, so if you wait awhile, you just might see a convention for your question.
- -- Wavelength (talk) 22:26, 22 April 2009 (UTC)
Before computers, "Mc" was correctly alphabetized by placing it as if it were spelled "Mac." With the advent of computers and their ability to alphabetize, "Mc" started being alphabetized as "Mc," i.e. as if the second letter was “c” and not “a.” Yanq (talk) 07:48, 25 April 2009 (UTC)
Why is the “search” function for article titles case sensitive, except for the first word? I don’t see why you would require all other words (beyond the first word) to be case sensitive in a search. If one doesn’t input the correct case, and there is no redirect, then the article won’t be found. Why would you require knowledge of the correct case in a search function? Yanq (talk) 07:57, 25 April 2009 (UTC)
- I thought this "bug" had been fixed some time ago. Can you give an example where it doesn't work?--Kotniski (talk) 08:23, 25 April 2009 (UTC)
I stand corrected. I was relying on Misplaced Pages's documentation, which I believe still states that titles are case sensitive except for the first word. When I have actually tried intentionally using the incorrect case letter of a word, I have been redirected to the correct page (with the page containing a note that I have been redirected there). I must falsely be assuming that someone manually put that particular redirect in place. Perhaps these redirects were created when you "fixed the bug." Yanq (talk) 18:05, 29 April 2009 (UTC)
Italic titles
I originally started this discussion at the Village Pump, but was referred here. Basically I noticed the article Puijila has an italic title (caused from {{Taxobox name}}). This seems to agree with the the MoS. However, the majority of articles eligible (newspapers, films, computer games etc.) don't have an italic title. I think we need to clarify if the actual title (not the prose) should be italicised. This potentially affects thousands of articles and IMO the italicised titles look strange. However I thought I should garner more input before attempting to remove this formatting. Rambo's Revenge (talk) 18:00, 27 April 2009 (UTC)
- There have been a few other discussions at the Village Pump about this. If you haven't already seen them, see Misplaced Pages:Village_pump_(technical)/Archive_56#Italic_titles_for_names and Misplaced Pages:Village_pump_(technical)/Archive_58#Italics_in_article_name. Regards. PC78 (talk) 22:42, 27 April 2009 (UTC)
Letter X vs. multiplication sign ×, part II
After sparse reactions the first time, I have to bring that topic back up, because User:Yankees10 keeps reverting my edits. Is there any consensus here that we use the ×
symbol × for things like "three-time MVP" and "five-time All-Star"? In my mind, using letter x (like "3x MVP") is wrong typography, like separating a parenthetical thought with a hyphen instead of a em dash. --bender235 (talk) 16:15, 28 April 2009 (UTC)
- It says only for a multiplication sign and this is not that. So you are not interpreting it right--Yankees10 16:18, 28 April 2009 (UTC)
- "Times" is multiplication. The symbol should be used rather than the letter ex. The symbol looks better. The article needs an audit for en dashes (I've fixed some). Tony (talk) 16:52, 28 April 2009 (UTC)
- For decisions between using x and using ×, see Misplaced Pages:Manual of Style (dates and numbers)#Common mathematical symbols, point 2. -- Wavelength (talk) 18:22, 28 April 2009 (UTC)
- And this means in this particular case … ? --bender235 (talk) 19:12, 28 April 2009 (UTC)
- If this is prose, not an infobox or caption, the word "time/s" would be better than either. dramatic (talk) 19:21, 28 April 2009 (UTC)
- The reason x is used is to save space--Yankees10 19:24, 28 April 2009 (UTC)
- However, we are talking about infoboxes, so what would you prefer? "3x MVP" or "3× MVP"? --bender235 (talk) 22:38, 28 April 2009 (UTC)
- Right. And the symbol for times is "×". I could see an "x" being used if it were actually pronounced "ex" as in "3x increase in sales". The only exception I could see in this case (being pronounced "times") that would make me think "x" is sensible is if the baseball literature reliably used the letter not the symbol; otherwise I see no reason to use anything but the times symbol. —Ben FrantzDale (talk) 17:06, 30 April 2009 (UTC)
(Unindent) When gauging what constitutes normal typography, one must recognize that some publications lack sophisticated typographic capability, or lacked such sophistication until recently. So when looking to see if the baseball literature uses "x" or "×" one must discount publications that lack the ability to use "×" and writers who have become accustomed to using "x" because it was all that was available until recently. Failure to take these factors into account would be like saying "UM" is a correct abbreviation for micrometer just because computer systems of the 1980's often used printers that lacked lowercase letters, and so many computer printouts from that era used "UM" for micrometer. --Jc3s5h (talk) 17:48, 30 April 2009 (UTC)
- The symbol for micro- is the Greek letter mu (μ). -- Wavelength (talk) 14:27, 2 May 2009 (UTC)
There comes a point when abbreviation obscures, and this reduction in clarity becomes a more major factor than, say, saving space (even though saving space is an important factor). I think perhaps that point has been reached in the current case, whichever symbol is used, and the word itself should be used. PL290 (talk) 09:23, 8 May 2009 (UTC)
RFC on the reform of ArbCom hearings
The attention of all editors is drawn to a Request for Comment on a major issue for the English Misplaced Pages: a package of six proposals to move the ArbCom hearings process away from the loose, expansionary model that has characterised it until now, to a tighter organisational model. The RFC started Tuesday 29 April. Your considered feedback would be appreciated. Tony (talk) 16:20, 28 April 2009 (UTC)
Loft of Guitars
The term relating to a collection of many guitars, attributed to Richard "Dickman" Cripps of the rock group "Nervous", having had a "Loft" full of guitars that they had displayed and photographed for a photo shoot. As far as I am aware there is no name for such a collection of guitars. Although this may sound somewhat inane, many musicians now use this for a collection of many guitars kept in one place. Generally more than three would constitute a "Loft"
Italic text
Many guitarists have extraordinarily large collections of guitars, these would be considered to be a "Loft" of Guitars
--BrianIndian (talk) 22:39, 28 April 2009 (UTC)Brian Indian
RFC 1924 links in headers
Currently MOS:HEAD says "Section names should not normally contain links, especially ones that link only part of the heading". I couldn't figure out how to un-link the automatically-generated link in the section Ascii85#RFC 1924 version. Does MOS:HEAD apply to such RFC links? If so, how do I un-link it? If I overlooked the answer in the archives, my apologies -- please link to the relevant archive. --68.0.124.33 (talk) 05:40, 29 April 2009 (UTC)
- If you need to suppress wiki markup, use the nowiki tag. I've done this for the Ascii85 article. Mindmatrix 13:12, 29 April 2009 (UTC)
- Does this behaviour of MediaWiki seem arbitrary and undesirable to anyone else? Oliphaunt (talk) 21:24, 4 May 2009 (UTC)
- Well yeah. How on earth did it get in? I'm taking it up at WP:VPT. --Kotniski (talk) 06:26, 5 May 2009 (UTC)
- Does this behaviour of MediaWiki seem arbitrary and undesirable to anyone else? Oliphaunt (talk) 21:24, 4 May 2009 (UTC)
Question about en dashes
Reading over WP:NDASH, it says when naming an article, en dashes should used when it properly belongs in the title. Does this rule apply to categories as well? — Σxplicit 05:55, 2 May 2009 (UTC)
- Yes, or at least it ought to, though there has been argument about it in the past (since there ought to be a redirect from the hyphen form for ease of typing, and category redirects don't work as beautifully as they might). When the devs finally get round to fixing the bug in the category redirect functionality, I assume there won't be any problem with using dashes. --Kotniski (talk) 09:50, 4 May 2009 (UTC)
Use of metaphor and simile
At Quantitative easing, Vexorg (talk · contribs) insists on using metaphors to describe the subject of the article rather than using literally accurate language. He claims that he is just trying to make the article easier to understand (for people with less education in economics). Is the use of metaphor or simile condoned by the Manual of Style? Is there another standard which bears on this topic? Thank you for your help. JRSpriggs (talk) 11:06, 2 May 2009 (UTC)
- Misplaced Pages:Make technical articles accessible is relevant. "out of thin air" is standard UK English, but I don't know its status in other countries. How about "out of nothing"?
- BTW I'd avoid gratuitous Latin like ex nihilo. --Philcha (talk) 15:13, 2 May 2009 (UTC)
- To Philcha: Thank you for the link to WP:MTAA, but I do not see any mention in it of metaphors.
- Please notice that while Suicup (talk · contribs) and Vexorg are only fighting about that one phrase, Vexorg and I are fighting about the whole paragraph which is the lead. My version is:
- Central banks engage in quantitative easing when they increase the monetary base by a pre-determined quantity via open market operations. This new money is injected into the private banking system when the accounts of the vendors of the securities purchased by the central bank through the open market operations are credited. This begins a process to increase the money supply. Quantitative easing is a monetary policy different from the more usual monetary policy of setting a target for a specific interest rate (such as the federal funds rate) and continuously adjusting the amount of money to achieve that target. Central banks switch from interest rate targeting to quantitative easing when the interest rate is zero and they want to ease further.
- Vexorg's version is:
- The term quantitative easing refers to the creation by a central bank of a pre-determined quantity of new money out of 'thin air' as the start of a process to increase the country's money supply. This new money is injected into the private banking system by using it to purchase government securities and crediting the bank accounts of the vendors of those securities (a process called open market operations). Quantitative easing can basically be understood as a method of 'printing money' although today the new money is generally created electronically rather than physically printed. The usual method of increasing the money supply is by decreasing the interest rates at which the central bank lends to private banks and the monetary policy of quantitative easing is usually only applied when the interest is at or close to zero and there's still not enough money in circulation to stimulate the economy.
- As you can see, these are quite different. JRSpriggs (talk) 19:31, 4 May 2009 (UTC)
- IMO the tone of the latter version is somewhat too informal for an encyclopedia, but the former is almost incomprehensible for lay readers, which is against WP:NOT PAPERS. "A theory that you can't explain to a bartender is probably no damn good." I'd go with the second version, but removing the phrase "out of 'thin air'" which is essentially redundant with "create". (If it isn't, I agree that "thin air" is too informal, but how is "ex nihilo" any better than "out of nothing"? They are verbatim translations of each other and this is the English Misplaced Pages, not the Latin one.) --A. di M. (formerly Army1987) — Deeds, not words. 23:06, 4 May 2009 (UTC)
- To A. di M.: Thank you for your comments (although I am disappointed that you did not take my side). Perhaps you could convince Vexorg that "out of 'thin air'" is redundant. I tried but, he neither agreed nor was able to explain (to my understanding) why it was not redundant with "create". JRSpriggs (talk) 00:08, 5 May 2009 (UTC)
- Well, I know very little about economics, and I had never heard the phrase "quantitative easing" before reading this thread (I had once read a rough explanation of how money is created, but I don't remember where it was.) So I won't go discuss the article itself; I just wanted to point out that the first sentence of an article should be understandable by as many people as possible even if it's simplified, whereas more detailed explanations using technical jargon belong to other sections. And using phrases such as "ex nihilo" is a lose-lose situation: it is neither more understandable nor more accurate than "out of nothing": I don't see the point of use them (unless they have a technical meaning which is more specific than the literal meaning of the phrase in Latin). --A. di M. (formerly Army1987) — Deeds, not words. 10:23, 5 May 2009 (UTC)
- WP:MTAA says "Use analogies ...".
- I suggested "out of nothing" for the reasons mentioned by A. di M..
- For what it's worth, I think Vexorg's version is much easier for non-specialists to understand (PS economics was one of the subjects I studied at university). Even so, I'd add one further explanation: the purpose of all this is to mitigate a recession and avoid a deflationary spiral. --Philcha (talk) 02:17, 5 May 2009 (UTC)
- To A. di M.: Thank you for your comments (although I am disappointed that you did not take my side). Perhaps you could convince Vexorg that "out of 'thin air'" is redundant. I tried but, he neither agreed nor was able to explain (to my understanding) why it was not redundant with "create". JRSpriggs (talk) 00:08, 5 May 2009 (UTC)
Italicising softwares and sites names
Is it true it is forbidden to italicise softwares and sites names in wikipedia. It seems strange to me because I think it is usual in books or newspapers (this question originates from a discussion I had about metamath here: http://en.wikipedia.org/User_talk:CRGreathouse#Metamath) -- fl —Preceding unsigned comment added by 88.175.209.213 (talk) 18:39, 4 May 2009 (UTC)
Capitals for scientific theories? Why?
Currently the MoS says:
Physical and natural laws and parodies of them are capitalized (the Second Law of Thermodynamics, the Theory of Special Relativity, Murphy's Law; but an expert on gravity and relativity, thermodynamic properties, Murphy's famous mock-law).
But in actual use in modern English, only person names and adjectives derived from them are capitalized in such names (the second law of thermodynamics, the theory of special relativity, Murphy's law). Taking a glance at the table of contents of Feynman's Lectures on Physics, it uses "Newton's Third Law" once, but otherwise "The uncertainty principle", "Kepler's laws", "First principles of quantum mechanics", "Maxwell's equations", etc. So, is it OK if I remove that sentence? --A. di M. (formerly Army1987) — Deeds, not words. 20:02, 4 May 2009 (UTC)
- I'd support removing that statement. What is a "name of scientific theory" and what isn't is pretty arbitrary. Writing big bang or Big Bang are both fine (although again, consistency and yadda yadda yadda). Headbomb {κοντριβς – WP Physics} 20:19, 4 May 2009 (UTC)
- "big bang" and "Big Bang" mean different things. The first is a loud, sudden, explosive noise, and the second is the expansion of the Universe from a dense state. --83.253.250.239 (talk) 10:39, 5 May 2009 (UTC)
- My quick and rough, non-natively-English-speaking layman's take on this:
- If it is a name, it should be capitalized. That is: If a phrase is grammatically and contextually treated as being a name, then it is a name (especially if it is well-established usage, and especially if the phrase/name has been specifically coined), and it should be capitalized. But if a phrase is used merely as a description, then it shouldn't be capitalized, perhaps even if that same phrase sometimes functions as a name for what is discussed. But there obviously is a fuzzy zone in-between.
- The quoted part of the
mosMoS could be improved (The names of physical and natural laws ...), but I think it is correct and shouldn't be removed (unless, perhaps, it is redundant). --83.253.250.239 (talk) 10:14, 5 May 2009 (UTC)- Maybe I agree that "Big Bang" should be capitalized (and, in fact, it often is) to avoid confusion with the literal meaning of "big bang"; but why "Second Law of Thermodynamics" or "Theory of General Relativity"? They're unambiguous, and nobody actually capitalizes them (excepts in titles and the like). Note that our featured articles General relativity and Introduction to general relativity consistently use a lowercase r, and also lowercase g if it's not at the beginning of a sentence, except in book/chapter titles and the like. --A. di M. (formerly Army1987) — Deeds, not words. 11:21, 5 May 2009 (UTC)
- (Note that I'm just suggesting to remove the sentence, not to replace it with anything else. Without any explicit guidance, I expect that most editors would just copy whichever capitalization is used in the source they're using, which is usually the right thing to do. --A. di M. (formerly Army1987) — Deeds, not words. 11:31, 5 May 2009 (UTC))
- It seems the rule is generally not followed, so it's wrong. E.g. Newton's law of universal gravitation, Murphy's law, Muphry's law (I didn't know this one, nice idea), but Murphy's Law (disambiguation). Big Bang seems to be an exception.
- Yes, just removing the guidance might be the right thing to do. --Hans Adler (talk) 11:34, 5 May 2009 (UTC)
- Newton's Third Law is a proper name; First principles of quantum mechanics isn't (quantum mechanics is debateable, but it is abbreviated QM). Removing the sentence may restore the situation, or may let loose the unthinking editors who will decapitalize everything as "Misplaced Pages policy". Let's see what happens. Septentrionalis PMAnderson 20:17, 5 May 2009 (UTC)
- Huh? Almost all acronyms are spelt in capitals regardless of the case of the full spelling, so "quantum mechanics" isn't any more debateable than "uniform resource locator" (URL) or "digital signal processing" (DSP) for that reason alone. (Or did I misunderstand you?) As for "unthinking editors", is there any other guideline which would otherwise suggest lowercasing "big bang"? As I said above, really unthinking editors would just copy whichever spelling is used in the sources, which is usually the right thing to do. --A. di M. (formerly Army1987) — Deeds, not words. 22:16, 5 May 2009 (UTC)
- Which is why much of this page should be replaced by "do what the sources do". The uncustomary "big bang" will be derived from WP:MOSCAP: Misplaced Pages's house style avoids unnecessary capitalization, frequently repeated there and here; I do not say this is rational, or the intended meaning of MOSCAP, but MOS does not tend to attract rationality. Septentrionalis PMAnderson 00:40, 6 May 2009 (UTC)
- Huh? Almost all acronyms are spelt in capitals regardless of the case of the full spelling, so "quantum mechanics" isn't any more debateable than "uniform resource locator" (URL) or "digital signal processing" (DSP) for that reason alone. (Or did I misunderstand you?) As for "unthinking editors", is there any other guideline which would otherwise suggest lowercasing "big bang"? As I said above, really unthinking editors would just copy whichever spelling is used in the sources, which is usually the right thing to do. --A. di M. (formerly Army1987) — Deeds, not words. 22:16, 5 May 2009 (UTC)
- I have ventured to add a paragraph on "do what your sources do". This, even more than the rest of MoS, is a rule of thumb, not to be followed blindly; but I hope this is now clear. Septentrionalis PMAnderson 16:02, 7 May 2009 (UTC)
Capitalisation—use of "The" mid-sentence
I noticed a couple of articles on groups (e.g., The Beatles) where "The" is always capitalised even mid-sentence. This surpised me and I wanted to check MOS but couldn't find a definitive answer. I've previously been accustomed to the same rule as for WP:Manual_of_Style#Institutions, viz:
- "Names of institutions (the University of Sydney, George Brown College) are proper nouns and require capitals. The at the start of a title is not normally capitalized (a degree from the University of Sydney), except where it begins a sentence."
The use of this convention for group names too is supported by a quick search in the wider world, for example, this New York Times article and this one. The Beatles at the start of a sentence, but the Beatles otherwise. I propose we add MOS guidance to this effect, i.e., "The at the start of a title is not normally capitalized (an album by the Beatles), except where it begins a sentence." PL290 (talk) 19:56, 5 May 2009 (UTC)
- Capitalising the definite article inside a sentence is of course totally wrong, but in the Beatles case there has been a long war about this, which was won by The Groupies. --Hans Adler (talk) 20:12, 5 May 2009 (UTC)
- The Groupies (whoever it is you mean, do tell!) don't appear to have cemented winning a war if there's no MOS guidance to that effect... and in any case, does not this idea—that an opinion can prevail and the matter henceforth be forever closed—make a mockery of the whole basis of Misplaced Pages? PL290 (talk) 21:08, 5 May 2009 (UTC)
- By "The Groupies" I mean those editors who identify so much with the Beatles that they insist on following their, or their representatives', absurd capitalisation guidelines instead of common sense and general practice. No of course this can, and should be, revised at some time, but you may not be aware of how big a thing you are trying to tackle. You should really look browse the 20 Talk:The Beatles archives to get an idea of the hostility that you are likely to face. Not quite as big as the date formatting thing. Not quite. --Hans Adler (talk) 21:55, 5 May 2009 (UTC)
- The Groupies (whoever it is you mean, do tell!) don't appear to have cemented winning a war if there's no MOS guidance to that effect... and in any case, does not this idea—that an opinion can prevail and the matter henceforth be forever closed—make a mockery of the whole basis of Misplaced Pages? PL290 (talk) 21:08, 5 May 2009 (UTC)
As I've stated before, I don't think you can make a uniform rule on this. For some bands the The is genuinely part of the name; for others it isn't. A rule of thumb is that if the noun is plural, the the is probably not part of the name, but if singular (or not a noun at all), then it probably is (contrasting the Beatles with The Who). But I wouldn't be comfortable setting even that in stone. I think this is an issue that's best discussed case-by-case, at the individual-article level. --Trovatore (talk) 21:15, 5 May 2009 (UTC)
- A quick search turned up this site which indicates that the trade name for the group, registered in 1964, is "The Beatles" rather than just "Beatles", so it would appear that The Beatles article is properly named and "The" would be appropriately capitalized elsewhere as part of the name. -- Tcncv (talk) 22:56, 5 May 2009 (UTC)
- Yes. That was the main argument by The Groupies. But see Britannica on "the Beatles". In fact almost nobody capitalises the definite article in this case, and it's even translated into other languages: "les Beatles", "die Beatles" etc. But of course it is absolutely necessary that our Beatles article follows the presumed wishes of The Beatles. --Hans Adler (talk) 23:28, 5 May 2009 (UTC)
- PS: Britannica also write "the Who". --Hans Adler (talk) 23:32, 5 May 2009 (UTC)
- Well, that clearly tends to undermine them as a source; it's obviously The Who. You can't ordinarily use Who without The (though I suppose I can imagine a phrase like Who frontman Roger Daltrey, it's a little forced).
- I would argue that even the current guideline is a little too negative towards capitalizing The. It's certainly correct for the University of California. But it really ought to be The Ohio State University because that's the official proper name. Here I hope I cannot be accused of fannishness — my PhD is from the University of California, and I have no connection with OSU. --Trovatore (talk) 00:09, 6 May 2009 (UTC)
- Misplaced Pages seems to be fairly consistent in preserving title case for leading articles that are part of a proper name. A cursory look at movies such as The Silence of the Lambs and newspaper articles such as The Times shows consistent usage both within the articles and other articles which reference them. In contrast, Financial Times does not have "the" in its title, and this is reflected in the article body and articles that refer to it. (I'm sure there are exceptions – I just fixed a reference to "the The Financial Times".) The specific case of (T/t)he Beatles is listed as an example in Misplaced Pages:Naming conventions (definite and indefinite articles at beginning of name)#Names of bands and groups, which acknowledges the use of leading articles in some proper names. Title case in general seems to have a variety of standards used by numerous publications as discussed in Letter case#Headings and publication titles. There seems to even be some national variety is common usage.
- I agree with Trovatore in that the MoS wording should be fine tuned to give preference to official or registered proper name. -- Tcncv (talk) 00:17, 6 May 2009 (UTC)
- The Ohio State University is exactly the sort of silliness WP:MOSTRADE says we should avoid. If it catches on,outside the University itself, fine; let it join mob as a Sturdy Indefensible; but until then we should not encourage it. Septentrionalis PMAnderson 00:47, 6 May 2009 (UTC)
- I agree with Trovatore in that the MoS wording should be fine tuned to give preference to official or registered proper name. -- Tcncv (talk) 00:17, 6 May 2009 (UTC)
(outdent!) Jumping back in here, as there seems to be confusion about what point is being argued! Yes, "The" is genuinely part of the name of many things ("The University of California", "The New York Times", "The Beatles", and so on). The point that I'm making is: there's an established general practice, which can easily be seen by looking anywhere in the wider world, that the definite article is not capitalised inside a sentence even when it is genuinely part of a name, for example, this New York Times article and this one. I would ask those who doubt this to research it now and produce compelling evidence here to the contrary, but otherwise I propose Misplaced Pages should now set this in stone and should make a uniform rule on it which matches the rule long established in the wider world. PL290 (talk) 01:51, 6 May 2009 (UTC)
- This seems to be the Times style, even for the Who. Septentrionalis PMAnderson 02:02, 6 May 2009 (UTC)
- (ec)It is not part of the name of the University of California. Proof is that the phrase University of California often appears by itself.
- I'm afraid I think you're just wrong on the normative claim. We should indeed use lower case the for the University of California, but we should use upper case for The Who. --Trovatore (talk) 02:03, 6 May 2009 (UTC)
- Citation please: the Times says Pete Townshend still cares about the Who's songs. Septentrionalis PMAnderson 02:05, 6 May 2009 (UTC)
- Asking for a "citation" in this sort of circumstance is kind of silly; this is a discussion about desired style. I'm sure I can find lots of sources that capitalize the The. In general journalistic style has lots of features I would not like to see in Misplaced Pages (for example newspapers almost never use the serial comma). --Trovatore (talk) 02:09, 6 May 2009 (UTC)
- Citation please: the Times says Pete Townshend still cares about the Who's songs. Septentrionalis PMAnderson 02:05, 6 May 2009 (UTC)
- So is the arguement for common usage to prevail on a case by case basis? "Beatles" when used as a proper noun in context is commonly recognized as "The Beatles". However a NY Times search for "Hague" yields many articles where "The" in "The Hague" is always capitalized, even in the middle of the sentence. - Tcncv (talk) 02:13, 6 May 2009 (UTC)
- That's my feeling, yes. The Hague is an excellent example; lowercasing the the here would be just wrong, and everyone understands it. The same principle applies to some other proper names, but there is no clear rule for demarcating which ones. --Trovatore (talk) 02:19, 6 May 2009 (UTC)
- Sept's comment above gives an interesting example, writing the Times for the New York Times. I agree with that capitalization for the NYT. However the one from London is The Times — that's its actual name, whereas for the NYT or LAT it isn't; their names are simply New York Times and Los Angeles Times, with the the added when needed for grammatical purposes. --Trovatore (talk) 02:25, 6 May 2009 (UTC)
- Ah, actually maybe I was wrong about the NYT. I was definitely right about the Los Angeles Times though. --Trovatore (talk) 03:02, 6 May 2009 (UTC)
- So is the arguement for common usage to prevail on a case by case basis? "Beatles" when used as a proper noun in context is commonly recognized as "The Beatles". However a NY Times search for "Hague" yields many articles where "The" in "The Hague" is always capitalized, even in the middle of the sentence. - Tcncv (talk) 02:13, 6 May 2009 (UTC)
I'd normally not capitalise the the, the The The the being the exception. JIMp talk·cont 08:23, 6 May 2009 (UTC)
- Is there consensus on the general principle of "common usage should prevail on a case by case basis"; if so, we can be quite simple. Septentrionalis PMAnderson 02:15, 7 May 2009 (UTC)
- As we've reached an indication of consensus I'll update the article accordingly. As a side-issue, some merging appears to be due because the current article refers editors to the main capitalization article but in each article there's a lot of pertinent detail not present in the other. I think it may be best to move most or all of the detail to the main capitalization article eventually. Based on current content I judge the current article to be the more appropriate location for the addition pro tem. PL290 (talk) 11:53, 8 May 2009 (UTC)
- Done. PL290 (talk) 11:56, 8 May 2009 (UTC)
- I note this discussion has not been notified to the Beatles Wikiproject, or on Talk:The Beatles or even WP:MUSTARD. An appalling breach of courtesy and protocol. Well, you've sown the breeze.... Rodhullandemu 14:09, 8 May 2009 (UTC)
- Indeed, the consensus at The Beatles is to use T, so you might want to choose another example for now - at least until the fun at Talk:The Beatles settles down. (John User:Jwy talk) 18:27, 8 May 2009 (UTC)
- Replaced with The United Kingdom. Septentrionalis PMAnderson 18:43, 8 May 2009 (UTC)
Capitalization of words within section headings.
One of the rules for section headings refers to how to capitalize words:
Capitalize the first letter of the first word and any proper nouns in headings, but leave the rest in lower case. Thus Rules and regulations, not Rules and Regulations.
This doesn't make sense to me - surely the first sentence means that the capitalization should be "Rules and Regulations". But the second sentence directly contradicts this.
Am I just being dumb? Or was the second sentence meant to read
Thus Rules and Regulations, not Rules And Regulations.
GeoffMacartney (talk) 16:32, 6 May 2009 (UTC)
- No, regulations is not a proper noun here, unless the section header is about a Handbook of Rules and Regulations or some such. The rules and regulations that apply to the subject of the article should be headed as shown. Septentrionalis PMAnderson 17:44, 6 May 2009 (UTC)
Current consensus for adding wikilinks within quotations
I've read the archived discussions and MOS:QUOTE currently states, ".. unless there is good reason to do so". Now, I realize it's not a major issue, and hopefully this isn't too creepy, but I think it would be more helpful if the MoS gave at least one or two examples of what would be an appropriate use of a wikilink inside a direct quotation. Specifically, for future reference, I'd like to know if this particular edit would be an example that is discouraged by the MoS? Or would this be something that is 'reasonable'? In my view this wikilink would be appropriate because it's within a short quotation not attributed to any specific person, and is the only mention of the topic "Pulitzer Prize" anywhere in the article... -- OlEnglish 20:02, 6 May 2009 (UTC)
- Three alternatives here:
- Don't link Pulitzer Prize, which is unhelpful to non-American readers.
- Link as indicated. A Pulitzer Prize winner for commentary,....
- Some construction like A Pulitzer Prize winner for commentary,.... This would be the dead-tree method, but I don't see any great advantage for us over #2. Some would argue that #2 distorts the emphasis and literal accuracy of the quote; but so does #3. Septentrionalis PMAnderson 22:29, 6 May 2009 (UTC)
- Here is a fourth alternative:
- A Pulitzer Prize winner for commentary,.... -- Wavelength (talk) 00:41, 7 May 2009 (UTC)
- Which many will criticize as using WP as a source (and as a bald link); we should be clear when we are using what are in effect internal links. Septentrionalis PMAnderson 02:13, 7 May 2009 (UTC)
If there is a need to link Pulitzer Prize, then (2) is best. If there is no good reason to link it, then no link at all is best.--Toddy1 (talk) 04:24, 7 May 2009 (UTC)
Another option would be to omit "Pulitzer Prize" from the quotation, and structure it around that. WP:QUOTE tells us that quotation should only bee used for controversial statements or for a "unique phrase or term". The type of terms we tend to want to link to are, as in this instance, proper names, which are neither unique nor controversial. Instead of 'Johnny said he "walked Bob to the Fourth District Park in a brisk and gentlemanly fashion".', one can write 'Johnny described his walking Bob to the Fourth District Park as "brisk and gentlemanly".' This strategy of paraphrasing can be deployed in many cases. Skomorokh 04:32, 7 May 2009 (UTC)
- I agree with the approaches suggested by Skomorokh and Toddy1, that is, try and rephrase so that the term sought to be linked is not within the quote, but if that proves impractical then go ahead and link it according to option (2). Personally, I don't think adding wikilinks to quotations is confusing to readers because it is clear that such links are highly unlikely to have been in the original source (I can't really think of a situation where they might be). — Cheers, JackLee 05:08, 7 May 2009 (UTC)
intro for lists?
Can someone point me to the standard for the intro sentence for articles of the form "list of ..."? (I'm wondering if something was supposed to be bolded or not). RJFJR (talk) 17:11, 7 May 2009 (UTC)
- It looks like I was looking for Misplaced Pages:Stand-alone lists. RJFJR (talk) 21:23, 7 May 2009 (UTC)
Follow the sources
Since it has been reverted, I discuss "follow the sources" here. It has been suggested above, and seems generally a good idea; rather than say it at the several sections where it applies, it seems simpler to say it once:
- Many points of usage can be decided by seeing what other writers do about the problem. Unless there is some clear reason to do otherwise, it is generally a good idea to follow the usage of reliable sources on a subject; the sources for the article itself ahould be reliable. If the sources can be shown to be unrepresentative of current English usage as whole (because, for example, they were published in the nineteenth century, or in one of the states involved in a territorial dispute), follow current English usage instead — and consult more sources.
Comments? Septentrionalis PMAnderson 17:37, 7 May 2009 (UTC)
- I'm OK with the idea. I think the second example is weird, though. Are you alluding to something like the Republic of Macedonia naming controversy? If so, we might try to be a little clearer. (Most readers would just wonder: "Huh? Why, if a state is involved in a dispute, stuff written there can't be used as a model for grammar and typography?") A source having those problems wouldn't be normally considered "reliable" anyway, per WP:RS and WP:NPOV. --A. di M. (formerly Army1987) — Deeds, not words. 21:11, 7 May 2009 (UTC)
- While we follow sources with regard to facts, I don't see much reason to follow them in matters of style (at least, not in matters where WP has its own established style). We have our own style guidelines for perhaps three good reasons: (1) to provide stylistic consistency in WP; (2) to encourage the use of styles appropriate to what WP is; (3) to enable lame disputes on multiple article pages to be settled quickly. Saying "follow what sources do" thwarts all of these aims: (1) because different sources (even within the scope of a single article) use different styles; (2) because our sources are mostly not online encyclopedias; (3) because arguments about which sources do what and whether they are reliable are liable to drag on interminably. So if we feel a need to add this as a principle, then it should be made clear that WP's own style guidance (where it exists) still takes precedence.--Kotniski (talk) 08:18, 8 May 2009 (UTC)
- Well, to make an example with the last such discussion (see above), if most (if not all) sources write "Big Bang" with capitals and "quantum mechanics" with lowercase, there is very little point in doing otherwise. Also, our own guidelines can be applied inconsistently, leading to very strange things. For example, the title of the article about the album Y34RZ3R0R3M1X3D is "Year Zero Remixed" "per WP:MOSTM". But, considering that almost all sources refer to the album as Y34RZ3R0R3M1X3D, referring to it as Year Zero Remixed is borderline original research. Indeed, one guy once proposed to move "(pronounced 'lĕh-'nérd 'skin-'nérd)" to "Pronounced Leh-Nerd Skin-Nerd", but another guy refused to do that because "hat would probably be considered synthesis." (On the other hand, the article about the song "Rock N Roll Train" is at "Rock 'n' Roll Train".) If we could agree that we should at the very least spell proper names the way they are usually spelt by reliable secondary sources in English (the "secondary" is to avoid stuff such as lowercase "adidas", and the "in English" is to avoid having to spell Akira Toriyama as "鳥山 明"), that would at least be a good start to prevent arbitrary name mangling. (I think there are people eagerly waiting for the last person who pronunces "NASA" as en-ay-es-ay to pass on, so that they can move the article to "Nasa".) --A. di M. (formerly Army1987) — Deeds, not words. 10:42, 8 May 2009 (UTC)
- Yes, that makes sense to me in relation to the specific question of proper names.--Kotniski (talk) 11:08, 8 May 2009 (UTC)
- Well, to make an example with the last such discussion (see above), if most (if not all) sources write "Big Bang" with capitals and "quantum mechanics" with lowercase, there is very little point in doing otherwise. Also, our own guidelines can be applied inconsistently, leading to very strange things. For example, the title of the article about the album Y34RZ3R0R3M1X3D is "Year Zero Remixed" "per WP:MOSTM". But, considering that almost all sources refer to the album as Y34RZ3R0R3M1X3D, referring to it as Year Zero Remixed is borderline original research. Indeed, one guy once proposed to move "(pronounced 'lĕh-'nérd 'skin-'nérd)" to "Pronounced Leh-Nerd Skin-Nerd", but another guy refused to do that because "hat would probably be considered synthesis." (On the other hand, the article about the song "Rock N Roll Train" is at "Rock 'n' Roll Train".) If we could agree that we should at the very least spell proper names the way they are usually spelt by reliable secondary sources in English (the "secondary" is to avoid stuff such as lowercase "adidas", and the "in English" is to avoid having to spell Akira Toriyama as "鳥山 明"), that would at least be a good start to prevent arbitrary name mangling. (I think there are people eagerly waiting for the last person who pronunces "NASA" as en-ay-es-ay to pass on, so that they can move the article to "Nasa".) --A. di M. (formerly Army1987) — Deeds, not words. 10:42, 8 May 2009 (UTC)
Thank you for your comments; I think they've improved the section, and I have restored it. Septentrionalis PMAnderson 18:30, 8 May 2009 (UTC)
- I believe it should NOT be a general principle - too broad a context for this. If my source is a 16th century document (extreme for emphasis), do I capitalize all the nouns? Use f for s? "Use the source" is an exception for particular situations. And 24 hours is not enough time to collect consensus for a "general principle" on the most critical style page in Misplaced Pages. But if I'm alone. . . (John User:Jwy talk) 18:49, 8 May 2009 (UTC)
- Most 16th century documents are primary sources (and the problem with them is that they are unrepresentative of current usage); so that should be twice covered by the present wording:
- Many points of usage, such as the treatment of proper names, can be decided by seeing what other writers do about the problem. Unless there is some clear reason to do otherwise, it is generally a good idea to follow the usage of reliable secondary sources in English on the subject; the sources for the article itself ahould be reliable. If the sources for the article can be shown to be unrepresentative of current English usage as a whole, follow current English usage instead — and consult more sources.
- The reason to add this is that "follow the sources" has come up several times in various contexts here over a long period; it seems simpler to add it as a general principle and then refer to the principle than to add it to a dozen sections and not to others. Septentrionalis PMAnderson 19:00, 8 May 2009 (UTC)
- On closer examination, my example is very poor! My main concern is the appropriate balance with the first general principle - I think we want to avoid having WP be a confusing mix of styles (I'm far from saying it MUST be absolutely consistent!), I see the new "principle" applying in cases where there is no guidance here or for "special" exceptions. Does it makes sense to add wording to that effect - or does that fall into "break the rules"? (John User:Jwy talk) 19:49, 8 May 2009 (UTC)
- I can't see how the principles conflict: if different reliable secondary sources use different styles, you can pick one and use it consistently in an article. --A. di M. (formerly Army1987) — Deeds, not words. 20:05, 8 May 2009 (UTC)
- I think, using this general principle, we can simplify a lot of the existing text. For example, we could add a link to it from the section on mid-sentence The - and I will do so if it is not reverted again. Septentrionalis PMAnderson 20:08, 8 May 2009 (UTC)
- I can't see how the principles conflict: if different reliable secondary sources use different styles, you can pick one and use it consistently in an article. --A. di M. (formerly Army1987) — Deeds, not words. 20:05, 8 May 2009 (UTC)
- On closer examination, my example is very poor! My main concern is the appropriate balance with the first general principle - I think we want to avoid having WP be a confusing mix of styles (I'm far from saying it MUST be absolutely consistent!), I see the new "principle" applying in cases where there is no guidance here or for "special" exceptions. Does it makes sense to add wording to that effect - or does that fall into "break the rules"? (John User:Jwy talk) 19:49, 8 May 2009 (UTC)
- The reason to add this is that "follow the sources" has come up several times in various contexts here over a long period; it seems simpler to add it as a general principle and then refer to the principle than to add it to a dozen sections and not to others. Septentrionalis PMAnderson 19:00, 8 May 2009 (UTC)
- mea culpa - fast/lazy reading on my part. I was thinking it was wikipedia-wide consistency. Sorry to take your time to deal with my poor reading skills (and for the initial revert). (John User:Jwy talk) 00:35, 9 May 2009 (UTC)
Navboxes
Is there any specific MOS on navboxes? I've seen a few (Template:The Simpsons, Template:Red Dwarf) that use apparently non standard colours. If there isn't a MOS related to these, perhaps there should be to make everything consistent? Rehevkor ✉ 21:58, 7 May 2009 (UTC)
- Misplaced Pages:Template messages#See also has a link to Misplaced Pages:Colours. -- Wavelength (talk) 14:27, 8 May 2009 (UTC)
Capitalize Holy Spirit?
Misplaced Pages's MoS should specify capitalization rules for "Holy Spirit" and "holy spirit". My thought is that when "Holy Spirit" is a figure or person, the term must be capitalized; when "holy spirit" is a mindset or impersonal force, then it need not be capitalized, but may be when quoting references which capitalize it.
Please post most reasoning on the capitalization page, where the matter is more thoroughly presented:
MOS:CAPS#Holy Spirit? ...Soc8675309 (talk) 17:45, 8 May 2009 (UTC)
- Can you give examples of where holy spirit refers to a mindset or impersonal force? JIMp talk·cont 18:34, 8 May 2009 (UTC)
- The matter is more thoroughly presented, with examples, at "MOS:CAPS":
- MOS:CAPS#Holy Spirit?
- It might be best to concentrate discussion there. ...--Soc8675309 (talk) 19:47, 8 May 2009 (UTC)
- The matter is more thoroughly presented, with examples, at "MOS:CAPS":
MoS Doesn't specify Wiki markup
The style guides for Misplaced Pages articles do not specify what WikiMarkup should be used to conform to a style.
It seems to me self evident that the MoS is implicitly saying this, so similarly, concensus is implicit. So I've put the above sentence as the first sentence under the General principles heading, if it was going to go anywhere it seems the obvious place; where else? HarryAlffa (talk) 20:06, 8 May 2009 (UTC)
- Why say it?
I was reverted elsewhere (an encyclopaedia article) for changing wiki markup for bold-face from the usual explicit '''Bold text'''
to different but still perfectly valid & safe markup, and the Mos was cited as the reason. On the bold face of it, that seems a stupid thing for someone to do, but I can see were the other editors coming from as it's a special case of markup behaviour, and it took a bit of thought for me to explain my reasoning, so please visit the WP:Link talk page for a discussion on this, and what the alternative markup for bold face might be! I hope to convince you there of that usage, which will then cascade back to keep the edit on this page! :) HarryAlffa (talk) 20:06, 8 May 2009 (UTC)