These rules are overridden by the demands of computer languages which support legacy character sets. Also, these rules are written with English in mind, and so do not override typesetting conventions commonly adopted when setting other natural languages.
If you are using two spaces, it is because you are used to the old standards associated with typewriters. More spaces are suitable in a computer language listing in order to achieve consistent indentation or tabulation. A notable violation in the widely published and respected literature is the IETF RFCs— such a violation is consistent with these documents having been intended to be set in a monospaced font.
Typesetting engines should remove all double spaces, except in computer language listings. HTML user agents will normally perform such distillation, with exception of those using a monospaced font, such as Lynx. These will use the old "typewritten letter" convention of using two spaces between sentences.
If you are forced to use a character set which does not have em-dash, then a hyphen and a space can be used as a surrogate. Do not use two hyphens unless you are writing a computer language comment intended to be used in a subsequent typeset form. An apparent violation in the widely published and respected literature is the IETF RFCs—however these documents are using ASCII as their character set, and this does not have an em-dash.
A typesetting engine should turn all hyphen-space pairs into an em-dash in any document which is using the surrogate. Any double-hypens in a computer language listing could be turned into an em-dash, but the engine must be engineered so as not to disturb any code where syntactic meaning is given to such a combination (e.g. in C and Java.) Also, there is no point using em-dash in a computer language listing if, as per guideline 5, that listing is set in a monospaced font. In this case, double-hyphen is the preferred setting choice.
Any automated replacement of hyphens with em-dashes as described above should also beware of the use of hyphen-space pairs at the start of lines: this is more often an indication of a bulleted item in a list, and should be set as such. The final check on such replacements should be performed by a human.
Remember, javadoc
comments support the insertion of an em-dash because they
are written in HTML. This choice is better, but the instructions and
entities implied by HTML should be interpreted before setting a listing
containing them. Another alternative is to remove the documentation comments
before setting and include suitable narrative text around the listing
instead.
Hyphen must only be used if the character set has no en-dash, or if a monospaced font is used. An apparent violation in the widely published and respected literature is the IETF RFCs—however:
This translation between HYPHEN or HYPHEN MINUS and EN DASH can sometimes be automated by an engine which recognises hyphens between numbers. But the final check should be performed by a human, as there may be rare exceptions. In particular, careful consideration should be given to replacing a Unicode HYPHEN with an EN DASH or anything else, as it is likely the author or her writing tool was using it deliberately.
HYPHEN MINUS must only be used if the character set has no minus sign. An apparent violation in the widely published and respected literature is the IETF RFCs—however these are not documents written with the luxury of Unicode's MINUS SIGN.
This translation between HYPHEN MINUS and MINUS SIGN can sometimes be automated by an engine which recognises space-hyphen pairs before numbers. However, the final check should be performed by a human, as there may be rare exceptions.
(Corollary to rules 2, 3, and 4.)
An en-dash may be used when word phrases are nested to form larger word phrases, in order to indicate the higher level nesting.
Unicode HYPHEN may be substituted for HYPHEN MINUS by a typesetting engine without exception, but such an engine should have also ensured compliance with rules 2, 3 and 4.
The correct way to nest adjacent quotation marks is therefore visually equivalent to a triple-quote. An opening triple is read as a single followed by a double and a closing triple is read as a double followed by a single. It is highly irregular to see (and therefore discouraged to cause to be seen) a single quotation where double and single quotes are used to both open and close it.
Ellipsis leading is the responsibility of the rendering engine—do not attempt to enforce any rule in the characters you write. Indeed, the typical rendering employed will be to display ellipses with ordinary period leading, though this often depends on the font being used when the Unicode character HORIZONTAL ELLIPSIS is employed.
It is highly irregular (and therefore discouraged) to use consecutive bracketed phrases or sentences, or consecutive quotations, within a single paragraph without some semantic salt.
In order to avoid ambiguity, where the text is literal, a different font or style should be used for the intended literal characters so the punctuation is not interpreted as belonging to the quoted text. If this is not possible, it is acceptable to drop the punctuation entirely. In plain ASCII text, reversal of the punctuation is also permitted.
Colons are preferred outside the bracket where the following phrase or list is set to the right. When the phrase or list is set below, set the introducing colon inside like all other marks.
A search and replace of affected text is a quick way to enforce this convention.
This does not exclude double quotes being used where only one level of nesting is ever used. However, the use of single quotes in this case is encouraged.
It is never correct to use many exclamation or query marks in a row, or a number of consecutive periods other than one or three.
Where letters are missing from a word, an ellipsis is appropriate to indicate to missing letters. Consider using an apostrophe or the full and complete word where possible, however. Two en-dashes can also be used to represent missing letters from a word. Three en-dashes can represent a completely missing word. No more en-dashes in a row can be tolerated.
It is not considered correct to follow an ellipsis by an extra period space to complete a sentence. In this way an ellipsis is equivalent in power to an exclamation or query mark when followed by a space.
Do not use apostrophe or double quote unless the character set has no prime.
But consider using decimal notation for angles, colon-separated notation for elapsed times, and SI units for all other measurements instead.
If the owner or producer uses a leading lower case letter, it is preferable to make this title case unless it leads to ambiguity or confusion. It is required to use a leading title case letter where the trademark or product name begins a sentence, or where the name may be confused with a natural language word.
The use of lower case letters only in UNIX utility names is optional if the
author's position is not known or using title casing would cause no
ambiguity. It is important to be consistent within documents and sets of
documents, however. Use the same font for computer listing literals if you
feel compelled for reasons of clarity or technical accuracy (say in user
documentation) to include the lower case form in some places, but not in
others (where you would use the ordinary body text font.) e.g. It is
technically inaccurate for a user to invoke the Lame MPEG audio encoder by
issuing the command Lame, because UNIX uses a case-sensitive filing system.
Therefore is is better to say "then run lame
against the appropriate input
file" rather than "then run Lame against..."
If a character has no Unicode symbol in the chosen body text font, or is a stylized representation chosen for rendition for the mark, then a surrogate may be used or the character dropped entirely provided this would cause no confusion or ambiguity.
The use of the exclamation mark to begin RISC OS application names is optional if the author's position is not known or its omission would cause no ambiguity. It is important to be consistent within documents and sets of documents, however. Use the same font for computer listing literals if you feel compelled for reasons of clarity (say in user documentation) to include the exclamation mark in some places, but not in others (where you would use the ordinary body text font.)
For example, if you are refering to Frisbee's frisbee, use "Frisbee", but if you are referring to flying plates in general, use "frisbee."
Words which are historically natural language words, but have trademark meanings, are also covered by this rule.
(Corollary to rules 14, 15 and 16, which allow trademarks to be adequately distinguished without these cumbersome and unsightly abberations.)
Copyright symbols should only be used at the beginning or end of a document, in headers, title pages, footers and bibliographies, and never in the normal flow of text. Consider reducing the size of the copyright symbol to 80% of its normal size if it appears to be too large in your chosen font—unfortunately this is often the case.
This is the only time the SI unit prefix k may be capitalized.
Any other usage should be noted.
Set the superscript index (e.g. -1 or -2) to the right of the divisor units instead.
See also: Typesetting Recommendations and Guidelines for In-house Material.
Also available: All rules, recommendations and guidelines are available as plain text.
Last updated: Sunday 8th May 2005