Distributed Proofreaders 29 titles preserved for the world!
  DP
ID: Password:  ·  Register  ·  Help  
 

Proofreading Guidelines

Version 1.9.c, generated January 1, 2006       (Revision History)

Proofreading Guidelines in French / Directives de Formatage en français
Proofreading Guidelines in Portuguese / Regras de Revisão em Português

Check out the Proofreading Quiz and Tutorial

  Table of Contents
 
 
  • Proofreading of the...
 
 
  • Common Problems
 
   

The Primary Rule

"Don't change what the author wrote!"

The final electronic book seen by a reader, possibly many years in the future, should accurately convey the intent of the author. If the author spelled words oddly, we leave them spelled that way. If the author wrote outrageous racist or biased statements, we leave them that way. If the author puts italics, bold text or a footnote every third word, we mark them italicized, bolded or footnoted. We are proofreaders, not editors. (See Printer's Errors for proper handling of obvious misprints.)

We do change minor typographical conventions that don't affect the sense of what the author wrote. For example, we rejoin words that were broken at the end of a line (End-of-line Hyphenation). Changes such as these help us produce a consistently formatted version of the book. The proofreading rules we follow are designed to achieve this result. Please carefully read the rest of the Proofreading Guidelines with this concept in mind. There is a separate set of Formatting Guidelines. These guidelines are intended for proofreading only. A second group of volunteers will be working on the formatting of the text.

To assist the next proofreader, the formatter, and the post-processor, we also preserve line breaks. This allows them to easily compare the lines in the text to the lines in the image.

 

About This Document

This document is written to explain the proofreading rules we use to maintain consistency when proofreading a single book that is distributed among many proofreaders, each of whom is working on different pages. This helps us all do proofreading the same way, which in turn makes it easier for the formatter and for the post-processor who will complete the work on this e-book.

It is not intended as any kind of a general editorial or typesetting rulebook.

We've included in this document all the items that new users have asked about while proofreading. If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.

This document is a work in progress. Help us to progress by posting your suggested changes in the Documentation Forum in this thread.

Project Comments

On the proofreading interface page (Project Page) where you start proofreading pages, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. There may also be instructions in the project comments that apply to the formatting phase, which do not apply during proofing. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project.

Please also read the Project Thread(Forum): The Project Manager may clarify project-specific guidelines here, and it is often used by proofreaders to alert other proofreaders to recurring issues within the project and how they can best be addressed. (See below).

On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other proofreaders have made changes. This Forum thread discusses different ways to use this information.

Forum/Discuss this Project

On the proofreading interface page (Project Page) where you start proofreading pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other proofreaders who are working on this book.

Fixing errors on Previous Pages

When you select a project for proofreading, the Project Comments page is loaded. This page contains links to pages from this project that you have recently proofread. (If you haven't proofread any pages yet, there will be no links shown.)

Pages listed under either "DONE" or "IN PROGRESS" are available to make proofreading corrections or to finish proofreading. Just click on the link to the page. So if you discover that you made a mistake on a page, or marked something incorrectly, you can click on that page here and re-open it to fix the error.

You may also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Comments page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.

For more detailed information, refer to either the Standard Proofreading Interface Help or the Enhanced Proofreading Interface Help, depending on which interface you are using.

How to proof...

Line Breaks

Leave all line breaks in so that later in the process other volunteers can easily compare the lines in the text to the lines in the image. If the previous proofreader removed the line breaks, please replace them so that they once again match the image.

Double Quotes

Proofread these as plain ASCII " double quotes. Do not change double quotes to single quotes. Leave them as the Author wrote them.

For quotes from non-English languages, use the quotation marks appropriate to that language if they are available in the Latin-1 character set. The French equivalent, guillemets, «like this», are available from the pulldown menus in the proofreading interface, since they are part of Latin-1. The quotation marks used in some German texts, „like this” are not available in the pulldown menus, as they are not in Latin-1. The Project Manager may instruct you in the Project Comments to proofread non-English language quotation marks differently for a particular book.

Single Quotes

Proofread these as the plain ASCII ' single quote (apostrophe). Do not change single quotes to double quotes. Leave them as the Author wrote them.

Quote Marks on each line

Proofread quotation marks at the beginning of each line of a quotation by removing all of them except for the one at the start of the first line of the quotation.

If the quotation goes on for multiple paragraphs, each paragraph should have an opening quote mark on the first line of the paragraph.

Often there is no closing quotation mark until the very end of the quoted section of text, which may not be on the same page you are proofreading. Leave it that way—do not add closing quotation marks that are not in the page image.

End of Sentence Periods

Proofread periods that end sentences with a single space after them.

You do not need to remove extra spaces after periods if they're already in the scanned text—we can do that automatically during post-processing. See the Sidenotes image and text for an example.

Punctuation

In general, there should be no space before punctuation characters except opening quotation marks. If scanned text has a space before punctuation, remove it.

Spaces before punctuation sometimes appear because books typeset in the 1700's & 1800's often used partial spaces before punctuation such as a semicolon or comma.

Scanned Text:
and so it goes ; ever and ever.
Correctly Proofread Text:
and so it goes; ever and ever.

Period Pause "..." (Ellipsis)

The guidelines are different for English and Languages Other Than English (LOTE).

ENGLISH: Leave a space before the three dots, and a space after. The exception is at the end of a sentence, when there would be no space, four dots, and a space after. This is also the case for any other ending punctuation mark: the 3 dots follow immediately, without any space.

For example:

     That I know ... is true.
     This is the end....
     Wherefore art thou Romeo?...

Sometimes you will see it with the punctuation at the end, so proofread it that way:

     Wherefore art thou Romeo...?

Remove extra dots, if any, or add new ones, if necessary, to bring the number to three (or four) as appropriate.

LOTE: (Languages Other Than English) Use the general rule "Follow closely the style used in the printed page." In particular, insert spaces, if there are spaces before or between the periods, and use the same number of periods as appear in the image. Sometimes the printed page is unclear: in that case, insert a [**unclear] to draw the attention of the post-processor. (Note: Post-Processors should replace those regular spaces with non-breaking spaces.)

Contractions

Remove any extra space in contractions: for example, would n't should be proofread as wouldn't.

This was often an early printers' convention, where the space was retained to indicate that 'would' and 'not' were originally separate words. It is also sometimes an artifact of the OCR. Remove the extra space in either case.

Some Project Managers may specify in the Project Comments not to remove extra spaces in contractions, particularly in the case of texts that contain slang, dialect, or are written in languages other than English.

Extra Spaces or Tabs Between Words

Extra spaces and tab characters between words are common in OCR output. You don't need to bother removing these—that can be done automatically during post-processing.

However, extra spaces around punctuation, em-dashes, quote marks, etc. do need to be removed when they separate the symbol from the word.

For example, in A horse ;   my kingdom for a horse. the space between the word "horse" and the semicolon should be removed. But the 2 spaces after the semicolon are fine—you don't have to delete one of them.

Trailing Space at End-of-line

Do not bother inserting spaces at the ends of lines of text. It is a waste of your time for something that we can take care of automatically later. Similarly do not waste your time removing extra spaces at the ends of lines.

Line Numbers

Keep line numbers. Use a few spaces to separate them from the other text on the line so that the formatters can easily find them.

Line numbers are numbers in the margin for each line, or sometimes every fifth or tenth line, and are common in books of poetry. Since poetry will not be reformatted in the e-book version, the line numbers will be useful to readers.

Italic and Bold Text

Italicized text may occasionally appear with <i> inserted at the start and </i> inserted at the end of the italics. Bold text (text printed in a heavier typeface) may occasionally appear with <b> inserted before the bold text and </b> after it. Do not remove this formatting information, unless it surrounds junk that does not appear on the page. Do not add it where it does not appear. The formatters will do that later in the process.

Superscripts

Older books often abbreviated words as contractions, and printed them as superscripts: for example,
     Genrl Washington defeated Ld Cornwall's army.
Proofread these by inserting an up-arrow followed by the superscripted text, like this:
     Gen^rl Washington defeated L^d Cornwall's army.

Subscripts

Subscripted text is often found in scientific works, but is not common in other material. Proofread subscripted text by inserting an underline character _.
For example:
        H2O.
would be proofread as
        H_2O.

Font size changes

Do not mark changes in font size. The formatters will take care of this later in the process.

Words in Small Capitals

Small caps (capital letters which are smaller than the standard capitals) may occasionally appear with <sc> inserted before the small caps and </sc> after the small caps. Once again, do not remove this formatting information, unless it surrounds junk that does not appear on the page. Do not add it where it does not appear. The formatters will do that later in the process. Please proof only the characters in small caps. Do not worry about case changes. If they are already ALL-CAPPED, Mixed-Cased, or lower-cased, leave them ALL-CAPPED, Mixed-Cased, or lower-cased.

Large, Ornate opening Capital letter (Drop Cap)

Proofread large and ornate graphic first letters of a chapter, section, or paragraph as just the letter.

Accented/Non-ASCII Characters

Please proofread these using the proper accented Latin-1 characters, where possible. See Diacritical marks for ways to proof some non-Latin-1 characters.

There are several ways of inputting accented characters:

  • The pull-down menus in the proofreading interface.
  • Applets included with your operating system.
    • Windows: "Character Map"
      Access it through:
      Start: Run: charmap, or
      Start: Accessories: System Tools: Character Map.
    • Macintosh: Key Caps or "Keyboard Viewer"
      For OS 9 and lower this is on the Apple Menu,
      For OS X through 10.2, this is located the in Applications, Utilities folder
      For OS X 10.3 and higher, this is in the Input Menu as "Keyboard Viewer."
    • Linux: Various, depending on your desktop environment.
      For KDE, try KCharSelect (in the Utilities submenu of the start menu).
  • An on-line program, such as Edicode.
  • Keyboard shortcuts.
    Tables for Windows and Macintosh which list these shortcuts are in the Proofreading Guidelines.
  • Switching to a keyboard layout or locale which supports "deadkey" accents.
    • Windows: Control Panel (Keyboard, Input Locales)
    • Macintosh: Input Menu (on Menu Bar)
    • Linux: Change the keyboard in your X configuration.

Project Gutenberg will post as a minimum, 7-bit ASCII versions of texts, but versions using other character encodings which can preserve more of the information from the original text are accepted. Currently for Distributed Proofreaders this means using Latin-1 or ISO 8859-1 and -15, and in the future will include Unicode.

For Windows:

  • You can use the Character Map program (Start: Run: charmap) to select an individual letter, and then cut & paste.
  • If you are using the enhanced proofreading interface, the more tag opens a pop-up window containing these characters, which you can then cut & paste.
  • Or you can type the Alt+NumberPad shortcut codes for these characters.
    This is faster than using cut & paste, once you get used to the codes.
    Hold the Alt key and type the four digits on the Number Pad—the number row over the letters won't work.
    You must type all 4 digits, including the leading 0 (zero). Note that the capital version of a letter is 32 less than the lower case.
    These instructions are for the US-English keyboard layout. It may not work for other keyboard layouts.
    The table below shows the codes we use. (Print-friendly version of this table)
    Do not use other special characters unless the Project Manager tells you to in the Project Comments.

Windows Shortcuts for Latin-1 symbols
` grave ´ acute (aigu) ^ circumflex ~ tilde ¨ umlaut ° ring Æ ligature
à Alt-0224 á Alt-0225 â Alt-0226 ã Alt-0227 ä Alt-0228 å Alt-0229 æ Alt-0230
À Alt-0192 Á Alt-0193 Â Alt-0194 Ã Alt-0195 Ä Alt-0196 Å Alt-0197 Æ Alt-0198
è Alt-0232 é Alt-0233 ê Alt-0234 ë Alt-0235
È Alt-0200 É Alt-0201 Ê Alt-0202 Ë Alt-0203
ì Alt-0236 í Alt-0237 î Alt-0238 ï Alt-0239
Ì Alt-0204 Í Alt-0205 Î Alt-0206 Ï Alt-0207 / slash Œ ligature
ò Alt-0242 ó Alt-0243 ô Alt-0244 õ Alt-0245 ö Alt-0246 ø Alt-0248 œ Use [oe]
Ò Alt-0210 Ó Alt-0211 Ô Alt-0212 Õ Alt-0213 Ö Alt-0214 Ø Alt-0216 Œ Use [OE]
ù Alt-0249 ú Alt-0250 û Alt-0251 ü Alt-0252
Ù Alt-0217 Ú Alt-0218 Û Alt-0219 Ü Alt-0220 currency mathematics
ñ Alt-0241 ÿ Alt-0255 ¢ Alt-0162 ± Alt-0177
Ñ Alt-0209 Ÿ Alt-0159 £ Alt-0163 × Alt-0215
çedilla Icelandic marks accents punctuation ¥ Alt-0165 ÷ Alt-0247
ç Alt-0231 Þ Alt-0222 © Alt-0169 ´ Alt-0180 ¿ Alt-0191 $ Alt-0036 ¬ Alt-0172
Ç Alt-0199 þ Alt-0254 ® Alt-0174 ¨ Alt-0168 ¡ Alt-0161 ¤ Alt-0164 ° Alt-0176
superscripts Ð Alt-0208 Alt-0153 ¯ Alt-0175 « Alt-0171 µ Alt-0181
¹ Alt-0185 ð Alt-0240 Alt-0182 ¸ Alt-0184 » Alt-0187 ordinals ¼ 1Alt-0188
² Alt-0178 sz ligature § Alt-0167 · Alt-0183 º Alt-0186 ½ 1Alt-0189
³ Alt-0179 ß Alt-0223 ¦ Alt-0166 * Alt-0042 ª Alt-0170 ¾ 1Alt-0190

1Unless specifically requested by the Project Comments, please do not use the fraction symbols, but instead use the guidelines for Fractions. (1/2, 1/4, 3/4, etc.)

For Apple Macintosh:

  • You can use the "Key Caps" program as a reference.
    In OS 9 & earlier, this is located in the Apple Menu; in OS X through 10.2, it is located in Applications, Utilities folder.
    This brings up a picture of the keyboard, and pressing shift, opt, command, or combinations of those keys shows how to produce each character. Use this reference to see how to type that character, or you can cut & paste it from here into the text in the proofreading interface.
  • In OS X 10.3 and higher, the same function is now a palette available from the Input menu (the drop-down menu attached to your locale's flag icon in the menu bar). It's labeled "Show Keyboard Viewer." If this isn't in your Input menu, or if you don't have that menu, you can activate it by opening System Preferences, the "International" panel, and selecting the "Input Menu" pane. Ensure that "Show input menu in menu bar" is checked. In the spreadsheet view, check the box for "Keyboard Viewer" in addition to any input locales you use.
  • If you are using the enhanced proofreading interface, the more tag creates a pop-up window containing these characters, which you can then cut & paste.
  • Or you can type the Apple Opt- shortcut codes for these characters.
    This is a lot faster than using cut & paste, once you get used to the codes.
    Hold the Opt key and type the accent symbol, then type the letter to be accented (or, for some codes, only hold the Opt key and type the symbol).
    These instructions are for the US-English keyboard layout. It may not work for other keyboard layouts.
    The table below shows the codes we use. (Print-friendly version of this table)
    Do not use other special characters unless the Project Manager tells you to in the Project Comments.

Apple Mac Shortcuts for Latin-1 symbols
` grave ´ acute (aigu) ^ circumflex ~ tilde ¨ umlaut ° ring Æ ligature
à Opt-`, a á Opt-e, a â Opt-i, a ã Opt-n, a ä Opt-u, a å Opt-a æ Opt-'
À Opt-~, A Á Opt-e, A Â Opt-i, A Ã Opt-n, A Ä Opt-u, A Å Opt-A Æ Opt-"
è Opt-~, e é Opt-e, e ê Opt-i, e ë Opt-u, e
È Opt-~, E É Opt-e, E Ê Opt-i, E Ë Opt-u, E
ì Opt-~, i í Opt-e, i î Opt-i, i ï Opt-u, i
Ì Opt-~, I Í Opt-e, I Î Opt-i, I Ï Opt-u, I / slash Œ ligature
ò Opt-~, o ó Opt-e, o ô Opt-i, o õ Opt-n, o ö Opt-u, o ø Opt-o œ Use [oe]
Ò Opt-~, O Ó Opt-e, O Ô Opt-i, O Õ Opt-n, O Ö Opt-u, O Ø Opt-O Œ Use [OE]
ù Opt-~, u ú Opt-e, u û Opt-i, u ü Opt-u, u
Ù Opt-~, U Ú Opt-e, U Û Opt-i, U Ü Opt-u, U currency mathematics
ñ Opt-n, n ÿ Opt-u, y ¢ Opt-4 ± Opt-+
Ñ Opt-n, N Ÿ Opt-u, Y £ Opt-3 × (none) †
çedilla Icelandic marks accents punctuation ¥ Opt-y ÷ Opt-/
ç Opt-c Þ (none) ‡ © Opt-g ´ Opt-E ¿ Opt-? $ Shift-4 ¬ Opt-l
Ç Opt-C þ Shift-Opt-6 ® Opt-r ¨ Opt-U ¡ Opt-1 ¤ Shift-Opt-2 ° Opt-*
superscripts Ð (none) ‡ Opt-2 ¯ Shift-Opt-, « Opt-\ µ Opt-m
¹ (none) ‡ ð (none) ‡ Opt-7 ¸ Opt-Z » Shift-Opt-\ ordinals ¼ (none) ‡1
² (none) ‡ sz ligature § Opt-6 · Opt-8 º Opt-0 ½ (none) ‡1
³ (none) ‡ ß Opt-s ¦ (none) ‡ * (none) ‡ ª Opt-9 ¾ (none) ‡1

‡ Note: No equivalent shortcut, use drop-down menus.

1Unless specifically requested by the Project Comments, please do not use the fraction symbols, but instead use the guidelines for Fractions. (1/2, 1/4, 3/4, etc.)

Characters with Diacritical marks

In some projects, you will find characters with special marks either above or below the normal Latin A..Z character. These are called diacritical marks and indicate a special pronunciation for this character. For proofreading, we indicate them in our normal ASCII text by using a specific coding, such as: ă becomes [)a] for a breve (the u-shaped accent) above an a, or [a)] for a breve below.

Be sure to include the square brackets ([ ]) around these, so the post-processor knows to which letter it applies. He or she will eventually replace these with whatever symbol works in each version of the text they produce, like 7-bit ASCII, 8-bit, Unicode, html, etc.

Note that when some of these marks appear on some characters (mainly vowels) our standard Latin-1 character set already includes that character with the diacritical mark. In those cases, use the Latin-1 character (see here), available from the drop-down lists in the proofreading interface.

The table below lists the special codings currently used:
The "x" represents a character with a diacritical mark.
When proofreading, use the actual character from the text, not the x shown in the examples.

Proofreading Symbols for Diacritical Marks
diacritical mark sample above below
macron (straight line) ¯ [=x] [x=]
2 dots (diaresis, umlaut) ¨ [:x] [x:]
1 dot · [.x] [x.]
grave accent ` [`x] or [\x] [x`] or [x\]
acute accent (aigu) ´ ['x] or [/x] [x'] or [x/]
circumflex ˆ [^x] [x^]
caron (v-shaped symbol) [vx] [xv]
breve (u-shaped symbol) [)x] [x)]
tilde ˜ [~x] [x~]
cedilla ¸ [,x] [x,]

Non-Latin Characters

Some projects contain text printed in non-Latin characters--that is, characters other than the Latin A...Z—for example, Greek, Cyrillic, Hebrew, or Arabic.

These characters should be entered in the text just as Latin characters are. (WITHOUT transliteration!)

If a document is written entirely in a non-Latin script, it is the best to install a keyboard driver which supports the language. Consult your operating system manual for instructions on how to do that.

If the script appears only occasionally, you may use a separate program to enter it. See above for some of the programs.

If you are uncertain about a character or an accent, mark it with an * to bring it to the attention of the second round proofreader or the post-processor.

For scripts which cannot be so easily entered, such as Arabic, surround the text with appropriate markers: [Arabic: **] and leave it as scanned. Include the ** so the post-processor can address it later.

Fractions

Proofread fractions as follows: becomes 2-1/2. The hyphen prevents the whole and fractional part from becoming separated when the lines are rewrapped during post-processing.

Dashes, Hyphens, and Minus Signs

There are generally four such marks you will see in books:

  1. Hyphens. These are used to join words together, or sometimes to join prefixes or suffixes to a word.
    Leave these as a single hyphen, with no spaces on either side.
    Note that there is a common exception to this shown in the second example below.
  2. En-dashes. These are just a little longer, and are used for a range of numbers, or for a mathematical minus sign.
    Proofread these as a single hyphen, too. Spaces before or after are determined by the way it was done in the book; usually no spaces in number ranges, usually spaces around mathematical minus signs, sometimes both sides, sometimes just before.
  3. Em-dashes & long dashes. These serve as separators between words—sometimes for emphasis like this—or when a speaker gets a word caught in his throat——!
    Proofread these as two hyphens if the em-dash is short and four hyphens if the em-dash is long. Don't leave a space before or after, even if it looks like there was a space in the original book image.
  4. Deliberately Omitted or Censored Words or Names.
    Proofread these as 4 hyphens. When it represents a word, we leave appropriate space around it like it's really a word. If it's only part of a word, then no spaces—join it with the rest of the word. If the em-dash looks as if it is the size of the rest of the smaller em-dashes, then proof it as a single em-dash, i.e. two dashes.

Note: If an em-dash appears at the start or end of a line of your OCR'd text, join it with the other line so that there are no spaces or line breaks around it. Only if the author used an em-dash to start or end the paragraph or line of poetry or dialog should you leave it at the start or end of a line. See the examples below.

Examples—Dashes, Hyphens, and Minus Signs:

Original Image: Correctly Proofread Text: Type
semi-detached semi-detached Hyphen
three- and four-part harmony three- and four-part harmony Hyphens
discoveries which the Crus-
aders made and brought home with
discoveries which the Crusaders
made and brought home with
Hyphen
factors which mold char-
acter—environment, training and heritage,
factors which mold character--environment,
training and heritage,
Hyphen
See pages 21–25 See pages 21-25 En-dash
–14° below zero -14° below zero En-dash
X – Y = Z X - Y = Z En-dash
2–1/2 2-1/2 En-dash
I am hurt;—A plague
on both your houses!—I am dead.
I am hurt;--A plague
on both your houses!--I am dead.
Em-dash
sensations—sweet, bitter, salt, and sour
—if even all of these are simple tastes. What
sensations--sweet, bitter, salt, and sour--if
even all of these are simple tastes. What
Em-dash
senses—touch, smell, hearing, and sight—
with which we are here concerned,
senses--touch, smell, hearing, and sight--with
which we are here concerned,
Em-dash
It is the east, and Juliet is the sun!— It is the east, and Juliet is the sun!-- Em-dash
"Three hundred——" "years," she was going to say, but the left-hand cat interrupted her. "Three hundred----" "years," she was going to say, but the left-hand cat interrupted her. Longer Em-dash
As the witness Mr. —— testified, As the witness Mr. ---- testified, long dash
As the witness Mr. S—— testified, As the witness Mr. S---- testified, long dash
the famous detective of ——B Baker St. the famous detective of ----B Baker St. long dash
“You —— Yankee”, she yelled. "You ---- Yankee", she yelled. long dash
“I am not a d—d Yankee”, he replied. "I am not a d--d Yankee", he replied. Em-dash

End-of-line Hyphenation

Where a hyphen appears at the end of a line, join the two halves of the hyphenated word back together. If it is really a hyphenated word like well-meaning, join the two halves leaving the hyphen in between. But if it was just hyphenated because it wouldn't fit on the line, and is not a word that is usually hyphenated, then join the two halves and remove the hyphen. Keep the joined word on the top line, and put a line break after it to preserve the line formatting—this makes it easier for volunteers in later rounds. See the Dashes, Hyphens, and Minus Signs section of the Proofreading Guidelines for examples of each kind (nar-row turns into narrow, but low-lying keeps the hyphen). If the word is followed by punctuation, then carry that punctuation onto the top line, too.

Words like to-day and to-morrow that we don't commonly hyphenate now were often hyphenated in the old books we are working on. Leave them hyphenated the way the author did. If you're not sure if the author hyphenated it or not, leave the hyphen, put an * after it, and join the word together. Like this: to-*day. The asterisk will bring it to the attention of the post processor, who has access to all the pages, and can determine how the author typically wrote this word.

End-of-page Hyphenation

Proofread end-of-page hyphens or em-dashes by leaving the hyphen at the end of the last line, and mark it with a * after the hyphen.
For example, proofread:
 
       something Pat had already become accus-
as:
       something Pat had already become accus-*

On pages that start with part of a word from the previous page or an em-dash, place a * before the partial word or em-dash.
To continue the above example, proofread:
 
       tomed to from having to do his own family
as:
       *tomed to from having to do his own family

These markings indicate to the post-processor that the word must be rejoined when the pages are combined to produce the final e-book.

Paragraph Spacing/Indenting

Put a blank line to separate paragraphs. You should not indent the start of paragraphs, but if all paragraphs are already indented, don't bother removing those spaces—that can be done automatically during post-processing.

See the Sidenotes image/text for an example.

Multiple Columns

Proofread ordinary text that has been printed in two columns as a single column.

Spans of multiple-column text within single column sections should be proofread as a single column by placing the text from the left-most column first, the text from the next one after it, and so on. You do not need to mark where the columns were split, just re-join them.

See also the Index and Table sections of the Proofreading Guidelines.

Blank Page

Most blank pages, or pages with an illustration but no text, will already be marked with [Blank Page]. Leave this marking as is. If the page is blank, and [Blank Page] does not appear, there is no need to add it.

If there is text in the proofreading text area and a blank image, or if there is an image but no text, follow the directions for a Bad Image or Bad Text.

Page Headers/Page Footers

Remove page headers and page footers, but not footnotes, from the text.

The page headers are normally at the top of the image and have a page number opposite them. Page headers may be the same all through the book (often the title of the book and the author's name), they may be the same for each chapter (often the chapter number), or they may be different on each page (describing the action on that page). Remove them all, regardless, including the page number.

A chapter header will start further down the page and won't have a page number on the same line. See the next section for a specific example.


Sample Image:

Correctly Proofread Text:
In the United States?[*] In a railroad? In a mining company?
In a bank? In a church? In a college?

Write a list of all the corporations that you know or have
ever heard of, grouping them under the heads public and private.

How could a pastor collect his salary if the church should
refuse to pay it?

Could a bank buy a piece of ground "on speculation?" To
build its banking-house on? Could a county lend money if it
had a surplus? State the general powers of a corporation.
Some of the special powers of a bank. Of a city.

A portion of a man's farm is taken for a highway, and he is
paid damages; to whom does said land belong? The road intersects
the farm, and crossing the road is a brook containing
trout, which have been put there and cared for by the farmer;
may a boy sit on the public bridge and catch trout from that
brook? If the road should be abandoned or lifted, to whom
would the use of the land go?


CHAPTER XXXV.

Commercial Paper.

Kinds and Uses.--If a man wishes to buy some commodity
from another but has not the money to pay for
it, he may secure what he wants by giving his written
promise to pay at some future time. This written
promise, or note, the seller prefers to an oral promise
for several reasons, only two of which need be mentioned
here: first, because it is prima facie evidence of
the debt; and, second, because it may be more easily
transferred or handed over to some one else.

If J. M. Johnson, of Saint Paul, owes C. M. Jones,
of Chicago, a hundred dollars, and Nelson Blake, of
Chicago, owes J. M. Johnson a hundred dollars, it is
plain that the risk, expense, time and trouble of sending
the money to and from Chicago may be avoided,

* The United States: "Its charter, the constitution. * * * Its flag the
symbol of its power; its seal, of its authority."--Dole.

Chapter Headers

Proofread chapter headers as they appear in the text.

A chapter header may start a bit farther down the page than the page header and won't have a page number on the same line. Chapter Headers are often printed all caps; if so, keep them as all caps.

Watch out for a missing double quote at the start of the first paragraph, which some publishers did not include or which the OCR missed due to a large capital in the original. If the author started the paragraph with dialog, insert the double quote.

Illustrations

Proofread any caption text as it is printed, preserving the line breaks. If the caption falls in the middle of a paragraph, use blank lines to set it apart from the rest of the text. If there is no caption in the original text, then the mark-up of the illustration is left to the formatters.

Most pages with an illustration but no text will already be marked with [Blank Page]. Leave this marking as is.

Sample Image:

Correctly Proofread Text:

Martha told him that he had always been her ideal and
that she worshipped him.

Frontispiece
Her Weight in Gold


Sample Image: (Illustration in middle of paragraph)

Correctly Proofread Text:

such study are due to Italians. Several of these instruments
have already been described in this journal, and on the present

FIG. 1.--APPARATUS FOR THE STUDY OF HORIZONTAL
SEISMIC MOVEMENTS.

occasion we shall make known a few others that will
serve to give an idea of the methods employed.

For the observation of the vertical and horizontal motions
of the ground, different apparatus are required. The

Footnotes/Endnotes

Footnotes are placed out-of-line; that is, the text of the footnote is left at the bottom of the page and a tag placed where it is referenced in the text.

The number, letter, or other character that marks a footnote location should be surrounded with square brackets ([ and ]) and placed right next to the word being footnoted[1] or its punctuation mark,[2] as shown in the text and the two examples in this sentence.

When footnotes are marked with a series of special characters (*, †, ‡, §, etc.) we replace them all with [*] in the text, and * next to the footnote itself.

Proofread the footnote text as it is printed, preserving the line breaks. Leave the footnote text at the bottom of the page. Be sure to use the same tag in the footnote as you used in the text where the footnote was referenced.

Place each footnote on a separate line in order of appearance. Place a blank line between each footnote if there is more than one.

See the Page Headers/Page Footers image/text for an sample footnote.

If a footnote or endnote is referenced in the text but does not appear on that page, keep the footnote/endnote number or marker and don't be concerned. This is common in scientific and technical books, where footnotes are often grouped at the end of chapters. See "Endnotes" below.

Original Text:
The principal persons involved in this argument were Caesar1, former military
leader and Imperator, and the orator Cicero2. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

1 Gaius Julius Caesar.
2 Marcus Tullius Cicero.
Correctly Proofed Text:
The principal persons involved in this argument were Caesar[1], former military
leader and Imperator, and the orator Cicero[2]. Both were of the aristocratic
(Patrician) class, and were quite wealthy.

1 Gaius Julius Caesar.

2 Marcus Tullius Cicero.

In some books, footnotes are separated from the main text by a horizontal line. We don't keep this so please just leave a blank line between the main text and the footnotes. (See example above.)

Endnotes are just footnotes that have been located together at the end of a chapter or at the end of the book, instead of on the bottom of each page. These are proofread in the same manner as footnotes. Where you find an endnote reference in the text, retain the number or letter. If you are proofreading one of the ending pages with the endnotes text on it, put a blank line after each endnote so that it is clear where each begins and ends.

Footnotes in Poetry should be treated the same as other footnotes.

Footnotes in Tables should remain where they are in the original text.

Original Footnoted Poetry:
Mary had a little lamb1
   Whose fleece was white as snow
And everywhere that Mary went
   The lamb was sure to go!

1 This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.
Correctly Proofread Text:
Mary had a little lamb[1]
Whose fleece was white as snow
And everywhere that Mary went
The lamb was sure to go!

1 This lamb was obviously of the Hampshire breed,
well known for the pure whiteness of their wool.

Poetry/Epigrams

Insert a blank line at the start of the poetry or epigram and another blank line at the end, so that the formatters can clearly see the beginning and end.

Leave each line left-justified and maintain the line breaks. Do not try to center or indent the poetry. The formatters will do that part. Do insert a blank line between stanzas.

Footnotes in poetry should be treated the same as regular footnotes during proofreading. See footnotes for details.

Line Numbers in poetry should be kept. Separate them from the main text with a few spaces. See instructions on Line Numbers.

Check the Project Comments for the specific text you are proofreading.


Sample Image:

Correctly Proofread Text:
to the scenery of his own country:

Oh, to be in England
Now that April's there,
And whoever wakes in England
Sees, some morning, unaware,
That the lowest boughs and the brushwood sheaf
Round the elm-tree hole are in tiny leaf,
While the chaffinch sings on the orchard bough
In England--now!

And after April, when May follows,
And the whitethroat builds, and all the swallows!
Hark! where my blossomed pear-tree in the hedge
Leans to the field and scatters on the clover
Blossoms and dewdrops--at the bent spray's edge--
That's the wise thrush; he sings each song twice over,
Lest you should think he never could recapture
The first fine careless rapture!
And though the fields look rough with hoary dew,
All will be gay, when noontide wakes anew
The buttercups, the little children's dower;
--Far brighter than this gaudy melon-flower!

So it runs; but it is only a momentary memory;
and he knew, when he had done it, and to his

Paragraph Side-Descriptions (Sidenotes)

Some books will have short descriptions of the paragraph along the side of the text. These are called sidenotes. Proofread the sidenote text as it is printed, preserving the line breaks. Leave a blank line before and after the sidenote, so that it can be distinguished from the text around it. The OCR may place the sidenotes anywhere on the page, and may even intermingle the sidenote text with the rest of the text.. Separate them so that the sidenote text is all together, but don't worry about the position of the sidenotes on the page. The formatters will move them to the correct locations.
Sample Image:

Correctly Proofread Text:

Burning
discs
thrown into
the air.

that such as looked at the fire holding a bit of larkspur
before their face would be troubled by no malady of the
eyes throughout the year.[1] Further, it was customary at
Würzburg, in the sixteenth century, for the bishop's followers
to throw burning discs of wood into the air from a mountain
which overhangs the town. The discs were discharged by
means of flexible rods, and in their flight through the darkness
presented the appearance of fiery dragons.[2]

The Midsummer
fires in
Swabia.

In the valley of the Lech, which divides Upper Bavaria
from Swabia, the midsummer customs and beliefs are, or
used to be, very similar. Bonfires are kindled on the
mountains on Midsummer Day; and besides the bonfire
a tall beam, thickly wrapt in straw and surmounted by a
cross-piece, is burned in many places. Round this cross as
it burns the lads dance with loud shouts; and when the
flames have subsided, the young people leap over the fire in
pairs, a young man and a young woman together. If they
escape unsmirched, the man will not suffer from fever, and
the girl will not become a mother within the year. Further,
it is believed that the flax will grow that year as high as
they leap over the fire; and that if a charred billet be taken
from the fire and stuck in a flax-field it will promote the
growth of the flax.[3] Similarly in Swabia, lads and lasses,
hand in hand, leap over the midsummer bonfire, praying
that the hemp may grow three ells high, and they set fire
to wheels of straw and send them rolling down the hill.
Among the places where burning wheels were thus bowled
down hill at Midsummer were the Hohenstaufen mountains
in Wurtemberg and the Frauenberg near Gerhausen.[4]
At Deffingen, in Swabia, as the people sprang over the mid-*

Omens
drawn from
the leaps
over the
fires.

Burning
wheels
rolled
down hill.

1 Op. cit. iv. 1. p. 242. We have
seen (p. 163) that in the sixteenth
century these customs and beliefs were
common in Germany. It is also a
German superstition that a house which
contains a brand from the midsummer
bonfire will not be struck by lightning
(J. W. Wolf, Beiträge zur deutschen
Mythologie, i. p. 217, § 185).

2 J. Boemus, Mores, leges et ritus
omnium gentium (Lyons, 1541), p.
226.

3 Karl Freiherr von Leoprechting,
Aus dem Lechrain (Munich, 1855),
pp. 181 sqq.; W. Mannhardt, Der
Baumkultus, p. 510.

4 A. Birlinger, Volksthümliches aus
Schwaben (Freiburg im Breisgau, 1861-1862),
ii. pp. 96 sqq., § 128, pp. 103
sq., § 129; id., Aus Schwaben (Wiesbaden,
1874), ii. 116-120; E. Meier,
Deutsche Sagen, Sitten und Gebräuche
aus Schwaben (Stuttgart, 1852), pp.
423 sqq.; W. Mannhardt, Der Baumkultus,
p. 510.

Tables

A proofreader's job is to be sure that all the information in a table is correctly proofed. Details of formatting will be handled later in the process. Provide enough space between entries on a line to clearly indicate where each item ends and begins. Retain line breaks.

Footnotes in tables should remain where they are in the original. See footnotes for details.

Sample Image:

Correctly Proofread Text:
Deg. C.  Millimeters of Mercury. Gasolene.
Pure Benzene.

-10°  13.4  43.5
 0°  26.6  81.0
+10°  46.6  132.0
20°  76.3  203.0
40°  182.0  301.8

Sample Image:

Correctly Proofread Text:
TABLE II.

Flat strips compared   Copper   Copper
with round wire 30 cm.  Iron. Parallel wires 30 cm. in  Iron.
in length.             length.

Wire 1 mm. diameter   20  100  Wire 1 mm. diameter   20 100

        STRIPS.      SINGLE WIRE.
0.25 mm. thick, 2 mm.
  wide  15  35  0.25 mm. diameter   16   48
Same, 5 mm. wide       13  20  Two similar wires    12  30
 "   10  "    "   11   15  Four    "    "     9   18
 "   20  "    "    10  14  Eight  "    "   8   10
 "   40  "    "    9   13  Sixteen "    "     7    6
Same strip rolled up in  Same 16 wires bound
  the form of a wire  17   15    close together   18    12

Front/Back Title Page

Proofread all the text, just as it was printed on the page, whether all capitals, upper and lower case, etc., including the years of publication or copyright.

Older books often show the first letter as a large ornate graphic—proofread this as just the letter.

Sample Image:
title page image
Correctly Proofread Text:

GREEN FANCY

BY

GEORGE BARR McCUTCHEON

AUTHOR OF "GRAUSTARK," "THE HOLLOW OF HER HAND,"
"THE PRINCE OF GRAUSTARK," ETC.

WITH FRONTISPIECE BY
C. ALLAN GILBERT

NEW YORK
DODD, MEAD AND COMPANY.

1917

Table of Contents

Proofread the Table of Contents just as it is printed in the book, whether all capitals, upper and lower case, etc. Page numbers should be retained.

Ignore any periods or asterisks (leaders) used to align the page numbers. These will be removed later in the process.

Sample Image:

Correctly Proofread Text:

CONTENTS


CHAPTER                                         PAGE

I. THE FIRST WAYFARER AND THE SECOND WAYFARER
MEET AND PART ON THE HIGHWAY  ..... 1

II. THE FIRST WAYFARER LAYS HIS PACK ASIDE AND
FALLS IN WITH FRIENDS  .... ... 15

III. MR. RUSHCROFT DISSOLVES, MR. JONES INTERVENES,
AND TWO MEN RIDE AWAY      35

IV. AN EXTRAORDINARY CHAMBERMAID, A MIDNIGHT
TRAGEDY, AND A MAN WHO SAID "THANK YOU"   50

V. THE FARM-BOY TELLS A GHASTLY STORY, AND AN
IRISHMAN ENTERS  ..  .. 67

VI. CHARITY BEGINS FAR FROM HOME, AND A STROLL IN
THE WILDWOOD FOLLOWS      85

VII. SPUN-GOLD HAIR, BLUE EYES, AND VARIOUS ENCOUNTERS  ...   103

VIII. A NOTE, SOME FANCIES, AND AN EXPEDITION IN
QUEST OF FACTS  .. ,, 120

IX. THE FIRST WAYFARER, THE SECOND WAYFARER, AND
THE SPIRIT OF CHIVALRY ASCENDANT   ,  134

X. THE PRISONER OF GREEN FANCY, AND THE LAMENT OF
PETER THE CHAUFFEUR ...   ....148

XI. MR. SPROUSE ABANDONS LITERATURE AT AN EARLY
HOUR IN THE MORNING ..  ...  , 167

XII. THE FIRST WAYFARER ACCEPTS AN INVITATION, AND
MR. DILLINGFORD BELABORS A PROXY      183

XIII. THE SECOND WAYFARER RECEIVES TWO VISITORS AT
MIDNIGHT  ,,,..  .... 199

XIV. A FLIGHT, A STONE-CUTTER'S SHED, AND A VOICE
OUTSIDE   ,,,..  ...., 221

Indexes

Please retain page numbers in index pages. You don't need to align the numbers as they appear in the scan; just make sure that the numbers and punctuation match the scan and retain the line breaks.

Specific formatting of indexes will occur later in the process. The proofreader's job is to be sure that all the text and numbers are correct.

Plays: Actor Names/Stage Directions

For all plays:

  • In dialogue, treat a change in speaker as a new paragraph, with one blank line between.
  • Stage directions are formatted as they are in the original text.
    If the stage direction is on a line by itself, proofread it that way; if it is at the end of a line of dialogue, leave it there.
    Stage directions often begin with an opening bracket and omit the closing bracket. This convention is retained; do not close the brackets.
  • Sometimes, especially in metrical plays, a word is split due to page-size constraints and placed above or below following a (, rather than having a line of its own. Please treat this as a normal end-of-line reattachment.
    See the example.

Please check the Project Comments, as the Project Manager may specify different formatting.

Sample Image:
title page image
Correctly Proofread Text:

Has not his name for nought, he will be trode upon:
What says my Printer now?

Clow. Here's your last Proof, Sir.
You shall have perfect Books now in a twinkling.

Lap. These marks are ugly.

Clow. He says, Sir, they're proper:
Blows should have marks, or else they are nothing worth.

Lap. But why a Peel-crow here?

Clow. I told 'em so Sir:
A scare-crow had been better.

Lap. How slave? look you, Sir,
Did not I say, this Whirrit, and this Bob,
Should be both Pica Roman.

Clow. So said I, Sir, both Picked Romans,
And he has made 'em Welch Bills,
Indeed I know not what to make on 'em.

Lap. Hay-day; a Souse, Italica?

Clow. Yes, that may hold, Sir,
Souse is a bona roba, so is Flops too.


Sample Image:
Plays image
Correctly Proofread Text:

Am. Sure you are fasting;
Or not slept well to night; some dream (Ismena?)

Ism. My dreams are like my thoughts, honest and innocent,
Yours are unhappy; who are these that coast us?
You told me the walk was private.

Anything else that needs special handling or that you're unsure of

While proofreading, if you encounter something that isn't covered in these guidelines that you think needs special handling or that you are not sure how to handle, post your question, noting the png (page) number, in the Project Discussion thread (a link to the project-specific forum is in the Project Comments), and put a note in the proofread text explaining the problem. Your note will explain to the next proofreader, formatter or post-processor what the problem or question is.

Start your note with a square bracket and two asterisks [** and end it with another square bracket ]. This clearly separates it from the Author's text and signals the Post-Processor to stop and carefully examine this part of the text & the matching image to address any issues. Agreement or disagreement can be added, but even if you know the answer, you absolutely must not remove the comment. If you have found a source which clarifies the problem, please cite it so the post-processor can also refer to it.

If you are proofreading in a later round and come across a note from a proofreader in a previous round that you know the answer to, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.

Previous Proofreaders' Notes/Comments

Any notes or comments put in by a previous volunteer must be left in place. You may add agreement or disagreement to the existing note but even if you know the answer, you absolutely must not remove the comment. If you have found a source which clarifies the problem, please cite it so the post-processor can also refer to it.

If you are formatting in a later round and come across a note from a volunteer in a previous round that you know the answer to, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.

 

Common Problems

OCR Problems: 1-l-I

OCR commonly has trouble distinguishing between the digit '1' (one), the lowercase letter 'l' (ell), and the uppercase letter 'I'. This is especially true for books where the pages may be in poor condition.

Watch out for these. Read the context of the sentence to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.

Noticing these is much easier if you use a mono-spaced font such as DPCustomMono or Courier.

OCR Problems: 0-O

OCR commonly has trouble distinguishing between the digit '0' (zero), and the uppercase letter 'O'. This is especially true for books where the pages may be in poor condition.

Watch out for these. Normally the context of the sentence is sufficient to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.

Noticing these is much easier if you use a mono-spaced font such as DPCustomMono or Courier.

OCR Problems: Scannos

Another common OCR issue is misrecognition of characters. We call these errors "scannos" (like "typos"). This misrecognition can create errors in the text:

  • A word that appears to be correct at first glance, but is actually misspelled.
    This can usually be caught by running the spellcheck from the proofreading interface.
  • A word that is changed to a different but otherwise valid word that does not match what is in the page image.
    These are subtle because they can only be caught by someone actually reading the text."

Possibly the most common example of the second type is "and" being OCR'd as "arid." Other examples: "eve" for "eye", "Torn" for "Tom", "train" for "tram". This type is harder to spot and we have a special term for them: "Stealth Scannos." We collect examples of Stealth Scannos in this thread.

Spotting scannos is much easier if you use a mono-spaced font such as DPCustomMono or Courier.

Handwritten Notes in Book

Do not include handwritten notes in a book (unless it is overwriting faded, printed text to make it more visible). Do not include handwritten marginal notes made by readers, etc.

Bad Images

If an image is bad (not loading, chopped off, unable to be read), please put a post about this bad image in the Project Comments forum. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.

Note that some page images are quite large, and it is common for your browser to have difficulty displaying them, especially if you have several windows open or are using an older computer. Before reporting this as a bad page, try clicking on the "Image" line on the bottom of the page to bring up just the image in a new window. If that brings up a good image, then the problem is probably in your browser or system.

It's fairly common for the image to be good, but the OCR scan is missing the first line or two of the text. Please just type in the missing line(s). If nearly all of the lines are missing in the scan, then either type in the whole page (if you are willing to do that), or just click on the "Return Page to Round" button and the page will be reissued to someone else. If there are several pages like this, you might post a note in the Project Comments forum to notify the Project Manager.

Wrong Image for Text

If there is a wrong image for the text given, please put a post about this bad image in the Project Comments forum. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.

Previous Proofreader Mistakes

If the previous proofreader made a lot of mistakes or missed a lot of things, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation so that they will know how in the future.

Please be nice! Everyone here is a volunteer and presumably trying their best. The point of your feedback message should be to inform them of the correct way to proofread, rather than to criticize them. Give a specific example from their work showing what they did, and what they should have done.

If the previous proofreader did an outstanding job, you can also send them a message about that—especially if they were working on a particularly difficult page.

Printer Errors/Misspellings

Correct all of the words that the OCR has misread (scannos), but do not correct what may appear to you to be misspellings or printer errors that occur on the scanned image. Many of the older texts have words spelled differently from modern usage and we retain these older spellings, including any accented characters.

If you are unsure, place a note in the txet [**typo for text?] and ask in the Project Discussion thread. If you do make a change, include a note describing what you changed: [**Transcriber's Note: typo fixed, changed from "txet" to "text"]. Include the two asterisks ** so the post-processor will notice it.

Factual Errors in Texts

In general, don't correct factual errors in the author's book. Many of the books we are proofreading have statements of fact in them that we no longer accept as accurate. Leave them as the author wrote them.

A possible exception is in technical or scientific books, where a known formula or equation may be given incorrectly, especially if it is shown correctly on other pages of the book. Notify the Project Manager about these, either by sending them a message via the Forum, or by inserting [**note sic explain-your-concern] at that point in the text.

Uncertain Items

      [...to be completed...]

  Return to: Distributed Proofreaders home page,     DP FAQ Central page,     Project Gutenberg home page.
 
Copyright Distributed Proofreaders (Page Build Time: 0.011) Report a Bug