![]() |
29 titles preserved for the world! |
DP |
Proofreading GuidelinesVersion 1.9.c, generated January 1, 2006 (Revision History)Proofreading Guidelines in French /
Directives de Formatage en français
|
This document is written to explain the proofreading rules we use to maintain consistency when proofreading a single book that is distributed among many proofreaders, each of whom is working on different pages. This helps us all do proofreading the same way, which in turn makes it easier for the formatter and for the post-processor who will complete the work on this e-book.
It is not intended as any kind of a general editorial or typesetting rulebook.
We've included in this document all the items that new users have asked about while proofreading. If there are any items missing, or items that you consider should be done differently, or if something is vague, please let us know.
This document is a work in progress. Help us to progress by posting your suggested changes in the Documentation Forum in this thread.
On the proofreading interface page (Project Page) where you start proofreading pages, there is a section called "Project Comments" containing information specific to that project (book). Read these before you start proofreading pages! If the Project Manager wants you to do something in this book differently from the way specified in these Guidelines, that will be noted here. Instructions in the Project Comments override the rules in these Guidelines, so follow them. There may also be instructions in the project comments that apply to the formatting phase, which do not apply during proofing. Finally, this is also where the Project Manager may give you interesting tidbits of information about the author or the project.
Please also read the Project Thread(Forum): The Project Manager may clarify project-specific guidelines here, and it is often used by proofreaders to alert other proofreaders to recurring issues within the project and how they can best be addressed. (See below).
On the Project Page, the link 'Images, Pages Proofread, & Differences' allows you to see how other proofreaders have made changes. This Forum thread discusses different ways to use this information.
On the proofreading interface page (Project Page) where you start proofreading pages, on the line "Forum", there is a link titled "Discuss this Project" (if the discussion has already started), or "Start a discussion on this Project" (if it hasn't). Clicking on that link will take you to a thread in the projects forum dedicated to this specific project. That is the place to ask questions about this book, inform the Project Manager about problems, etc. Using this project forum thread is the recommended way to communicate with the Project Manager and other proofreaders who are working on this book.
When you select a project for proofreading, the Project Comments page is loaded. This page contains links to pages from this project that you have recently proofread. (If you haven't proofread any pages yet, there will be no links shown.)
Pages listed under either "DONE" or "IN PROGRESS" are available to make proofreading corrections or to finish proofreading. Just click on the link to the page. So if you discover that you made a mistake on a page, or marked something incorrectly, you can click on that page here and re-open it to fix the error.
You may also use the "Images, Pages Proofread, & Differences" or "Just My Pages" links on the Project Comments page. These pages will display an "Edit" link next to the pages you have worked on in the current round that can still be corrected.
For more detailed information, refer to either the Standard Proofreading Interface Help or the Enhanced Proofreading Interface Help, depending on which interface you are using.
How to proof... |
Leave all line breaks in so that later in the process other volunteers can easily compare the lines in the text to the lines in the image. If the previous proofreader removed the line breaks, please replace them so that they once again match the image.
Proofread these as plain ASCII " double quotes. Do not change double quotes to single quotes. Leave them as the Author wrote them.
For quotes from non-English languages, use the quotation marks appropriate to that language if they are available in the Latin-1 character set. The French equivalent, guillemets, «like this», are available from the pulldown menus in the proofreading interface, since they are part of Latin-1. The quotation marks used in some German texts, „like this” are not available in the pulldown menus, as they are not in Latin-1. The Project Manager may instruct you in the Project Comments to proofread non-English language quotation marks differently for a particular book.
Proofread these as the plain ASCII ' single quote (apostrophe). Do not change single quotes to double quotes. Leave them as the Author wrote them.
Proofread quotation marks at the beginning of each line of a quotation by removing all of them except for the one at the start of the first line of the quotation.
If the quotation goes on for multiple paragraphs, each paragraph should have an opening quote mark on the first line of the paragraph.
Often there is no closing quotation mark until the very end of the quoted section of text, which may not be on the same page you are proofreading. Leave it that way—do not add closing quotation marks that are not in the page image.
Proofread periods that end sentences with a single space after them.
You do not need to remove extra spaces after periods if they're already in the scanned text—we can do that automatically during post-processing. See the Sidenotes image and text for an example.
In general, there should be no space before punctuation characters except opening quotation marks. If scanned text has a space before punctuation, remove it.
Spaces before punctuation sometimes appear because books typeset in the 1700's & 1800's often used partial spaces before punctuation such as a semicolon or comma.
Scanned Text: |
---|
and so it goes ; ever and ever. |
Correctly Proofread Text: |
and so it goes; ever and ever. |
The guidelines are different for English and Languages Other Than English (LOTE).
ENGLISH: Leave a space before the three dots, and a space after. The exception is at the end of a sentence, when there would be no space, four dots, and a space after. This is also the case for any other ending punctuation mark: the 3 dots follow immediately, without any space.
For example:
That I know ... is true.
This is the end....
Wherefore art thou Romeo?...
Sometimes you will see it with the punctuation at the end, so proofread it that way:
Wherefore art thou Romeo...?
Remove extra dots, if any, or add new ones, if necessary, to bring the number to three (or four) as appropriate.
LOTE: (Languages Other Than English) Use the general rule "Follow closely the style used in the printed page." In particular, insert spaces, if there are spaces before or between the periods, and use the same number of periods as appear in the image. Sometimes the printed page is unclear: in that case, insert a [**unclear] to draw the attention of the post-processor. (Note: Post-Processors should replace those regular spaces with non-breaking spaces.)
Remove any extra space in contractions: for example, would n't should be proofread as wouldn't.
This was often an early printers' convention, where the space was retained to indicate that 'would' and 'not' were originally separate words. It is also sometimes an artifact of the OCR. Remove the extra space in either case.
Some Project Managers may specify in the Project Comments not to remove extra spaces in contractions, particularly in the case of texts that contain slang, dialect, or are written in languages other than English.
Extra spaces and tab characters between words are common in OCR output. You don't need to bother removing these—that can be done automatically during post-processing.
However, extra spaces around punctuation, em-dashes, quote marks, etc. do need to be removed when they separate the symbol from the word.
For example, in A horse ; my kingdom for a horse. the space between the word "horse" and the semicolon should be removed. But the 2 spaces after the semicolon are fine—you don't have to delete one of them.
Do not bother inserting spaces at the ends of lines of text. It is a waste of your time for something that we can take care of automatically later. Similarly do not waste your time removing extra spaces at the ends of lines.
Keep line numbers. Use a few spaces to separate them from the other text on the line so that the formatters can easily find them.
Line numbers are numbers in the margin for each line, or sometimes every fifth or tenth line, and are common in books of poetry. Since poetry will not be reformatted in the e-book version, the line numbers will be useful to readers.
Italicized text may occasionally appear with <i> inserted at the start and </i> inserted at the end of the italics. Bold text (text printed in a heavier typeface) may occasionally appear with <b> inserted before the bold text and </b> after it. Do not remove this formatting information, unless it surrounds junk that does not appear on the page. Do not add it where it does not appear. The formatters will do that later in the process.
Older books often abbreviated words as contractions, and printed them as
superscripts: for example,
Genrl Washington defeated Ld Cornwall's army.
Proofread these by inserting an up-arrow followed by the superscripted text, like this:
Gen^rl Washington defeated L^d Cornwall's army.
Subscripted text is often found in scientific works, but is not common in other
material. Proofread subscripted text by inserting an underline character _.
For example:
H2O.
would be proofread as
H_2O.
Do not mark changes in font size. The formatters will take care of this later in the process.
Small caps (capital letters which are smaller than the standard capitals) may occasionally appear with <sc> inserted before the small caps and </sc> after the small caps. Once again, do not remove this formatting information, unless it surrounds junk that does not appear on the page. Do not add it where it does not appear. The formatters will do that later in the process. Please proof only the characters in small caps. Do not worry about case changes. If they are already ALL-CAPPED, Mixed-Cased, or lower-cased, leave them ALL-CAPPED, Mixed-Cased, or lower-cased.
Proofread large and ornate graphic first letters of a chapter, section, or paragraph as just the letter.
Please proofread these using the proper accented Latin-1 characters, where possible. See Diacritical marks for ways to proof some non-Latin-1 characters.
There are several ways of inputting accented characters:
Project Gutenberg will post as a minimum, 7-bit ASCII versions of texts, but versions using other character encodings which can preserve more of the information from the original text are accepted. Currently for Distributed Proofreaders this means using Latin-1 or ISO 8859-1 and -15, and in the future will include Unicode.
For Windows:
Windows Shortcuts for Latin-1 symbols | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
` grave | ´ acute (aigu) | ^ circumflex | ~ tilde | ¨ umlaut | ° ring | Æ ligature | |||||||
à | Alt-0224 | á | Alt-0225 | â | Alt-0226 | ã | Alt-0227 | ä | Alt-0228 | å | Alt-0229 | æ | Alt-0230 |
À | Alt-0192 | Á | Alt-0193 | Â | Alt-0194 | Ã | Alt-0195 | Ä | Alt-0196 | Å | Alt-0197 | Æ | Alt-0198 |
è | Alt-0232 | é | Alt-0233 | ê | Alt-0234 | ë | Alt-0235 | ||||||
È | Alt-0200 | É | Alt-0201 | Ê | Alt-0202 | Ë | Alt-0203 | ||||||
ì | Alt-0236 | í | Alt-0237 | î | Alt-0238 | ï | Alt-0239 | ||||||
Ì | Alt-0204 | Í | Alt-0205 | Î | Alt-0206 | Ï | Alt-0207 | / slash | Œ ligature | ||||
ò | Alt-0242 | ó | Alt-0243 | ô | Alt-0244 | õ | Alt-0245 | ö | Alt-0246 | ø | Alt-0248 | œ | Use [oe] |
Ò | Alt-0210 | Ó | Alt-0211 | Ô | Alt-0212 | Õ | Alt-0213 | Ö | Alt-0214 | Ø | Alt-0216 | Œ | Use [OE] |
ù | Alt-0249 | ú | Alt-0250 | û | Alt-0251 | ü | Alt-0252 | ||||||
Ù | Alt-0217 | Ú | Alt-0218 | Û | Alt-0219 | Ü | Alt-0220 | currency | mathematics | ||||
ñ | Alt-0241 | ÿ | Alt-0255 | ¢ | Alt-0162 | ± | Alt-0177 | ||||||
Ñ | Alt-0209 | Ÿ | Alt-0159 | £ | Alt-0163 | × | Alt-0215 | ||||||
çedilla | Icelandic | marks | accents | punctuation | ¥ | Alt-0165 | ÷ | Alt-0247 | |||||
ç | Alt-0231 | Þ | Alt-0222 | © | Alt-0169 | ´ | Alt-0180 | ¿ | Alt-0191 | $ | Alt-0036 | ¬ | Alt-0172 |
Ç | Alt-0199 | þ | Alt-0254 | ® | Alt-0174 | ¨ | Alt-0168 | ¡ | Alt-0161 | ¤ | Alt-0164 | ° | Alt-0176 |
superscripts | Ð | Alt-0208 | ™ | Alt-0153 | ¯ | Alt-0175 | « | Alt-0171 | µ | Alt-0181 | |||
¹ | Alt-0185 | ð | Alt-0240 | ¶ | Alt-0182 | ¸ | Alt-0184 | » | Alt-0187 | ordinals | ¼ 1 | Alt-0188 | |
² | Alt-0178 | sz ligature | § | Alt-0167 | · | Alt-0183 | º | Alt-0186 | ½ 1 | Alt-0189 | |||
³ | Alt-0179 | ß | Alt-0223 | ¦ | Alt-0166 | * | Alt-0042 | ª | Alt-0170 | ¾ 1 | Alt-0190 |
1Unless specifically requested by the Project Comments, please do not use the fraction symbols, but instead use the guidelines for Fractions. (1/2, 1/4, 3/4, etc.)
For Apple Macintosh:
Apple Mac Shortcuts for Latin-1 symbols | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
` grave | ´ acute (aigu) | ^ circumflex | ~ tilde | ¨ umlaut | ° ring | Æ ligature | |||||||
à | Opt-`, a | á | Opt-e, a | â | Opt-i, a | ã | Opt-n, a | ä | Opt-u, a | å | Opt-a | æ | Opt-' |
À | Opt-~, A | Á | Opt-e, A | Â | Opt-i, A | Ã | Opt-n, A | Ä | Opt-u, A | Å | Opt-A | Æ | Opt-" |
è | Opt-~, e | é | Opt-e, e | ê | Opt-i, e | ë | Opt-u, e | ||||||
È | Opt-~, E | É | Opt-e, E | Ê | Opt-i, E | Ë | Opt-u, E | ||||||
ì | Opt-~, i | í | Opt-e, i | î | Opt-i, i | ï | Opt-u, i | ||||||
Ì | Opt-~, I | Í | Opt-e, I | Î | Opt-i, I | Ï | Opt-u, I | / slash | Œ ligature | ||||
ò | Opt-~, o | ó | Opt-e, o | ô | Opt-i, o | õ | Opt-n, o | ö | Opt-u, o | ø | Opt-o | œ | Use [oe] |
Ò | Opt-~, O | Ó | Opt-e, O | Ô | Opt-i, O | Õ | Opt-n, O | Ö | Opt-u, O | Ø | Opt-O | Œ | Use [OE] |
ù | Opt-~, u | ú | Opt-e, u | û | Opt-i, u | ü | Opt-u, u | ||||||
Ù | Opt-~, U | Ú | Opt-e, U | Û | Opt-i, U | Ü | Opt-u, U | currency | mathematics | ||||
ñ | Opt-n, n | ÿ | Opt-u, y | ¢ | Opt-4 | ± | Opt-+ | ||||||
Ñ | Opt-n, N | Ÿ | Opt-u, Y | £ | Opt-3 | × | (none) † | ||||||
çedilla | Icelandic | marks | accents | punctuation | ¥ | Opt-y | ÷ | Opt-/ | |||||
ç | Opt-c | Þ | (none) ‡ | © | Opt-g | ´ | Opt-E | ¿ | Opt-? | $ | Shift-4 | ¬ | Opt-l |
Ç | Opt-C | þ | Shift-Opt-6 | ® | Opt-r | ¨ | Opt-U | ¡ | Opt-1 | ¤ | Shift-Opt-2 | ° | Opt-* |
superscripts | Ð | (none) ‡ | ™ | Opt-2 | ¯ | Shift-Opt-, | « | Opt-\ | µ | Opt-m | |||
¹ | (none) ‡ | ð | (none) ‡ | ¶ | Opt-7 | ¸ | Opt-Z | » | Shift-Opt-\ | ordinals | ¼ | (none) ‡1 | |
² | (none) ‡ | sz ligature | § | Opt-6 | · | Opt-8 | º | Opt-0 | ½ | (none) ‡1 | |||
³ | (none) ‡ | ß | Opt-s | ¦ | (none) ‡ | * | (none) ‡ | ª | Opt-9 | ¾ | (none) ‡1 |
‡ Note: No equivalent shortcut, use drop-down menus.
1Unless specifically requested by the Project Comments, please do not use the fraction symbols, but instead use the guidelines for Fractions. (1/2, 1/4, 3/4, etc.)
In some projects, you will find characters with special marks either above or below the normal Latin A..Z character. These are called diacritical marks and indicate a special pronunciation for this character. For proofreading, we indicate them in our normal ASCII text by using a specific coding, such as: ă becomes [)a] for a breve (the u-shaped accent) above an a, or [a)] for a breve below.
Be sure to include the square brackets ([ ]) around these, so the post-processor knows to which letter it applies. He or she will eventually replace these with whatever symbol works in each version of the text they produce, like 7-bit ASCII, 8-bit, Unicode, html, etc.
Note that when some of these marks appear on some characters (mainly vowels) our standard Latin-1 character set already includes that character with the diacritical mark. In those cases, use the Latin-1 character (see here), available from the drop-down lists in the proofreading interface.
The table below lists the special codings currently used:
The "x" represents a character with a diacritical mark.
When proofreading, use the actual character from the text, not the x shown in the examples.
Proofreading Symbols for Diacritical Marks | |||
---|---|---|---|
diacritical mark | sample | above | below |
macron (straight line) | ¯ | [=x] | [x=] |
2 dots (diaresis, umlaut) | ¨ | [:x] | [x:] |
1 dot | · | [.x] | [x.] |
grave accent | ` | [`x] or [\x] | [x`] or [x\] |
acute accent (aigu) | ´ | ['x] or [/x] | [x'] or [x/] |
circumflex | ˆ | [^x] | [x^] |
caron (v-shaped symbol) | ∨ | [vx] | [xv] |
breve (u-shaped symbol) | ∪ | [)x] | [x)] |
tilde | ˜ | [~x] | [x~] |
cedilla | ¸ | [,x] | [x,] |
Some projects contain text printed in non-Latin characters--that is, characters other than the Latin A...Z—for example, Greek, Cyrillic, Hebrew, or Arabic.
These characters should be entered in the text just as Latin characters are. (WITHOUT transliteration!)
If a document is written entirely in a non-Latin script, it is the best to install a keyboard driver which supports the language. Consult your operating system manual for instructions on how to do that.
If the script appears only occasionally, you may use a separate program to enter it. See above for some of the programs.
If you are uncertain about a character or an accent, mark it with an * to bring it to the attention of the second round proofreader or the post-processor.
For scripts which cannot be so easily entered, such as Arabic, surround the text with appropriate markers: [Arabic: **] and leave it as scanned. Include the ** so the post-processor can address it later.
Proofread fractions as follows: 2½ becomes 2-1/2. The hyphen prevents the whole and fractional part from becoming separated when the lines are rewrapped during post-processing.
There are generally four such marks you will see in books:
Note: If an em-dash appears at the start or end of a line of your OCR'd text, join it with the other line so that there are no spaces or line breaks around it. Only if the author used an em-dash to start or end the paragraph or line of poetry or dialog should you leave it at the start or end of a line. See the examples below.
Examples—Dashes, Hyphens, and Minus Signs:
Original Image: | Correctly Proofread Text: | Type |
---|---|---|
semi-detached | semi-detached | Hyphen |
three- and four-part harmony | three- and four-part harmony | Hyphens |
discoveries which the Crus- aders made and brought home with |
discoveries which the Crusaders made and brought home with |
Hyphen |
factors which mold char- acter—environment, training and heritage, |
factors which mold character--environment, training and heritage, | Hyphen |
See pages 21–25 | See pages 21-25 | En-dash |
–14° below zero | -14° below zero | En-dash |
X – Y = Z | X - Y = Z | En-dash |
2–1/2 | 2-1/2 | En-dash |
I am hurt;—A plague on both your houses!—I am dead. |
I am hurt;--A plague on both your houses!--I am dead. |
Em-dash |
sensations—sweet, bitter, salt, and sour —if even all of these are simple tastes. What |
sensations--sweet, bitter, salt, and sour--if even all of these are simple tastes. What |
Em-dash |
senses—touch, smell, hearing, and sight— with which we are here concerned, |
senses--touch, smell, hearing, and sight--with which we are here concerned, |
Em-dash |
It is the east, and Juliet is the sun!— | It is the east, and Juliet is the sun!-- | Em-dash |
"Three hundred——" "years," she was going to say, but the left-hand cat interrupted her. | "Three hundred----" "years," she was going to say, but the left-hand cat interrupted her. | Longer Em-dash |
As the witness Mr. —— testified, | As the witness Mr. ---- testified, | long dash |
As the witness Mr. S—— testified, | As the witness Mr. S---- testified, | long dash |
the famous detective of ——B Baker St. | the famous detective of ----B Baker St. | long dash |
“You —— Yankee”, she yelled. | "You ---- Yankee", she yelled. | long dash |
“I am not a d—d Yankee”, he replied. | "I am not a d--d Yankee", he replied. | Em-dash |
Where a hyphen appears at the end of a line, join the two halves of the hyphenated word back together. If it is really a hyphenated word like well-meaning, join the two halves leaving the hyphen in between. But if it was just hyphenated because it wouldn't fit on the line, and is not a word that is usually hyphenated, then join the two halves and remove the hyphen. Keep the joined word on the top line, and put a line break after it to preserve the line formatting—this makes it easier for volunteers in later rounds. See the Dashes, Hyphens, and Minus Signs section of the Proofreading Guidelines for examples of each kind (nar-row turns into narrow, but low-lying keeps the hyphen). If the word is followed by punctuation, then carry that punctuation onto the top line, too.
Words like to-day and to-morrow that we don't commonly hyphenate now were often hyphenated in the old books we are working on. Leave them hyphenated the way the author did. If you're not sure if the author hyphenated it or not, leave the hyphen, put an * after it, and join the word together. Like this: to-*day. The asterisk will bring it to the attention of the post processor, who has access to all the pages, and can determine how the author typically wrote this word.
Proofread end-of-page hyphens or em-dashes by leaving the hyphen at the end of the last line, and mark it with a * after
the hyphen.
For example, proofread:
something Pat had already become accus-
as:
something Pat had already become accus-*
On pages that start with part of a word from the previous page or an em-dash, place a * before the partial word or em-dash.
To continue the above example, proofread:
tomed to from having to do his own family
as:
*tomed to from having to do his own family
These markings indicate to the post-processor that the word must be rejoined when the pages are combined to produce the final e-book.
Put a blank line to separate paragraphs. You should not indent the start of paragraphs, but if all paragraphs are already indented, don't bother removing those spaces—that can be done automatically during post-processing.
See the Sidenotes image/text for an example.
Proofread ordinary text that has been printed in two columns as a single column.
Spans of multiple-column text within single column sections should be proofread as a single column by placing the text from the left-most column first, the text from the next one after it, and so on. You do not need to mark where the columns were split, just re-join them.
See also the Index and Table sections of the Proofreading Guidelines.
Most blank pages, or pages with an illustration but no text, will already be marked with [Blank Page]. Leave this marking as is. If the page is blank, and [Blank Page] does not appear, there is no need to add it.
If there is text in the proofreading text area and a blank image, or if there is an image but no text, follow the directions for a Bad Image or Bad Text.
Remove page headers and page footers, but not footnotes, from the text.
The page headers are normally at the top of the image and have a page number opposite them. Page headers may be the same all through the book (often the title of the book and the author's name), they may be the same for each chapter (often the chapter number), or they may be different on each page (describing the action on that page). Remove them all, regardless, including the page number.
A chapter header will start further down the page and won't have a page number on the same line. See the next section for a specific example.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Proofread chapter headers as they appear in the text.
A chapter header may start a bit farther down the page than the page header and won't have a page number on the same line. Chapter Headers are often printed all caps; if so, keep them as all caps.
Watch out for a missing double quote at the start of the first paragraph, which some publishers did not include or which the OCR missed due to a large capital in the original. If the author started the paragraph with dialog, insert the double quote.
Proofread any caption text as it is printed, preserving the line breaks. If the caption falls in the middle of a paragraph, use blank lines to set it apart from the rest of the text. If there is no caption in the original text, then the mark-up of the illustration is left to the formatters.
Most pages with an illustration but no text will already be marked with [Blank Page]. Leave this marking as is.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Sample Image: (Illustration in middle of paragraph) | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Footnotes are placed out-of-line; that is, the text of the footnote is left at the bottom of the page and a tag placed where it is referenced in the text.
The number, letter, or other character that marks a footnote location should be surrounded with square brackets ([ and ]) and placed right next to the word being footnoted[1] or its punctuation mark,[2] as shown in the text and the two examples in this sentence.
When footnotes are marked with a series of special characters (*, †, ‡, §, etc.) we replace them all with [*] in the text, and * next to the footnote itself.
Proofread the footnote text as it is printed, preserving the line breaks. Leave the footnote text at the bottom of the page. Be sure to use the same tag in the footnote as you used in the text where the footnote was referenced.
Place each footnote on a separate line in order of appearance. Place a blank line between each footnote if there is more than one.
See the Page Headers/Page Footers image/text for an sample footnote.
If a footnote or endnote is referenced in the text but does not appear on that page, keep the footnote/endnote number or marker and don't be concerned. This is common in scientific and technical books, where footnotes are often grouped at the end of chapters. See "Endnotes" below.
Original Text: | |
---|---|
|
|
Correctly Proofed Text: | |
|
In some books, footnotes are separated from the main text by a horizontal line. We don't keep this so please just leave a blank line between the main text and the footnotes. (See example above.)
Endnotes are just footnotes that have been located together at the end of a chapter or at the end of the book, instead of on the bottom of each page. These are proofread in the same manner as footnotes. Where you find an endnote reference in the text, retain the number or letter. If you are proofreading one of the ending pages with the endnotes text on it, put a blank line after each endnote so that it is clear where each begins and ends.
Footnotes in Poetry
should be treated the same as other footnotes.
Footnotes in Tables should remain where they are in the original text.
Original Footnoted Poetry: | |
---|---|
|
|
Correctly Proofread Text: | |
|
Insert a blank line at the start of the poetry or epigram and another blank line at the end, so that the formatters can clearly see the beginning and end.
Leave each line left-justified and maintain the line breaks. Do not try to center or indent the poetry. The formatters will do that part. Do insert a blank line between stanzas.
Footnotes in poetry should be treated the same as regular footnotes during proofreading. See footnotes for details.
Line Numbers in poetry should be kept. Separate them from the main text with a few spaces. See instructions on Line Numbers.
Check the Project Comments for the specific text you are proofreading.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Some books will have short descriptions of the paragraph along the side of the text. These are called sidenotes. Proofread the sidenote text as it is printed, preserving the line breaks. Leave a blank line before and after the sidenote, so that it can be distinguished from the text around it. The OCR may place the sidenotes anywhere on the page, and may even intermingle the sidenote text with the rest of the text.. Separate them so that the sidenote text is all together, but don't worry about the position of the sidenotes on the page. The formatters will move them to the correct locations.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
A proofreader's job is to be sure that all the information in a table is correctly proofed. Details of formatting will be handled later in the process. Provide enough space between entries on a line to clearly indicate where each item ends and begins. Retain line breaks.
Footnotes in tables should remain where they are in the original. See footnotes for details.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Proofread all the text, just as it was printed on the page, whether all capitals, upper and lower case, etc., including the years of publication or copyright.
Older books often show the first letter as a large ornate graphic—proofread this as just the letter.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Proofread the Table of Contents just as it is printed in the book, whether all capitals, upper and lower case, etc. Page numbers should be retained.
Ignore any periods or asterisks (leaders) used to align the page numbers. These will be removed later in the process.
Sample Image: | |
---|---|
|
|
Correctly Proofread Text: | |
|
Please retain page numbers in index pages. You don't need to align the numbers as they appear in the scan; just make sure that the numbers and punctuation match the scan and retain the line breaks.
Specific formatting of indexes will occur later in the process. The proofreader's job is to be sure that all the text and numbers are correct.
For all plays:
Please check the Project Comments, as the Project Manager may specify different formatting.
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
Sample Image: | |
---|---|
![]() |
|
Correctly Proofread Text: | |
|
While proofreading, if you encounter something that isn't covered in these guidelines that you think needs special handling or that you are not sure how to handle, post your question, noting the png (page) number, in the Project Discussion thread (a link to the project-specific forum is in the Project Comments), and put a note in the proofread text explaining the problem. Your note will explain to the next proofreader, formatter or post-processor what the problem or question is.
Start your note with a square bracket and two asterisks [** and end it with another square bracket ]. This clearly separates it from the Author's text and signals the Post-Processor to stop and carefully examine this part of the text & the matching image to address any issues. Agreement or disagreement can be added, but even if you know the answer, you absolutely must not remove the comment. If you have found a source which clarifies the problem, please cite it so the post-processor can also refer to it.
If you are proofreading in a later round and come across a note from a proofreader in a previous round that you know the answer to, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.
Any notes or comments put in by a previous volunteer must be left in place. You may add agreement or disagreement to the existing note but even if you know the answer, you absolutely must not remove the comment. If you have found a source which clarifies the problem, please cite it so the post-processor can also refer to it.
If you are formatting in a later round and come across a note from a volunteer in a previous round that you know the answer to, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation in the future. Please, as already stated, do not remove the note.
OCR commonly has trouble distinguishing between the digit '1' (one), the lowercase letter 'l' (ell), and the uppercase letter 'I'. This is especially true for books where the pages may be in poor condition.
Watch out for these. Read the context of the sentence to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.
Noticing these is much easier if you use a mono-spaced font such as DPCustomMono or Courier.
OCR commonly has trouble distinguishing between the digit '0' (zero), and the uppercase letter 'O'. This is especially true for books where the pages may be in poor condition.
Watch out for these. Normally the context of the sentence is sufficient to determine which is the correct character, but be careful—often your mind will automatically 'correct' these as you are reading.
Noticing these is much easier if you use a mono-spaced font such as DPCustomMono or Courier.
Another common OCR issue is misrecognition of characters. We call these errors "scannos" (like "typos"). This misrecognition can create errors in the text:
Possibly the most common example of the second type is "and" being OCR'd as "arid." Other examples: "eve" for "eye", "Torn" for "Tom", "train" for "tram". This type is harder to spot and we have a special term for them: "Stealth Scannos." We collect examples of Stealth Scannos in this thread.
Spotting scannos is much easier if you use a mono-spaced font such as DPCustomMono or Courier.
Do not include handwritten notes in a book (unless it is overwriting faded, printed text to make it more visible). Do not include handwritten marginal notes made by readers, etc.
If an image is bad (not loading, chopped off, unable to be read), please put a post about this bad image in the Project Comments forum. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.
Note that some page images are quite large, and it is common for your browser to have difficulty displaying them, especially if you have several windows open or are using an older computer. Before reporting this as a bad page, try clicking on the "Image" line on the bottom of the page to bring up just the image in a new window. If that brings up a good image, then the problem is probably in your browser or system.
It's fairly common for the image to be good, but the OCR scan is missing the first line or two of the text. Please just type in the missing line(s). If nearly all of the lines are missing in the scan, then either type in the whole page (if you are willing to do that), or just click on the "Return Page to Round" button and the page will be reissued to someone else. If there are several pages like this, you might post a note in the Project Comments forum to notify the Project Manager.
If there is a wrong image for the text given, please put a post about this bad image in the Project Comments forum. Do not click on "Return Page to Round"; if you do, the page will be reissued to the next proofreader. Instead, click on the "Report Bad Page" button so this page is 'quarantined'.
If the previous proofreader made a lot of mistakes or missed a lot of things, please take a moment and provide Feedback to them by clicking on their name in the proofreading interface and posting a private message to them explaining how to handle the situation so that they will know how in the future.
Please be nice! Everyone here is a volunteer and presumably trying their best. The point of your feedback message should be to inform them of the correct way to proofread, rather than to criticize them. Give a specific example from their work showing what they did, and what they should have done.
If the previous proofreader did an outstanding job, you can also send them a message about that—especially if they were working on a particularly difficult page.
Correct all of the words that the OCR has misread (scannos), but do not correct what may appear to you to be misspellings or printer errors that occur on the scanned image. Many of the older texts have words spelled differently from modern usage and we retain these older spellings, including any accented characters.
If you are unsure, place a note in the txet [**typo for text?] and ask in the Project Discussion thread. If you do make a change, include a note describing what you changed: [**Transcriber's Note: typo fixed, changed from "txet" to "text"]. Include the two asterisks ** so the post-processor will notice it.
In general, don't correct factual errors in the author's book. Many of the books we are proofreading have statements of fact in them that we no longer accept as accurate. Leave them as the author wrote them.
A possible exception is in technical or scientific books, where a known formula or equation may be given incorrectly, especially if it is shown correctly on other pages of the book. Notify the Project Manager about these, either by sending them a message via the Forum, or by inserting [**note sic explain-your-concern] at that point in the text.
[...to be completed...]
Return to: Distributed Proofreaders home page, DP FAQ Central page, Project Gutenberg home page. |