THIS IS A TEST SITE!!! For the real site, go to http://www.pgdp.net.     

Post-Processing Verification Guidelines

From DPTWiki
DP Official Documentation - Post-Processing and Post-Processing Verification

A) General Procedures

  1. Make all the checks you would make on a project you are post-processing.
  2. Keep track of all errors and inconsistencies remaining in the project as well as any corrections made.
  3. Send the PPer a feedback note describing the required changes, suggesting amendments, and providing helpful hints for future work.
  4. When the agreed-upon changes have been made and checked, upload the project to PG.
  5. Submit a PPV Summary. Generally, PPers should be copied on the PPV Summary.

B) Requests to Revise and Resubmit

Please strongly encourage PPers to make corrections and amendments themselves. (Of course, if PPers specifically request that you make the changes for them, you may abide by their wishes.)

  1. Use the option on the project page to return the project to the PPer's queue.
  2. Send them an email or private message letting them know that the project will show up in their queue again. The message should note the items to be corrected, and/or make suggestions that may improve the project (specifying that these are not errors if that is the case).
  3. When complete, the PPer should reupload the project for PPV and notify the PPVer that the revised files are now available.

C) Specific Checks to Make

Please check for the following types of errors, using tools you usually use for PPing:

Errors

Errors such as failure to grasp the italics guidelines are counted as one error, not one error each time italics are wrongly handled. Errors such as he/be errors are each counted as individual errors (i.e., 3 "he" instead of "be" count as 3 errors).

If the PPer is asked to resubmit a corrected file, then any errors not corrected or new errors introduced are added to the total number of errors for rating purposes.

Level 1 (minor errors) - All Versions

  • Spellcheck/scanno errors
  • Gutcheck-type errors, e.g., punctuation, hyphen/emdash, missing/extra space, line length, illegal characters, etc.
  • Jeebies errors (English only)
  • Paragraph breaks missing or incorrectly added
  • A few occurrences of hyphenated/non-hyphenated, spelling and punctuation variants and other inconsistencies not addressed (may be addressed by note in the TN)
  • Chapter and other headings inconsistently spaced, aligned, capitalized or punctuated
  • Formatting inconsistencies (e.g., in margins, blanks lines etc.)
  • Other minor errors (such as a minor rewrap error, misplaced entry in the TN, or minor inconsistency between the text and HTML versions)

Level 1 Errors - HTML Version Only

Images

  • Unused files in images folder (other than Thumbs.db)
  • Appropriate image size not used for thumbnail, inline and linked-to images. Image sizes should not normally exceed the limits described here, but exceptions may be made if warranted by the type of image or book (provided the PPer explains the exception).
  • Images with major blemishes, uncorrected rotation/distortion or without appropriate cropping.
  • Failure to enter image size appropriately via HTML attribute or CSS such that the image is distorted in HTML, epub or mobi.
  • Failure to use appropriate "alt" tags for images that have no caption and to include empty "alt" tags if captions exist.

HTML Code

  • Use of px sizing units for items other than images and borders
  • <title> missing or incorrectly worded (Should be <title>The Project Gutenberg eBook of Alice's Adventures in Wonderland, by Lewis Carroll</title> or <title>Alice's Adventures in Wonderland, by Lewis Carroll&em;A Project Gutenberg eBook</title>)
  • Use of <pre> tags instead of their CSS equivalents
  • Failure to place html>, <body>, <head>, </head ></body>, and </html> tags each on its own line and correctly use them. (This is required by the WWers)
  • Use of tables for things that are not tables
  • Used CSS other than CSS 2.1 or below (except for the dropcap "transparent" element)
  • Used HTML version other than XHTML 1.0 Strict or 1.1
  • Failure to use <div class="chapter"> or <div class= "section"> at chapter breaks to enable proper page breaks for e-readers (Please see here for more details). It is also acceptable to use <div class="chapter"> < /div> or <div class="section"> </div>
  • Minor HTML errors in code that do not generate an HTML validation alert (such as misspelling a language code)

Level 2 (major errors) - All Versions

  • Markup not handled (e.g. blockquotes, poetry indentation, or widespread failure to mark italics)
  • Poetry indentation does not match original
  • Footnotes/footnote markers missing or incorrectly placed
  • Printers' errors not addressed
  • Missing page(s) or substantial sections of missing text
  • Substantial rewrapping errors, e.g., poetry has been rewrapped or text version generally rewrapped so that it doesn't exceed 75 characters or fall below 55 characters (though the aim should be 72 characters) except where unavoidable, e.g., some tables
  • Widespread/general occurrences of hyphenated/non-hyphenated, spelling and punctuation variants and other inconsistencies not addressed (may be addressed by note in the TN)
  • Other major errors that could seriously impact the readability of the book or that represent major inconsistencies between the text and the HTML versions

Level 2 Errors - HTML version only

  • The W3C Markup Validation Service generates errors or warning messages (Please enter number of errors)
  • The W3C CSS Validation Service generates errors or warning messages other than for the dropcap "transparent" element (Please enter number of errors). Certain errors can generate other errors that will be automatically corrected when the original errors are fixed. Therefore, to count the number of real errors, simply run the Validator and count the errors that follow the message that includes "start tag was here". That will give you the real errors to enter into the PPV Form.
  • Non-working links within HTML or to images. (Either broken or link to wrong place/file)
  • File and folder names not in lowercase or contain spaces, images not in "images" folder, etc.
  • Cover image has not been included and/or has not been coded for e-reader use. (For example, the cover should be 600x800px or at least 500px wide and no more than 800px high and should be called cover.jpg. Also, if the cover is newly created, it must meet current DP guidelines.)
  • Project not presentable/useable when put through epubmaker (Please see the section below on Checking E-reader Versions)
  • Heading elements used for things that are not headings and failure to use hierarchical headings for book, chapter and section headings (single h1, appropriate h2s and h3s etc.)

Strongly Recommended (not counted toward rating but please mentor the PPers to comply with these recommendations)

  • Enclose entire multi-part headings within the related heading tag
  • Avoid using empty tags (with &nbsp; entities) or <br /> elements for vertical spacing. e.g. <p><br /><br /></p> (or with nbsps) -- <td> </td> is still acceptable though
  • List Tags should be used for lists (e.g. a normal Index). For further information please read W3's List Use section
  • Include all text as text, not just as images
  • Keep your code line lengths reasonable
  • Tables display left, right, and center justification and top and bottom align appropriately
  • Tables contain <th> elements for headings
  • Remove thumbs.db file from the images folder
  • E-reader version, although without major flaws, should also look as good as possible

Mildly Recommended

  • Distinguish between purely decorative italics/bold/gesperrt and semantic uses of them
  • Include space before the slash in self-closing tags <br />
  • Ensure that there are no unused elements in the css (other than the base HTML headings)

D) Printers' Errors and Transcriber's Note

Obvious printers' errors should be addressed in one, or a combination, of the following ways:

  • Correct silently (in other words, change things without flagging each change within the text) and state in the Transcriber's Note that all such errors have been corrected silently. If there are changes made this way, there should be at least one Transcriber's Note to say that this has been done in the text (Level 1 Error).
  • Correct all such errors and note them in Transcriber's Note
  • Leave uncorrected and state in the Transcriber's Note that at all such errors were left uncorrected.

"Not addressing printers' errors" means that all, or a large percentage, of printers' errors have been left uncorrected and not noted. If just one or two have been missed, and the rest addressed, then those missed would instead be counted as the relevant type of error (spellcheck, gutcheck, etc.).

Anything that could make a reader think an error has been made in the transcription should be mentioned in the Transcriber's Note.

E) Checking E-reader versions

With the clear expectation from PG and the recommendation of the DPF Board that our projects should look good in e-reader versions, the PPV community have agreed that PPVers should support PPers (and check their work) in producing HTML versions that convert successfully to epub and mobi (Kindle) formats.

The Easy Epub wiki pages are intended to help with that process of checking e-reader versions and making some simple changes to improve them if necessary.

The wiki pages include Viewing - How to use epubmaker to view your project in e-reader format, even if you don't have an e-reader.

Also, understanding the Best Practices helps us understand the coding practices that can ensure the project converts seamlessly to e-reader format.

Please use a [[[Easy_Epub/Viewing#I_don.27t_have_an_e-reader.21|suggested viewer]] to test the epub and mobi versions of the book.

It doesn't take long to look through the pages of the epub and mobi versions. Here are some problem areas to look for:

Front and End of Book

  • TOC
  • Title page layout

Body of Book

  • Horizontal rules
  • Obscured sections within the book such that text covers other text or blank areas occur where text should be
  • Poetry
  • Dropcaps
  • If hovers were used in the HTML (which isn't recommended), all important "hovered" information should be present and readable in a non-hovered way within the e-reader version. Also Transcriber's Notes referring to hovers should be hidden in the e-reader version.
  • Headings
  • Blockquotes
  • Page numbers (if present)
  • Sidenotes
  • Margins
  • Tables
  • Illustrations

F) Submitting a PPV Summary

Using the link on the project page, submit a PPV Summary for the project, using the following criteria for determining project difficulty and rating of PPer's work:

Determining whether a project is Easy, Average, or Difficult

<font color="maroon">The PPV Summary Form will calculate whether a project is Easy, Average, or Difficult based on the information you provide.</font>

Easy Project

  • Straight fiction with little or no poetry
  • Straight poetry
  • Other straight text, e.g., essay
  • Can include some illustration markup
  • Can include a few simple footnotes, blockquotes, etc.
  • No index
  • 4 or fewer illustrations (not counting minor decorations or logos)

Average Project

Can include 3 or more of the following kinds of content:

  • poetry other than straight poetry covered under "Easy Project"
  • blockquotes
  • footnotes
  • sidenotes
  • index
  • advertisements
  • tables
  • 5 or more illustrations (not counting minor decorations or logos)

If a project has 0 or more "average" elements and 1 to 3 "difficult" element, it qualifies as an "Average" project.

Difficult Project

Includes significant amounts of 4 or more of the following kinds of content:

  • poetry other than straight poetry covered under "Easy Project"
  • blockquotes
  • footnotes
  • sidenotes
  • index
  • advertisements
  • tables
  • drama
  • illustrations requiring advanced preparation and/or difficult placement
  • 20 or more illustrations
  • multiple languages
  • Extensive Spellcheck/Gutcheck
  • Englſh
  • musical notation and files
  • extensive mathematical or chemical notation

How to define multiple languages:

  • If the book is English on one page and Latin on the facing page, it counts as multiple languages.
  • If the author is traveling and repeatedly reports conversations in the foreign language of the country, it counts as multiple languages.
  • If extensive (several long paragraphs or more) quotations in a language other than the base language are present, it counts as multiple languages.
  • If the Frenchman in the novel says "Zut!" a lot, it does NOT count as multiple languages.

Determining Allowable Errors for Various Ratings

<font color="maroon">The PPV Summary Form will calculate the rating based on the difficulty of the project, file size, and the number and type of errors you record. </font> If you like, you can also print a copy of the form and jot down notes on it before entering the information into the online form.

File size referred to below is based <font color="maroon">on the plain text version</font>. It is easy to check the size of the text file in kilobytes by looking at the files using your file manager. We use only the text &emdash; not the HTML &emdash; file for this purpose.

Errors such as failure to grasp the italics guidelines is counted as one error, not one error each time italics are wrongly handled. Errors such as he/be errors are each counted as individual errors (e.g., 3 "he" for "be" count as three errors).

If the PPer is asked to resubmit a corrected file, then any errors not corrected or new errors introduced are added to the total number of errors for rating purposes.

Excellent

There should be no Level 2 errors

If project is: Errors
Easy Maximum of one Level 1 error per 300kb, (or no more than 1 error if file size < 300kb)
Average Maximum of one Level 1 error per 200kb, (or no more than 1 error if file size < 200kb)
Difficult Maximum of one Level 1 error per 100kb, (or no more than 1 error if file size < 100kb)

Very Good

There should be no Level 2 errors

If project is: Errors
Easy Maximum of one Level 1 error per 150kb, (or no more than 1 error if file size < 150kb)
Average Maximum of one Level 1 error per 100kb, (or no more than 1 error if file size < 100kb)
Difficult Maximum of one Level 1 error per 50kb, (or no more than 1 error if file size < 50kb)

Good

May contain no more than 5 Level 2 errors

If project is: Errors
Easy Maximum of 6 Level 1 error per 150kb, (or no more than 6 Level 1 errors if file size < 150kb)
Average Maximum of 6 Level 1 error per 100kb, (or nor more than 6 Level 1 errors if file size < 100kb)
Difficult Maximum of 6 Level 1 error per 50kb, (or no more than 6 Level 1 errors if file size < 50kb)

Fair

A Fair rating is assigned if the project contains too many errors to be assigned a rating of Good. Essentially, if a project does not qualify as Excellent, Very Good, or Good, it is Fair.

Late PPV Summaries

Occasionally a project is marked as posted and the link to the PPV form disappears from the Project Page before the PPVer can complete the form. In this situation, a PPVer can also access the form by entering the Project ID into a blank PPV Report. There is also a link to this blank form on the PPV page.

Last edited: 4/04/2016

To comment or request edits to this page, please contact lhamilton or wfarrell.

Return to DP Official Documentation Menu