HOME TEXT SEARCH NEWSPAPER SEARCH MAP BROWSE ALL ITEMS PARTNERS

Digitization

Digitization: Florida Digital Newspaper Library Specifications

The Florida Digital Newspaper Library takes contributions of Florida and Florida-related newspapers and ensures searchability and preservation. Contributions to the Library are attributed to the contributing institution, with links to the institution's web pages. All contributions to the Library are digitally archived with the state of the art Florida Digital Archive to ensure long term preservation.

The following information is provided to help libraries, newspaper publishers, and others contribute. We welcome contributions from others so please contact us directly to help make yesterday's news readable again and to preserve all of Florida's news. In addition to the information below, please see the Florida LSTA Grant Program website, this sample LSTA grant from the Hendry County Library Cooperative to digitize the 1928-1945 issues of the Clewiston News, and information on grants that helped establish the Florida Digital Newspaper Library.

Detailed information for contributing materials to the Florida Digital Newspaper Library:

  1. Contact Us
  2. Copyright
  3. Transfer of Digital Objects to the Florida Digital Newspaper Library
  4. Attribution Requirements
  5. General Bibliographic Metadata Requirements
  6. Imaging, General Requirements
  7. Microfilm Digitization
  8. Derivates, Part 1 – Web display images
  9. Derivatives, Part 2 – Searchable text
  10. OCR
  11. Mark-up and METS
  12. Information for State University Libraries

Contact Us

Will Canova, Coordinator for the Florida Digital Newspaper Library
E-Mail: ufdc@uflib.ufl.edu
Telephone: (352) 273-2900
FAX: (352) 846-3702
Mailing Address:
Florida Digital Newspaper Library
c/o Digital Library Center
University of Florida Libraries
P.O. Box 117007
Gainesville, FL 32611
U.S.A.
Shipping Address:
Florida Digital Newspaper Library
c/o Digital Library Center - Stop 117003
University of Florida Libraries
200 Smathers
Gainesville, FL 32611
U.S.A.

Copyright

Transfer of Digital Objects to the Florida Digital Newspaper Library

Attribution Requirements

The Florida Digital Newspaper Library (FDNL) would like to attribute your contribution to you.  We do that by providing a link to your institution on the side-bar of individual issue/page display and in citation information.  Please provide the following:

General Bibliographic Metadata Requirements

Imaging, General Requirements

Microfilm Digitization

N.B.  and Other Notes

Derivates, Part 1 – Web display images

All derivatives should share or be built upon the file name of the master TIFF as specified below.  Derivative sets will be needed for each TIFF.

Note that FDNL does not use PDFs.  PDF for any given page is large and for bundled pages, larger.  We hope to build a PDF-on-the-fly service later when bandwidth to the average user becomes greater or PDF otherwise more efficient.

Derivatives, Part 2 – Searchable text

All derivatives should share or be built upon the file name of the master TIFF as specified below.

OCR

OCR or Optical Character Recognition is computer software that converts digital images of text, such as the picture of a page, to searchable text, such as that created on word- processing software.  The software sees the image of an “e”, for example, and recognizes it as the letter “e”.  OCR software is frequently combined with spell-check software to ensure greater accuracy, checking formed words against a dictionary and suggesting changes.

Prime Recognition, the OCR software used by the Florida Digital Newspaper Library (FDNL), simultaneously runs six different OCR software “engines”.  The results of recognition are voted on, such that if four of the engines recognize the letter “e”, one recognizes the letter “c” and the other recognized the letter “o”, the image is converted to an “e”.  The more OCR engines running, the more accurate the conversion.  This is especially important when processing old newspapers, particularly those converted from old microfilm.  Discoloration of newsprint due to aging effects reduces contrast between print and background paper, and, can reduce OCR’s accuracy.  Poor and uneven inking or worn type used in printing newspapers of a certain age also contributes to inaccuracies.  Older microfilms, likely produced without compliance to standards, may exacerbate these effects.  Poor or uneven lighting of the page when microfilmed fades some text and darkens other text – both can reduce accuracy, visually changing the letter “e”, for example, to a “c” through fading or an “o” through darkening.  Additionally, older microfilms tend to suffer aging effects, such as splotching and scratching, that further limit accuracy.

FDNL currently runs to instances of Prime Recognition. In addition to converting images to searchable text, Prime Recognition also provides the location or bitmap reference of each letter in the image to enable text-highlighting in a future release of the FDNL site. See the Wikipedia entry on Optical Character Recognition for more information.

Mark-up and METS

This will be detailed online. In the meantime, contact Will Canova at wcanova@uflib.ufl.edu or 352.273.2900 for more information.

Information for State University Libraries

State University Libraries should contact the Florida Digital Newspaper Library to see which file transfer method best matches their needs. UF's Digital Library Center offers several tools for easily transferring files. In order for the Florida Digital Newspaper Library to process files from SULs, the SUL needs to provide:


  Home | About dLOC | Collections | Governance | Digitization | Outreach | Contact  
  Powered by SobekCM
Acceptable Use, Copyright, and Disclaimer Statement  
© All rights reserved