Digitising your hard copy archive



 
 
Machtel has formed a partnership with Oakdene International, who have developed "Osprey", in our view the most flexible digitisation product on the market.

Why Digitise?

  1. Enables easier access to information Both for Internal and External use. External use would mean a revenue generating proposition, in the form of paid subscription access over the internet or static media distribution (CDs, DVDs etc.).
    The Internal access route would enable easy access to information electronically within your organisation, which otherwise can be a daunting proposition to search and retrieve, considering the staggering amount of paper most printed archives can have.

    Either way, it means quick, effective search and retrieval of valuable information, anywhere, anytime and by anybody.

  1. Extension of conventional markets for printed material. With the market for most printed publications localised, limited and fast saturating, web-publishing of valuable archive material can be the key to a potentially limitless and perpetual reader base world-wide, which is only limited by access to the internet.

  1. An impending necessity…..? Either way, it is logically the inevitable way ahead and an impending necessity going into the future for any organisation having significant amount of intellectual property locked in paper. Oakdene strongly believes that this is the time to digitise, with a blue ocean scenario, early adopters can reap benefits in the form of competitive costs and the luxury of time readily available before everyone wakes up to reality, after which service costs can sky-rocket because of the demand far outweighing the supply and in turn resulting in time becoming a luxury!

Oakdene and your Organisation


Oakdene is a pure end-to-end services provider for digitisation, consulting and related enabling and leveraging technology, all under one roof. We don’t have anything to do with your IPR.
We provide a best-of-breed operational capability, consulting and a technology backbone to provide an end-to-end digitisation solution. The start and end product of our process belong to you; we just come in the conversion part, with watertight security for your valuable material.


If your Archives are already scanned, then we directly ingest it into our process provided the scanning quality meets our quality requirements. If not, Oakdene also provides scanning services either in the UK through our partner capabilities or in Bangalore at our in-house facility. Digital photography of paper is acceptable provided the capture mechanism is professional.


Bare-bones Digitisation versus Digitisation with Classification


Digitisation is the first step wherein the image is converted into its constituent text and sub-images (adverts, photographs etc.) in digital format, predominantly using an OCR engine and human intervention used to correct resultant errors.

An agreed accuracy can be maintained and the final ‘read’ digital text can be provided in many formats, PDF Image over text being a popular and logical choice. This means that the image will contain the OCRed text as an invisible layer under it.

This method allows for free text search across the entire page, but without any context. For instance, searching for “Tony Blair” would bring up all pages containing that phrase wherever. It helps the user to the extent of being able to mine a subset of a usually large repository to a hopefully  limited number of pages very quickly, within which the user can then read and figure out the required context.


This is a first and very significant step towards electronic archiving of printed material and can result in millions of pages of printed material being stored on a few hard disk drives or DVDs and free text searchable. More importantly, this would be just the scale of job that early adopters of digitisation might consider owing to its relatively quick TOT and low cost.


Digitisation with Classification figures higher up in the value ladder, and is obviously a complete proposition. This process takes off from the previous step and goes further to associate tags or meaning to various elements of a page. E.g. Articles, Chapters, Sections, Title, sub-title, author, caption etc are first identified from within the text previously OCRed and corrected, all tagged as such and can then be searched with context.

For instance, it is now possible to search and retrieve all Articles, Chapters, etc where “Tony Blair” appears as an author, image caption or merely as subject in any section in volumes between a specified period. As is immediately apparent, this can enable powerful context-specific searching and information retrieval, that is more specific in nature as compared to just digitisation.

Digitisation with classification would normally require an intellectual aspect to the manual part of Osprey which Oakdene achieves with journalistically trained resources. The TOT is longer and the process is relatively more expensive then just digitisation, but this is an all-encompassing solution which is complete, and can be leveraged for immediate revenue generation now and in the future using advanced search capabilities that this would support.


The Time-Cost-Quality Connection


Digitisation being customisable is not just a nice to have scenario, there is no other way!



To every component of digitisation, there is a Time (and consequently Cost) factor involved, and this is main consideration when measuring the efficacy of the process vis-à-vis Quality.

It is usually possible to achieve an above 80% and sometime upto 95% quality purely on the strength of the OCR engine and a favourable print/paper/complexity composition as described below. From an industrial engineering standpoint, every incremental increase in quality beyond this inflection point, results in a disproportional increase in time and hence cost.


Factors that affect the cost of Digitisation


    1. Paper quality of the source material

    2. Print Quality and type-face

    3. Number of Characters/Images in a page

    4. Text Layout Complexity

    5. Quality of Scanning

    6. Complexity and granularity of classification (if required)

    7. Volume Binding

    8. Size of the Order

It is readily apparent that a combination of all the above parameters determine the effort, time and type of human intervention for a digitisation job, which in turn eventually determines costs.


We therefore don’t believe in quoting off-hand for a project considering the variables involved for each of our prospective clients. What we’d rather do is to do a proof of concept involving sample material spread over the time window that requires digitisation, understand the rough value for all these parameters and then key in on a cost, which we believe would be reflective of the nature of the job.


Though Oakdene takes small orders as well, all orders irrespective of the size would ensue a fixed consultancy and setup cost and this might make the proportionate pricing per unit relatively higher. A large order would therefore help amortise the fixed costs over the high volume of work. Therefore, if the customer appreciates this principle, we are more then happy to take on small orders as well.


Oakdene’s Digitisation Services Portfolio


  • consultancy
  • Scanning

  • Digitisation

  • Digitisation with Classification

  • Interfacing with your database and/or web-publishing system

  • Customised design and development of Information Storage, Access and Retrieval Systems for your Digitised Data.





    For more information please use our contact page