![]() ![]() |
|
|
The researchers, who do not dispense with the manuscript, are few. They are usually top experts, who know how to handle the original and how to avoid its damage. If the purpose is achieved, the topics for new research are less and the need for such a way of the use of the documents decreases.
The process, whose purpose is to image by means of digitization more
than what is visible by eyes, is very expensive. The expenses for the technology targeted
on a higher resolution of image are increasing vehemently. Nevertheless, the copy cannot
replace the original. With regard to this fact it seems to us to be efficient to invest
money, which is available, exclusively for the benefit of the first group.
We started our activity in 1992. In response to UNESCO and in co-operation with the National Library we created within the framework of the Memory of the World project the first CD-ROM disc with the rarest manuscripts and old prints, deposited there. That disc was followed by other ones with entire manuscripts and completed with descriptive data. At the same time we could test several digital cameras.
The first discs we created were in fact experiments by which we tried to find the way how to solve the problem of digitization. These experiments allowed us to refine our opinions and discuss them, as well as they speeded up the penetration of digital technology into a such conservative sphere which is so characteristic for the work with the old documents.
Thanks to our results we could familiarize with similar activities all over the world. We respect the labour, made in this field, but at the same time we feel certain stringency, because every product is different and uses different software.
The essential and generally valid signification of the digitization process consists in the creation of digital data. All products, in which the important role is played by technology and software, age very quickly.
Therefore we aimed at finding the way how to make data accessible
independently of any program, platform, or producer of hardware or software.
To the National Library we supplied one of the first cameras KODAK DCS 460 with the filter wheel. We aimed at obtaining full-value information. It was necessary to solve the problem of vibration, because our place of work was close to a busy street with tramway lines, as well as the problem of filters. On the other hand, this camera enabled us to apply exact methods of the optimal adjustment and calibration of image. The system ensures perfect reproduction ability independently of the changes of illumination. Together with every digitized manuscript we store the respective digitized form of the calibration table.
The camera KODAK DCS 460 enables to achieve the image 2000 x 3000 dots in 38 bites (colour version). With this resolution we are able to scan, and after the first processing of the image we work with 24 bites (18 MB RGB). Thus we achieve the maximum resolution usually between 200 - 350 dpi. The resolution is adapted to the size and the type of the manuscript and it is not changes within the manuscript (with the exception of details digitized as separated images). Such manuscripts for which the achievable resolution is not sufficient are still not digitized for the time being. In well-founded cases we photograph and digitize large originals from slides. Then we convert the images to lower quality levels used for well defined purposes.
For scanning and conversions we use the Adobe Photoshop program, for which we have written large macros. Thus the conversions can run automatically, usually at night. The description of digitized documents is done in another place of work.
The final data are stored on CD-R
discs and at the same time in digital archives. They can be viewed by any WWW browser or
made accessible by means of Internet. However, the same
data can be processed and vieved more comfortably by means of our program ManuFreT.
The digitized image is assigned to the people in the role of readers, who receive the message from some writer across centuries and long distances. The properties of the human eye have not changed yet and it is possible to assume that they will stay one and the same also after thousand years (let us hope). We start from a modest but well-founded assumption that the essential aim of our activity is the access to the image of documents, i.e. the mediation of the most authentic visual perception.
On the base of this assumption we define the group of researchers to whom we should enable the access to rare documents by means of digitization.
The technology of digitization is developing very quickly and on the other hand the prices of respective equipment are decreasing. If the development of our civilization does not stop, neither this trend will stop. Nevertheless, it does not mean that it is better to wait till the quality is even higher and prices lower. There is an analogy with the technical development in the field of sound. The development of recording have been for long time marked by the increasing of frequency and the decreasing of distortion. The chase of the quality of recording had stopped, when the bounds of the human ear were overstepped.
Also the human eye has its limits. It is important whether we look at the manuscript like a quondam writer, or whether we use a magnifying glass or a microscope. The use of digitization instead of a microscope is uneconomical and it is not easy to justify it.
Nevertheless, it is possible and useful to enable the access to any manuscript even nowadays.
All these reasoning was the base of the routine digitization of manuscripts in the National Library of the Czech Republic.
The image of any manuscript is from physical point of view endless, concentrated, and simplified information for our eye (not eyes, because of the absence of stereoscopic information).
It is necessary to take following fact into account:
It has to do with the special image of document.
It is not necessary to analyse the effect of exposure angle in relation
to the smoothness of surface, as well as the effect of the width of the objective and
reflexes on its surface (in the best objectives this effect is measurable), its ability to
image colours, the uniformity of exposure, the spectral sensitivity of recording medium
(either CCD or film) and the effect of charging, storage, and development of film.
The conclusion is clear: the digitized image is assigned to our eye only, not to the
researcher, who is interested in more than his eyes can give.
If the researcher approaches any object, for example a manuscript, there is between him and this object a flow of information, mediated by senses, especially by sight. The researcher can change this flow of information if he handles it, i.e. he touches the pages, looks at them when he turns over the folios, looks them through, or when he changes the lights. He also can make up and use various means, which create new flows of information. Nevertheless, in the course of his work he has to be in contact with the manuscript, which he - in point of fact - wears out.
If the contact with the original is to be eliminated, it is possible to conserve such a flow of information in order to use it later without using the original. This conservation has to be ensured during photography, digitization, making facsimile, or special image, for example in unusual spectral zone, in the course of looking the image through, etc.
The image on the monitor, as well as the copy or the facsimile, creates
for our eyes the flow of information, which is similar to the flow from the original under
the same conditions as if it was from the recorded copy. Nevertheless, it is modified
according to the properties of recording and reproducing equipment.
The aim is to register some specific physical property and apply
specific methods, for example photography in unusual spectral or translucent zones. It
goes without saying that it is useful to describe the conditions of scanning by the most
objective way. Nevertheless, these images serve the researcher' s eyes only and represent
the information, which should be trusted, and not used for further physical research of
the original. Even that image is the result of certain research, that has been executed by
now.
Read in professional literature how our eye sees and perceives colours. You will discover that it is a very complicated matter, while the various points of view are developing continually.
This research influences the development of reproduction technologies,
based on printing, where the question is the maximum adaptability to how our eye
perceives. Whole reproduction technology is adapted to the properties of our eyes. The aim
of this technical system is to store and re-evoke the subjective perception in the course
of looking at the image. The question often is to provide us with the image that we would
subjectively consider better, though it differs from the original.
The digitized image provides relatively accurate information about the energy of light in certain wavelength, which impacted the sensor. This energy depends on many conditions, especially on the illumination of the original, the time of light's effect on the sensor, as well as the properties of filters, the spectral distribution of light, etc. For the purpose of the description and storage of this complete information we use the following principle: together with the documents we photograph the respective calibration table. It differs from similar calibration tables, because for every calibrating flat the exact spectral analysis of reflectivity in respective conditions is executed. These properties have the form of text and together with respective digitized calibration table are added to the data of digitized documents. It means that whenever in the future it will be possible to create the table of the same properties, or execute the corrections according to another table, measured by the same method. If the reproduction equipment (printer, monitor) is adjusted in order to reproduce adequately this calibration table, the image of the respective manuscript will be reproduced adequately as well.
We are aware of what "adequately" means. With regard to different character of illumination (angle, volume) it is possible to achieve various images of the same document. Every document needs its own illumination and all opinions of which is optimal are subjective.
Even the calibration, which we have executed, cannot restrain this subjectivity. Nevertheless, it enables to eliminate the distortion of the image, caused by the characteristics of respective technology.
It is natural that we had to elaborate a system of access to digitized documents. On the base of our initial experience and research of other solutions we have elaborated a system whose aim is to be maximally independent of any other system or hardware. All data, which are added to the image, are recorded in the extended HTML. The reason of it was the request of long-run applicability of data, as mentioned above in the section, concerning the purposes of digitization. In effect it is questionable to ask the historians for the generating of HTML documents. We have solved this problem by dividing the preparation of the description of manuscripts into three parts.
First, on the base of known properties of the manuscript we generate the text - file without concrete information, but in the structure, needed for this manuscript. At the same time, we add to every future image the name, by which the description of respective page will be connected with its image. For this purpose we use the program GENTEMP.
Then the experts add to this text the concrete information about the manuscript, and - if need be - also more information about individual pages.
The text is processed by the program GENHTLM. This process
results in the creation of the group of interconnected HTLM files.
The digitized images are stored on CD-R discs together with HTML files. On the disc other information is stored, too. There is a medium-dependent identification file and a SGML map of the digital document. It is possible to view the final disc as any WWW page by any Internet browser. The data are independent of software as intended.
It is evident that CD-R discs with manuscripts can be used immediately for the access in respective extend and quality on Internet. Thus the Internet browsers can be the means of access to our discs. This means has a lot of advantages, it is developing continuously, and it is widespread all over the world.
Nevertheless, the access to individual manuscripts needs more;
therefore, we have elaborated the system which registers digitized manuscripts and which
enables to share this information on Internet. Now we are putting this automated system
into operation.
Although the browsers are easily accessible means of viewing our discs,
they are still in want of many functions, which researchers would appreciate. Therefore we
have elaborated another program named ManuFreT. This program is able to read the HTML
description of the respective manuscript as well as to index and image it. The manuscript
is interpreted in the form of a virtual book, in which it is possible to browse as well as
to mark and note. Besides, it is possible to handle the image. The program allows to view
more pages together and change the scale of luminance and contrast, as well as to use the
orientation view in the course of manipulation with magnified image and to measure the
size and distance just as in the original.
The digital data in the Memory of the World programme will grow
continuously and they must be usable even in hundred or more years. For the first time
there is a record of information which does not degrade in the course of time and which
can cease to exist only on the base of our decision or together with the end of our
civilization. There is often spoken about the limited service life of recording media.
This relatively short service life (100 - 1000 years) is comparable with that of
manuscripts which can be - under positive conditions - even longer. However, there is a
misunderstanding. It is presumable that after 100 years the CD-ROM disc will be an
anachronism. CD-ROM is not more than a medium. The same data can be stored on any other
one. They can be rewritten and reconstructed and not a single bit of information bite will
be lost. Thus they have unlimited service life and we have to generate them with all
responsibility. They must be usable on the base of the needs of our unchangeable senses
and not of the technical possibilities, which - on the contrary - are changing very
quickly.