MULTILINGUAL ENVIRONMENT IN THE CYBERSPACE
V.I. Gritsenko, A.V.Anisimov
International Scientific and Training Center of Information Technologies and Systems, Kiev, Ukraine
Some peculiarities of the multilingualism paradigm in the Cyberspace are discussed: humanitarian and technological multilingualism, internal multilingualism inside one country, the state-of-the art of the problem. Achievements of the ISTCITS in the field of language engineering are briefly introduced.
Information revolution is the global phenomenon that has an impact on all nations. Nobody can stay out of the Information Society. That is why nowadays problems of ethical, cultural and social type has an increasing value. Only in resolving these problems the Information Society can achieve the declared goal of globalization: to promote quality of life and sustain a cohesive development. Due to the advent of the Information Society UNESCO faces new problems demanding to define its role in new world communication processes. That is why the UNESCO strategy should embrace the wide range of activities:
These problems demand thorough considerations and constant attention. Textual information remains the main form of demanded information for majority of interstate or global international networks such as INTERNET. Due to the rapid success of communication technologies active use of spoken and visual information is foreseen in the nearest future. Nowadays there are near 3000 spoken languages in the world and only 100 of them are written. Such a variety of languages under increased value of international contacts, high intensity of information flows of economic, political, scientific, technological and social types put forward as a priority task the development of human language technologies. Multilingualism is an inevitable historical attribute of the international communication process. The notion "multilingual information society " solidly occupies the place in the range of problems discussed by computer science specialists and predefined strategic directions of many international organizations and companies dealing with information distribution and processing.
Human language technologies obtained the valid political understanding and support. At the third UNESCO Congress on Informatics and Education held in Moscow, July 1996, more than 40 countries representatives confirmed their willingness to participate in the establishing multilingual telematics support for educational needs. The relevance of linguistic and cultural aspects of the Information Society in Europe has been stressed by the European Council meetings in Corfu, 24-25 June 1994, in Cannes, 26-27 June 1995 and on the G7 Conference of Ministers in Brussels, 25-26 February 1995. In May 1995 the European Commission has adopted the upcoming Fifth RTD Framework Programme which includes points devoted to developing human language technologies.
In the multilingualism problem there exists one important factor which usually slips out in discussions of this issue. The high level of technology of the modern civilization broadens the notion of multilingualism in information society including in it artificial intellectual electronic devices. This newly arisen complex communication structure forms the Cyberspace that is marked by extremely complex processes of control and self-development. In the Cyberspace the humanitarian multilingualism interacts with the technological multilingualism of communication protocols, computer interfaces, programming languages and data protection methods. Communication multilingualism of computing devices is to some extent being solved by introducing conventional technological standards regulating data exchange and transformation processes. Different computers in networks can transmit information using standard popular protocols TCP/IP (Transmission Control Protocol / Internet Protocol), IPX/SPX (Internetwork Packet Exchange / Sequenced Packet Exchange) or SNA ( System Network Architecture). Such protocols as FTP (File Transfer Protocol), HTTP (Hyper Text Transfer Protocol), address standard URL (Uniform Resource Locator), standard sluice interface CGI are in active use in the INTERNET. High expressive possibilities of Web pages rely on implementations of HTML and Java languages. For protecting information from different forms of eavesdropping cryptographic protocols SET (Secure Electronic Transfer) , SSL (Secure Sockets Layer) and some others have been elaborated. The "hardware multilingualism" aggravates the problem of the humanitarian multilingualism in the Cyberspace.
It is interesting to note some resemblance in humanitarian and technical multilingualism. Communication multilingualism is stormy developing and to some extent repeats the dramatic way of human multilingualism. Here also numerous languages and dialects currently appear and die, conflicts of standards, programming languages and interfaces is an usual event. Any technological innovation can lead to the chain, or may be it will be better to say "domino", reaction of expressive and communication means.
The Cyberspace evolution is marked by constant attempts to resolve problems of humanitarian and technological multilingualism. The general common base for advances in these areas are backed by developing theory of semiotic systems, artificial intelligence, translation methods, algorithmic analyses, evolution of discourse, technologies sustaining lingual engineering.
The multilingualism challenge
Historically formed present situation presupposes two general approaches for partial resolving the multinational communication problem. The first approach relies on the idea to adopt some universal unique well understanding language for invariant multilingual communication. The second approach lays on elaborating powerful technique for simulating expressive means of one language by expressive forms of another one.
There are known three main variants in implementation of the first mentioned above approach.
The second approach, which is simulating expressive means of one language by expressive forms of the other one, is traditionally reduced to the translation process. Before the computer era a bilingual or a multilingual interpreter was a man. Under computer expansion the language translation problem has obtained a new content – an automated language translation. In its full context this problem relates to the very complex problems of formalizing natural language understanding and cognitive context which are very far from the final decision. Nevertheless there are some recent evident successes in this area. The number of commercial systems for machine translation, as well as other types such as spell-checkers, multilingual mailers, dictionaries, etc. is rapidly growing. During the last two years a few multilingual services available within Internet appeared. Multilingual Communication Corporation (USA) is offering its powerful WEB.TRANS service for global business. The two main browsers providers for WWW, companies Alis and Globalink, include the multilingual support in their products. The Tango, multilingual browser, using the Alis technologies, is a high performance Web tool that allows users to display Web pages authored in any of over 90 languages, select the languages of its interface, automatically retrieve these pages in the language version user prefers and input a text in a wide variety of languages. Alice Technologies has brought together several leading companies in the language industry to create Columbus – a suite of communication and translation tools and services designed to offer corporations the most comprehensive, convenient and cost-effective solutions on the market. There are other well-known companies offering a variety of multilingual solutions among them: ACCENT (Israel), Ajax Software Corp. (Russia), IBM, Intersol Brea, SYSTRAN (USA). We can also concentrate on activity of BIT Software, Inc. (Russia), as well as ABBYY Company, formerly BIT Ukraine. These software companies are famous by their multi-lingual products like Stylus Gigant (powerul heuristic translator from five European languages to Russian and visa versa), Lingvo (large hyper-text interactive dictionary), WebTranSite (online "transparent" Internet translator), etc.
Multilingualism inside one country
There is one interesting frequently met feature of the multilingualism paradigm. Due to different historical reasons in one country (nation, popularity) there are two or several internal natural languages sharing the same information space. Such a situation exists in many NIS countries, Ireland, Canada, India, Benelux countries and in many others. Ukraine is the bilingual country where the Ukrainian language has the state status. In this case specific domestic features could complicate the international multilingualism problem. Furthermore, problems become really global, e.g., after 7 years of official independence, Ukraine doesn't even have fully defined symbol code page (380), because four or five existing standards are already widespread and it's difficult to change any of them right now. Sometimes (as in Ireland) politics can influence the multilingualism decisions.
Informatization in Ukraine
Much attention is paid to the processes of creating computer supported information infrastructure in Ukraine. The State program of informatization in Ukraine has been approved by the government. This program is backed by some legislative basis. There are three juridical laws concerning the informatization program. The first law defines the concept of the informatization in Ukraine. The second one defines objectives of the program. And the third law determines the content of the program. We would like to stress that the our Center is the leading organization in forming the concept and the governmental program of informatization in Ukraine.
International Research and Training Center of Information Technologies and Systems
In the organization we present here intensive research and technological works in language engineering are carried out. Let us mention briefly some of them.
All these elaborations are based on the modeling of human intellectual activity and on the constructive analysis of digital information by means of its synthesis. The method that is used is recognized around the world.
The neural network methodology for implementation of this method has been elaborated. The methodology was checked in the automatic detection of user’s interests.
These principles formed the base for creating neural networks supporting processes of making decisions and pattern recognition. Particularly, works in handwritten symbols neural recognition have been started with the Japanese company WACOM and at present gives good results. The same results could be used in the multilingual Internet.
Ukraine (and some CIS countries) can be considered as a typical "country in transition" concerning production, dissemination and use of multilingual information on the global information highways. There is an urgent need to develop an effective infrastructure in the field and to adjust to international standards. On the other hand, it is ready for international co-operative projects, having know-how and qualified specialists in many fields, including computer science, linguistics, programming, education, etc. Currently the following directions are considered as a first priorities in the National Informatization Programme: Basics in Telematics (Internet), networking of information and educational resources and multilingual computer-based support in science, education and communication, creation of the distance education centre.
Our preliminary analysis shows that the CIS countries have serious needs in the field of multilingual telematics support. Russian and French-speaking users are very conscious that they waste too much time and loose essential context information when they read/write texts in another language, even if they are competent in it. To keep one language such as English as unique language of communication is not cost-effective. There is a strong desire to stay within own native languages, which does not of course except trying to learn a few others for personal communicational and cultural purposes.
The following specific multilingual user's needs are:
The problems related to supporting user and/or suppliers groups and standardization becoming more and more actual both for East users and West suppliers. To make all relevant information available to its potential users and to provide an information exchange between Internet-users from various professional fields and from different countries, relevant national sites must be equipped by tools which form friendly and comfortable environment, making communication attractive for users world-wide, developing their information needs, widening their horizons and motivating their creativity. To facilitate a world-wide exchange via Internet specific needs of the users from wide range of cultural settings should be analyzed and generalized to find an effective solution.
The general goals of the multilingual telematic support for education and training are the following:
To achieve these goals, the interested in multilingual applications sites must be equipped with tools which form user-friendly and comfortable multilingual environment. Communication within Network must be made attractive for users world-wide, developing their information needs, widening their horizons and motivating their creativity and inter- communications. The necessity of linguistic support for multilingual communication, information search and acquisition requires the use of modern linguistic tools such as spelling and grammar checkers, dictionaries, thesauri, vocabularies, elements of automatic translation and, eventually, multimedia voice encoders and decoders. To achieve the main goals for CIS countries, the translation tools for English into Russian and Ukrainian will be used.
The general goals of the multilingual telematic support for education and training are the following:
These problems must be solved at local, regional, national and international levels.
The global aims of the IVU "Ukraine" are the following:
The multilingual telematics support for education and training should be based on innovative ways for flexible integration of modern technologies into the new solutions directed to the development of multimedia learning environment, which in its turn will contribute to the introduction of innovative solutions and skills for telematics-based training in world society, promoting new opportunities for life-long distance education.
To facilitate a world-wide exchange within the Network its local/national sites should analyse specific needs of its users to find an effective solution within the Network. One of this specific needs is the necessity of language support for multi-lingual communication and exchange of materials. To facilitate the use of samples and examples from different sources the Network should support a communication between teachers, as well as material translation into native language. This language service within local sites would not only promote the use of English-based materials outside English-speaking countries but also open the gates to teachers experience in other cultures and languages.
The use of current machine translation tools, spell- and grammar-checkers, vocabularies would help to improve them making a necessary office tool, as text editor or word processor.
Multilingual Internet support development. Main research direction is methods development for multilingual Internet technologies teaching to insure intencive use of the modern infornation resourses by the different groups of people.
Working in Internet multilingual environment create a large amount of the problems for non-English speakers. Such users cannot to solve these problems allways self-sufficiently. Activity in this direction was initiated at Second International Congress UNESCO and was supported by the 42 counries round the world. Such developments are absent in Ukraine. However, there is strong need to overcome language barriers and ensure certain translation or explanation of terms are selected by the user. Besides that, there isn’t any materials about language barriers overcoming, in spite of existing such experience for English, Dutch and French languages. For Russian and Ukrainian languages we have only American translation off-line centers. So this problem is actual for Ukraine, that only tries to occupy certain place among leading countries in telecommunication issues.
Object of investigation is ways of access to modern information Internet resourses with on-line and off-line electronic dictionaries and translators, and insuring flexible distance learning and providing measures, that will facilitate active use of the multilingual Internet environment.
Within the Ukrainian site of the Network the following tools are available:
A carefully structured multilingual environment for Ukrainian education infrastructure will ensure that the medium is the message and that higher education's investments and experience in national networking and telecommunications result in an educational medium in which Ukrainian Information programs can be as accessible and effective as educational leaders dream and as affordable as public and personal fiscal realities demand.