Monday, 27 September 2010

Fight the machine? (2)

You want to travel from A to B. You get in your car, tell it where you want to go, relax - and the car does the rest. It recognises the route, keeps to the traffic regulations, detects hazards, avoids collisions with other vehicles, uses its database of road works and congestion to choose the best route and gets you to your destination quickly and safely.
Is this vision realistic? Could a car really be programmed to "see" traffic and detect all hazards? What road markings are necessary for the car to tell the difference between road and non-road? Can vehicle programming be sophisticated enough to anticipate all possible traffic situations? There are already automatic vehicle systems such as reversing cameras to help city parking, collision warning systems, speed control systems. Satellite navigation systems can help the driver to find his/her way. Can these systems be combined and refined to create all-round automatic travel?
The answer to this question is probably rather complex and full of "ifs" and "buts". Just like the answer to the question of whether computerised translation is suitable for professional use.
What for?
Machine translation can be useful for some purposes. If I come across an Internet text in a language that I do not understand at all, and if there is an automatic translation solution available for this language, a machine translation into a language that I understand may give me a general idea of the content. This is known as "gisting", i.e. my aim is to understand the "gist" of the text. Usually this process will give me a reasonable indication of the subject matter, but it is notoriously unreliable on the details, and in places I must expect the text to contain serious mistranslations. If I need a reliable translation, the only solution is to consult a competent human translator who understands the message of the text and can express it in the desired target language.
What subject matter?
It is generally agreed that MT is not suitable for literary texts. But there are many other domains and types of text that are completely unsuitable. I frequently translate contracts and other legal texts from German to English (and occasionally vice versa), and most of the sentences that I face are completely unsuited to automatic machine translation. This is partly because of the potential for terminology mismatch between the two legal systems, even where specialist dictionaries suggest equivalent terms or phrases. But it is also due to the sentence structure. Legal writing is often very complicated, with intricate clause structures and multiple layers of meaning within most sentences. The typical English word order "subject-verb-object" is sometimes reflected in German, but the alternative patterns "object-verb-subject" or "object-subject-verb" are also common. And complex adjectives (which frequently occur in legal language) are handled completely differently. A competent specialist translator must first take time to grasp the structure and interconnections of the elements in the German sentence, and then spend more time working out how these elements should be transferred into meaningful English, a process which often involves trial and error and, in an age of computers, shuffling of the elements by "drag and drop".
There are pitfalls in many other subject areas, too. Topics such as investment banking, business management, accounting and many others have their own conventions in each language. Even in technical disciplines there can be terminology and syntax mismatch which can lead to problems. I recently translated the technical specifications for the construction of a facade for a building. This text contained many terms which were not found in any specialist dictionaries and did not occur on a number of bilingual sites that I sometimes use for research on the Internet. In one or two cases, even the leading search engines had never heard of the concepts. I had to solve these problems by a multi-staged process which involved breaking these compound German terms down into their parts, investigating the meaning of the parts on their own, finding and checking other combinations of these parts, compiling a short list of possible English equivalents, then using search engines to check how plausible these equivalents were.
In cases like this, machine translation is out of its depth. Rules-based translation systems are liable to fail when the author of the source text bends or breaks the rules. Statistical machine translation systems, which depend on a corpus of previous material, are lost when there is no corpus.
How sophisticated?
The most sophisticated MT projects are projects with a restricted subject area and a well-defined procedural structure. They usually deal with mass-produced technical products, especially in areas where manufacturers produce a range of products which are similar in many respects and where the documentation has various recurring patterns. The procedural structure involves various stages. The first stage is editing the source text before it goes through the translation process to remove factual mistakes, language mistakes and non-standard wording. Then comes the machine translation itself, but it is then followed by post-editing by a competent editor. This editor usually needs to understand the source language (and the specific technological discipline) in order to spot and correct any mistranslations. And the editor also needs to give feedback to the system, thus enabling the MT system to expand or correct its data and "learn" from the work of the editor.
There is still controversy about the use of such systems. Proponents point to the savings in cost and the increase in efficiency. Others speak of the risk of liability if the quality control system is unable to eradicate the inherent errors and if the "translated" documentation therefore contains mistakes which lead to damage or injury. Another controversial issue is the role of the post-editor. What qualifications does the post-editor need, what are the potential earnings and how satisfying is the work likely to be?
Just another dictionary?
Professional translators have plenty of reference works. First of all, there are shelf-loads of dictionaries. I have about 70 dictionaries in various subject areas (some bilingual, some monolingual) and a good range of background reading. I also have several bilingual dictionaries in digital form. I have developed various strategies to extend my terminology searching on the Internet, and I also use a "translation memory" software program which gives me an easy way to look up all of the work I have done over the last 11 years.
This is not in any way special - the majority of the really experienced and competent professional translators probably have similar resources. It is therefore perfectly logical to add some form of access to an MT program. Some time ago I invested in such a program ("Personal Translator" from linguatec), and I occasionally use it for a "second opinion" on individual sentences. But I do not use it very often, because the results are simply not useful enough. From time to time it may provide a good suggestion which I can incorporate into my work, but in most cases the rendering is just not useful enough, so it is usually more effective to work without it. And when I do use MT, the guiding principle is the same as when I use paper-based or digital dictionaries or Internet resources: the help that I find is just a suggestion. I am the one who must judge whether it is really useful, and I am always free to adapt it to the requirements of the text that I am working on.
The title of this article is "Fight the machine?". The short answer is: No. I don't wish to fight against the machine, and I am open to use the resources provided by computer programs and the Internet. But these resources need to be used carefully and critically. They are a resource for our work, not a source of higher wisdom.

Thursday, 23 September 2010

Fight the machine? (1)


It was during the 1970s in a school staff room in England. A teacher was preparing audio material for a language lesson. The original was on an old-fashioned tape spool, so he connected a tape recorder to his cassette recorder to transfer the material from one to the other. He turned the volume right down because he didn't want to disturb anyone.
A colleague saw what he was doing and quipped: "Two machines talking to each other, and we can't hear what they are saying. Now that is scary!"

This evokes a whole range of scenarios that are familar from science fiction. But machines today can do much more than they could then. Now, many people have smartphones with more computing power than a whole roomful of equipment back in the 1970s.

This march of technology has also reached the translation business. There is much research and industrial development in a new discipline which is known as "MT" or "Machine Translation". There are romantic dreams about achieving technical inventions which would overcome the language barrier for all forms of communication, rather like the "Babelfish" in Douglas Adams' novel "Hitchhiker's Guide to the Galaxy" (although Adams himself suggested in the novel that this does not lead to global peace, but to even more bloodshed).

Out of curiosity, I fed my first article (Blogging the miracle) into Google Translate and asked it to translate it into German. The result is, of course, full of grammatical mistakes and questionable written style, and definitely not fit for publication. But much of it is more or less comprehensible.

Of course it is easy to make fun of Google Translate by quoting some of its more blatant mistranslations. One way to do this is by the "translation party". Here, you enter an English sentence and it is then translated back and forth between English and Japanese until it reaches "equilibrium", i.e. the English version remains the same on every "round trip". The sentence "Once upon a time there were three bears, Daddy bear, Mummy bear and baby bear", reaches equilibrium as: "Bears 3, Dadikuma, he was a mummy bear and baby bear". Great fun, and only one of the many ways to poke fun at machine translation. But does this really do justice to the subject?

There are a number of serious ventures using machine translation for real translation work. One advocate is Kirti Vashee, who blogs at "eMpTy Pages". He believes that the volume of machine translation will increase to cope with the enormous volume of material needing translation, especially from multinational technology manufacturers, and that many translators will need to move into a new field: post-editing of machine translated output. In his blog he covers many aspects of this topic, which I can't do justice to in a short paragraph here. His blog provides an interesting and thought-provoking (and sometimes controversial) perspective on the subject.

Another advocate of MT is Jeff Allen. He mainly works on French to English machine translation and post-editing systems. He has also done much work on the Haitian Creole language, and this is currently being developed to support disaster relief work after the earthquake in Haiti. I haven't yet seen any reports on how efficiently this works out "on the ground" (does anyone have any recent news?). A good place to start looking at Jeff's contributions is his profile at  Proz.com, which offers a number of links for further reading.

There are other serious users of MT, including large organisations such as Microsoft and the European Union. But MT also meets with much opposition from professional translators. This may be partly due to a fear of losing the market for human translation. But in many cases there are also doubts about how much MT can actually achieve, and whether it can really handle the subtleties of language.

So how do I, as a professional translator, regard MT? That is a long story, so I think I will have to make this entry into a series, and continue in part 2. Some day soon (I hope).

Wednesday, 15 September 2010

Hands to the keyboard

Help at last! Two small grandchildren volunteered to assist me on my computer keyboard. You can see the result at the top of the page.
The young lady on the left is nearly five weeks old. How much can she help me? Her communication skills are still rudimentary. She can let us know when she is upset for any reason, but we need a special genius to interpret whether this is due to hunger, constipation, wind, discomfort, boredom or fear of loud noises. Fortunately, we have someone in the family who has this special genius. That is the power of mothers.
The young man on the right is two and a half, and he can communicate a number of things very clearly. When he says "Opa bauen", I know that my moment has come. He wants me to build something, perhaps a wooden railway track, or possibly a house or car of Duplo bricks. And he can communicate this with words alone, even if the building materials are not in the same room.
Neither of them can really use the keyboard. Neither of them can tell an imaginary story. But in spite of their limitations, all human language is there. They may grow up to be real language experts. They will certainly be skilled users of language.
Human language really is a miracle!

Sunday, 5 September 2010

Blogging the miracle

Language fascinates me. The more I think about language, the more I realise that it is a miracle.
Language is my bread and butter. My native language is English, but I live in Germany and am perfectly at home in German. My work is translating from German to English (and occasionally the other way). in subjects such as law, architecture, building, industry and commerce.
In this blog I will look at some of the practical issues which arise in translation, but I will also explore the mystery and the miracle of language itself.
Think about it: by making a series of noises with my mouth or pressing a number of keys on a computer keyboard, I can take you into a completely new realm. Imagine, for example that you are sitting on an elephant, stroking its course skin and looking out at the scenery around. Can you see the prairie dotted with trees? Can you see the elephant's ears and feel the wind in your face as it flaps them? Can you smell those elephant smells? But what would happen if the elephant suddenly started to run? Could you hold on, or would you fall off? Look at the ground. It is an awfully long way down.
Can you see all of this? Of course you can't! Look up from your computer or the printed page ‑ how many elephants can you really see? None, of course.
But a moment ago you "saw" an elephant. The words in my description put a picture in your mind, and "in your mind's eye" you saw an elephant. Why? That is what language does. However skillful or clumsy my words may be ‑ when I mentioned the elephant, you "saw" the elephant. That is the power of words.
Words are the raw material of literature. They are the building blocks of contracts, police reports, tourist guides and school textbooks. They make up great stage dialogues, they enable you to buy a train ticket. They are the stuff of business meetings, church sermons, news broadcasts and blinding arguments in the kitchen. You can use them to inspire others to noble deeds and high ideals. Or you can use them to tell lies and deceive others.
But sometimes words are completely useless. Put me in the middle of the Amazon Forest or the Russian Tundra, and my words in English or German are probably useless. In spite of my linguistic training. I am lost. There are thousands of languages in the world, and in most of those languages I am speechless and illiterate.
Do you share my fascination with language? Perhaps you would like to watch this space. Let's see what we can discover.