Showing posts with label IT. Show all posts
Showing posts with label IT. Show all posts

Wednesday, 8 May 2013

Humpty Dumpty and the TAUS quality concept



The “Translation Automation User Society” (TAUS) is a think tank which promotes the use of machine translation and technology within the translation industry. It organises events and offers services such as data sharing and language technology training. A recent article on the TAUS blog focused on the problem of quality evaluation in automated translation. It proposes a model called “dynamic quality evaluation”. This model has also been discussed onthe LinkedIn group “Translation Automation”, and Rahzeb Choudhury of Leeds University kindly sent me a link to a longer report in PDF format, the DynamicQuality Framework Report.
Looking at these materials, the underlying logic looks to me rather suspect, like a circular argument. It is worth considering the reasons for this.
The TAUS demographics
The Dynamic Quality Evaluation Framework report is based on a study conducted with a number of major multinational organisations (“reviewers”) which have a high volume of text which needs translation. Most of these organisations are large businesses with high volume technical products such as Dell, Google, Microsoft, Phillips and Siemens. The organisations also include the EU, which has a high volume of translations between the national languages in the European Community.
In other words, the work of TAUS, at least in this particular instance, is based on a very limited sample, i.e. major international organisations with an extremely high volume of multilingual text requirements, most of which service a limited range of subject areas. There is no consideration given to highly complex and confidential legal texts which will be read in different jurisdictions, no mention of complicated architectural texts, of urban planning, high-powered business management documents and much more. Given this highly selective demographic situation, it is not surprising that TAUS claims broad agreement on certain priorities in its reports and other documents. I would suggest, however, that the translation industry is much broader than the demographic group represented by TAUS.
The part and the whole
This limited demographic sample would not in itself be a problem if TAUS freely admitted that the study deliberately focuses on a certain scenario and certain types of translation work. But the actual usage in the report exacerbates the problem and is often misleading. For example, there are frequent references to “the translation industry”, although the actual descriptions and conclusions actually apply to clients (and perhaps selected suppliers) in the translation technology industry working on high volume automated translation in specified subject domains.
If the work of TAUS claimed to be impartial academic research, it would take a far more self-critical approach to its own sampling procedures and would openly point out the limitations of its material. Instead, it acts like a political pressure group, presenting its results in the way that most suits its own agenda. In some of the TAUS material that I have read, I have wondered whether this confusion is deliberate, or whether it reflects a genuine inability to perceive that there are different perspectives on the issues.
Dynamic quality evaluation – a definition of convenience?
The report on “dynamic quality evaluation” uses this very problem as its starting point. It states, for example, “Quality evaluation (QE) in the translation industry is problematic”. The blog post claims “The industry needs common measurable definitions”. Both of these statements pose more questions than they answer. Which sector(s) of the translation industry is TAUS referring to? What quality is referred to, who wants to evaluate this quality, for what purpose and in what kinds of text? What measurements could be used to define something as flowing and variable as language? To what extent would industrial-scale evaluation and defined measurements miss the essential characteristics of the material they are used on?
Instead of dealing with these fundamental issues, TAUS posits a quality evaluation system with three main elements, which it calls utility, time and sentiment. We are told that utility refers to the functionality of the content, speed refers to how quickly the translation is needed and sentiment denotes the effect of the resulting text on the brand image. You may notice that the actual quality of a text is not one of the three elements. So where does it come in? As far as I can gather, it seems to be relegated to a sub-category of “Utility” and to be marginally touched on in the category “Sentiment”. At the stroke of the categoriser's computer keyboard, the quality of the text itself is relegated to a mere sub-category.
The pinnacle of the “dynamic quality” logic is reached in the blog post. At the conference which is reported on the blog, there were apparently some participants who did not agree with the majority opinion – they advocated absolute rather than relative quality, and they felt that universal measurable standards did not do justice to the phenomenon of translation. Then comes the classic conclusion: most participants at the conference felt that “unless we maintain the simplicity of the model we get lost in endless details and personal requirements, and we end up … having no generalizable reference …”
Get yourself a cup of coffee and sit down and consider this sentence for a few moments. I would paraphrase it like this: some people argue that the world of language and translation is complicated, but we can’t handle a complex world because we could then not create the simple and measurable system that we want. We must have simplicity, so let there be simplicity. Simplicity rules, simply because we want it to rule.
This is rather like the semantic principles expressed by Humpty Dumpty in Lewis Carroll's novel “Alice in Wonderland”: “When I use a word, it means just what I choose it to mean – neither more nor less.” It would be a wonderfully simple way to use language: I say what I want, and it means what I want. The only problem is the puzzled expression on the faces of my listeners.
The toxic disclaimer
The final section of the blog is where TAUS dances on the borderline of Imperialism. In the title of this section, and three times in the paragraphs, it mentions the possibility of applying for the “dynamic quality” system to be certified as a standard. Each time, the possibility is retracted, at least partially, rather like the song of the Mock Turtle in Carroll's novel: “Will you, won't you, will you, won't you, will you join the dance?” In a TAUS context, this translates as “we would not be so sure that we would want to apply for official standardisation” and “Whether we go for standard certification is a decision we can take together when we get to this crossroads”.
Together? Dear TAUS, does this mean that you will gather all of the translators in the world and involve us in deciding whether to apply for certification of a standard? I think not. Your agenda seems to be domination of the translation industry rather than cooperation with real life translators. You do not look kindly on people like me who have differing opinions, far less do you take us seriously. For you, we are unwelcome “quality gatekeepers” who are “blinkered by prior assumptions”. Ho hum, I suppose Humpty would be proud of these sweeping allegations.
Unintended consequences
The occupation of Gaul by the Roman Empire gave rise to the insurrection by Asterix and Obelix in the wonderful French comics and films. Many other literary parallels come to mind, such as Luke Skywalker and the Empire, Thursday Next and Goliath Corporation, etc. If you continue to play Humpty with the values which translators hold dear, please do not be surprised when you meet opposition. Every group which aspires to global domination must expect resistance. The rhetoric adopted by TAUS and others will bring forth a myriad Luke Skywalkers, and your glorious automated future will be lit up by the flash of lightsabres all over the globe.
Previous related posts on this blog

Wednesday, 25 April 2012

Computer language mystery solved by humans


Computers have languages, too. According to an article in the American Scientist, even the experts do not agree how many programming languages there are – estimates range from 2,500 to over 8,500.

One recent example which highlighted this variety was the mystery of the programming language used in the creation of “Duqu”, a computer Trojan which has been studied by heavyweight anti-virus companies like Symantec, Kaspersky Labs and F-Secure. These IT giants were able to see the code which this Trojan consisted of, but they were not able to identify which programming language had been used to compile this code.

Why didn’t they ask a computer?
To me, as a mere computer user without a programming background, the solution appears simple. It is a computer language, and a computer is obviously able to follow the instructions in the code (otherwise the Trojan would be of no use to the crooks who created it). So a computer should be able to identify what language it is. This seems to be an obvious logical conclusion.

But it is not so. Igor Soumenkov, a Kaspersky Lab Expert, wrote a blog article “The Mystery of the Duqu Framework”. The article outlines the history of the study of Duqu and the structure of the threat which it poses, and it ends with an appeal which amazed me: We would like to make an appeal to the programming community and ask anyone who recognizes the framework, toolkit or the programming language that can generate similar code constructions, to contact us or drop us a comment in this blogpost.

Digital guesswork?
Soumenkov received a flood of blog comments and e-mail responses, and the mystery of the programming language has now been solved. But it is interesting to check out the wording of the 159 comments on the original blog article. They are peppered with phrases like:
That code looks familiar
It may be a tool developed by ...
I think it's a ...
What about ...?
Just a guess ... the first thing that pops to my mind is ...
Sounds a lot like ...
I am not a specialist but I would say it could be ...
One more guess ...
This does smell to me a little bit like ...
I'm gonna take a wild guess ...
Plus a generous sprinkling of words like might, perhaps, maybe, probably, similar, clue, feel, remember, possibility and similar vague terms.

Data or brains?
For me, this throws an interesting light on the use of computers in natural language processing. The human guesswork in the comments on Duqu included many ideas that turned out to be wrong, but the brainstorming process was helpful to the computer experts involved, and the fuzzy process of human thinking led to a solution which evidently was not possible with the computer alone. And all of this for a language which is only useful in computers and has no meaning for human communication (when did you last _class_2.setup_class13)[esi]?).

The situation in translation between human languages is comparable. Automatic translation programs from Google, Microsoft, IBM and others can achieve a certain amount of pattern recognition and sometimes come up with plausible solutions. But only a competent human being can evaluate whether this solution is really accurate or appropriate. So these programs can be a useful tool in the hands of an expert, but there is a distinct risk that they may get the wrong end of the stick.

Friday, 2 March 2012

Would I advise my grandchildren to translate?

Bang, bang, bang.
Is this another nail in the coffin of freelance translation as a career?
A recent article on the blog of the Translation Automation User Society (TAUS) does not hold out much hope for specialist translators. The title of the article is “Who gets paid for translation in 2020?”. I would love to quote the author of this article by name, but no name is given. Perhaps this is a model article, generated by a computer, untouched by human hand. This would graphically illustrate the creed which underlies the article:
“In 2020 words are ‘free’. Almost every word has already been translated before. Our words will be stored somewhere and used again, legitimately in the eyes of the law or not. .... Even today ‘robots’ are crawling websites to retrieve billions of words that help to train machine translation engines. The latent demand for translation created by unprecedented globalization is making piracy an act of common sense.”
The TAUS vision paints a glowing picture of a completely automated future, with instant computerised translation in every hand-held device, every computer application and on every website, without any need for specialist intervention. To achieve this, TAUS aims to build up a database of all the translation work done in the world. It seems to envisage three methods to do this:
LinkBEG, SCAVENGE and STEAL
BEG: In conference lectures, blog articles and other publications, TAUS calls on translators to donate their translations to its central database. The reward for doing this is to know that we are contributing to the BRAVE NEW WORLD of global computerised translation. There may be some payback in the form of access to databases provided by others, but the rhetoric of the begging prose is that we should contribute for free to the ideal of a humanity without language barriers.
SCAVENGE: The above quote speaks of the “robots” which are retrieving billions of translated words to train machine translation engines. But a scavenger takes everything that it can find. A scavenger cannot afford to be fussy about quality. There are two experts in the industry who have important things to say about this. First of all Kirti Vashee in his blog eMpTy Pages. Kirti is an ardent advocate of machine translation, but he insists that the data used to train the translation engines must be of extremely high quality. The danger of the TAUS vision of innumerable robots scavenging for more and more data is that this can include lots of low quality data, so the resulting translations will be inherently problematical. The other expert is Miguel Llorens, a highly insightful freelance translator who ridicules many of the assumptions of the machine translation gurus and elegantly criticises buzzwords such as the “content tsunami” and “crowdsourcing”.
As an aside: Kirti and Miguel disagree on many things - I suppose it is not often that they are recommended as two leading experts in the debate on machine translation.
STEAL: It has often been suggested that Internet giants such as Google and Facebook are in fact data-gobbling monsters which think nothing of violating data protection standards. But at least in their public statements, they usually claim to respect the privacy of their users and to comply with data protection laws. Not so TAUS. In the above quotation, TAUS explicitly suggests that piracy is “an act of common sense”. I wonder if the similarity to the confiscation of private assets in the ideology of Marx, Stalin and others is merely accidental. Brave new world indeed!
Translation and my grandchildren
By the time the brave new world predicted by TAUS comes to pass (2020), my own translation career will be drawing to a close, or perhaps already ended. But what about my wonderful grandchildren? They will be on the threshold of their working lives (and some will be still in primary school). What should I tell them if they ask about translation as a career?
I will say: “Why not - if that is what you are really good at.” Of course I will point out the general principles of working in a career like translation: real language expertise in two languages, realistic self-appraisal and self-management, translating skills, the need for solid specialisation, how to use the tools of the trade (including computer-aided translation and various forms of machine translation), how to advertise and find customers and much more.
This is because essentially I do not accept the TAUS creed that “Almost every word has already been translated before.”. Even at the word level, in my work I regularly come across newly created terms or compound words (German legal and architectural prose has an amazing level of inventiveness in this respect). And at the sentence level, every language on earth has an incredible potential for creative new combinations of ideas and even new linguistic structures - after all, I believe that we are still building the tower and city of Babel.

Friday, 11 November 2011

DVX2 screenshot gallery

At first sight, the screen of the Translation Memory program DéjàVuX2 (DVX2) is just a mass of boxes, a chaotic pattern of vertical and horizontal lines. What are they all for? Where in this enormous jigsaw puzzle can I find the text I want to translate? What other information is provided on the screen, and how is it helpful? The best way to explore this is with screenshots.

The classic layout
When you start working on a project with DVX2, the screen will probably look something like this. The pane at the top left is the working area. The left column is headed "German" - that is my source language. The right column, English (United Kingdom), is where my translation goes.

At the bottom left and bottom right of the screen I can see my reference material. At the bottom right I have terminology suggestions ("AutoSearch Portions"), and at the bottom left I have similar sentences ("AutoSearch Segments"). The top right ("Project Explorer") shows me the files in the project. When I am working on the translation, I normally hide this pane so that I have the full window height for the terminology.
There are various ways to personalise this layout. I can change the font and type size in the various windows, and I can also change the arrangement of the different panes in the working window.
My personal layout
Modern monitors, laptops and netbooks tend to have a wide screen. There is not much space to display elements above each other, so it is sometimes better to display the elements side by side. Therefore, my normal DVX2 screen looks like this:
In this "tramline" layout, the working area is in the middle of the screen and the reference material is arranged to the right and left. It provides more context (i.e. the text before and after the active sentence). The shorter lines could be a disadvantage for longer sentences, and especially on smaller screens. The above screenshot is taken from my 22" monitor. On my 10" netbook, this layout is rather more cramped, although it would be just about workable:
One way to make the lines longer in the working area is to work in a separate text area at the bottom of the screen and to split this text area vertically (Tools>Options>Environment). The active sentence is highlighted in the grid, but the working area is now at the bottom, i.e.:
I often get jobs with very long sentences, and sometimes the reference pane on the left is empty for most segments. In such jobs, I can simply hide this column, which gives me longer text lines even without using the separate text area:
Hide and display
In the last screenshot, note the little tabs on the left and right of the screen. They are "mouse-over" tabs. If I want to have a quick look at "AutoSearch Segments", I simply move the mouse over the tab, and the AS Segments pane opens up, but closes again when I return the mouse to the main grid.
Note also the little drawing pin icon at the top right of the "AS Portions" pane. This is a three-way switch for the display of this pane. It can either be fully displayed, as it is here, folded away like the "AS Segments" pane, or it can hover as in the mouse-over function. The combination of the tabs and the drawing pin icons takes a bit of practice, but it helps me to be flexible in using the screen layout.
Smaller details
There are a number of smaller details in the screen layout which can be useful.
The top of the DVX2 window shows the name and path of the current project. For example, the project I used for these screenshots is on drive D at the location shown.
These six icons are in the middle of the bottom edge of the DVX window. Mousing over them displays what they mean - here I had the mouse over the first icon (AutoWrite). The background colour shows me whether the function is on or off. Here, for example, AutoWrite, AutoAssemble, AutoPropagate and AutoCheck are enabled, but AutoSearch and AutoSend are disabled. These functions can also be switched on or off via Tools>Options>Environment, but the icons are quicker.

This is the area above the working part of the grid, and it contains a few hidden details. The grid language heading boxes (here "German" and "English") switch between alphabetical and chronological view of the project sentences. The language field with the flag has a little arrow to the right, which leads to a list of the target languages in the project (useful for project managers, but not usually for freelancers like me). The box "All segments" also has a little arrow, which opens up a list of types of sentence (all fuzzy matches, all exact matches etc.). The empty box on the left is a row finder. If I know the number of a segment, I can type it here, and DVX2 jumps to that segment (useful if I am proofreading and notice that a segment needs more work when I have finished proofing - I simply jot down the number and jump to the segment afterwards).
The tabs above this row show the name of the files which I have opened, so I can move to another file simply by clicking the tab. That in itself does not sound special. But these tabs can also be used to display files side by side (or one above the other). I can then compare my work on two files in context, for example like this:
This article only looks at the main grid, in other words the screen which I usually see when I work on a project. It does not explore the menu or any of the subsidiary screens, nor does it examine the efficiency of the many functions of the program. But I hope that this visual summary gives a general impression of the working environment.

Wednesday, 19 October 2011

Deep mining with Déjà Vu X2

"Déjà Vu" is a translation memory program created by the company Atril. It stores my previous translation work in databases and uses these databases to help me in every new translation job. There are three types of database. The "translation memory" (TM) contains the sentences in my source language together with my translations of these sentences. My main TM has about 385,000 sentence pairs in my two languages (German and English), so it is effectively an archive of all the work I have done since I started using Déjà Vu in late 1999. The "termbase" (TB) has terminology items which I have entered. My main TB has about 54,500 terminology pairs. In addition, there is a "lexicon" for each project, which is a place to put proper nouns, client-specific terminology etc. When I work on a new translation project, the program calls on these databases to offer as much help as possible. Sometimes this enables me to work much faster on a project. But usually I have projects with long and complicated sentences, especially contracts, so the speed gains are usually more modest. The main benefit of Déjà Vu for me is as a tool for quality which enables me to be more consistent in my work.

Over the last 12 years I have seen three generations of the program. The first version was known by the abbreviation "DV3". The next generation, DVX, was released in May 2003. The latest version is DVX2, which was released in May 2011.

Each new version has new features. A list of new features in DVX2 can be found here. One new feature which has puzzled many people is "DeepMiner". The theory is that it uses both the TM and the terminology databases to retrieve even more material. But how does it work in practice? There is a training video which uses an extremely simple example to show cross-analysis between the sentences "I have a brown dog" and "I have a black dog" when translating them into French.

So far so good. In practice, however, my sentences are never as simple as this example, and the size of my databases means that DeepMiner has to work much harder. As a result, using DeepMiner on a largish project with big databases can be very slow. And in my experience, DeepMiner is sometimes not helpful because it tries to be too clever and reconstruct the solution from similar sentences in the TM, and in the process it may overlook what I have in my termbase and lexicon. Thankfully, it is easy to switch the DeepMiner function on or off.

So how helpful is this new function? To illustrate this, let's look at one example sentence from a complicated German land purchase and partitioning contract in two alternative versions: with and without DeepMiner:

My translation:
I make the following declarations not in my own name, but as a manager with power of sole representation of ...

Looking at the first half of the sentence, where do the phrases "my own name" and "the following declarations" come from in the example with DeepMiner? They are not in the terminology hits for this segment, and there is no whole sentence match. But the TM has many matches containing "die nachstehenden Erklärungen" and the translation "the following declarations" (although "nachstehend" on its own is only in the TB as "hereinafter"). The first three words "my own name" seem strange at first sight. Somehow, DeepMiner seems to have found a correlation between the words "ich ... im eigenen Namen" and the English "my own name", in spite of the fact that the TB entries which use "im eigenen Namen" only offer the English "its own name" and "his own name".

At least in this example, DeepMiner offers solutions which go beyond the conventional assembly and pretranslation routines in the previous version of DVX. In my experience, it is still a matter of trial and error - sometimes it finds surprisingly good suggestions, but sometimes it is not really helpful. One possible workflow to get the best of both worlds is to "Pretranslate" the whole file with DeepMiner activated and then, if the solution is not helpful, to "Assemble" the individual sentence without DeepMiner. To do this, the settings for Pretranslate are:

And the settings for Assemble (under Tools>Options>General) are:

I am still experimenting to find out how DeepMiner can be used to best advantage, so perhaps I will be able to add more insights at a later date. Before too long (hopefully) I will comment on some of the other features of DVX2 such as AutoWrite, the information design options in the variable grid layout etc.