Language mystery: 2013

Wednesday, 8 May 2013

Humpty Dumpty and the TAUS quality concept

The “Translation Automation User Society” (TAUS) is a think tank which promotes the use of machine translation and technology within the translation industry. It organises events and offers services such as data sharing and language technology training. A recent article on the TAUS blog focused on the problem of quality evaluation in automated translation. It proposes a model called “dynamic quality evaluation”. This model has also been discussed onthe LinkedIn group “Translation Automation”, and Rahzeb Choudhury of Leeds University kindly sent me a link to a longer report in PDF format, the DynamicQuality Framework Report.

Looking at these materials, the underlying logic looks to me rather suspect, like a circular argument. It is worth considering the reasons for this.

The TAUS demographics

The Dynamic Quality Evaluation Framework report is based on a study conducted with a number of major multinational organisations (“reviewers”) which have a high volume of text which needs translation. Most of these organisations are large businesses with high volume technical products such as Dell, Google, Microsoft, Phillips and Siemens. The organisations also include the EU, which has a high volume of translations between the national languages in the European Community.

In other words, the work of TAUS, at least in this particular instance, is based on a very limited sample, i.e. major international organisations with an extremely high volume of multilingual text requirements, most of which service a limited range of subject areas. There is no consideration given to highly complex and confidential legal texts which will be read in different jurisdictions, no mention of complicated architectural texts, of urban planning, high-powered business management documents and much more. Given this highly selective demographic situation, it is not surprising that TAUS claims broad agreement on certain priorities in its reports and other documents. I would suggest, however, that the translation industry is much broader than the demographic group represented by TAUS.

The part and the whole

This limited demographic sample would not in itself be a problem if TAUS freely admitted that the study deliberately focuses on a certain scenario and certain types of translation work. But the actual usage in the report exacerbates the problem and is often misleading. For example, there are frequent references to “the translation industry”, although the actual descriptions and conclusions actually apply to clients (and perhaps selected suppliers) in the translation technology industry working on high volume automated translation in specified subject domains.

If the work of TAUS claimed to be impartial academic research, it would take a far more self-critical approach to its own sampling procedures and would openly point out the limitations of its material. Instead, it acts like a political pressure group, presenting its results in the way that most suits its own agenda. In some of the TAUS material that I have read, I have wondered whether this confusion is deliberate, or whether it reflects a genuine inability to perceive that there are different perspectives on the issues.

Dynamic quality evaluation – a definition of convenience?

The report on “dynamic quality evaluation” uses this very problem as its starting point. It states, for example, “Quality evaluation (QE) in the translation industry is problematic”. The blog post claims “The industry needs common measurable definitions”. Both of these statements pose more questions than they answer. Which sector(s) of the translation industry is TAUS referring to? What quality is referred to, who wants to evaluate this quality, for what purpose and in what kinds of text? What measurements could be used to define something as flowing and variable as language? To what extent would industrial-scale evaluation and defined measurements miss the essential characteristics of the material they are used on?

Instead of dealing with these fundamental issues, TAUS posits a quality evaluation system with three main elements, which it calls utility, time and sentiment. We are told that utility refers to the functionality of the content, speed refers to how quickly the translation is needed and sentiment denotes the effect of the resulting text on the brand image. You may notice that the actual quality of a text is not one of the three elements. So where does it come in? As far as I can gather, it seems to be relegated to a sub-category of “Utility” and to be marginally touched on in the category “Sentiment”. At the stroke of the categoriser's computer keyboard, the quality of the text itself is relegated to a mere sub-category.

The pinnacle of the “dynamic quality” logic is reached in the blog post. At the conference which is reported on the blog, there were apparently some participants who did not agree with the majority opinion – they advocated absolute rather than relative quality, and they felt that universal measurable standards did not do justice to the phenomenon of translation. Then comes the classic conclusion: most participants at the conference felt that “unless we maintain the simplicity of the model we get lost in endless details and personal requirements, and we end up … having no generalizable reference …”

Get yourself a cup of coffee and sit down and consider this sentence for a few moments. I would paraphrase it like this: some people argue that the world of language and translation is complicated, but we can’t handle a complex world because we could then not create the simple and measurable system that we want. We must have simplicity, so let there be simplicity. Simplicity rules, simply because we want it to rule.

This is rather like the semantic principles expressed by Humpty Dumpty in Lewis Carroll's novel “Alice in Wonderland”: “When I use a word, it means just what I choose it to mean – neither more nor less.” It would be a wonderfully simple way to use language: I say what I want, and it means what I want. The only problem is the puzzled expression on the faces of my listeners.

The toxic disclaimer

The final section of the blog is where TAUS dances on the borderline of Imperialism. In the title of this section, and three times in the paragraphs, it mentions the possibility of applying for the “dynamic quality” system to be certified as a standard. Each time, the possibility is retracted, at least partially, rather like the song of the Mock Turtle in Carroll's novel: “Will you, won't you, will you, won't you, will you join the dance?” In a TAUS context, this translates as “we would not be so sure that we would want to apply for official standardisation” and “Whether we go for standard certification is a decision we can take together when we get to this crossroads”.

Together? Dear TAUS, does this mean that you will gather all of the translators in the world and involve us in deciding whether to apply for certification of a standard? I think not. Your agenda seems to be domination of the translation industry rather than cooperation with real life translators. You do not look kindly on people like me who have differing opinions, far less do you take us seriously. For you, we are unwelcome “quality gatekeepers” who are “blinkered by prior assumptions”. Ho hum, I suppose Humpty would be proud of these sweeping allegations.

Unintended consequences

The occupation of Gaul by the Roman Empire gave rise to the insurrection by Asterix and Obelix in the wonderful French comics and films. Many other literary parallels come to mind, such as Luke Skywalker and the Empire, Thursday Next and Goliath Corporation, etc. If you continue to play Humpty with the values which translators hold dear, please do not be surprised when you meet opposition. Every group which aspires to global domination must expect resistance. The rhetoric adopted by TAUS and others will bring forth a myriad Luke Skywalkers, and your glorious automated future will be lit up by the flash of lightsabres all over the globe.

Previous related posts on this blog

Would I advise my grandchildren to translate?

Still building Babel?

Fight the machine? (1)

Fight the machine? (2)

Tuesday, 15 January 2013

Terminology for parts of a city

Texts about towns and cities can be tricky to translate. One thorny problem which arises again and again is how to translate the terms used for parts of the city. Municipalities are often broken down into smaller parts. Sometimes these smaller parts have an administrative function, sometimes they arise from social or historical traditions. The best way to research the terminology of the parts of towns or cities is to look at actual examples. However, the terms used in my two languages (German and English) turn out to be rather confusing and inconsistent.

Terms used in German

The basic term in German is “Bezirk”, “Stadtteil”, “Stadtbezirk”, “Ortsteil” etc.

I live in Berlin, and here the term “Bezirk” is used with a strictly defined meaning – it denotes an administrative urban district with its own elected parliament and its own administrative structure. There are 12 of these “Bezirke”. My “Bezirk” is called Spandau, which is on the western edge of Berlin and is itself broken down into 9 formally defined sub-districts, known as “Ortsteile”. The most well-known “Ortsteile” are probably Kladow, Gatow and Siemensstadt, closely followed by the area where I live, Staaken. But there are also a number of smaller areas with locally familiar names such as Klosterfelde, Altstadt, Neustadt, Wasserstadt, Waldsiedlung, Pichelsdorf. These are referred to by terms such as “Gebiet”, “Ortsteil” “Ortslage”, “Quartier”, “Kiez”.

What about other towns and cities in Germany? In Mainz there are 15 defined “Stadtteile”, which are referred to as “Ortsbezirke” in administrative texts. The officially defined structure in Stuttgart is rather more complicated, with 23 “Stadtbezirke”, 152 “Stadtteile” and 318 “Stadtviertel”. Munich has 25 official “Stadtbezirke”, but Wikipedia lists many informally used local names for smaller areas, which it refers to as “Stadtteile”, “Quartiere” and “Siedlungen”.

Other German-speaking countries have a similarly broad range of terms. For example, the larger urban districts in Zürich are the 12 “Stadtkreise” or “Kreise”, each of which is made up of 2-4 “Quartiere”. Basel (Basle) has 19 official residential districts called “Quartiere”. Geneva has 4 “Stadteile”, each of which is sub-divided into “Quartiere”. Vienna has 23 “Bezirke”, which the locals often refer to by number rather than by name, and which are made up of “Bezirksteile” and smaller areas known as “Grätzl”.

The list of terms for parts of cities in German is therefore long: Bezirk, Ortsteil, Gebiet, Ortslage, Quartier, Kiez, Stadtteil, Ortsbezirk, Stadtbezirk, Stadtviertel, Quartier, Siedlung, Stadtkreis, Kreis, Grätzl – and this list is certainly not exhaustive.

Terms used in English

In my home city of Coventry (UK), the parts of the city are mainly referred to as “suburbs” – even in central parts of the city and without distinction in terms of size. There are also some smaller units called “wards”. However, the suburbs do not appear to play any administrative role in the government of the city.

Just a few miles to the north-west, in Birmingham, the terminology is more varied, including terms such as “metropolitan borough”, “formal district”, “council constituency” “ward” and “suburb”. In London I found references for terms such as “borough”, “urban district”, “ward”, “suburb”, “neighbourhood”, “local area”, “inner London” and “outer London”.

Other English-speaking countries also present a stunning variety of terms. New York has five formally defined “boroughs” (sometimes spelled “boro”). They are broken up into “neighborhoods”. The term “suburb” is rather emotional, and many New York residents are adamant that suburbs are only found outside the five boroughs. San Francisco has “districts”, “quadrants”, “neighborhoods” and many informally named smaller areas.

The English terms listed here, then, are suburb, ward, borough, boro, metropolitan borough, district, urban district, formal district, neighbourhood, neighborhood, local area, inner, outer, quadrant – and again, this list is far from exhaustive. Further research in other towns and cities and other English-speaking countries is sure to turn up many more examples.

Help! What can I do in my text?

This variety of terms in both languages means first of all that there is no absolute right answer for any terminology question. Perhaps I could suggest a provisional sub-division into primary, secondary and informal parts of the town or city, although some of the terms will overlap, and many distinctions are likely to be relative.

Primary sub-divisions:

German: Bezirk, Stadtbezirk, Ortsbezirk, Stadtteil, Stadtkreis

English: borough, boro, urban district, formal district, inner/outer

Secondary sub-divisions:

German: Ortsteil, Gebiet, Ortslage, Quartier, Kiez

English: district, neighbourhood, neighborhood, local area, suburb

Informal areas:

German: Quartier, Kiez, Siedlung, Viertel, Grätzl

English: quadrant, ward, suburb, local area, residential district, residential estate, housing area

Scratching the surface

I realise that these terms do not cover all that can be said about urban locations. For example, how are the German “City” and “Innenstadt” linked, and how closely do they correlate with the “city centre”, “inner city” or “central business district”? How do we treat terms such as “Stadtrand” and “Randlagen”, and what exactly are “Mittelzentren”? The list of open questions could go on and on, and perhaps I will come back to some of these terms. But hey, I haven’t managed a blog post for about 9 months, and this first venture back into “active service” has to end somewhere, doesn’t it?.

Language mystery

Wednesday, 8 May 2013

Humpty Dumpty and the TAUS quality concept

Tuesday, 15 January 2013

Terminology for parts of a city

Terms used in German

Popular Posts

Blog Archive

About Me

My Blog List

Followers