www.witinall.com
Your German language service
  • Home
  • About
  • Contact
  • Terms
  • AGB
  • Blog
  • Shop
  • Languages
  • Site Privacy Policy
  • Ask the Experts

Have your say...

No nasty comments, please!

email

Machine translation: why you shouldn't make it too easy for the program.

8/7/2018

2 Comments

 
Picture
#10 Nominalizations – releasing the power of smothered verbs

…It is a simple truth that in most sentences you should express action through verbs, just as you do when you speak. Yet in so many sentences the verbs are smothered, all their vitality trapped beneath heavy noun phrases based on the verbs themselves. There is nothing wrong with nominalization as such – it is a useful part of the language. But overusing it tends to freeze-frame the action…. (Cutts, 1996)

This is the final post in my summer series on machine translation (in September my guest writers will be returning with lots of interesting topics), and I’d like to briefly look at nominalizations. Basically, a nominalization is a verb converted into a noun. The fun usually starts when it needs support from another verb, such as in the phrase “to give consideration to” (where “give” helps out) meaning the same as “to consider something”.

       He is interested in obtaining a quotation for…
       He expressed his interest in obtaining a quotation for…

Business and science tend to have a penchant for indirectness, verblessness and personification:  

       Die Schnelligkeit der Rückantwort und die Bereitstellung der Antworten
       [The speed of reply and the provision of answers]
      Cleaning and user maintenance should not be made [carried out] by children unless they are older than 8 years           and supervised.

Psycholinguistic studies have found that nominalizations seem to pose greater processing demands than individual sentences containing verbs, mainly because the results of nominalizations tend to be complex multi-clause sentences.

       My going to the store afterwards led to Sam’s objecting most vigorously.
       The sheriff pursued the cattle-rustling cowboys.

So, let’s release the vitality of verbs trapped beneath heavy noun phrases and see what the free online translation programs think. “No small part of language’s value lies in its flexibility” (Quirk & Greenbaum) after all, so there should be scope for remediation.

       Zur Erfüllung eines Bauvertrags sind in zahlreichen Fällen Mitwirkungshandlungen des Bestellers erforderlich.

A literal translation could be:

       The fulfilling of a construction contract in many cases necessitates participatory action by the customer.

Not very nice, is it? Or (with a different focus):  

       In many cases, the customer’s participation is necessary for the fulfilling of a construction contract.

Or even less literal, less formal translations, depending on the context. We don’t really need to be so literal unless it’s a legal text. But I think the machine also felt that the nominalizations sounded a bit odd in English:












It gives us “in order to fulfil” and “the customer is required to participate”. Interesting. Not a nominalization in sight! And if we hand the translated sentence back and ask it to translate it into German, we get:

Um einen Bauvertrag zu erfüllen, ist der Kunde verpflichtet, sich in zahlreichen Fällen zu beteiligen.

The meaning has changed slightly, but still there are no nominalizations. Thank goodness.

Writing well and writing for an international audience are two sides of the same coin. Get the one right and chances are you (and your machine translation tool) will succeed in getting your message across no matter what language you are writing in. Make sure you avoid the pitfalls I talked about in this series on machine translation, such as when using punctuation, the active and the passive voice, literal translations and special vocabularies, homonymy and misspellings, and you’ll capture your audience (and save time and money to boot).

Good luck! And have a great summer!

Picture
2 Comments

Machine translation: why you shouldn't make it too easy for the machine.

8/6/2018

0 Comments

 
Picture


​#9 Semantics, spelling and vocabulary: creating a new kind of ambiguity?

The study of language in use is a truly rewarding experience. Human discourse is a treasure trove of variation and creativity, and languages like English and German come in a number social and regional varieties used for a multitude of interaction purposes. Often, their most distinguishing characteristic is their lexicon, their vocabularies. Let’s look at what free online translation tools know about special purpose vocabularies. Here’s an example from the language of science and engineering:  

For the purposes of this part of ISO 9117, a single coating or multi-coat system of paint or varnish or related material is considered to be through-dry when a specified gauze under specifies pressure, torsion and time does not damage the film.

Für die Zwecke dieses Teils von ISO 9117 wird eine einzelne Beschichtung oder ein Mehrschichtsystem aus Farbe oder Lack oder einem verwandten Material als durchtränkt angesehen, wenn eine spezifische Gaze unter spezifischem Druck, Torsion und Zeit den Film nicht beschädigt.

Even if you don’t know what “through-dry” means, I think you would be able to hazard a guess, wouldn’t you? It’s something to do with being dry. The German translation provided by the machine, however, means “saturated, impregnated” (i.e. quite the opposite of being dry)! The correct term is “durchgetrocknet” (or “durchgehärtet”), a documented and standardised special purpose term (in chemistry and related fields). So, what went wrong here?

Machines need databases of word lists to tap into when converting sentences from one language to another. Glossaries of special purpose vocabularies and annotated lists of technical and scientific terms, for example, are fed into these databases to provide the raw material for the computer to access at high speed during the conversion process.

Let’s look at an example of social variation. We had one in an earlier post – the “dings” used by DIY enthusiasts. Here’s another example from a different field, illustrating the power of creativity:
​
       One beef with the BMI is that burgers cannot easily be traded across borders.

BMI in this sentence stands for the “Big Mac Index” produced by The Economist, and the author uses the phrase “to have a beef with someone” (have a complaint) partly for comic effect, as “beef” belongs to the same lexical field as “burger”. Unfortunately, the machine doesn’t understand the context (or the joke) and gives us the meat (“Rindfleisch”) instead:
​
       Ein Rindfleisch mit dem BMI ist, dass Burger nicht einfach über Grenzen hinweg gehandelt werden können.

Well, at least we still have something to laugh about.

Similarly, the slang term “binge”, here used in the sentence starting with “You can binge on…” presents a problem:

​










​




​​Unfortunately, the machine is way off track with “auf die Beine stellen”, which means “to organise/mount something” (e.g. an exhibition). If you have a clue as to why this is, please write to me. Maybe the machine is short-sighted and needs glasses, which is the only explanation I could find for the following example:

[Er] nutzt die Daten unserer Newsletterempfänger jedoch nicht, um diese selbst anzuschreiben oder an Dritte weiterzugeben.

However, [he] does not use the data of our newsletter recipients to write them down or pass them on to third      parties.

The verb “anschreiben” has various meanings (both in standard and colloquial usage, as well as in combination with the verb “lassen”), but is here clearly used to mean “to write to somebody”. The context makes this clear. But do you see how the pronoun “diese” – “them” is misinterpreted, missing the coreference with the newsletter recipients? That is the reason for this semantic error.  

One last example:

       They stand by him when an attempt is made on his life by a gang of rockers who were paid to kill him.

       Sie stehen ihm bei, als eine Rockerbande, die dafür bezahlt wurden, ihn zu töten, versucht, sein Leben zu retten.

The phrase “attempt is made on his life” has a relatively low frequency in the English language (less than 0.01 occurrences per 1 million words, cf. Sketch Engine), but it is not ambiguous in any way. It means to try and kill somebody, not to save somebody’s life, as the machine would have us believe. It seems that machines still have a lot of learning to do; they are “inexperienced”.

Would you like to share some examples or discuss the topic? Write to me below or on Facebook. I look forward to hearing from you!

Picture
0 Comments

Machine translation: why you shouldn't make it too easy for the machine.

8/6/2018

0 Comments

 
Picture

​#8 Semantics, spelling and vocabulary: creating a new kind of ambiguity?

Hi again, and welcome to part 8 of the series on machine translation and plain English. Let’s look at some “lazy solutions” for simple problems.

Do you know what the English word “snap” means? It can be a verb or a noun and it can have several meanings. It’s what linguists call a homonym, a word that has the same form (and sound) as another word with a different meaning. In the following sentence the word “snap” refers to the audible cracking sound the component part makes when it is slid into place:

       You will hear a firm snap when the earcup cushion is properly in place.

In German the word would be “Klicken”. In the given context (in connection with the verb “hear”) the word “Einrasten” (also “Einschnappen”) is acceptable, as one free online translation program suggests:

       Sie hören ein festes Einrasten, wenn das Hörmuschelpolster richtig sitzt.

(Note: „Polster“ is the Austrian German word for German „Kissen“.) Another program, however, suggested this:

​
 






​

​“Schnappschuss” is the sense of the word snap used when talking about photography: taking a snapshot. So, this is a case of the computer offering the wrong sense for homonyms in context. Not what I call helpful. 

We looked at a similar situation in an earlier post. Remember the “match head” flickering? The computer thought that “match” referred to a game and translated it to the German word “Spiel”:
​
       The flame on the match head danced and jumped in her fingers…
       Die Flamme auf dem Spiel Kopf tanzte und sprang in den Fingern…

This is a very obvious example, and there many are other cases where it’s not immediately apparent that the semantics have changed:

On your initial visit to the chiropractor, he or she will take a thorough case history and give you a full physical examination. Your blood pressure may be tested and blood or urine specimens sent for analysis. The practitioner will frequently feel the spinal column to reveal where movement is restricted or excessive, as well as possibly testing mobility in other joints. X-rays are taken, if necessary.

In this example the reader is being told what to expect from the visit and the examination. The machine translated the underlined sentence as:

…Der Behandler wird häufig spüren, dass die Wirbelsäule anzeigt, wo die Bewegung eingeschränkt oder übermäßig ist, sowie möglicherweise die Beweglichkeit in anderen Gelenken testen. Röntgenstrahlen werden, wenn nötig, genommen.

This conveys the “perceive by touch” meaning of „feel“, rather than the “examine or search by touch” meaning intended and indicated by the immediate sentence context, the co-text. If the machine had read the context correctly, it would not have made this error. Nor would it have produced the rather confusing (and co-referentially faulty) version of the following sentence, where “should look at it” is rendered as “darauf achten” (take care that…, make sure that…) and the translation of “it” in “If it does fail” does not refer to the mower but to the manufacturing and testing:

The mower has been carefully manufactured and tested. If it does fail, an authorised customer service agent for our products should look at it.

Der Rasenmäher wurde sorgfältig hergestellt und getestet. Sollte dies fehlschlagen, sollte ein autorisierter Kundendienstmitarbeiter für unsere Produkte darauf achten.

Readers (and human translators) rely on the context to help them understand the meaning of a text. Machines do, too, but sometimes they get side-tracked by unfamiliar constructions or constellations and come up with sentences like this one:

Bent under hoods of raincoats or with plastic bags on his head, some with blankets over his shoulders or over his heads, some carried bags, others pulled suitcases, the windscreen wipers beat rhythmically back and forth, like hands who wanted to blur the picture, wiping away, he heard the navigation device: “Please turn around if possible!”.

Confused? (And no, it’s not science fiction. The text is not about people with more than one head! It describes a group of refugees as seen by the driver in the car and should read “…on their heads” etc.) Luckily, if you know German, you would notice this kind of error quite quickly. The situation is slightly different for the following example:

Sind die Daten für die Erfüllung vertraglicher oder gesetzlicher Pflichten nicht mehr erforderlich, werden diese regelmäßig gelöscht.

If the data are no longer required for the fulfilment of contractual or statutory obligations, they are regularly updated.

If your data are being “gelöscht”, they are being deleted. Updating may include deleting, but I do think that there is a difference between these two actions, and seen from a legal perspective, this could certainly be an issue.   

Let’s briefly look at misspellings. Typos. Letters switching position. A wrong spelling or the autofill on overdrive. They happen, don’t they? And they’re not really an issue if your translator is a human being. But what would happen if you were to enter sentences like the following into a free online translation program?

Despite his looks, the man had an aurora of calmness and competence about him.
He couldn’t help his eyes wondering up her shapely legs.

Can you guess? What do you think?
See you next time!

Picture
0 Comments

Machine translation: why you shouldn't make it too easy for the machine.

8/4/2018

0 Comments

 
Picture
#7 Semantics, spelling and vocabulary: creating a new kind of ambiguity?

As a rule, if you want to make life easy for your readers (and the machine), follow the recommendation of the plain English movement which states that you should use words that are appropriate for the reader. This means that word choice depends on the given textual and situational context. If you’re writing for DIY enthusiasts, for example, then you will probably use the terms, the vocabulary, familiar to members of this group of people:

Fill all dings and deep gouges with quality wood putty, is possible. Use a flexible putty knife and fill in all the areas needing attention.

This was translated by the computer as:

Füllen Sie nach Möglichkeit alle Dings und tiefen Hohlmeißel mit hochwertigem Holzkitt. Verwenden Sie ein flexibles Spachtel und füllen Sie alle Bereiche aus, die Aufmerksamkeit erfordern.

There are a number of issues with this translation, the most obvious one perhaps being what in my last post I called the “make-do” and “inexperience” strategies. The American English word “ding” is left untranslated (“Dings”), probably because it is usually categorised as a slang term (meaning “a small dent”) and isn’t in the computer database. Then there is the translation of “deep gouges”: this refers to a groove in the wood, but what the computer selected (“Hohlmeißel”, chisel) is actually the tool (!), which is, of course, totally wrong in the sentence context. So, it’s 0 out of 10 for this sentence. I think it’s safe to say that a human translator would NEVER have made this mistake, simply because the text is about filling in dents in wood, not the tools making the dents. (Even I gathered that much.) A human translator not familiar with a subject area will research the field to find the correct terms plus suitable alternatives depending on the readership.

Next, we have the phrase “a flexible putty knife”, translated as “ein flexibles Spachtel”. This means that the basic form of “Spachtel” would be “das Spachtel”, a neuter gender noun. Except it’s not. “Spachtel” is masculine in most parts of Germany (“der Spachtel”) and feminine in Austria (“die Spachtel”). Who put the wrong form in the computer database?

Let’s look at another case of “I-don’t-know-so-I’ll-just-use-the-English-term”:

​





​




In this example the word “scrim” is left untranslated (“Scrim”) in the first, but translated as “Schreiber” in the second sentence. Both are wrong. However, what saves the day a little (but really only a tiny bit) is the fact that in the first sentence of the text it actually tells you what the scrim is (“…the black cloth inside the earcup”) and the translation reproduces this. But then the second sentence suddenly talks about a “Schreiber”. What? Why, for goodness sake? Are you feeling confused? I don’t blame you. My suggestion to the computer program in this instance would have been to leave both instances of “scrim” untranslated (and wait for scrim to become a loan word, maybe, who knows).

When the computer leaves words untranslated this can have either of two consequences:

1.      The reader will not be able to understand the text. Unfamiliar words can be looked up, but if this is not an option,          comprehension will suffer.
2.      The unfamiliar word will be accepted by the reader and interpreted in a way which may or may not conform to              the original, thus adding or removing parts of the original meaning.

To illustrate the second point, let’s look at the following sentence:

He turned and moved away swiftly, his slight limp masked almost completely by the use of his hawk-headed stick.

Er drehte sich schnell um und bewegte sich weg, sein leichtes Hinken verdeckt fast vollständig durch den Gebrauch seines Falkenkopfes.

I don’t want to go into detail here and discuss what’s wrong with the whole sentence but concentrate instead on the word “hawk-headed stick”. Did you form an image in your mind when you read the description of this ornamented walking stick? I bet you did. Reading the German sentence, however, entails having to make a mental leap to realise that it is in fact a walking stick that is being talked about. Although there are some cases of figurative language use (metonymy, pars pro toto) in German where an attribute replaces the whole (which would mean that a “hawk head” would stand for “a walking stick with the handle shaped like a hawk head”), this is not the case for “Falkenkopf” (yet).

It’s perhaps laudable that computers are trying to fill the lexical gaps in our languages with coinages of their own to create so-called neologisms, but this is not something you want to happen when you’re trying to reproduce your original text and meaning. The danger is that meaning(s) will be lost or, worse still, incompatible meaning(s) added. I still remember the shock I felt when I received a translation of a story in which someone is cleaning out their room using black plastic bags (bin liners) and these were rendered as black body bags! Not a very nice connotation to be confronted with! (It was around Halloween; maybe someone had gotten distracted.)

So, this post has been about the kind of “make-do” translation free machine translators provide us with. Next time I’ll talk a little bit about more about homonyms and also about misspellings and the “lazy solutions” on offer for those. If you have any examples of your own that you would like to present to us, go ahead and use the comment function below. I’d love to hear from you!

Before you go, here is another example (in German this time) for why some things are better left untranslated:

Zunächst hatte die Eurostat brav geliefert. Die Antwort war ausführlich, gespickt mit Zahlen, aber nicht hilfreich.
Statisten! Hatte Bohumil achselzuckend auf Deutsch zu Martin gesagt.
Du meinst: Statistiker!

Ja.

Jokes and word-play are notoriously difficult to render in another language. In the example above, Bohumil uses the word “Statist” (by mistake?) when he means “Statistiker”, presumably because it sounds similar. Like a malapropism it comes across as funny. If you had to translate this correctly to English, the lexemes would be “extra” and “statistician”. You would lose the similar sounds and the humorous effect. An alternative could be, perhaps, to leave those two words untranslated and indicate this using quotation marks. What do you think?

Picture
0 Comments

Machine translation: why you shouldn't make it too easy for the machine

8/3/2018

0 Comments

 
#6 Semantics, spelling and vocabulary: creating a new kind of ambiguity?

Continuing the series on machine translation and plain English, I’d now like to take you on a tour around the wonderful world of word meanings. I’ll talk a little bit about what you need to watch out for when you hand your text to a machine to get a free translation meaning the same thing as the text you put in.

Actually, I had quite a bit of fun with this instalment. Not only because the errors really are legion and easy to come by, but also because some are quite funny. In fact, some are droll, some are infuriating, and the majority seems to be so well hidden in plain sight that it’s really no surprise one tends to overlook them. I had to come up with a selection for this post, so below I’ve tried to reproduce the best examples for you. Here we go…

Broadly speaking, there are four categories of machine output you will come across time and again, which I’ve decided to call

a)     “literal” – a word-for-word translation gone wrong,
b)     “make-dos” – words are left untranslated,
c)      “lazy solutions” – offering the wrong sense for homonyms and misspellings in context, and
d)     “inexperienced” – when words in restricted use are simply not available, e.g. slang words.

Let’s start with the first one and some examples. (In each case the sentence represents a genuine source language sentence and the word with the asterisk is what one (or all) of the more popular free online translation programs had to offer for the underlined word.)

Mobile phone case with precise camera cutouts and button covers… -->*Kameraausschnitte

The scrim is the black cloth inside the earcup and protects the earcup components. --> *Ohrbecher, ohrbecher

Eine Begrenzung aus Maschendraht sieht ja oft ein wenig schlaff und ausgebeult aus, ein Jägerzaun wirkt rasch verblasst nostalgisch und ist deshalb so gut wie ausgestorben. Dafür umziehen nun vielerorts sogenannte Stabmatten die Gärten, die mit ihren strengen Stahlgittern ebenso gut im Gewerbegebiet eingesetzt werden können und zur Einfassung von Sportanlagen bestens geeignet sind.  ​--> *hunting fence… bar mats move around the garden…

Finally, your authority will be weakened. Your old buddies – the ones with weird hair but brilliant ideas and total loyalty – will have to be hidden in the basement. --> *…müssen im Keller versteckt sein.

The “camera cutouts” in the first sentence are undisputedly those little holes in the phone cases that are left for the camera to see through, right? Yes, well, in German these are called “Aussparungen” or similar, not “Kameraausschnitte” or “Kamera-Ausschnitte”, which is a literal, word-for-word translation but actually refers to the detail or frame selected for a picture (for example by zooming in). You will find this translation on some websites, but it simply isn’t German (yet).

Similarly, an “earcup” or “ear cup” on a headphone refers to the part (on so-called earcup or over-the-head earphones) that looks like a cup and goes around or lies on your ears. This is called a “Hörmuschel” (or, more precisely, a Kopfhörermuschel) in German, lit. a “hearing shell”. “Ohrbecher” is a word-for-word translation but wrong.

The next sentence, the one in German, mentions a type of wire mesh fence right at the beginning and then the “Jägerzaun”, which is a very popular type of rustic wooden (!) lattice concertina fence (see the picture below this post). The free translation program probably got sidetracked by the mention of the wire fence developing a slack and looking baggy in the first clause and so decided that a “hunting fence” (i.e. an almost literal translation of “Jägerzaun”, actually a “hunters’/hunter’s fence”) must be the right word to use in this context. (Note: the sentence structure makes it clear that two types of fence are being talked about. A human translator would not have been sidetracked in this way, since he or she would have noticed the additional hint of the “verblasst nostalgisch” (nostalgically faded), which indicates a wooden fence.) "Jägerzäune" were traditionally used to keep hunting quarry out of plots and gardens, but they were never used to hunt with. The word “umziehen” (surround, enclose) in this example, wrongly translated as “move around”, is a bit unusual, I admit. A more common term would be “umgeben”. But what the program did is a trifle too clever: it thought that maybe “umziehen” (which actually has another sense, namely "relocating, moving house", etc.) is related to "herumziehen" (wander about, move around/about). Yes, well…

Of course, for some of these examples most programs offer alternative lexical items if you hover or mouse over the word that you consider inadequate, so the information is there for you to find. But how do you know that you are supposed to change the word if you don’t speak the language?

Finally, the sentence starting with “Finally,…”: probably a case of misunderstanding. What the English phrase means is that they will have to be hidden in the basement in the future. The German translation says that you can assume that they are hidden in the cellar now. Not wrong, but wrong in the given context.

Fun, isn’t it? Next, I’ll look at words being left untranslated (and why this is actually a good idea sometimes). In the meantime, let’s hear from you!
Picture
0 Comments

Exercise: find the errors

8/3/2018

0 Comments

 
Picture
0 Comments

    Authors

    Alexandra
    Matthieu
    ​Sarah
    ​Samuel
    ​Summer
    ​Mike

    Archives

    January 2023
    November 2022
    September 2022
    August 2022
    July 2022
    May 2022
    March 2022
    February 2022
    January 2022
    December 2021
    November 2021
    October 2021
    August 2021
    July 2021
    June 2021
    April 2021
    March 2021
    February 2021
    January 2021
    November 2020
    October 2020
    September 2020
    August 2020
    June 2020
    April 2020
    March 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    July 2019
    June 2019
    May 2019
    April 2019
    March 2019
    February 2019
    January 2019
    December 2018
    November 2018
    October 2018
    September 2018
    August 2018
    July 2018
    June 2018
    May 2018
    April 2018
    March 2018
    January 2018
    August 2017
    March 2017
    February 2017
    October 2016
    May 2016
    April 2016
    February 2016
    January 2016
    December 2015
    December 2014
    September 2014
    February 2014
    January 2014
    September 2013

    Categories

    All

    RSS Feed

    Link to Delicious
    English Dictionary;
Proudly powered by Weebly