Wednesday, July 28, 2010

GIGO (Garbage In, Garbage Out)

As with software, if the data fed into the computer program (read source text) is bad, the program's results (read translation) will also be bad. I thought of this when reading ultan's recent post "Information Quality, MT and UX" on Multilingual Computing's Blogos blog. ultan notes that quality information not only makes machine translation easier, but simply is better information that is more easily understood by both humans and machines.



So what is quality information? I think quality information consistent and concise, but well-written text with an audience-appropriate level of technical terminology. In this context, well-written refers to grammatically correct, clear structures free of spelling and punctuation errors. Clearly the amount and complexity of subject-specific terminology used depends on the text's end users. Installation instructions for consumers will need to be practically jargon-free (and contain explanations of any unavoidable terms), while specifications for computer programmers can contain quite a few acronyms and still be easily understood.



While this last statement is generally true, I have had to deal with source text that was replete with abbreviations specific to a particular company, without having access to an internal list of these acronyms (if such a list even existed). Since the assignment was the usual rush job via a translation agency in another time zone, there was no way to ask for and receive such a list in a timely manner. I did my best guessing the meaning of many of the abbreviations from context and annotated the rest with translator's notes.



I was initially surprised at how frequently source text -- even fairly lengthy whitepapers and similar types of text -- appears not to have been proofread, let alone copy-edited. After reading a couple of books on technical and business matters recently, I am no longer surprised. Even books being printed and sold in bookstores don't seem to undergo much of a quality-assurance process any more. A case in point is Tamar Weinberg's "The New Community Rules: Marketing on the Social Web", which I am in the process of reviewing for an upcoming issue of the Society for Technical Communication's magazine Intercom, which contains quite a few instances where sentences seem to have been hurriedly revised and fragments of the sentence's previous incarnation left behind or too much taken out. So if books aren't proofread any more, what can we expect from internal industry papers or instructions?



However, such poorly written source text not only hampers the flow of reading, it often also adds ambiguity to the text. After all, if there are two conjunctions when only one should be present, which of the two did the author intend to use? And if I pick the wrong one, the translation could be completely misleading. But never having seen the machine for which I am translating the instructions, how would I know whether the correct conjunction here was "and" or "or"?



Yes, we do need quality assurance for translations. But we also need quality assurance for the source text -- not only for the translator's sake, but also for the reader's sake. As programmers are fond of saying: Garbage In, Garbage Out -- GIGO.

10 comments:

  1. "A case in point is Tamar Weinberg's "The New Community Rules: Marketing on the Social Web", which I am in the process of reviewing for an upcoming issue of the Society for Technical Communication's magazine Intercom, which contains quite a few instances where sentences seem to have been hurridly revised and fragments of the sentence's previous incarnation left behind or too much taken out."

    Hi Barbara, can you be more specific? I have not yet seen any errata submitted about this, but the submission guidelines for any errors are provided in the introduction of the book.

    FWIW, I'm a wordy writer. I've reviewed the manuscript that I spent countless hours on a billion times. The book was copy edited and I even reviewed the content after the content was reviewed. I'm surprised that you still feel that the writing was lacking, so any insights you can provide would be great.

    You can contact me via the contact form on my website -- I can't find a way to provide you with my email address confidentially in the commenting options below.

    Thanks!

    ReplyDelete
  2. So true, so true.

    I've had the unfortunate chance to be involved in translation projects where the hiring company had a translation memory from a previous translator that was horrible. They wanted me to use the old TM to bring the costs down. I warned them a number of times that the result would follow GIGO. They still wanted to use it. I warned them again and the end I had to use it. In retrospect I should have told them simply, "no".
    They didn't like the final result and didn't want to pay. Lesson learned.

    ReplyDelete
  3. Thank you for mentioning me in your blog. I think it's worth considering a far more dangerous situation: one where Garbage In becomes Gospel Out. The consumer of this information is under the impression it will help them, when it won't, and possibly even do the opposite. Worse still, this content and the translation memory or other resource this toxic information is contained in is then distributed freely for others to use. The message is clear: Clean Your Data!

    Ultan

    ReplyDelete
  4. Tamar,
    Just one of a number of examples (p. 302): "UK agency Immediate Future performed studied the involvement of big brands in social media." Clearly, the sentence is either "UK agency ... performed studies of the involvement ..." or "UK agency ... studied the involvement of ...", but not both. On the next page, there is the sentence "The reach of each individual sites only extends so far ..." An individual site is singular, not plural, so the sentence should read "The reach of each individual site only extends so far ..." For my other comments about your book please read my upcoming review.
    All the best,
    Barbara

    ReplyDelete
  5. Thank you for a very well told post.

    Just a few corrections, considering this is about good/bad writing. I assume spelling correctly does come into this, too? Because you have a few typos in there:

    "I was initially suprised (SURPRISED) at how frequently source text -- even fairly lengthy whitepapers and similar types of text -- appears not to have been proofread, let alone copy-edited."

    "After reading a couple of books on technical and business matters recently, I am no longer suprised (SURPRISED)."

    "..which contains quite a few instances where sentences seem to have been hurridly (HURRIEDLY)revised and fragments of the sentence's previous incarnation.."

    Nothing major, we all do that when we become passionate about something.. just watch me blogging without the letter "k" which tends to stick on my keyboard :-)

    You should come and join the fraternity on the NING Watercooler http://translationandlanguage.ning.com

    ReplyDelete
  6. Good points, Barbara, and these are ones that haven't actually been put on my radar. I'm not sure if those weren't caught by the copy-editor or if it was another type of editorial oversight, but I do encourage you to submit errata on those issues as they are totally legitimate. If you spot any others, I know I'd appreciate it if you can let me know.

    (I feel that the first issue you pointed out was just someone approving all changes in a heavily-edited Word document and missing the unnecessary word. I feel that this was also replicated on page 303, where a change was approved before the edits were saved properly. I'm so sorry about that.)

    I know I read my book a billion times (or so it seems), and it's hard to proofread one's own work. I understand that I probably missed others too when rereading the copy-editor's copy-edits!

    Thanks for the notes, and apologies for sacrificing readability. I happen to be a stickler for that myself so I totally respect your concern.

    ReplyDelete
  7. Linguanerd,
    Point well taken. Thanks for the proofing. I have corrected the issues you found. I do, however, hold printed books and text that is to be disseminated globally (i.e., that needs to be translated) to a higher standard than a quick blog (or Twitter, etc.) post.
    Barbara

    ReplyDelete
  8. Good blog you've got here.. It's hard to find high quality writing like yours nowadays.

    I seriously appreciate people like you! Take care!!


    Check out my weblog - single cup coffee makers

    ReplyDelete
  9. Everything is very open with a precise description of the issues.

    It was really informative. Your website is very
    helpful. Thanks for sharing!

    Feel free to visit my web page http://www.goldenrat.com/fast-products-for-modcloth-coupon-code-the-facts/

    ReplyDelete
  10. Situated at the fringe of short distance to city, Sant Ritz at Potong Pasir (Singapore) in District 13.
    the interlace condo

    ReplyDelete

Thanks for your comment. I will review comments weekly, so please be patient if you are expecting a reply. - Barbara