Rhodri Glyn Thomas AM outlines a new machine translation system for the Welsh Language.

February 21st, 2014

Today we mark UNESCO International Mother Language Day, which aims to promote linguistic and cultural diversity and multilingualism across the world – a fitting occasion for the National Assembly for Wales in collaboration with Microsoft to launch a machine translation system that puts the Welsh language firmly on the world stage.

Welsh now joins the growing number of lesser-used languages, including Catalan, Latvian and Maltese, alongside such major languages as Chinese, Spanish and Russian, to be supported by Microsoft Translator and Bing Translator.

The Welsh language is one of the oldest living languages in Europe and it continues to thrive today. It is a significant part of civic and cultural life in Wales, being spoken by more than half a million people in Wales as well as a significant number across the globe, and so it is important for it to be a part of the evolving technological landscape of communication.

Welsh is, of course, one of the official languages of Wales.

In 2012, the National Assembly passed the Official Languages Act , which placed a statutory duty on the Assembly Commission to treat the Welsh and English languages on the basis of equality.

The Assembly produces a large amount of bilingual material, including Records of all Assembly Plenary and committee meetings, and we have been able to use this data as the basis for Microsoft Translator’s Welsh language model.

Working in partnership with Microsoft has allowed technology experts and Welsh language users to work together to create a machine translation system that will help deliver exemplary bilingual services, which is a key commitment of ours.

It is a significant achievement, and a great step forward in bilingual working, that we have helped with the development of such a powerful translation tool.

It will:

  • provide a self-service translation tool for Assembly staff, Members and Members’ support staff to facilitate communication and working practices in their language of choice;
  • provide a learning aid for Members, support staff and Assembly staff wishing to improve their knowledge and understanding of Welsh in the workplace;
  • allow the Assembly to share our experience of delivering bilingual services with other organisations in Wales including, where appropriate, making available translation products; and
  • provide a world wide platform to the Welsh language by offering a translation tool that will be available to any Microsoft Office user across the globe.

The quality of machine translation will not be perfect and can never be as good as a human translation. It is a useful tool to enable more people to communicate bilingually and saves time for professional translators; however, it by no means replaces the need for formal communication and documents to be translated by professional translators

The Assembly is committed to maintaining high standards of translation. We are all too aware of the potential pitfalls of relying solely on machine translation, so as we start to use this powerful tool in the Assembly, we will be raising awareness and providing guidance to make sure that staff make the best use of it.

By working collaboratively with the language community and bilingual organisations, we can feed back corrections and more data to the system, thereby continuously improving the quality of the translations produced, so that people throughout the world can use it with confidence.

I would like to say thank you to Microsoft for working with the National Assembly on developing this, and I also thank the Assembly Commission staff for their hard work in helping to deliver this powerful new tool.

This exciting development would also not have happened without the support of other organisations such as the Welsh Government, S4C, BBC Cymru Wales and Gweiadur by Gwerin who have helped to populate the tool with a variety of bilingual words and phrases. It demonstrates what can be achieved through working in co-operation across the Welsh public sector and beyond.

This is an exciting development in realising the National Assembly’s goal of becoming a truly bilingual law-making body. However, to see Welsh as one of the Microsoft Translator family of languages is a great leap forward and one that can only help to safeguard its future.

Tags: , , , , ,

Rhodri Glyn Thomas is the Assembly Member for Carmarthen East and Dinefwr and the Assembly Commissioner for the Welsh language

23 Responses to:“Assembly adopts cutting-edge translation tool”

  1. Dewi Jones says:

    In addition, other public sector organistions in Wales are adopting the CyfieithuCymru.com/TranslateWales.com cloud based translation system developed in Wales by the Language Technologies Unit at Bangor University’s Canolfan Bedwyr.

    Bangor’s TranslateWales.com software is a new product for bilingual organizations, translation agencies and translators that’s been especially developed for optimized technological assistance in translating between Welsh and English. It includes a workflow manager, translation memories, glossaries as well as the more advanced components such as Cysill – the spelling and grammar checker – and Bangor’s own machine translation engines. TranslateWales.com enables professional translators to work more efficiently by providing machine translated text for correcting and final publication. This cuts down the work of human translators, whilst avoiding the pitfalls of pure machine translation.

    Yesterday, it was announced that components from TranslateWales.com, in particular its machine translation engines will, thanks to a collaboration between Bangor University and the Wikimedia Foundation, feature in the software used to power Wikipedia.

    See http://www.bangor.ac.uk/news/full.php.en?nid=17792&tnid=17792

    (Report comment)

  2. comeoffit says:

    This machine translation tool sounds like a wonderful bit of kit. However, I’d wager that it doesn’t come cheap! Any efficient business worth it’s salt will only make large investments in technology if there are some predicted long term savings. Now of course few people would think of the Assembly as being efficient but still, there must have been some sort of cost-benefit analysis in purchasing this technology. There is a hugely well staffed Welsh translation department at the Assembly which will now have significantly less primary translation to carry out…. what will happen to them? They cant all become proof readers for the machine translation.

    (Report comment)

  3. Emily says:

    Welsh politicians seem to love announcing partnerships with big multinational companies. But who wants to bet that “We’ve got a partnership with Microsoft” just means “we’re using a Microsoft system and providing a load of data for free”?

    I just created a Gmail account. In other words … I proudly announce a new partnership between myself and Google. Henceforth, the whole of Wales (and indeed the world) will benefit from Google incorporating my data into free services. Available to all. I’d like to demonstrate my leadership by encouraging other organizations in Wales to sign up to Gmail too. There, do I get a government-funded champagne reception in the Senedd?

    (Report comment)

  4. David Meurig Thomas says:

    LLongyfarchiadau mawr ar diwtyrnod hannesyddol
    ‘Rwyf yn gobeitiio fod y system llawer well na’r system y Beeb
    O BYDDED I’R HEN IAETH BARHAU!!!

    (Report comment)

  5. John Winterson Richards says:

    Emily makes an excellent point that applies to a lot, possibly most, of these sweetheart deals between a public sector organisation and a favoured big business. The small print ought to be scrutinised more carefully. For example, it is typical that the Welsh media printed the press release about the new Pinewood operation in Cardiff apparently without amendment and certainly without critical analysis. Is there not a single journalist left in Wales with the courage and professionalism to ask why money is being taken from taxpayers, some of them very poor, and given to multinational businesses, while indigenous Welsh businesses are ignored and neglected?

    (Report comment)

  6. John R Walker says:

    OK – let’s cut to the chase – how much taxpayers’ money has this cost so far, across all contributing cost centres, and what is the budgeted overhead for maintaining and updating this limited market system in years to come?

    Has any IP been licensed to Microsoft for reward or has it all been handed to them on a plate?

    (Report comment)

  7. Jeff Jones says:

    Last week’s Economist has an interesting article showing how many firms world wide are adopting English as their official language. It was fascinating to read that Lufthansa had adopted English as its official language for senior managers in 2011 even though only a handful of the firm’s top 50 senior managers were not German. As the author of the article points out “there is no real alternative ( to English) as a global business language”.

    (Report comment)

  8. Ken Richards says:

    The cynicism expressed about this announcement is counter productive insofar as technological innovation in Wales is concerned. Of course this type of technology costs. In the private sector its called investment in emerging technology. This is a step in the right direction for the public sector in Wales, and one hopes that there are spinoffs to the private sector, particularly small business and to community organizations.

    As to quality of translation. give it time. Better to start with a decent platform then work from there. Smart investment? I would think so.

    (Report comment)

  9. Barry Phillips says:

    Much rather see money spent on this project than on our Health Service because Welsh is more important than lives, that’s what I have been told and we can always carry on sending all our premature babies in North Wales to England, well done to AM’s for voting of this important issue.
    Now we need to think of how to put those in jail who speak out against all this wasted money, let them join those 15% of inmates who are veterans and suffer from post traumatic stress.

    (Report comment)

  10. Daran Hill says:

    “There is a hugely well staffed Welsh translation department at the Assembly which will now have significantly less primary translation to carry out…. what will happen to them? They cant all become proof readers for the machine translation.”
    Seemingly nothing as the article implies same level of staffing.
    Great technology, right purchase, but why no consequential saving to the public purse?

    (Report comment)

  11. Alasdair MaolChrìosd says:

    There are tens of thousands of Welsh speakers in Wales, and I imagine a good proportion of them are literate in both languages, so why the problem with translation? Just ask the nearest literate Welsh speaker, better still pay them a few bob (but not extravagant fees) for their help. If the text is too obscure for the average bilingual to translate then it’s probably incomprehensible to most people whatever the language. What’s the point of converting English gobbledigook into some made-up Welsh equivalent. “Garbage in — Garbage out” as they used to say.

    And with computers in mind, wouldn’t it have been more prudent and forward-looking to have gone for an open-source solution rather than locking an important section of the nation into a devil’s compact with M$? That’s the approach taken by many small European countries, it spreads funding, innovation and benefits around. Then all the profits not being exported to the US could be diverted to the education system so that when you teach pupils Welsh they actually come out of the system able to use the language. Which of course brings us back to my first point, why the need to translate everything when everyone is supposed to have been taught Welsh?

    (Report comment)

  12. Peter Hugh Charles Davies says:

    Dear Editor
    The last time I asked for a translation of an article in Welsh in these columns,you pleaded the cost and the fact that your institute is a charity,as a reason for not providing one. Can I now assume as a result of this article that matters have progressed and you will now meet my request to publish a translation of David Meurig Thomas’ Welsh article above.?

    (Report comment)

  13. John says:

    John R Walker- as someone who attended the event I can answer that question- nothing. Microsoft has developed in conjunction with the Assembly’s Parliamentary Translation Unit. This is an interesting and positive development for Welsh, but as usual all some can do is bleat on about how much it costs.

    (Report comment)

  14. John says:

    Emily- nothing could be further from the truth. The creating of the system was a reciprocal process involving different parties. It took the support of the translators themselves, contributions from BBC, S4C, The Assembly Commission and The Assembly by way of providing the huge corpus of already translated material and Welsh text, it also took a number of people based in Wales to evaluate and check the translations throughout the whole time it was being developed, the linguistic consultants were needed to see what progress was being made. Cooperation between Microsoft based in Seattle and Welsh linguists, translators and users was essential and it would not have happened if it were not for that co-operation.

    (Report comment)

  15. comeoffit says:

    “This is an interesting and positive development for Welsh, but as usual all some can do is bleat on about how much it costs.”

    @John
    absolutely not! we’re not interested in the cost! it sounds like a good investment…. if of course the investment eventually translates into savings. Don’t you think that sounds fair John? or do you not know how business works or that public money does not grow on trees?

    (Report comment)

  16. John R Walker says:

    John

    “John R Walker- as someone who attended the event I can answer that question- nothing. Microsoft has developed in conjunction with the Assembly’s Parliamentary Translation Unit.”

    Great answer if it’s aimed at somebody who has just fallen off a bus – but I haven’t! Unless you can tell when the Assembly Translation Unit and the rest of the assembled contributors and contributions from the public sector and tax-funded bodies like S4C and Universities started working for nothing, using equipment not funded by the taxpayers in buildings not funded by the taxpayers using means of communication not funded by the taxpayers…? And so on and so forth…

    I thought not…

    You wouldn’t happen to have attended ‘the event’ using any funding from the taxpayers – would you?

    You’re right about me, and many others, bleating about the costs though ‘cos I can see a multi-million pound non-essential non-productive industry that has been allowed to build up on the back of the Welsh language – the costs of which are mostly top-sliced from the budgets of what most of us consider to be more essential services. The total cost of the Welsh language per annum is not only unknown by any statutory body it is clear that it is actively obfuscated in many cases and has been for many years. As health, education, etc. etc. goes into terminal decline those of us who regard the cost of the Welsh language as a ‘TAX’ on essential services are entitled to ask what else we could have had in exchange for the money the Welsh language industry is consuming?

    (Report comment)

  17. Emily says:

    John: At the event, they told you Microsoft in Seattle did significant work this project. But is there any evidence for this? It seems more like the government gave them a whole load of data for free, and they’ve stuck it into their Bing Translate software as a way of selling more copies of Microsoft Word.

    If someone wants to disprove this, all they have to do is explain what significant work the Welsh linguists at Microsoft did on this project.

    (Report comment)

  18. John says:

    John R Walker- No I was invited as a researcher on Machine Translation and the event was free. Microsoft funded it, they own the software and work in partnership with enterprises. You required as to the cost of the establishment of the software- not about the cost of the structures that are to use it. So I am in fact right when I say it costed nothing, and you are wrong.

    As for the ‘welsh costs money’ complaint, it is a fraction of what the Government spends on other things, no that your point was relevant as like I said, Microsoft have created the system off their own backs.

    (Report comment)

  19. John says:

    @comeoffit- Machine Translation and the post-editing of its output to make it a high-quality, publishable translation ha s been shown to lead to increased production and therefore lower costs for companies that commission translation. Numerous studies have proven this, which is why Welsh is now making greater use of the technology.

    (Report comment)

  20. Dewi Jones says:

    Machine translation is just another tool for the human translator. Along with translation memories, it allows them to translate more and meet increasing demand. With workflow features such as those in TranslateWales.com, rather than just a translator within Microsoft Word, the organization can achieve greater efficiencies.

    An added benefit is that these technologies allow for more to be authored in Welsh or for previous collections of Welsh texts to become accessible to non-Welsh speakers. (e.g. as will soon be evident at cymru1914.org – the National Library of Wales’s website on the Welsh experience of the First World War – where Bangor University’s Language Technologies Unit’s machine translation engines, along with the help of HPCWales, were used to translate 12 millions words from the Welsh language newspapers of the time)

    Beyond Wales and the Welsh/English pairing, Bangor’s Language Technologies Unit’s experiences with its machine translation engine and its inclusion in TranslateWales.com is being used to not only benefit Wikipedia’s software but, via its membership of the BSI and ISO, international efforts on establishing best practice for the post editing of machine translation output. (see http://standardsdevelopment.bsigroup.com/Home/Project/201300526)

    (Report comment)

  21. John says:

    @Dewi Jones- having tried to find the system, I noticed you need to log in and so first one has to be given usage rights, is that correct? How is the system accessed? And is there any thing written about it regarding its architecture and quality?

    (Report comment)

  22. Dewi Jones says:

    @John – you mentioned that you research machine translation (?) – you’re welcome to get in touch directly to know more about TranslateWales.com and our work with Welsh language machine translation. Our contact details are at : http://techiaith.bangor.ac.uk/index.php/staff/ . Diolch.

    (Report comment)

  23. ben says:

    @Emily- As one of the linguists working on the project my role was to evaluate the machine translation output using evaluation software and writing a series of reports outliing the main grammatical and translation failings or otherwise of that output. You cannot build any statistical based machine translation system without a huge corpus of already translated material, which you would know had you even googled it, and so the the Assembly Commission gave them for free all the electronic corpuses of the Cofnod. What is wrong with that? A lot of misunderstandings here about what is actually a great development. As for the ‘just stuck in’ comment, it’s obvious understand nothing of the computer science behind it.

    (Report comment)

Have your say

Please let us know in your message if you do not want the IWA to contact you in future or related IWA activity.