Welsh Linux pioneer launches new online dictionary

The internet is often criticized for spreading a kind of lowest common denominator English dialect throughout the world, to the detriment of local languages, but it does also allow speakers of other languages to use and develop those languages even when the number of speakers in a particular geographical area might be quite small. It also enables related language speakers to collaborate in ways that would previously have been unrealistic - such as those projects led from a far-flung corner of Wales by Kevin Donnelly.

Donnelly, originally from Northern Ireland, has a doctorate in African languages, and now lives in Llanfairpwll, Anglesey, where he works as a part-time web and Linux consultant. He is well-known in Wales and abroad for establishing Kywaith Kyfieithu in February 2003, a global volunteer effort to translate the Linux KDE desktop into Welsh.

His latest project, Eurfa, a free online Welsh dictionary, is a solo effort which has taken nine months to reach its first release. Now it’s up and running, he’s hoping people will contribute wordlists and develop and refine the dictionary further. Donnelly describes it as: “A stepping-stone to the holy grail of a free Welsh grammar-checker.”

Why is this important, you may ask. Well, although various ‘official’ dictionaries exist, usually with web interfaces, this is the first time that the contents of a Welsh dictionary will be available to download and share under the GPL. This is also the first Welsh dictionary to be based on digital rather than printed resources - all 10,000 base words in this initial release of Eurfa have been compiled, captured and edited by Donnelly from scratch, based on the content of texts on the internet.

Most importantly, especially for learners, Eurfa is the first dictionary in any Celtic language to include inflected verb forms, along with mutated forms of both base words and verb inflections. There are over 400,000 of these secondary forms in this initial release, dwarfing the 10,000 base words.

Eurfa will form the basis of the Welsh spelling and grammar checker that Donnelly is working on now - a port of Professor Kevin Scannell’s Irish original, Gramadóir. This will be the first ever GPL grammar-checker for Welsh, and an initial version should be available by the late summer.

The Eurfa site also has two little demo apps - Rhymer, which produces a list of rhyming words, and Translist, which does multiple lookups on a piece of English text, and produces a “sort-of” first attempt at a translation. The latter is limited at present, but Donnelly hopes that more work on this front in the future might lead first to a Welsh-English auto-translator, and then an English-Welsh one.

Source - http://www.pingwales.co.uk/2006/04/18/eurfa.html

Leave a Comment

You must be logged in to post a comment.