Insight NUI Galway

IRIS: English-Irish Machine Translation System

Authors:

Mihael Arcan, Caoilfhionn Lane, Eoin Ó Droighneáin, Paul Buitelaar

Publication Type:
Refereed Conference Meeting Proceeding
Abstract:
We describe IRIS, a statistical machine translation (SMT) system for translating from English into Irish and vice versa. Since Irish is considered an under-resourced language with a limited amount of machine-readable text, building a machine translation system that produces reasonable translations is rather challenging. As translation is a difficult task, current research in SMT focuses on obtaining statistics either from a large amount of parallel, monolingual or other multilingual resources. Nevertheless, we collected available English-Irish data and developed an SMT system aimed at supporting human translators and enabling cross-lingual language technology tasks.
Conference Name:
Language Resources and Evaluation Conference (LREC’16)
Digital Object Identifer (DOI):
NA
Publication Date:
25/05/2016
Conference Location:
Slovenia
Research Group:
Institution:
NUIG
Project Acknowledges:
Open access repository:
No

Entity Linking with Multiple Knowledge Bases: an Ontology Modularization Approach

Authors:
Publication Type:
Refereed Conference Meeting Proceeding
Abstract:
The recognition of entities in text is the basis for a series of applications. Synonymy and Ambiguity are among the biggest challenges in identifying such entities. Both challenges are addressed by Entity Linking, the task of grounding entity mentions in textual documents to Knowledge Base entries. Entity Linking has been based in the use of single cross-domain Knowledge Bases as source for entities. This PhD research proposes the use of multiple Knowledge Bases for Entity Linking as a way to increase the number of entities recognized in text. The problem of Entity Linking with Multiple Knowledge Bases is addressed by using textual and Knowledge Base features as contexts for Entity Linking, Ontology Modularization to select the most relevant subset of entity entries, and Collective Inference to decide the most suitable entity entry to link with each mention.
Conference Name:
International Semantic Web Conference
Proceedings:
13th International Semantic Web Conference
Digital Object Identifer (DOI):
10.1007/978-3-319-11915-1_33
Publication Date:
19/10/2014
Pages:
513-520
Conference Location:
Italy
Research Group:
Institution:
NUIG
Open access repository:
Yes

Historical Data Preservation and Interpretation Pipeline for Irish Civil Registration Records

Authors:

Oya Beyan, PJ Mealy, Dolores Grant, Rebecca Grant, Natalie Harrower, Ciara Breathnach, Sandra Collins, Stefan Decker (a)

Publication Type:
Refereed Conference Meeting Proceeding
Abstract:
Semantic Web technologies give us the opportunity to understand today’s data-rich society and provide novel means to explore our past. Civil registration records such as birth, death, and marriage registers contain a vast amount of implicit information, which can be revealed by structuring, linking and combining that information with other datasets and bodies of knowledge. In the Irish Record Linkage (IRL) Project 1864-1913, we have developed a data preservation and interpretation pipeline supported by a dedicated semantic architecture. This three-layered pipeline is designed to capture separate concerns from the perspective of multiple disciplines such as archivistics, history and data science. In this study, our aim is to demonstrate best practices in digital archives, while facilitating innovative new methodologies in historical research. The designed pipeline is executed with a dataset of 4090 registered Irish death entries from selected areas of south Dublin City.
Conference Name:
4th International Workshop on Methods, Evaluation, Tools and Applications for the Creation and Consumption of Structured Data for the e-Society (META4eS’15)
Digital Object Identifer (DOI):
10.NA
Publication Date:
28/10/2015
Conference Location:
Greece
Research Group:
Institution:
NUIG
Open access repository:
No