Publications

2019 (Accepted)

  • B. Pereira, C. Robin, T. Daudert, J. P. McCrae, P. Buitelaar, and P. Mohanty, “Taxonomy Extraction for Customer Service Knowledge Base Construction,” in Proceedings of the semantics 2019, 2019 (Accepted).
    [Bibtex]
    @inproceedings{pereira2019taxonomy,
    title={{Taxonomy Extraction for Customer Service Knowledge Base Construction}},
    author="Bianca Pereira and Cécile Robin and Tobias Daudert and John P. McCrae and Paul Buitelaar and Pranab Mohanty",
    booktitle="Proceedings of the SEMANTicS 2019",
    description="Customer service agents play an important role in bridging the gap between customers' vocabulary and business terms. In a scenario where organisations are moving into semi-automatic customer service, semantic technologies with capacity to bridge this gap become a necessity. In this paper we explore the use of automatic taxonomy extraction from text as a means to reconstruct a customer-agent taxonomic vocabulary. We evaluate our proposed solution in an industry use case scenario in the financial domain and show that our approaches for automated term extraction and using in-domain training for taxonomy construction can improve the quality of automatically constructed taxonomic knowledge bases.",
    year="2019 (Accepted)"
    }

2019

  • B. Klimek, J. P. McCrae, M. Ionov, J. K. Tauber, C. Chiarcos, J. Bosque-Gil, and P. Buitelaar, “Challenges for the Representations for Morphology in Ontology Lexicons,” in Proceedings of sixth biennial conference on electronic lexicography, elex 2019, 2019-10-03 2019.
    [Bibtex]
    @inproceedings{klimek2019challenges,
    title={{Challenges for the Representations for Morphology in Ontology Lexicons}},
    author="Bettina Klimek and John P. McCrae and Maxim Ionov and James K. Tauber and Christian Chiarcos and Julia Bosque-Gil and Paul Buitelaar",
    affiliation="['Leipzig University', 'National University of Ireland Galway', 'Goethe-University Frankfurt', 'Open Greek and Latin Project', 'Goethe-University Frankfurt', 'Universidad Politécnica de Madrid', 'National University of Ireland Galway']",
    date="2019-10-03",
    open="True",
    license="cc-by-sa",
    grants="[{'id': '731015'}, {'id': '825182'}]",
    url="https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_33.pdf",
    booktitle="Proceedings of Sixth Biennial Conference on Electronic Lexicography, eLex 2019",
    description="Recent years have experienced a growing trend in the publication of language resources as Linguistic Linked Data (LLD) to enhance their discovery, reuse and the interoperability of tools that consume language data. To this aim, the OntoLex-lemon model has emerged as a de-facto standard to represent lexical data on the Web. However, traditional dictionaries contain a considerable amount of morphological information which is not straightforwardly representable as LLD within the current model. In order to fill this gap a new Morphology Module of OntoLex-lemon is currently developed. This papers presents the results of this model as on-going work as well as the underlying challenges that emerged during the module development. Based on the MMoOn Core ontology, it aims to account for a wide range of morphological information, ranging from endings to derive whole paradigms to the decomposition and generation of lexical entries which is in compliance to other OntoLex-lemon modules and facilitates the encoding of complex morphological data in ontology lexicons.",
    year="2019"
    }
  • J. P. McCrae, C. Tiberius, A. F. Khan, I. Kernerman, T. Declerck, S. Krek, M. Monachini, and S. Ahmadi, “The ELEXIS Interface for Interoperable Lexical Resources,” in Proceedings of sixth biennial conference on electronic lexicography, elex 2019, 2019-10-03 2019.
    [Bibtex]
    @inproceedings{mccrae2019elexis,
    title={{The ELEXIS Interface for Interoperable Lexical Resources}},
    author="John P. McCrae and Carole Tiberius and Anas Fahad Khan and Ilan Kernerman and Thierry Declerck and Simon Krek and Monica Monachini and Sina Ahmadi",
    affiliation="['National University of Ireland Galway', 'Instituut voor de Nederlandse Taal', 'CNR- Istituto di Linguistica Computazionale «A. Zampolli»', 'K Dictionaries', 'Austrian Centre for Digital Humanities, Austrian Academy of Sciences', 'Jožef Stefan Institute/University of Ljubljana', 'CNR- Istituto di Linguistica Computazionale «A. Zampolli»', 'National University of Ireland Galway']",
    date="2019-10-03",
    open="True",
    license="cc-by-sa",
    grants="[{'id': '731015'}]",
    url="https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_37.pdf",
    booktitle="Proceedings of Sixth Biennial Conference on Electronic Lexicography, eLex 2019",
    description="ELEXIS is a project that aims to create a European network of lexical resources, and one of the key challenges for this is the development of an interoperable interface for different lexical resources so that further tools may improve the data. This paper describes this interface and in particular describes the five methods of entrance into the infrastructure, through retrodigitization, by conversion to TEI-Lex0, by the TEI-Lex0 format, by the OntoLex format or through the REST interface described in this paper.",
    year="2019"
    }
  • S. Ahmadi, H. Hassani, and J. P. McCrae, “Towards Electronic Lexicography for the Kurdish Language,” in Proceedings of sixth biennial conference on electronic lexicography, elex 2019, 2019-10-03 2019.
    [Bibtex]
    @inproceedings{ahmadi2019towards,
    title={{Towards Electronic Lexicography for the Kurdish Language}},
    author="Sina Ahmadi and Hossein Hassani and John P. McCrae",
    affiliation="['National University of Ireland Galway', 'University of Kurdistan Hewler', 'National University of Ireland Galway']",
    date="2019-10-03",
    open="True",
    license="cc-by-sa",
    grants="[{'id': '731015'}]",
    url="https://elex.link/elex2019/wp-content/uploads/2019/09/eLex_2019_50.pdf",
    booktitle="Proceedings of Sixth Biennial Conference on Electronic Lexicography, eLex 2019",
    description="This paper describes the development of lexicographic resources for Kurdish and provides a lexical model for this language. Kurdish is considered a less-resourced language, and currently, lacks the machine-readable lexicon resources. The unique potential which Linked Data and the Semantic Web offer to e-lexicography enables interoperability across lexical resources by elevating the traditional linguistic data to machine-processable semantic formats. Therefore, we present our lexicon in Ontolex-Lemon ontology as a standard model for sharing lexical information on the Semantic Web. The research covers Sorani, Kurmanji, and Hawrami dialects of Kurdish. This research suggests that although Kurdish is a less-resourced language, in terms of documented lexicons, it owns a wide range of resources, but because they are machine-readable, they could not contribute to the language processing. The outcome of this project, which is made publicly available, assists scholars in their efforts towards making Kurdish a resource-rich language.",
    year="2019"
    }
  • [DOI] A. Doyle, J. P. McCrae, and C. Downey, “A Character-Level LSTM Network Model for Tokenizing the Old Irish text of the Würzburg Glosses on the Pauline Epistles,” in Proceedings of the celtic language technology workshop 2019, 2019-08-19 2019.
    [Bibtex]
    @inproceedings{doyle2019character,
    title={{A Character-Level LSTM Network Model for Tokenizing the Old Irish text of the Würzburg Glosses on the Pauline Epistles}},
    author="Adrian Doyle and John P. McCrae and Clodagh Downey",
    booktitle="Proceedings of the Celtic Language Technology Workshop 2019",
    description="This paper examines difficulties inherent in tokenization of Early Irish texts and demonstrates that a neural-network-based approach may provide a viable solution for historical texts which contain unconventional spacing and spelling anomalies. Guidelines for tokenizing Old Irish text are presented and the creation of a character-level LSTM network is detailed, its accuracy assessed, and efforts at optimising its performance are recorded. Based on the results of this research it is expected that a character- level LSTM model may provide a viable solution for tokenization of historical texts where the use of Scriptio Continua, or alternative spacing conventions, makes the automatic separation of tokens difficult.",
    year="2019",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-08-19",
    open="True",
    license="cc-by",
    grants="[{'id': '731015'}]",
    url="https://www.aclweb.org/anthology/W19-6910.pdf",
    doi="10.18653/v1/w19-6910"
    }
  • [DOI] J. P. McCrae and A. Doyle, “Adapting Term Recognition to an Under-Resourced Language: the Case of Irish,” in Proceedings of the celtic language technology workshop 2019, 2019-08-19 2019.
    [Bibtex]
    @inproceedings{mccrae2019adapting,
    title={{Adapting Term Recognition to an Under-Resourced Language: the Case of Irish}},
    author="John P. McCrae and Adrian Doyle",
    booktitle="Proceedings of the Celtic Language Technology Workshop 2019",
    description="Automatic Term Recognition (ATR) is an important method for the summarization and analysis of large corpora, and normally requires a significant amount of linguistic input, in particular the use of part-of-speech taggers. For an under-resourced language such as Irish, the resources necessary for this may be scarce or entirely absent. We evaluate two methods for the automatic extraction of terms, based on the small part-of-speech-tagged corpora that are available for Irish and on a large terminology list, and show that both methods can produce viable term extractors. We evaluate this with a newly constructed corpus that is the first available corpus for term extraction in Irish. Our results shine some light on the challenge of adapting natural language processing systems to under-resourced scenarios.",
    year="2019",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-08-19",
    open="True",
    license="cc-by",
    grants="[{'id': '731015'}]",
    url="https://www.aclweb.org/anthology/W19-6907.pdf",
    doi="10.18653/v1/w19-6907"
    }
  • [DOI] B. R. Chakravarthi, M. Arcan, and J. P. McCrae, “WordNet Gloss Translation for Under-resourced Languages using Multilingual Neural Machine Translation,” in Proceedings of the moment workshop, 2019-08-19 2019.
    [Bibtex]
    @inproceedings{chakravarthi2019wordnet,
    title={{WordNet Gloss Translation for Under-resourced Languages using Multilingual Neural Machine Translation}},
    author="Bharathi Raja Chakravarthi and Mihael Arcan and John P. McCrae",
    booktitle="Proceedings of the MomenT Workshop",
    description="In this paper, we translate the glosses in the English WordNet based on the expand approach for improving and generating wordnets with the help of multilingual neural machine translation. Neural Machine Translation (NMT) has recently been applied to many tasks in natural language processing, leading to state-of-the-art performance. However, the performance of NMT often suffers in low resource scenarios where large corpora cannot be obtained. Using training data from closely related language have proven to be invaluable for improving performance. In this paper, we describe how we trained multilingual NMT from closely related language utilizing phonetic transcription for Dravidian languages. We report the evaluation result of the generated wordnets sense in terms of precision. By comparing to the recently proposed approach, we show improvement in terms of precision.",
    year="2019",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-08-19",
    open="True",
    license="cc-by",
    grants="[{'id': '731015'}]",
    url="https://www.aclweb.org/anthology/W19-7101.pdf",
    doi="10.18653/v1/w19-7101"
    }
  • [DOI] B. R. Chakravarthi, R. Priyadharshini, B. Stearns, A. Jayapal, S. Srivedy, M. Arcan, M. Zarrouk, and J. P. McCrae, “Multilingual Multimodal Machine Translation for Dravidian Languages utilizing Phonetic Transcription,” in Proceedings of the 2nd workshop on technologies for mt of low resource languages (loresmt 2019), 2019-08-20 2019.
    [Bibtex]
    @inproceedings{chakravarthi2019multilingual,
    title={{Multilingual Multimodal Machine Translation for Dravidian Languages utilizing Phonetic Transcription}},
    author="Bharathi Raja Chakravarthi and Ruba Priyadharshini and Bernardo Stearns and Arun Jayapal and S Srivedy and Mihael Arcan and Manel Zarrouk and John P. McCrae",
    booktitle="Proceedings of the 2nd Workshop on Technologies for MT of Low Resource Languages (LoResMT 2019)",
    description="Multimodal machine translation is the task of translating from source language to target language using information from other modalities. Existing multimodal datasets have been restricted to only highly resourced languages. These datasets were collected by manual translation of English descriptions from the Flickr30K dataset. In this work, we introduce MMDravi, a Multilingual Multimodal dataset for under-resourced Dravidian languages. It comprises of 30K sentences which were created utilizing several machine translation outputs. Using data from MMDravi and a phonetic transcription of the corpus, we build an MMNMT system for closely related Dravidian languages to take advantage of multilingual corpus and other modalities. We evaluate our MMNMT translations generated by the proposed approach with human annotated evaluation tests in terms of BLEU, METEOR, and TER. Relying on multilingual corpora, phonetic transcription, and image features, our approach improves the translation quality for the under-resourced languages.",
    year="2019",
    affiliation="['National University of Ireland Galway', 'Saraswathi Narayanan College', 'National University of Ireland Galway', 'Smart Insights from Conversations', 'Tamil Nadu Agricultural University', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-08-20",
    open="True",
    license="cc-by",
    grants="[{'id': '731015'}]",
    url="https://www.aclweb.org/anthology/W19-6809.pdf",
    doi="10.18653/v1/w19-6809"
    }
  • J. P. McCrae, A. Rademaker, F. Bond, E. Rudnicka, and C. Fellbaum, “English WordNet 2019 – An Open-Source WordNet for English,” in Proceedings of the 10th global wordnet conference – gwc 2019, 2019-07-23 2019.
    [Bibtex]
    @inproceedings{mccrae2019english,
    title={{English WordNet 2019 -- An Open-Source WordNet for English}},
    author="John P. McCrae and Alexandre Rademaker and Francis Bond and Ewa Rudnicka and Christiane Fellbaum",
    booktitle="Proceedings of the 10th Global WordNet Conference – GWC 2019",
    description="We describe the release of a new wordnet for English based on the Princeton WordNet, but now developed under an open-source model. In particular, this version of WordNet, which we call English WordNet 2019, which has been developed by multiple people around the world through GitHub, fixes many errors in previous wordnets for English. We give some details of the changes that have been made in this version and give some perspectives about likely future changes that will be made as this project continues to evolve.",
    year="2019",
    affiliation="['National University of Ireland Galway', 'IBM Research and FGV/EMAp', 'Nanyang Technological University', 'Wroclaw University of Technology', 'Princeton University']",
    date="2019-07-23",
    grants="[{'id': '731015'}]",
    open="True",
    license="cc-by"
    }
  • [DOI] J. P. McCrae, “Identification of Adjective-Noun Neologisms using Pretrained Language Models,” in Joint workshop on multiword expressions and wordnet (mwe-wn 2019) at acl 2019, 2019-08-02 2019.
    [Bibtex]
    @inproceedings{mccrae2019identification,
    author="John P. McCrae",
    title={{Identification of Adjective-Noun Neologisms using Pretrained Language Models}},
    booktitle="Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019) at ACL 2019",
    description="Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention. In this paper, we show that we can effectively identify the distinction between compositional and non-compositional adjective-noun pairs by using pretrained language models and comparing this with individual word embeddings. Our results show that the use of these models significantly improves over baseline linguistic features, however the combination with linguistic features still further improves the results, suggesting the strength of a hybrid approach.",
    year="2019",
    affiliation="['National University of Ireland Galway']",
    date="2019-08-02",
    open="True",
    license="cc-by",
    grants="[{'id': '731015'}]",
    url="https://www.aclweb.org/anthology/W19-5116.pdf",
    doi="10.18653/v1/w19-5116"
    }
  • M. Arcan, D. Torregrosa, S. Ahmadi, and J. P. McCrae, “Inferring translation candidates for multilingual dictionary generation,” in Proceedings of the 2nd translation inference across dictionaries (tiad) shared task, 2019-05-20 2019.
    [Bibtex]
    @inproceedings{arcan2019inferring,
    title={{Inferring translation candidates for multilingual dictionary generation}},
    author="Mihael Arcan and Daniel Torregrosa and Sina Ahmadi and John P. McCrae",
    booktitle="Proceedings of the 2nd Translation Inference Across Dictionaries (TIAD) Shared Task",
    description="In the widely-connected digital world, multilingual lexical resources are one of the most important resources, for natural language processing applications, including information retrieval, question answering or knowledge management. These applications benefit from the multilingual knowledge as well as from the semantic relation between the words documented in these resources. Since multilingual dictionary creation and curation is a time-consuming task, we explored the use of multi-way neural machine translation trained on corpora of languages from the same family and trained additionally with a relatively small human-validated dictionary to infer new translation candidates. Our results showed not only that new dictionary entries can be identified and extracted from the translation model, but also that the expected precision and recall of the resulting dictionary can be adjusted by using different thresholds.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-05-20",
    open="True",
    license="cc-zero",
    grants="[{'id': '731015'}]",
    year="2019",
    url="http://ceur-ws.org/Vol-2493/regular1.pdf"
    }
  • D. Torregrosa, M. Arcan, S. Ahmadi, and J. P. McCrae, “TIAD 2019 Shared Task: Leveraging Knowledge Graphs with Neural Machine Translation for Automatic Multilingual Dictionary Generation,” in Proceedings of the 2nd translation inference across dictionaries (tiad) shared task, 2019-05-20 2019.
    [Bibtex]
    @inproceedings{torregrosa2019tiad,
    title={{TIAD 2019 Shared Task: Leveraging Knowledge Graphs with Neural Machine Translation for Automatic Multilingual Dictionary Generation}},
    author="Daniel Torregrosa and Mihael Arcan and Sina Ahmadi and John P. McCrae",
    booktitle="Proceedings of the 2nd Translation Inference Across Dictionaries (TIAD) Shared Task",
    description="This paper describes the different proposed approaches to the TIAD 2019 Shared Task, which consisted in the automatic discovery and generation of dictionaries leveraging multilingual knowledge bases. We present three methods based on graph analysis and neural machine translation and show that we can generate translations without parallel data.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-05-20",
    open="True",
    license="cc-zero",
    grants="[{'id': '731015'}]",
    year="2019",
    url="http://ceur-ws.org/Vol-2493/regular1.pdf"
    }
  • J. P. McCrae, “TIAD Shared Task 2019: Orthonormal Explicit Topic Analysis for Translation Inference across Dictionaries,” in Proceedings of the 2nd translation inference across dictionaries (tiad) shared task, 2019-05-20 2019.
    [Bibtex]
    @inproceedings{mccrae2019tiad,
    title={{TIAD Shared Task 2019: Orthonormal Explicit Topic Analysis for Translation Inference across Dictionaries}},
    author="John P. McCrae",
    booktitle="Proceedings of the 2nd Translation Inference Across Dictionaries (TIAD) Shared Task",
    description="The task of inferring translations can be achieved by the means of comparable corpora and in this paper we apply explicit topic modelling over comparable corpora to the task of inferring translation candidates. In particular, we use the Orthonormal Explicit Topic Analysis (ONETA) model, which has been shown to be the state-of-the-art explicit topic model through its elimination of correlations between topics. The method proves highly effective at selecting translations with high precision.",
    affiliation="['National University of Ireland Galway']",
    date="2019-05-20",
    open="True",
    license="cc-zero",
    grants="[{'id': '731015'}]",
    year="2019",
    url="http://ceur-ws.org/Vol-2493/system4.pdf"
    }
  • S. Ahmadi, M. Arcan, and J. McCrae, “Lexical Sense Alignment using Weighted Bipartite b-Matching,” in Proceedings of the poster track of ldk 2019, 2019-05-20 2019, pp. 12-16.
    [Bibtex]
    @inproceedings{ahmadi2019lexical,
    title={{Lexical Sense Alignment using Weighted Bipartite b-Matching}},
    author="Sina Ahmadi and Mihael Arcan and John McCrae",
    booktitle="Proceedings of the Poster Track of LDK 2019",
    description="Lexical resources are important components of natural language processing (NLP) applications providing linguistic information about the vocabulary of a language and the semantic relationships between the words. While there is an increasing number of lexical resources, particularly expert-made ones such as WordNet or FrameNet as well as collaboratively- curated ones such as Wikipedia1 or Wiktionary2 , manual construction and maintenance of such resources is a cumbersome task. This can be efficiently addressed by NLP techniques. Aligned resources have shown to improve word, knowledge and domain coverage and increase multilingualism by creating new lexical resources such as Yago , BabelNet and ConceptNet In addition, they can improve the performance of NLP tasks such as word sense disambiguation semantic role tagging and semantic relations extraction.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-05-20",
    open="True",
    license="cc-zero",
    grants="[{'id': '731015'}]",
    year="2019",
    url="http://ceur-ws.org/Vol-2402/paper3.pdf",
    pages="12-16"
    }
  • M. Jarrar, H. Amayreh, and J. McCrae, “Representing Arabic Lexicons in Lemon – a Preliminary Study,” in Proceedings of the poster track of ldk 2019, 2019-05-20 2019, pp. 29-33.
    [Bibtex]
    @inproceedings{jarrar2019representing,
    title={{Representing Arabic Lexicons in Lemon - a Preliminary Study}},
    author="Mustafa Jarrar and Hamzeh Amayreh and John McCrae",
    booktitle="Proceedings of the Poster Track of LDK 2019",
    description=" We present our progress in representing 150 Arabic multilingual lexicons using Lemon, which we have been digitizing from scratch. These lexicons are available through a lexicographic search engine (https://ontology.birzeit.edu) that allows searching for translations, synonyms, and definitions. Representing these lexicons in Lemon will enable them to be used by ontologies and NLP applications, as well as to be interlinked with the Open Linguistic Data Cloud. ",
    affiliation="['Birzeit University', 'Birzeit University', 'National University of Ireland Galway']",
    date="2019-05-20",
    open="True",
    license="cc-zero",
    grants="[{'id': '731015'}]",
    year="2019",
    url="http://ceur-ws.org/Vol-2402/paper6.pdf",
    pages="29-33"
    }
  • [DOI] O. Zayed, J. P. McCrae, and P. Buitelaar, “Crowd-sourcing A High-Quality Dataset for Metaphor Identification in Tweets,” in 2nd conference on language, data and knowledge (ldk 2019), 2019-05-20 2019.
    [Bibtex]
    @inproceedings{zayed2019crowd,
    title={{Crowd-sourcing A High-Quality Dataset for Metaphor Identification in Tweets}},
    author="Omnia Zayed and John P. McCrae and Paul Buitelaar",
    booktitle="2nd Conference on Language, Data and Knowledge (LDK 2019)",
    year="2019",
    open="True",
    url="http://drops.dagstuhl.de/opus/volltexte/2019/10374/pdf/OASIcs-LDK-2019-10.pdf",
    doi="10.4230/OASIcs.LDK.2019.10",
    description="Metaphor is one of the most important elements of human communication, especially in informal settings such as social media. There have been a number of datasets created for metaphor identification, however, this task has proven difficult due to the nebulous nature of metaphoricity. In this paper, we present a crowd-sourcing approach for the creation of a dataset for metaphor identification, that is able to rapidly achieve large coverage over the different usages of metaphor in a given corpus while maintaining high accuracy. We validate this methodology by creating a set of 2,500 manually annotated tweets in English, for which we achieve inter-annotator agreement scores over 0.8, which is higher than other reported results that did not limit the task. This methodology is based on the use of an existing classifier for metaphor in order to assist in the identification and the selection of the examples for annotation, in a way that reduces the cognitive load for annotators and enables quick and accurate annotation. We selected a corpus of both general language tweets and political tweets relating to Brexit and we compare the resulting corpus on these two domains. As a result of this work, we have published the first dataset of tweets annotated for metaphors, which we believe will be invaluable for the development, training and evaluation of approaches for metaphor identification in tweets.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-05-20",
    license="cc-by",
    grants="[{'id': '731015'}]"
    }
  • [DOI] B. R. Chakravarthi, M. Arcan, and J. P. McCrae, “Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages,” in 2nd conference on language, data and knowledge (ldk 2019), Dagstuhl, Germany, 2019-05-20 2019, p. 6:1–6:14.
    [Bibtex]
    @inproceedings{chakravarthi2019comparison,
    author="Bharathi Raja Chakravarthi and Mihael Arcan and John P. McCrae",
    title={{Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages}},
    booktitle="2nd Conference on Language, Data and Knowledge (LDK 2019)",
    pages="6:1--6:14",
    series="OpenAccess Series in Informatics (OASIcs)",
    ISBN="978-3-95977-105-4",
    ISSN="2190-6807",
    year="2019",
    volume="70",
    editor="Maria Eskevich and Gerard de Melo and Christian Fäth and John P. McCrae and Paul Buitelaar and Christian Chiarcos and Bettina Klimek and Milan Dojchinovski",
    publisher="Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik",
    address="Dagstuhl, Germany",
    url="http://drops.dagstuhl.de/opus/volltexte/2019/10370",
    URN="urn:nbn:de:0030-drops-103700",
    doi="10.4230/OASIcs.LDK.2019.6",
    description="Under-resourced languages are a significant challenge for statistical approaches to machine translation, and recently it has been shown that the usage of training data from closely-related languages can improve machine translation quality of these languages. While languages within the same language family share many properties, many under-resourced languages are written in their own native script, which makes taking advantage of these language similarities difficult. In this paper, we propose to alleviate the problem of different scripts by transcribing the native script into common representation i.e. the Latin script or the International Phonetic Alphabet (IPA). In particular, we compare the difference between coarse-grained transliteration to the Latin script and fine-grained IPA transliteration. We performed experiments on the language pairs English-Tamil, English-Telugu, and English-Kannada translation task. Our results show improvements in terms of the BLEU, METEOR and chrF scores from transliteration and we find that the transliteration into the Latin script outperforms the fine-grained IPA transcription.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2019-05-20",
    open="True",
    license="cc-by",
    grants="[{'id': '731015'}]"
    }
  • [DOI] 2nd Conference on Language, Data and Knowledge (LDK 2019)Dagstuhl, Germany: Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, 2019.
    [Bibtex]
    @proceedings{eskevich2019ldk,
    editor="Maria Eskevich and Gerard de Melo and Christian Fäth and John P. McCrae and Paul Buitelaar and Christian Chiarcos and Bettina Klimek and Milan Dojchinovski",
    title={{2nd Conference on Language, Data and Knowledge (LDK 2019)}},
    series="OpenAccess Series in Informatics (OASIcs)",
    ISBN="978-3-95977-105-4",
    year="2019",
    volume="70",
    publisher="Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik",
    address="Dagstuhl, Germany",
    url="http://drops.dagstuhl.de/portals/oasics/index.php?semnr=16105",
    doi="10.4230/OASIcs.LDK.2019.0"
    }

2018

  • S. Ahmadi, M. Arcan, and J. McCrae, “On Lexicographical Networks,” in Workshop on elexicography: between digital humanities and artificial intelligence, 2018-12-06 2018.
    [Bibtex]
    @inproceedings{ahmadi2018lexicographical,
    title={{On Lexicographical Networks}},
    author="Sina Ahmadi and Mihael Arcan and John McCrae",
    booktitle="Workshop on eLexicography: Between Digital Humanities and Artificial Intelligence",
    url="https://lexdhai.insight-centre.org/Lex_DH__AI_2018_paper_4.pdf",
    open="True",
    year="2018",
    description="Lexical resources are important components of natural language processing (NLP) applications providing machine-readable knowledge for various tasks. One of the most popular examples of lexical resources are lexicons. Lexicons provide linguistic information about the vocabulary of a language and the semantic relationships between the words in a pair of languages. In addition to the lexicons, there are various other types of lexical resources, particularly those which are made by experts such as WordNet, VerbNet and FrameNet and, those which are collaboratively curated such as Wikipedia and Wiktionary.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    date="2018-12-06",
    license="cc-zero",
    grants="[{'id': '731015'}]"
    }
  • 6th Workshop on Linked Data in Linguistics: Towards Linguistic Data ScienceEuropean Language Resources Association, 2018.
    [Bibtex]
    @proceedings{mccrae2018ldl,
    title={{6th Workshop on Linked Data in Linguistics: Towards Linguistic Data Science}},
    editor="John P. McCrae and Christian Chiarcos and Thierry Declerck and Jorge Gracia and Bettina Klimek",
    affiliation="['National University of Ireland, Galway', 'Goethe University Frankfurt', 'DFKI GmbH and ACDH-ÖAW', 'University of Zaragoza', 'University of Leipzig']",
    date="2018-05-12",
    description="Since its establishment in 2012, the Linked Data in Linguistics (LDL) workshop series has become the major forum for presenting, discussing and disseminating technologies, vocabularies, resources and experiences regarding the application of Semantic Web standards and the Linked Open Data paradigm to language resources in order to facilitate their visibility, accessibility, interoperability, reusability, enrichment, combined evaluation and integration. The LDL workshop series is organized by the Open Linguistics Working Group of the Open Knowledge Foundation, and has contributed greatly to the emergence and growth of the Linguistic Linked Open Data (LLOD) cloud. LDL workshops contribute to the discussion, dissemination and establishment of community standards that drive this development, most notably the Lemon/OntoLex model for lexical resources, as well as standards for other types of language resources still under development. Building on our earlier success in creating and linking language resources, LDL-2018 will focus on Linguistic Data Science, i.e., research methodologies and applications building on Linguistic Linked Open Data and the existing technology and resource stack for linguistics, natural language processing and digital humanities. LDL-2018 builds on the success of the workshop series, incl. two appearances at LREC (2014, 2016), where we attracted a large number of interested participants. As of 2016, LDL workshops alternate with our stand-alone conference on Language, Data and Knowledge (LDK). LDK-2017 was held in Galway, Ireland, as a 3-day event with 150 registrants and several satellite workshops. Continuing the LDL workshop series together with LDK is important in order to facilitate dissemination within and to receive input from the language resource community, and LREC is the obvious host conference for this purpose. LDL-2018 will be supported by the ELEXIS project on an European Lexicographic Infrastructure.",
    open="True",
    year="2018",
    publisher="European Language Resources Association",
    url="http://lrec-conf.org/workshops/lrec2018/W23/pdf/book_of_proceedings.pdf",
    series="LREC-2018 Workshop Proceedings"
    }
  • S. Krek, J. McCrae, I. Kosem, T. Wissek, C. Tiberius, R. Navigli, and B. S. Pedersen, “European Lexicographic Infrastructure (ELEXIS),” in Proceedings of the xviii euralex international congress on lexicography in global contexts, 2018-7-21 2018, pp. 881-892.
    [Bibtex]
    @inproceedings{krek2018european,
    title={{European Lexicographic Infrastructure (ELEXIS)}},
    author="Simon Krek and John McCrae and Iztok Kosem and Tanja Wissek and Carole Tiberius and Roberto Navigli and Bolette Sandford Pedersen",
    affiliation="['Jožef Stefan Institute', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Austrian Academy of Sciences', 'Dutch Language Institute', 'Sapienza University of Rome', 'University of Copenhagen']",
    date="2018-7-21",
    description="In the paper we describe a new EU infrastructure project dedicated to lexicography. The project is part of the Horizon 2020 program, with a duration of four years (2018-2022). The result of the project will be an infrastructure which will (1) enable efficient access to high quality lexicographic data, and (2) bridge the gap between more advanced and less-resourced scholarly communities working on lexicographic resources. One of the main issues addressed by the project is the fact that current lexicographic resources have different levels of (incompatible) structuring, and are not equally suitable for application in in Natural Language Processing and other fields. The project will therefore develop strategies, tools and standards for extracting, structuring and linking lexicographic resources to enable their inclusion in Linked Open Data and the Semantic Web, as well as their use in the context of digital humanities.",
    open="True",
    booktitle="Proceedings of the XVIII EURALEX International Congress on Lexicography in Global Contexts",
    year="2018",
    pages="881-892",
    url="http://euralex.org/wp-content/themes/euralex/proceedings/Euralex%202018/118-4-2986-1-10-20180820.pdf"
    }
  • [DOI] A. Walsh, C. Bonial, K. Geeraert, J. P. McCrae, N. Schneider, and C. Somers, “Constructing an Annotated Corpus of Verbal MWEs for English,” in Proceedings of joint workshop on linguistic annotation, multiword expressions and constructions (law-mwe-cxg-2018), 2018-08-01 2018.
    [Bibtex]
    @inproceedings{walsh2018constructing,
    title={{Constructing an Annotated Corpus of Verbal MWEs for English}},
    author="Abigail Walsh and Claire Bonial and Kristina Geeraert and John P. McCrae and Nathan Schneider and Clarissa Somers",
    date="2018-08-01",
    description="This paper describes the construction and annotation of a corpus of verbal MWEs for English as part of the PARSEME Shared Task 1.1 on automatic identification of verbal MWEs. The criteria for corpus selection, the categories of MWEs used, and the training process are discussed, along with the particular issues that led to revisions in edition 1.1 of the annotation guidelines. Finally, an overview of the characteristics of the final annotated corpus is presented, as well as some discussion on inter-annotator agreement.",
    affiliation="['ADAPT Centre, Dublin City University', 'U.S. Army Research Laboratory', 'University of Alberta', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Georgetown University', 'Georgetown University']",
    open="True",
    doi="10.18653/v1/W18-4921",
    booktitle="Proceedings of Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)",
    year="2018"
    }
  • [DOI] O. Zayed, J. P. McCrae, and P. Buitelaar, “Phrase-Level Metaphor Identification using Distributed Representations of Word Meaning,” in Proceedings of the workshop on figurative language processing, 2018-06-01 2018.
    [Bibtex]
    @inproceedings{zayed2018phrase,
    title={{Phrase-Level Metaphor Identification using Distributed Representations of Word Meaning}},
    author="Omnia Zayed and John P. McCrae and Paul Buitelaar",
    booktitle="Proceedings of the Workshop on Figurative Language Processing",
    date="2018-06-01",
    description="Metaphor is an essential element of human cognition which is often used to express ideas and emotions that might be difficult to express using literal language. Processing metaphoric language is a challenging task for a wide range of applications ranging from text simplification to psychotherapy. Despite the variety of approaches that are trying to process metaphor, there is still a need for better models that mimic the human cognition while exploiting fewer resources. In this paper, we present an approach based on distributional semantics to identify metaphors on the phrase-level. We investigated the use of different word embeddings models to identify verb-noun pairs where the verb is used metaphorically. Several experiments are conducted to show the performance of the proposed approach on benchmark datasets.",
    affiliation="['Insight Centre for Data Analytics, Data Science Institute, National University of Ireland Galway', 'Insight Centre for Data Analytics, Data Science Institute, National University of Ireland Galway', 'Insight Centre for Data Analytics, Data Science Institute, National University of Ireland Galway']",
    open="True",
    year="2018",
    doi="10.18653/v1/W18-0910"
    }
  • [DOI] J. P. McCrae and P. Buitelaar, “Linking Datasets Using Semantic Textual Similarity,” Cybernetics and information technologies, vol. 18, iss. 1, pp. 109-123, 2018.
    [Bibtex]
    @article{mccrae2018linking,
    journal="Cybernetics and Information Technologies",
    volume="18",
    number="1",
    pages="109-123",
    author="John P. McCrae and Paul Buitelaar",
    description="Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway']",
    open="False",
    title={{Linking Datasets Using Semantic Textual Similarity}},
    url="http://www.cit.iit.bas.bg/CIT_2018/v-18-1/10_paper.pdf",
    year="2018",
    doi="10.2478/cait-2018-0010"
    }
  • A. Doyle, J. P. McCrae, and C. Downey, “Preservation of Original Orthography in the Construction of an Old Irish Corpus,” in Proceedings of the 3rd workshop for collaboration and computing for under-resourced languages, 2018-05-12 2018.
    [Bibtex]
    @inproceedings{doyle2018preservation,
    title={{Preservation of Original Orthography in the Construction of an Old Irish Corpus}},
    author="Adrian Doyle and John P. McCrae and Clodagh Downey",
    booktitle="Proceedings of the 3rd Workshop for Collaboration and Computing for Under-Resourced Languages",
    date="2018-05-12",
    description="This paper will examine the process of creating a digital corpus based on the Würzburg glosses, the earliest large collection of glosses written in the Irish language. Modern editorial standards applied in publications of these glosses can alter spelling, punctuation, and even the semantic meaning of a sentence where one word is used in place of another. Therefore, an understanding of the original orthography utilised by Old Irish scribes is important in determining the orthography which should be utilised in a modern digital corpus. This paper will outline why the text of the Würzburg glosses as it appears in Thesaurus Palaeohibernicus is the best candidate for digitisation. The automated digitisation and proofing process of the corpus will be outlined, and details will be given of a tag-set utilised within the digital corpus in order to preserve information present in Thesaurus Palaeohibernicus as metadata.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway']",
    open="True",
    year="2018",
    url="http://lrec-conf.org/workshops/lrec2018/W26/pdf/20_W26.pdf"
    }
  • T. Declerck, J. McCrae, R. Navigli, K. Zaytseva, and T. Wissik, “ELEXIS – European Lexicographic Infrastructure: Contributions to and from the Linguistic Linked Open Data,” in Proceedings of the globalex 2018 workshop, 2018-05-12 2018.
    [Bibtex]
    @inproceedings{declerck2018elexis,
    title={{ELEXIS - European Lexicographic Infrastructure: Contributions to and from the Linguistic Linked Open Data}},
    author="Thierry Declerck and John McCrae and Roberto Navigli and Ksenia Zaytseva and Tanja Wissik",
    date="2018-05-12",
    description="In this paper we outline the interoperability aspects of the recently started European project ELEXIS (European Lexicographic Infrastructure). ELEXIS aims to integrate, extend and harmonise national and regional efforts in the field of lexicography, both modern and historical, with the goal of creating a sustainable infrastructure which will enable efficient access to high quality lexical data in the digital age, and bridge the gap between more advanced and lesser-supported lexicographic resources. For this, ELEXIS will make use of or establish common standards and solutions for the development of lexicographic resources and develop strategies and tools for extracting, structuring and linking lexicographic resources.",
    affiliation="['Austrian Centre for Digital Humanities at the Austrian Academy of Sciences and DFKI GmbH, Multilingual Technologies Lab', 'Insight Centre for Data Analytics at the National University of Ireland Galway, Ireland', 'Sapienza University of Rome', 'Austrian Centre for Digital Humanities at the Austrian Academy of Sciences', 'Austrian Centre for Digital Humanities at the Austrian Academy of Sciences']",
    open="True",
    booktitle="Proceedings of the Globalex 2018 Workshop",
    year="2018"
    }
  • R. Sarkar, J. P. McCrae, and P. Buitelaar, “A supervised approach to taxonomy extraction using word embeddings,” in Proceedings of the 11th language resource and evaluation conference (lrec), 2018-05-12 2018.
    [Bibtex]
    @inproceedings{sarkar2018supervised,
    booktitle="Proceedings of the 11th Language Resource and Evaluation Conference (LREC)",
    year="2018",
    author="Rajdeep Sarkar and John P. McCrae and Paul Buitelaar",
    date="2018-05-12",
    description="Large collections of texts are commonly generated by large organizations and making sense of these collections of texts is a significant challenge. One method for handling this is to organize the concepts into a hierarchical structure such that similar concepts can be discovered and easily browsed. This approach was the subject of a recent evaluation campaign, TExEval, however the results of this task showed that none of the systems consistently outperformed a relatively simple baseline.In order to solve this issue, we propose a new method that uses supervised learning to combine multiple features with a support vector machine classifier including the baseline features. We show that this outperforms the baseline and thus provides a stronger method for identifying taxonomic relations than previous methods",
    affiliation="['Indian Institute of Technology Kharagpur', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway']",
    open="True",
    title={{A supervised approach to taxonomy extraction using word embeddings}},
    url="http://www.lrec-conf.org/proceedings/lrec2018/pdf/601.pdf"
    }
  • I. Wood, J. P. McCrae, V. Andryushechkin, and P. Buitelaar, “A Comparison Of Emotion Annotation Schemes And A New Annotated Data Set,” in Proceedings of the 11th language resource and evaluation conference (lrec), 2018-05-12 2018.
    [Bibtex]
    @inproceedings{wood2018comparison,
    booktitle="Proceedings of the 11th Language Resource and Evaluation Conference (LREC)",
    year="2018",
    author="Ian Wood and John P. McCrae and Vladimir Andryushechkin and Paul Buitelaar",
    title={{A Comparison Of Emotion Annotation Schemes And A New Annotated Data Set}},
    date="2018-05-12",
    description="While the recognition of positive/negative sentiment in text is an established task with many standard data sets and well developed methodologies, the recognition of more nuanced affect has received less attention, and in particular, there are very few publicly available gold standard annotated resources. To address this lack, we present a series of emotion annotation studies on tweets culminating in a publicly available collection of 2,019 tweets with scores on four emotion dimensions: valence, arousal, dominance and surprise, following the emotion representation model identified by Fontaine et.al. (Fontaine et al., 2007). Further, we make a comparison of relative vs. absolute annotation schemes. We find improved annotator agreement with a relative annotation scheme (comparisons) on a dimensional emotion model over a categorical annotation scheme on Ekman’s six basic emotions (Ekman et al., 1987), however when we compare inter-annotator agreement for comparisons with agreement for a rating scale annotation scheme (both with the same dimensional emotion model), we find improved inter-annotator agreement with rating scales, challenging a common belief that relative judgements are more reliable.",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway']",
    open="True",
    url="http://www.lrec-conf.org/proceedings/lrec2018/pdf/61.pdf"
    }
  • H. Ziad, J. P. McCrae, and P. Buitelaar, “Teanga: A Linked Data based platform for Natural Language Processing,” in Proceedings of the 11th language resource and evaluation conference (lrec), 2018-05-12 2018.
    [Bibtex]
    @inproceedings{ziad2018linked,
    booktitle="Proceedings of the 11th Language Resource and Evaluation Conference (LREC)",
    year="2018",
    author="Housam Ziad and John P. McCrae and Paul Buitelaar",
    date="2018-05-12",
    description="In this paper, we describe Teanga, a linked data based platform for natural language processing (NLP). Teanga enables the use of many NLP services from a single interface, whether the need was to use a single service or multiple services in a pipeline. Teanga focuses on the problem of NLP services interoperability by using linked data to define the types of services input and output. Teanga’s strengths include being easy to install and run, easy to use, able to run multiple NLP tasks from one interface and helping users to build a pipeline of tasks through a graphical user interface.",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland, Galway', 'Insight Centre for Data Analytics, National University of Ireland, Galway', 'Insight Centre for Data Analytics, National University of Ireland, Galway']",
    open="True",
    title={{Teanga: A Linked Data based platform for Natural Language Processing}},
    url="http://www.lrec-conf.org/proceedings/lrec2018/pdf/106.pdf"
    }
  • M. Arcan, E. Montiel-Ponsoda, J. P. McCrae, and P. Buitelaar, “Automatic Enrichment of Terminological Resources: the IATE RDF Example,” in Proceedings of the 11th language resource and evaluation conference (lrec), 2018-05-12 2018.
    [Bibtex]
    @inproceedings{arcan2018automatic,
    booktitle="Proceedings of the 11th Language Resource and Evaluation Conference (LREC)",
    year="2018",
    author="Mihael Arcan and Elena Montiel-Ponsoda and John P. McCrae and Paul Buitelaar",
    title={{Automatic Enrichment of Terminological Resources: the IATE RDF Example}},
    date="2018-05-12",
    description="Terminological resources have proven necessary in many organizations and institutions to ensure communication between experts. However, the maintenance of these resources is a very time-consuming and expensive process. Therefore, the work described in this contribution aims to automate the maintenance process of such resources. As an example, we demonstrate enriching the RDF version of IATE with new terms in the languages for which no translation was available, as well as with domain-disambiguated sentences and information about usage frequency. This is achieved by relying on machine translation trained on parallel corpora that contains the terms in question and multilingual word sense disambiguation performed on the context provided by the sentences. Our results show that for most languages translating the terms within a disambiguated context significantly outperforms the approach with randomly selected sentences.",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland Galway', 'Ontology Engineering Group, Universidad Politecnica de Madrid', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway']",
    open="True",
    url="http://www.lrec-conf.org/proceedings/lrec2018/summaries/541.html"
    }
  • [DOI] P. Buitelaar, I. D. Wood, S. Negi, M. Arcan, J. P. McCrae, A. Abele, C. Robin, V. Andryushechkin, H. Ziad, H. Sagha, F. J. Sánchez-Rada, C. A. Iglesias, C. Navarro, A. Giefer, N. Heise, V. Masucci, F. A. Danza, C. Caterino, P. Smrž, M. Hradiš, F. Povolný, M. Klimeš, P. Matějka, and G. Tummarello, “MixedEmotions: An Open-Source Toolbox for Multi-Modal Emotion Analysis,” Ieee transactions on multimedia, vol. 20, iss. 9, 2018.
    [Bibtex]
    @article{buitelaar2018mixedemotions,
    journal="IEEE Transactions on Multimedia",
    year="2018",
    author="Paul Buitelaar and Ian D. Wood and Sapna Negi and Mihael Arcan and John P. McCrae and Andrejs Abele and Cécile Robin and Vladimir Andryushechkin and Housam Ziad and Hesam Sagha and J. Fernando Sánchez-Rada and Carlos A. Iglesias and Carlos Navarro and Andreas Giefer and Nicolaus Heise and Vincenzo Masucci and Francesco A. Danza and Ciro Caterino and Pavel Smrž and Michal Hradiš and Filip Povolný and Marek Klimeš and Pavel Matějka and Giovanni Tummarello",
    date="2018-09-01",
    description="Recently, there is an increasing tendency to embed functionalities for recognizing emotions from user-generated media content in automated systems such as call-centre operations, recommendations, and assistive technologies, providing richer and more informative user and content profiles. However, to date, adding these functionalities was a tedious, costly, and time-consuming effort, requiring identification and integration of diverse tools with diverse interfaces as required by the use case at hand. The MixedEmotions Toolbox leverages the need for such functionalities by providing tools for text, audio, video, and linked data processing within an easily integrable plug-and-play platform. These functionalities include: 1) for text processing: emotion and sentiment recognition; 2) for audio processing: emotion, age, and gender recognition; 3) for video processing: face detection and tracking, emotion recognition, facial landmark localization, head pose estimation, face alignment, and body pose estimation; and 4) for linked data: knowledge graph integration. Moreover, the MixedEmotions Toolbox is open-source and free. In this paper, we present this toolbox in the context of the existing landscape, and provide a range of detailed benchmarks on standard test-beds showing its state-of-the-art performance. Furthermore, three real-world use cases show its effectiveness, namely, emotion-driven smart TV, call center monitoring, and brand reputation analysis.",
    affiliation="['National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'National University of Ireland Galway', 'University of Passau', 'GSI Universidad Politécnica de Madrid', 'GSI Universidad Politécnica de Madrid', 'Paradigma Digital', 'Deutsche Welle', 'Deutsche Welle', 'Expert Systems', 'Expert Systems', 'Expert Systems', 'Brno University of Technology', 'Brno University of Technology', 'Phonexia', 'Phonexia', 'Phonexia', 'Siren Solutions']",
    open="False",
    title={{MixedEmotions: An Open-Source Toolbox for Multi-Modal Emotion Analysis}},
    url="http://ieeexplore.ieee.org/document/8269329/",
    volume="20",
    number="9",
    doi="10.1109/TMM.2018.2798287"
    }
  • J. P. McCrae, “Mapping WordNet Instances to Wikipedia,” in Proceedings of the 9th global wordnet conference, 2018-01-12 2018.
    [Bibtex]
    @inproceedings{mccrae2018mapping,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="John P. McCrae",
    date="2018-01-12",
    description="Lexical resource differ from encyclopaedic resources and represent two distinct types of resource covering general language and named entities respectively. However, many lexical resources, including Princeton WordNet, contain many proper nouns, referring to named entities in the world yet it is not possible or desirable for a lexical resource to cover all named entities that may reasonably occur in a text. In this paper, we propose that instead of including synsets for instance concepts PWN should instead provide links to Wikipedia articles describing the concept. In order to enable this we have created a gold-quality mapping between all of the 7,742 instances in PWN and Wikipedia (where such a mapping is possible). As such, this resource aims to provide a gold standard for link discovery, while also allowing PWN to distinguish itself from other resources such as DBpedia or BabelNet. Moreover, this linking connects PWN to the Linguistic Linked Open Data cloud, thus creating a richer, more usable resource for natural language processing.",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland Galway']",
    open="True",
    title={{Mapping WordNet Instances to Wikipedia}}
    }
  • J. P. McCrae, I. Wood, and A. Hicks, “Towards a Crowd-Sourced WordNet for Colloquial English,” in Proceedings of the 9th global wordnet conference, 2018-01-12 2018.
    [Bibtex]
    @inproceedings{mccrae2018towards,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="John P. McCrae and Ian Wood and Amanda Hicks",
    title={{Towards a Crowd-Sourced WordNet for Colloquial English}},
    date="2018-01-12",
    description="Princeton WordNet is one of the most widely-used resources for natural language processing, but is updated only infrequently and cannot keep up with the fast-changing usage of the English language on social media platforms such as Twitter. The Colloquial WordNet aims to provide an open platform whereby anyone can contribute, while still following the structure of WordNet. Many crowdsourced lexical resources often have significant quality issues, and as such care must be taken in the design of the interface to ensure quality. In this paper, we present the development of a platform that can be opened on the Web to any lexicographer who wishes to contribute to this resource and the lexicographic methodology applied by this interface",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Health Outcomes \& Policy, University of Florida']",
    open="True"
    }
  • B. R. Chakravarthi, M. Arcan, and J. P. McCrae, “Improving Wordnets for Under-Resourced Languages Using Machine Translation information,” in Proceedings of the 9th global wordnet conference, 2018-01-12 2018.
    [Bibtex]
    @inproceedings{chakravarthi2018improving,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="Bharathi Raja Chakravarthi and Mihael Arcan and John P. McCrae",
    title={{Improving Wordnets for Under-Resourced Languages Using Machine Translation information}},
    date="2018-01-12",
    description="Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not available for under-resourced languages. Even if wordnet-like resources are available for under-resourced languages, they are often not easily accessible, which can alter the results of applications using these resources. Our proposed method presents an expand approach for improving and generating wordnets with the help of machine translation. We apply our methods to improve and extend wordnets for the Dravidian languages, i.e., Tamil, Telugu, Kannada, which are severly under-resourced languages. We report evaluation results of the generated wordnet senses in term of precision for these languages. In addition to that, we carried out a manual evaluation of the translations for the Tamil language, where we demonstrate that our approach can aid in improving wordnet resources for under-resourced Dravidian languages.",
    affiliation="['Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway', 'Insight Centre for Data Analytics, National University of Ireland Galway']",
    open="True"
    }
  • B. Pedersen, J. McCrae, C. Tiberius, and S. Krek, “ELEXIS – a European infrastructure fostering cooperation and infor-mation exchange among lexicographical research communities,” in Proceedings of the 9th global wordnet conference, 2018-01-12 2018.
    [Bibtex]
    @inproceedings{pedersen2018elexis,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="Bolette Pedersen and John McCrae and Carole Tiberius and Simon Krek",
    title={{ELEXIS - a European infrastructure fostering cooperation and infor-mation exchange among lexicographical research communities}},
    date="2018-01-12",
    description="The paper describes objectives, concept and methodology for ELEXIS, a European infrastructure fostering cooperation and information exchange among lexicographical research communities. The infrastructure is a newly granted project under the Horizon 2020 INFRAIA call, with the topic Integrating Activities for Starting Communities. The project is planned to start in January 2018",
    affiliation="['University of Copenhagen', 'National University of Ireland Galway', ' Dutch Language Institute', ' Jožef Stefan Institute']",
    open="True"
    }
  • J. P. McCrae, “Mapping WordNet Instances to Wikipedia,” in Proceedings of the 9th global wordnet conference, 2018.
    [Bibtex]
    @inproceedings{mccrae2018mapping,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="John P. McCrae",
    title={{Mapping WordNet Instances to Wikipedia}}
    }
  • J. P. McCrae, I. Wood, and A. Hicks, “Towards a Crowd-Sourced WordNet for Colloquial English,” in Proceedings of the 9th global wordnet conference, 2018.
    [Bibtex]
    @inproceedings{mccrae2018towards,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="John P. McCrae and Ian Wood and Amanda Hicks",
    title={{Towards a Crowd-Sourced WordNet for Colloquial English}}
    }
  • B. R. Chakravarthi, M. Arcan, and J. P. McCrae, “Improving Wordnets for Under-Resourced Languages Using Machine Translation information,” in Proceedings of the 9th global wordnet conference, 2018.
    [Bibtex]
    @inproceedings{chakravarthi2018improving,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="Bharathi Raja Chakravarthi and Mihael Arcan and John P. McCrae",
    title={{Improving Wordnets for Under-Resourced Languages Using Machine Translation information}}
    }
  • B. Pedersen, J. McCrae, C. Tiberius, and S. Krek, “ELEXIS – a European infrastructure fostering cooperation and infor-mation exchange among lexicographical research communities,” in Proceedings of the 9th global wordnet conference, 2018.
    [Bibtex]
    @inproceedings{pedersen2018elexis,
    booktitle="Proceedings of the 9th Global WordNet Conference",
    year="2018",
    author="Bolette Pedersen and John McCrae and Carole Tiberius and Simon Krek",
    title={{ELEXIS - a European infrastructure fostering cooperation and infor-mation exchange among lexicographical research communities}}
    }

2017

  • Knowledge Graphs and Language Technology – ISWC 2016 International Workshops: KEKI and NLP&DBpediaSpringer, 2017.
    [Bibtex]
    @proceedings{vanerp2017knowledge,
    editor="Marieke van Erp and Sebastian Hellmann and John P. McCrae and Christian Chiarcos and Key-Sun Choi and Jorge Gracia and Yoshihiko Hayashi and Seiji Koide and Pablo Mendes and Heiko Paulheim and Hideaki Takeda",
    title={{Knowledge Graphs and Language Technology - ISWC 2016 International Workshops: KEKI and NLP\&DBpedia}},
    volume="20",
    number="9",
    year="2017",
    publisher="Springer",
    url="https://www.springer.com/gp/book/9783319687223",
    series="Information Systems and Applications, incl. Internet/Web, and HCI"
    }
  • Knowledge Graphs and Language Technology – ISWC 2016 International Workshops: KEKI and NLP&DBpediaSpringer, 2017.
    [Bibtex]
    @proceedings{vanerp2017knowledge,
    editor="Marieke van Erp and Sebastian Hellmann and John P. McCrae and Christian Chiarcos and Key-Sun Choi and Jorge Gracia and Yoshihiko Hayashi and Seiji Koide and Pablo Mendes and Heiko Paulheim and Hideaki Takeda",
    title={{Knowledge Graphs and Language Technology - ISWC 2016 International Workshops: KEKI and NLP\&DBpedia}},
    year="2017",
    url="http://www.springer.com/gp/book/9783319687223",
    publisher="Springer",
    series="Lecture Notes in Computer Science"
    }
  • Language, Data, and KnowledgeSpringer, 2017.
    [Bibtex]
    @proceedings{gracia2017language,
    editor="Jorge Gracia and Francis Bond and John P. McCrae and Paul Buitelaar and Christian Chiarcos and Sebastian Hellmann",
    title={{Language, Data, and Knowledge}},
    year="2017",
    url="http://www.springer.com/gp/book/9783319598871",
    publisher="Springer",
    series="Lecture Notes in Artificial Intelligence"
    }
  • J. P. McCrae, M. Arcan, and P. Buitleaar, “Linking Knowledge Graphs across Languages with Semantic Similarity and Machine Translation,” in Proceedings of the first workshop on multi-language processing in a globalising world (mlp2017), 2017.
    [Bibtex]
    @inproceedings{mccrae2017linking,
    title={{Linking Knowledge Graphs across Languages with Semantic Similarity and Machine Translation}},
    author="John P. McCrae and Mihael Arcan and Paul Buitleaar",
    year="2017",
    booktitle="Proceedings of the First Workshop on Multi-Language Processing in a Globalising World (MLP2017)"
    }
  • J. P. McCrae, J. Bosque-Gil, J. Gracia, P. Buitelaar, and P. Cimiano, “The OntoLex-Lemon Model: development and applications,” in Proceedings of elex 2017, 2017, pp. 587-597.
    [Bibtex]
    @inproceedings{mccrae2017ontolex,
    title={{The OntoLex-Lemon Model: development and applications}},
    author="John P. McCrae and Julia Bosque-Gil and Jorge Gracia and Paul Buitelaar and Philipp Cimiano",
    year="2017",
    booktitle="Proceedings of eLex 2017",
    pages="587-597",
    url="https://elex.link/elex2017/wp-content/uploads/2017/09/paper36.pdf"
    }
  • [DOI] B. Klimek, J. P. McCrae, C. Chiarcos, and S. Hellmann, “OnLiT: An Ontology for Linguistic Terminology,” in Proceedings of the first conference on language, data and knowledge (ldk2017), 2017, pp. 42-57.
    [Bibtex]
    @inproceedings{klimek2017onlit,
    title={{OnLiT: An Ontology for Linguistic Terminology}},
    author="Bettina Klimek and John P. McCrae and Christian Chiarcos and Sebastian Hellmann",
    year="2017",
    booktitle="Proceedings of the First Conference on Language, Data and Knowledge (LDK2017)",
    pages="42-57",
    doi="10.1007/978-3-319-59888-8_4"
    }
  • [DOI] J. P. McCrae, I. Wood, and A. Hicks, “The Colloquial WordNet: Extending Princeton WordNet with Neologisms,” in Proceedings of the first conference on language, data and knowledge (ldk2017), 2017, pp. 194-202.
    [Bibtex]
    @inproceedings{mccrae2017colloquial,
    title={{The Colloquial WordNet: Extending Princeton WordNet with Neologisms}},
    author="John P. McCrae and Ian Wood and Amanda Hicks",
    year="2017",
    booktitle="Proceedings of the First Conference on Language, Data and Knowledge (LDK2017)",
    pages="194-202",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • [DOI] A. Abele, J. P. McCrae, and P. Buitelaar, “An Evaluation Dataset for Linked Data Profiling,” in Proceedings of the first conference on language, data and knowledge (ldk2017), 2017, pp. 1-9.
    [Bibtex]
    @inproceedings{abele2017evaluation,
    title={{An Evaluation Dataset for Linked Data Profiling}},
    author="Andrejs Abele and John P. McCrae and Paul Buitelaar",
    year="2017",
    booktitle="Proceedings of the First Conference on Language, Data and Knowledge (LDK2017)",
    pages="1-9",
    doi="10.1007/978-3-319-59888-8_1"
    }

2016

  • P. Cimiano, J. P. McCrae, and P. Buitelaar, Lexicon Model for Ontologies: Community Report, 2016.
    [Bibtex]
    @misc{cimiano2016lexicon,
    title={{Lexicon Model for Ontologies: Community Report}},
    author="Philipp Cimiano and John P. McCrae and Paul Buitelaar",
    url="https://www.w3.org/2016/05/ontolex/",
    year="2016",
    organization="W3C"
    }
  • M. Arcan, J. P. McCrae, and P. Buitelaar, “Expanding wordnets to new languages with multilingual sense disambiguation,” in Proceedings of the 26th international conference on computational linguistics, 2016.
    [Bibtex]
    @inproceedings{arcan2016expanding,
    author="Mihael Arcan and John P. McCrae and Paul Buitelaar",
    year="2016",
    title={{Expanding wordnets to new languages with multilingual sense disambiguation}},
    booktitle="Proceedings of The 26th International Conference on Computational Linguistics",
    url="https://www.aclweb.org/anthology/C/C16/C16-1010.pdf"
    }
  • [DOI] J. P. McCrae and N. Prangnawarat, “Identifying Poorly-Defined Concepts in WordNet with Graph Metrics,” in Proceedings of the first workshop on knowledge extraction and knowledge integration (keki-2016), 2016.
    [Bibtex]
    @inproceedings{mccrae2016identifying,
    author="John P. McCrae and Narumol Prangnawarat",
    year="2016",
    title={{Identifying Poorly-Defined Concepts in WordNet with Graph Metrics}},
    booktitle="Proceedings of the First Workshop on Knowledge Extraction and Knowledge Integration (KEKI-2016)",
    doi="10.1007/978-3-319-68723-0_6"
    }
  • J. P. McCrae and P. Cimiano, “LIXR: Quick, succinct conversion of XML to RDF,” in Proceedings of the iswc 2016 posters and demo track, 2016.
    [Bibtex]
    @inproceedings{mccrae2016lixr,
    author="John P. McCrae and Philipp Cimiano",
    year="2016",
    title={{LIXR: Quick, succinct conversion of XML to RDF}},
    booktitle="Proceedings of the ISWC 2016 Posters and Demo Track"
    }
  • J. P. McCrae, “Yuzu: Publishing Any Data as Linked Data,” in Proceedings of the iswc 2016 posters and demo track, 2016.
    [Bibtex]
    @inproceedings{mccrae2016yuzu,
    author="John P. McCrae",
    year="2016",
    title={{Yuzu: Publishing Any Data as Linked Data}},
    booktitle="Proceedings of the ISWC 2016 Posters and Demo Track"
    }
  • [DOI] J. P. McCrae, K. Asooja, N. Aggarwal, and P. Buitelaar, “NUIG-UNLP at SemEval-2016 Task 1: Soft Alignment and Deep Learning for Semantic Textual Similarity,” in Semeval-2016, 2016.
    [Bibtex]
    @inproceedings{mccrae2016nuig,
    author="John P. McCrae and Kartik Asooja and Nitish Aggarwal and Paul Buitelaar",
    year="2016",
    title={{NUIG-UNLP at SemEval-2016 Task 1: Soft Alignment and Deep Learning for Semantic Textual Similarity}},
    booktitle="SemEval-2016",
    url="https://aclweb.org/anthology/S/S16/S16-1110.pdf",
    doi="10.18653/v1/s16-1110"
    }
  • J. P. McCrae, G. Bordea, and P. Buitelaar, “Linked Data and Text Mining as an Enabler for Reproducible Research,” in 1st workshop on cross-platform text mining and natural language processing interoperability, 2016.
    [Bibtex]
    @inproceedings{mccrae2016linked,
    author="John P. McCrae and Georgeta Bordea and Paul Buitelaar",
    year="2016",
    title={{Linked Data and Text Mining as an Enabler for Reproducible Research}},
    booktitle="1st Workshop on Cross-Platform Text Mining and Natural Language Processing Interoperability",
    url="http://interop2016.github.io/pdf/INTEROP-7.pdf"
    }
  • [DOI] J. P. McCrae, M. Arcan, K. Asooja, J. Gracia, P. Buitelaar, and P. Cimiano, “Domain adaptation for ontology localization,” Web semantics, vol. 36, pp. 23-31, 2016.
    [Bibtex]
    @article{mccrae2016domain,
    url="http://www.sciencedirect.com/science/article/pii/S1570826815001420",
    title={{Domain adaptation for ontology localization}},
    author="John P. McCrae and Mihael Arcan and Kartik Asooja and Jorge Gracia and Paul Buitelaar and Philipp Cimiano",
    journal="Web Semantics",
    volume="36",
    pages="23-31",
    year="2016",
    doi="10.2139/ssrn.3199218"
    }
  • J. P. McCrae, P. Cimiano, P. Buitelaar, and G. Bordea, “Representing Multiword Expressions on the Web with the OntoLex-Lemon model,” in Parseme/enel workshop on mwe e-lexicons, 2016.
    [Bibtex]
    @inproceedings{mccrae2016representing,
    author="John P. McCrae and Philipp Cimiano and Paul Buitelaar and Georgeta Bordea",
    title={{Representing Multiword Expressions on the Web with the OntoLex-Lemon model}},
    booktitle="PARSEME/ENeL workshop on MWE e-lexicons",
    url="http://typo.uni-konstanz.de/parseme/images/Meeting/2016-04-07-Struga-meeting/WG1-MCCRAE-ETAL-abstract.pdf",
    year="2016"
    }
  • J. P. McCrae, C. Chiarcos, F. Bond, P. Cimiano, T. Declerck, G. de Melo, J. Gracia, S. Hellmann, B. Klimek, S. Moran, P. Osenova, A. Pareja-Lora, and J. Pool, “The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud,” in 10th language resource and evaluation conference (lrec), 2016, pp. 2435-2441.
    [Bibtex]
    @inproceedings{mccrae2016open,
    author="John P. McCrae and Christian Chiarcos and Francis Bond and Philipp Cimiano and Thierry Declerck and Gerard de Melo and Jorge Gracia and Sebastian Hellmann and Bettina Klimek and Steven Moran and Petya Osenova and Antonio Pareja-Lora and Jonathan Pool",
    title={{The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud}},
    booktitle="10th Language Resource and Evaluation Conference (LREC)",
    url="http://www.lrec-conf.org/proceedings/lrec2016/pdf/851_Paper.pdf",
    pages="2435-2441",
    year="2016"
    }
  • F. Bond, P. Vossen, J. P. McCrae, and C. Fellbaum, “CILI: the Collaborative Interlingual Index,” in Proceedings of the global wordnet conference 2016, 2016.
    [Bibtex]
    @inproceedings{bond2016cili,
    url="https://www.overleaf.com/read/rsnvsbdghybg",
    title={{CILI: the Collaborative Interlingual Index}},
    author="Francis Bond and Piek Vossen and John P. McCrae and Christiane Fellbaum",
    booktitle="Proceedings of the Global WordNet Conference 2016",
    year="2016"
    }
  • P. Vossen, F. Bond, and J. P. McCrae, “Toward a truly multilingual Global Wordnet Grid,” in Proceedings of the global wordnet conference 2016, 2016.
    [Bibtex]
    @inproceedings{vossen2016toward,
    url="https://www.overleaf.com/read/fwhzzvrrwrmw",
    title={{Toward a truly multilingual Global Wordnet Grid}},
    author="Piek Vossen and Francis Bond and John P. McCrae",
    booktitle="Proceedings of the Global WordNet Conference 2016",
    year="2016"
    }

2015

  • [DOI] J. P. McCrae, S. Moran, S. Hellmann, and M. Brümmer, “Multilingual Linked Data (editorial),” Semantic web, vol. 6, iss. 4, pp. 315-317, 2015.
    [Bibtex]
    @article{mccrae2015multilingual,
    title={{Multilingual Linked Data (editorial)}},
    author="John P. McCrae and Steven Moran and Sebastian Hellmann and Martin Brümmer",
    journal="Semantic Web",
    volume="6",
    number="4",
    pages="315-317",
    year="2015",
    url="http://semantic-web-journal.net/content/multilingual-linked-data",
    doi="10.3233/sw-150178"
    }
  • [DOI] J. Eckle-Kohler, J. McCrae, and C. Chiarcos, “lemonUby – a large, interlinked, syntactically-rich lexical resource for ontologies,” Semantic web, vol. 6, iss. 4, pp. 371-378, 2015.
    [Bibtex]
    @article{ecklekohler2015,
    title={{lemonUby - a large, interlinked, syntactically-rich lexical resource for ontologies}},
    author="Judith Eckle-Kohler and John McCrae and Christian Chiarcos",
    journal="Semantic Web",
    volume="6",
    number="4",
    pages="371-378",
    year="2015",
    url="http://www.semantic-web-journal.net/content/lemonuby-large-interlinked-syntactically-rich-lexical-resource-ontologies-1",
    doi="10.3233/sw-140159"
    }
  • J. P. McCrae and P. Cimiano, “Linghub: a Linked Data based portal supporting the discovery of language resources,” in Proceedings of the 11th international conference on semantic systems, 2015, pp. 88-91.
    [Bibtex]
    @inproceedings{mccrae2015linghub,
    title={{Linghub: a Linked Data based portal supporting the discovery of language resources}},
    author="John P. McCrae and Philipp Cimiano",
    booktitle="Proceedings of the 11th International Conference on Semantic Systems",
    pages="88-91",
    year="2015"
    }
  • [DOI] B. Siemoneit, J. P. McCrae, and P. Cimiano, “Linking Four Heterogeneous Language Resources as Linked Data,” in Proceedings of the 4th workshop on linked data in linguistics, 2015, pp. 59-63.
    [Bibtex]
    @inproceedings{siemoneit2015linking,
    title={{Linking Four Heterogeneous Language Resources as Linked Data}},
    author="Benjamin Siemoneit and John P. McCrae and Philipp Cimiano",
    booktitle="Proceedings of the 4th Workshop on Linked Data in Linguistics",
    year="2015",
    pages="59-63",
    url="http://www.aclweb.org/anthology/W15-4207",
    doi="10.18653/v1/w15-4207"
    }
  • [DOI] J. P. McCrae, P. Cimiano, V. Rodriguez-Doncel, D. Vila-Suero, J. Gracia, L. Matteis, R. Navigli, A. Abele, G. Vulcu, and P. Buitelaar, “Reconciling Heterogeneous Descriptions of Language Resources,” in Proceedings of the 4th workshop on linked data in linguistics, 2015, pp. 39-48.
    [Bibtex]
    @inproceedings{mccrae2015reconciling,
    title={{Reconciling Heterogeneous Descriptions of Language Resources}},
    author="John P. McCrae and Philipp Cimiano and Victor Rodriguez-Doncel and Daniel Vila-Suero and Jorge Gracia and Luca Matteis and Roberto Navigli and Andrejs Abele and Gabriela Vulcu and Paul Buitelaar",
    booktitle="Proceedings of the 4th Workshop on Linked Data in Linguistics",
    year="2015",
    pages="39-48",
    url="http://www.aclweb.org/anthology/W/W15/W15-4205.pdf",
    doi="10.18653/v1/w15-4205"
    }
  • P. Cimiano, J. P. McCrae, V. Rodriguez-Doncel, T. Gornostaya, A. Gómez-Pérez, B. Siemoneit, and A. Lagzdins, “Linked Terminology: Applying Linked Data Principles to Terminological Resources,” in Proceedings of elex 2015, 2015, pp. 504-517.
    [Bibtex]
    @inproceedings{cimiano2015linked,
    title={{Linked Terminology: Applying Linked Data Principles to Terminological Resources}},
    author="Philipp Cimiano and John P. McCrae and Victor Rodriguez-Doncel and Tatiana Gornostaya and Asuncion Gómez-Pérez and Benjamin Siemoneit and Andis Lagzdins",
    booktitle="Proceedings of eLex 2015",
    year="2015",
    pages="504-517",
    url="https://elex.link/elex2015/proceedings/eLex_2015_34_Cimiano+etal.pdf"
    }
  • [DOI] J. P. McCrae, P. Labropoulou, J. Gracia, M. Villegas, V. Rodriguez-Doncel, and P. Cimiano, “One ontology to bind them all: The META-SHARE OWL ontology for the interoperability of linguistic datasets on the Web,” in Proceedings of the 4th workshop on the multilingual semantic web, 2015.
    [Bibtex]
    @inproceedings{mccrae2015metashare,
    title={{One ontology to bind them all: The META-SHARE OWL ontology for the interoperability of linguistic datasets on the Web}},
    author="John P. McCrae and Penny Labropoulou and Jorge Gracia and Marta Villegas and Victor Rodriguez-Doncel and Philipp Cimiano",
    booktitle="Proceedings of the 4th Workshop on the Multilingual Semantic Web",
    year="2015",
    url="http://ceur-ws.org/Vol-1532/paper4.pdf",
    doi="10.1007/978-3-319-25639-9_42"
    }
  • [DOI] M. Fiorelli, A. Stellato, J. P. McCrae, P. Cimiano, and M. T. Pazienza, “LIME: the Metadata Module for OntoLex,” in Proceedings of 12th extended semantic web conference, 2015.
    [Bibtex]
    @inproceedings{fiorelli2015lime,
    title={{LIME: the Metadata Module for OntoLex}},
    author="Manuel Fiorelli and Armando Stellato and John P. McCrae and Philipp Cimiano and Maria Teresa Pazienza",
    booktitle="Proceedings of 12th Extended Semantic Web Conference",
    year="2015",
    url="http://link.springer.com/chapter/10.1007/978-3-319-18818-8_20#page-1",
    doi="10.1007/978-3-319-18818-8_20"
    }
  • J. Gracia, D. Vila-Suero, J. P. McCrae, T. Flati, C. Baron, and M. Dojchinovski, “Language Resources and Linked Data: A Practical Perspective,” in Knowledge engineering and knowledge management, Springer, 2015.
    [Bibtex]
    @incollection{gracia2015language,
    url="http://link.springer.com/chapter/10.1007/978-3-319-17966-7_1",
    title={{Language Resources and Linked Data: A Practical Perspective}},
    author="Jorge Gracia and Daniel Vila-Suero and John P. McCrae and Tiziano Flati and Ciro Baron and Milan Dojchinovski",
    booktitle="Knowledge Engineering and Knowledge Management",
    publisher="Springer",
    year="2015"
    }

2014

  • [DOI] J. P. McCrae and C. Unger, “Design Patterns for Engineering the Ontology-Lexicon Interface,” in Towards the multilingual semantic web, P. Buitelaar and P. Cimiano, Eds., Springer, 2014, pp. 15-30.
    [Bibtex]
    @incollection{mccrae2014design,
    url="http://link.springer.com/chapter/10.1007%2F978-3-662-43585-4_2#page-1",
    title={{Design Patterns for Engineering the Ontology-Lexicon Interface}},
    author="John P. McCrae and Christina Unger",
    booktitle="Towards the Multilingual Semantic Web",
    editor="Paul Buitelaar and Philipp Cimiano",
    pages="15-30",
    publisher="Springer",
    year="2014",
    doi="10.1007/978-3-662-43585-4_2"
    }
  • L. Borin, D. Dannells, M. Forsberg, and J. P. McCrae, “Representing Swedish Lexical Resources in RDF with lemon,” in Proceedings of the iswc 2014 posters & demonstrations track – a track within the 13th international semantic web conference, 2014, pp. 329-332.
    [Bibtex]
    @inproceedings{borin2014representing,
    url="http://ceur-ws.org/Vol-1272/paper_82.pdf",
    title={{Representing Swedish Lexical Resources in RDF with lemon}},
    author="Lars Borin and Dana Dannells and Markus Forsberg and John P. McCrae",
    booktitle="Proceedings of the ISWC 2014 Posters \& Demonstrations Track - a track within the 13th International Semantic Web Conference",
    pages="329-332",
    year="2014"
    }
  • J. P. McCrae, C. Wiljes, and P. Cimiano, “Towards assured data quality and validation by data certification,” in Proceedings of the 1st workshop on linked data quality, 2014.
    [Bibtex]
    @inproceedings{mccrae2014towards,
    url="http://ceur-ws.org/Vol-1215/paper-05.pdf",
    title={{Towards assured data quality and validation by data certification}},
    author="John P. McCrae and Cord Wiljes and Philipp Cimiano",
    booktitle="Proceedings of the 1st Workshop on Linked Data Quality",
    year="2014"
    }
  • [DOI] J. P. McCrae and P. Cimiano, “Bielefeld SC: Orthonormal Topic Modelling for Grammar Induction,” in Proceedings of the 8th international workshop on semantic evaluation, 2014.
    [Bibtex]
    @inproceedings{mccrae2014bielefeld,
    url="http://alt.qcri.org/semeval2014/cdrom/pdf/SemEval016.pdf",
    title={{Bielefeld SC: Orthonormal Topic Modelling for Grammar Induction}},
    author="John P. McCrae and Philipp Cimiano",
    booktitle="Proceedings of the 8th International Workshop on Semantic Evaluation",
    year="2014",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • [DOI] F. Quattri, A. Pease, and J. P. McCrae, “Default Physical Measurements in SUMO,” in Proceedings of 4th workshop on cognitive aspects of the lexicon, 2014.
    [Bibtex]
    @inproceedings{quattri2014default,
    url="http://www.aclweb.org/anthology/W/W14/W14-47.pdf#page=152",
    title={{Default Physical Measurements in SUMO}},
    author="Francesca Quattri and Adam Pease and John P. McCrae",
    booktitle="Proceedings of 4th Workshop on Cognitive Aspects of the Lexicon",
    year="2014",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • [DOI] J. P. McCrae, C. Unger, F. Quattri, and P. Cimiano, “Modelling the Semantics of Adjectives in the Ontology-Lexicon Interface,” in Proceedings of 4th workshop on cognitive aspects of the lexicon, 2014.
    [Bibtex]
    @inproceedings{mccrae2014modelling,
    url="http://anthology.aclweb.org/W/W14/W14-47.pdf#page=212",
    title={{Modelling the Semantics of Adjectives in the Ontology-Lexicon Interface}},
    author="John P. McCrae and Christina Unger and Francesca Quattri and Philipp Cimiano",
    booktitle="Proceedings of 4th Workshop on Cognitive Aspects of the Lexicon",
    year="2014",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • J. P. McCrae, C. Fellbaum, and P. Cimiano, “Publishing and Linking WordNet using lemon and RDF,” in Proceedings of the 3rd workshop on linked data in linguistics, 2014.
    [Bibtex]
    @inproceedings{mccrae2014publishing,
    url="https://github.com/jmccrae/wn-rdf-paper/raw/master/rdf-wordnet.pdf",
    title={{Publishing and Linking WordNet using lemon and RDF}},
    author="John P. McCrae and Christiane Fellbaum and Philipp Cimiano",
    booktitle="Proceedings of the 3rd Workshop on Linked Data in Linguistics",
    year="2014"
    }
  • M. Ehrmann, F. Ceconi, D. Vannella, J. P. McCrae, P. Cimiano, and R. Navigli, “A Multilingual Semantic Network as Linked Data: lemon-BabelNet,” in Proceedings of the 3rd workshop on linked data in linguistics, 2014.
    [Bibtex]
    @inproceedings{ehrmann2014multilingual,
    title={{A Multilingual Semantic Network as Linked Data: lemon-BabelNet}},
    author="Maud Ehrmann and Francesco Ceconi and Daniela Vannella and John P. McCrae and Philipp Cimiano and Roberto Navigli",
    booktitle="Proceedings of the 3rd Workshop on Linked Data in Linguistics",
    year="2014"
    }
  • M. Ehrmann, F. Ceconi, D. Vannella, J. P. McCrae, P. Cimiano, and R. Navigli, “Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0,” in Proceedings of the 9th language resource and evaluation conference, 2014, pp. 401-408.
    [Bibtex]
    @inproceedings{ehrmann2014representing,
    title={{Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0}},
    author="Maud Ehrmann and Francesco Ceconi and Daniela Vannella and John P. McCrae and Philipp Cimiano and Roberto Navigli",
    booktitle="Proceedings of the 9th Language Resource and Evaluation Conference",
    pages="401-408",
    year="2014"
    }
  • D. Vila-Suero, V. Rodriguez-Doncel, A. Gómez-Pérez, P. Cimiano, J. P. McCrae, and G. Aguado-de-Cea, “3LD: Towards high quality, industry-ready Linguistic Linked Linguistic Data,” in European data forum 2014, 2014.
    [Bibtex]
    @inproceedings{vilasuero20143ld,
    title={{3LD: Towards high quality, industry-ready Linguistic Linked Linguistic Data}},
    author="Daniel Vila-Suero and Victor Rodriguez-Doncel and Asunción Gómez-Pérez and Philipp Cimiano and John P. McCrae and Guadalupe Aguado-de-Cea",
    booktitle="European Data Forum 2014",
    year="2014"
    }
  • [DOI] P. Cimiano, C. Unger, and J. McCrae, Ontology-based interpretation of natural language, Morgan & Claypool, 2014.
    [Bibtex]
    @book{cimiano2014ontology,
    url="http://www.morganclaypool.com/doi/abs/10.2200/S00561ED1V01Y201401HLT024",
    title={{Ontology-based interpretation of natural language}},
    author="Philipp Cimiano and Christina Unger and John McCrae",
    publisher="Morgan \& Claypool",
    year="2014",
    doi="10.1007/978-3-319-59888-8_17"
    }

2013

  • J. McCrae, P. Cimiano, and R. Klinger, “Orthonormal explicit topic analysis for cross-lingual document matching,” in Proceedings of the 2013 conference on empirical methods in natural language processing, 2013, pp. 1732-1742.
    [Bibtex]
    @inproceedings{mccrae2013orthonormal,
    url="https://www.aclweb.org/anthology/D/D13/D13-1179.pdf",
    title={{Orthonormal explicit topic analysis for cross-lingual document matching}},
    author="John McCrae and Philipp Cimiano and Roman Klinger",
    booktitle="Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing",
    pages="1732-1742",
    year="2013"
    }
  • C. Unger, J. McCrae, S. Walter, S. Winter, and P. Cimiano, “A lemon lexicon for DBpedia,” in Proceedings of 1st international workshop on nlp and dbpedia, 2013.
    [Bibtex]
    @inproceedings{unger2013lemon,
    url="http://ceur-ws.org/Vol-1064/Unger_lemon.pdf",
    title={{A lemon lexicon for DBpedia}},
    author="Christina Unger and John McCrae and Sebastian Walter and Sara Winter and Philipp Cimiano",
    booktitle="Proceedings of 1st International Workshop on NLP and DBpedia",
    year="2013"
    }
  • E. Montiel-Ponsoda, J. McCrae, G. Aguado-de-Cea, and J. Gracia, “Multilingual variation in the context of linked data,” in Proceedings of the 10th international conference on terminology and artificial intelligence, 2013, pp. 19-26.
    [Bibtex]
    @inproceedings{montiel2013multilingual,
    url="https://lipn.univ-paris13.fr/tia2013/Proceedings/actesTIA2013.pdf",
    title={{Multilingual variation in the context of linked data}},
    author="Elena Montiel-Ponsoda and John McCrae and Guadalupe Aguado-de-Cea and Jorge Gracia",
    booktitle="Proceedings of the 10th International Conference on Terminology and Artificial Intelligence",
    pages="19-26",
    year="2013"
    }
  • J. P. McCrae and P. Cimiano, “Mining translations from the web of open linked data,” in Proceedings of the joint workshop on nlp&lod and swaie: semantic web, linked open data and infromation extraction, 2013, pp. 9-13.
    [Bibtex]
    @inproceedings{mccrae2013mining,
    url="http://www.aclweb.org/anthology/W/W13/W13-52.pdf#page=20",
    title={{Mining translations from the web of open linked data}},
    author="John P. McCrae and Philipp Cimiano",
    booktitle="Proceedings of the Joint Workshop on NLP\&LOD and SWAIE: Semantic Web, Linked Open Data and Infromation Extraction",
    pages="9-13",
    year="2013"
    }
  • P. Menke, J. P. McCrae, and P. Cimiano, “Releasing multimodal data as Linguistic Linked Open Data: An experience report,” in Proceedings of the 2nd workshop on linked data in linguistics, 2013, pp. 44-52.
    [Bibtex]
    @inproceedings{menke2013releasing,
    url="http://anthology.aclweb.org/W/W13/W13-5507.pdf",
    title={{Releasing multimodal data as Linguistic Linked Open Data: An experience report}},
    author="Peter Menke and John P. McCrae and Philipp Cimiano",
    booktitle="Proceedings of the 2nd Workshop on Linked Data in Linguistics",
    pages="44-52",
    year="2013"
    }
  • [DOI] C. Chiarcos, J. McCrae, P. Cimiano, and C. Fellbaum, “Towards open data for linguistics: Lexical Linked Data,” in New trends of research in ontologies and lexical resources, Springer, 2013, pp. 7-25.
    [Bibtex]
    @incollection{chiarcos2013towards,
    title={{Towards open data for linguistics: Lexical Linked Data}},
    author="Christian Chiarcos and John McCrae and Philipp Cimiano and Christiane Fellbaum",
    booktitle="New Trends of Research in Ontologies and Lexical Resources",
    pages="7-25",
    publisher="Springer",
    year="2013",
    doi="10.1007/978-3-642-31782-8_2"
    }
  • [DOI] P. Cimiano, J. McCrae, P. Buitelaar, and E. Montiel-Ponsoda, “On the role of senses in the Ontology-Lexicon,” in New trends of research in ontologies and lexical resources, Springer, 2013, pp. 43-62.
    [Bibtex]
    @incollection{cimiano2013role,
    title={{On the role of senses in the Ontology-Lexicon}},
    author="Philipp Cimiano and John McCrae and Paul Buitelaar and Elena Montiel-Ponsoda",
    booktitle="New Trends of Research in Ontologies and Lexical Resources",
    pages="43-62",
    publisher="Springer",
    year="2013",
    doi="10.1007/978-3-319-59888-8_17"
    }

2012

  • [DOI] D. Spohr, P. Cimiano, J. McCrae, and S. O’Riain, “Using SPIN to formalize accounting regulation on the Semantic Web,” in First international workshop on finance and economics on the semantic web in conjunction with 9th extended semantic web conference, 2012, pp. 1-15.
    [Bibtex]
    @inproceedings{spohr2012using,
    url="http://nadir.uc3m.es/feosw2012/proceedings/FEOSWp1.pdf",
    title={{Using SPIN to formalize accounting regulation on the Semantic Web}},
    author="Dennis Spohr and Philipp Cimiano and John McCrae and Sean O'Riain",
    booktitle="First International Workshop on Finance and Economics on the Semantic Web in conjunction with 9th Extended Semantic Web Conference",
    pages="1-15",
    year="2012",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • J. McCrae, E. Montiel-Ponsoda, and P. Cimiano, “Collaborative semantic editing of linked data lexica,” in Proc. of the 2012 international conference on language resource and evaluation, 2012, pp. 2619-2625.
    [Bibtex]
    @inproceedings{mccrae2012collaborative,
    url="http://www.lrec-conf.org/proceedings/lrec2012/pdf/544_Paper.pdf",
    title={{Collaborative semantic editing of linked data lexica}},
    author="John McCrae and Elena Montiel-Ponsoda and Philipp Cimiano",
    booktitle="Proc. of the 2012 International Conference on Language Resource and Evaluation",
    pages="2619-2625",
    year="2012"
    }
  • J. McCrae and P. Cimiano, “Three steps for creating high quality ontology-lexica,” in Proc. of the workshop on collaborative resource development and delivery at the 2012 international conference on language resource and evaluation, 2012.
    [Bibtex]
    @inproceedings{mccrae2012three,
    title={{Three steps for creating high quality ontology-lexica}},
    author="John McCrae and Philipp Cimiano",
    booktitle="Proc. of the Workshop on Collaborative Resource Development and Delivery at the 2012 International Conference on Language Resource and Evaluation",
    year="2012"
    }
  • [DOI] J. McCrae, P. Cimiano, and E. Montiel-Ponsoda, “Integrating WordNet and Wiktionary with lemon,” in Linked data and linguistics, C. Chiarcos, S. Nordhoff, and S. Hellmann, Eds., Springer, 2012, pp. 25-34.
    [Bibtex]
    @incollection{mccrae2013integrating,
    title={{Integrating WordNet and Wiktionary with lemon}},
    author="John McCrae and Philipp Cimiano and Elena Montiel-Ponsoda",
    booktitle="Linked Data and Linguistics",
    editor="Christian Chiarcos and Sebastian Nordhoff and Sebastian Hellmann",
    pages="25-34",
    publisher="Springer",
    year="2012",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • [DOI] J. McCrae, G. Aguado-de-Cea, P. Buitelaar, P. Cimiano, T. Declerck, A. Gómez-Pérez, J. Gracia, L. Hollink, E. Montiel-Ponsoda, D. Spohr, and T. Wunner, “Interchanging lexical resources on the Semantic Web,” Language resources and evaluation, vol. 46, iss. 6, pp. 701-709, 2012.
    [Bibtex]
    @article{mccrae2012interchanging,
    title={{Interchanging lexical resources on the Semantic Web}},
    author="John McCrae and Guadalupe Aguado-de-Cea and Paul Buitelaar and Philipp Cimiano and Thierry Declerck and Asunción Gómez-Pérez and Jorge Gracia and Laura Hollink and Elena Montiel-Ponsoda and Dennis Spohr and Tobias Wunner",
    journal="Language Resources and Evaluation",
    volume="46",
    number="6",
    pages="701-709",
    year="2012",
    doi="10.1007/978-3-319-59888-8_17"
    }

2011

  • [DOI] P. Cimiano, P. Buitelaar, J. McCrae, and M. Sintek, “LexInfo: A declarative model for the lexicon-ontology interface,” Web semantics: science, services and agents on the world wide web, vol. 9, iss. 1, pp. 29-51, 2011.
    [Bibtex]
    @article{cimiano2011lexinfo,
    title={{LexInfo: A declarative model for the lexicon-ontology interface}},
    author="Philipp Cimiano and Paul Buitelaar and John McCrae and Michael Sintek",
    journal="Web Semantics: Science, Services and Agents on the World Wide Web",
    volume="9",
    number="1",
    pages="29-51",
    year="2011",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • [DOI] J. Gracia, E. Montiel-Ponsoda, P. Cimiano, A. Gómez-Pérez, P. Buitelaar, and J. McCrae, “Challenges for the Multilingual Web of Data,” Web semantics: science, services and agents on the world wide web, iss. 11, pp. 63-71, 2011.
    [Bibtex]
    @article{gracia2011challenges,
    url="http://oa.upm.es/8848/1/Multiling.pdf",
    title={{Challenges for the Multilingual Web of Data}},
    author="Jorge Gracia and Elena Montiel-Ponsoda and Philipp Cimiano and Asunción Gómez-Pérez and Paul Buitelaar and John McCrae",
    journal="Web Semantics: Science, Services and Agents on the World Wide Web",
    number="11",
    pages="63-71",
    year="2011",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • J. McCrae, M. Espinoza, E. Montiel-Ponsoda, G. Aguado-de-Cea, and P. Cimiano, “Combining statistical and semantic approaches to the translation of ontologies and taxonomies,” in Fifth workshop on syntax, structure and semantics in statistical translation in conjunction with 49th annual meeting of the association for computational linguistics: human language technologies, 2011.
    [Bibtex]
    @inproceedings{mccrae2011combining,
    title={{Combining statistical and semantic approaches to the translation of ontologies and taxonomies}},
    author="John McCrae and Mauricio Espinoza and Elena Montiel-Ponsoda and Guadalupe Aguado-de-Cea and Philipp Cimiano",
    booktitle="Fifth Workshop on Syntax, Structure and Semantics in Statistical Translation in conjunction with 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies",
    year="2011"
    }
  • [DOI] J. McCrae, D. Spohr, and P. Cimiano, “Linking Lexical Resources and Ontologies on the Semantic Web with lemon,” in Proc. of the 8th extended semantic web conference, 2011, pp. 245-249.
    [Bibtex]
    @inproceedings{mccrae2011linking,
    title={{Linking Lexical Resources and Ontologies on the Semantic Web with lemon}},
    author="John McCrae and Dennis Spohr and Philipp Cimiano",
    booktitle="Proc. of the 8th Extended Semantic Web Conference",
    pages="245-249",
    year="2011",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • P. Buitelaar, P. Cimiano, J. McCrae, E. Montiel-Ponsoda, and T. Declerck, “Ontology Lexicalization: The lemon perspective,” in Proc. of 9th international conference on terminology and articial intelligence, 2011.
    [Bibtex]
    @inproceedings{buitelaar2011ontology,
    title={{Ontology Lexicalization: The lemon perspective}},
    author="Paul Buitelaar and Philipp Cimiano and John McCrae and Elena Montiel-Ponsoda and Thierry Declerck",
    booktitle="Proc. of 9th International Conference on Terminology and Articial Intelligence",
    year="2011"
    }
  • E. Montiel-Ponsoda, G. Aguado-de-Cea, and J. McCrae, “Representing Term Variation in lemon,” in Proc. of 9th international conference on terminology and articial intelligence, 2011.
    [Bibtex]
    @inproceedings{montiel2011representing,
    title={{Representing Term Variation in lemon}},
    author="Elena Montiel-Ponsoda and Guadalupe Aguado-de-Cea and John McCrae",
    booktitle="Proc. of 9th International Conference on Terminology and Articial Intelligence",
    year="2011"
    }

2010

  • J. McCrae, J. R. Campaña, and P. Cimiano, “CLOVA: An architecture for cross-language semantic data querying,” in Proceedings of the 1st workshop on the multilingual semantic web, 2010, pp. 5-12.
    [Bibtex]
    @inproceedings{mccrae2010clova,
    url="http://ceur-ws.org/Vol-571/571-complete.pdf#page=10",
    title={{CLOVA: An architecture for cross-language semantic data querying}},
    author="John McCrae and Jesus R. Campaña and Philipp Cimiano",
    booktitle="Proceedings of the 1st Workshop on the Multilingual Semantic Web",
    pages="5-12",
    year="2010"
    }
  • [DOI] N. Collier, S. Doan, R. M. Goodwin, J. McCrae, M. Conway, M. Shigematsu, and A. Kawazoe, “Navigating the Information Storm: Web-based global health surveillance in BioCaster,” in Biosurveillance: methods and case studies, T. Kass-Hout and X. Zhang, Eds., CRC Press, 2010, pp. 291-312.
    [Bibtex]
    @incollection{collier2010navigating,
    url="https://www.crcpress.com/Biosurveillance-Methods-and-Case-Studies/KassHout-Zhang/9781439800461",
    title={{Navigating the Information Storm: Web-based global health surveillance in BioCaster}},
    author="Nigel Collier and Son Doan and Reiko Matsuda Goodwin and John McCrae and Mike Conway and Mika Shigematsu and Ai Kawazoe",
    booktitle="Biosurveillance: Methods and Case Studies",
    editor="Taha Kass-Hout and Xiaohui Zhang",
    pages="291-312",
    publisher="CRC Press",
    year="2010",
    doi="10.1007/978-3-319-59888-8_17"
    }
  • T. Declerck, H. Krieger, S. Thomas, P. Buitelaar, S. O’Riain, T. Wunner, G. Maguet, J. McCrae, D. Spohr, and E. Montiel-Ponsoda, “Ontology-based multilingual access to financial reports for sharing business knowledge across Europe,” in International financial control assessment applying multilingual ontology frameworks, Készült a HVG Press Kft., 2010, pp. 67-76.
    [Bibtex]
    @incollection{declerck2010ontology,
    title={{Ontology-based multilingual access to financial reports for sharing business knowledge across Europe}},
    author="Thierry Declerck and Hans-Ulrich Krieger and Susan-Marie Thomas and Paul Buitelaar and Sean O'Riain and Tobias Wunner and Gilles Maguet and John McCrae and Dennis Spohr and Elena Montiel-Ponsoda",
    booktitle="International Financial Control Assessment applying Multilingual Ontology Frameworks",
    pages="67-76",
    publisher="Készült a HVG Press Kft.",
    year="2010"
    }
  • N. Collier, R. M. Goodwin, J. McCrae, S. Doan, A. Kawazoe, M. Conway, A. Kawtrakul, K. Takeuchi, and D. Dien, “An ontology-driven system for detecting global health events,” in In proc. of the 23rd international conference on computational linguistics, 2010, pp. 215-222.
    [Bibtex]
    @inproceedings{collier2010ontology,
    title={{An ontology-driven system for detecting global health events}},
    author="Nigel Collier and Reiko Matsuda Goodwin and John McCrae and Son Doan and Ai Kawazoe and Mike Conway and Asanee Kawtrakul and K. Takeuchi and D. Dien",
    booktitle="In Proc. of the 23rd International Conference on Computational Linguistics",
    pages="215-222",
    year="2010"
    }

2009

  • J. McCrae, “Automatic extraction of logically consistent ontologies from text corpora,” PhD Thesis, 2009.
    [Bibtex]
    @phdthesis{mccrae2009automatic,
    title={{Automatic extraction of logically consistent ontologies from text corpora}},
    author="John McCrae",
    school="PhD Thesis for Graduate University of Advanced Studies (SoKenDai)",
    year="2009"
    }
  • J. McCrae and N. Collier, “SRL Editor: A rule development tool for text mining,” in Proc. of workshop on semantic authoring, annotation and knowledge markup in conjunction with the 5th international conference on knowledge capture, 2009.
    [Bibtex]
    @inproceedings{mccrae2009srl,
    url="http://srl-editor.googlecode.com/files/saakm.pdf",
    title={{SRL Editor: A rule development tool for text mining}},
    author="John McCrae and Nigel Collier",
    booktitle="Proc. of Workshop on Semantic Authoring, Annotation and Knowledge Markup in conjunction with the 5th International Conference on Knowledge Capture",
    year="2009"
    }

2008

  • [DOI] J. McCrae and N. Collier, “Synonym set extraction from the biomedical literature by lexical pattern discovery,” Bmc bioinformatics, vol. 9, iss. 156, 2008.
    [Bibtex]
    @article{mccrae2009synonym,
    url="http://www.biomedcentral.com/1471-2105/9/159",
    title={{Synonym set extraction from the biomedical literature by lexical pattern discovery}},
    author="John McCrae and Nigel Collier",
    journal="BMC Bioinformatics",
    volume="9",
    number="156",
    year="2008",
    doi="10.1007/978-3-319-59888-8_17"
    }