During the last years, there has been a high increase in the use of social networks and blogs so that citizens and consumers express now widely their opinions about different topics like politics, society and media, through these channels. However the development of systems for sentiment analysis of these opinions is hampered by difficulties to access and get the necessary language resources, for several reasons:
- language resource owners fears for losing competitiveness
- lack of agreed language resource schemas for sentiment analysis and not normalised magnitudes for measuring sentiment strength
- high costs for adapting existing language resources for sentiment analysis
- reduced visibility, accessibility and interoperability of the language resources.
The project aims to develop a large shared data pool for language resources meant to be used by sentiment analysis systems, in order to bundle together scattered resources. One goal is to extend the WordNet Domain to sentiment analysis. The project will also specify a schema for sentiment analysis and normalise the metrics used for sentiment strength. The sharing of resources will be supported by a self-sustainable and profitable framework based on a community governance model, offering contributors the possibility of exploiting commercially the resources they provide.
The project is structured around following steps:
- definition of a common schema to ensure interoperability;
- acquisition and clean up of language resources;
- deployment of the resources and
- validation through opinion mining demonstrators in the hotel and electronic domains.
The targeted users are B2B including service developers, content providers, LR owners.
The data pool will cover 6 languages: English Catalan German Italian Portuguese and Spanish.
For more information, please visit our website.