DART 2015 Abstracts


Full Papers
Paper Nr: 1
Title:

Posgram Driven Word Prediction

Authors:

Carmelo Spiccia, Agnese Augello and Giovanni Pilato

Abstract: Several word prediction algorithms have been described in literature for automatic sentence completion from a finite candidate words set. However, at the best of our knowledge, very little or no work has been done on reducing the cardinality of this set. To address this issue, we use posgrams to predict the part of speech of the missing word first. Candidate words are then restricted to the ones fulfilling the predicted part of speech. We show how this additional step can improve the processing speed and the accuracy of word predictors. Experimental results are provided for the Italian language.

Paper Nr: 2
Title:

Data-driven Relation Discovery from Unstructured Texts

Authors:

Marilena Ditta, Fabrizio Milazzo, Valentina Ravì, Giovanni Pilato and Agnese Augello

Abstract: This work proposes a data driven methodology for the extraction of subject-verb-object triplets from a text corpus. Previous works on the field solved the problem by means of complex learning algorithms requiring hand-crafted examples; our proposal completely avoids learning triplets from a dataset and is built on top of a well-known baseline algorithm designed by Delia Rusu et al.. The baseline algorithm uses only syntactic information for generating triplets and is characterized by a very low precision i.e., very few triplets are meaningful. Our idea is to integrate the semantics of the words with the aim of filtering out the wrong triplets, thus increasing the overall precision of the system. The algorithm has been tested over the Reuters Corpus and has it as shown good performance with respect to the baseline algorithm for triplet extraction.

Paper Nr: 3
Title:

Ontology Matching using Multiple Similarity Measures

Authors:

Thi Thuy Anh Nguyen and Stefan Conrad

Abstract: This paper presents an automatic ontology matching approach (called LSSOM - Lexical Structural Semantic-based Ontology Matching method) which brings a final alignment by combining three kinds of different similarity measures: lexical-based, structure-based, and semantic-based techniques as well as using information in ontologies including names, labels, comments, relations and positions of concepts in the hierarchy and integrating WordNet dictionary. Firstly, two ontologies are matched sequentially by using the lexical-based and structure-based similarity measures to find structural correspondences among the concepts. Secondly, the semantic similarity based on WordNet dictionary is applied to these concepts in given ontologies. After the semantic and structural similarities are obtained, they are combined in the parallel phase by using weighted sum method to yield the final similarities. Our system is implemented and evaluated based on the OAEI 2008 benchmark dataset. The experimental results show that our approach obtains good F-measure values and outperforms other automatic ontology matching systems which do not use instances information.

Paper Nr: 4
Title:

Moving Beyond the Twitter Follow Graph

Authors:

Giambattista Amati, Simone Angelini, Marco Bianchi, Gianmarco Fusco, Giorgio Gambosi, Giancarlo Gaudino, Giuseppe Marcone, Gianluca Rossi and Paola Vocca

Abstract: The study of the topological properties of graphs derived from social network platforms has a great importance both from the social and from the information point of view; furthermore, it has a big impact in designing new applications and in improving already existing services. Surprisingly, the research community seems to have mainly focused its efforts just in studying the most intuitive and explicit graphs, such as the follower graph of the Twitter platform, or the Facebook friends’ graph: consequently, a lot of valuable information is still hidden and it is waiting to be explored and exploited. In this paper we introduce a new type of graph modeling behavior of Twitter users: the mention graph. Then we show how to easily build instances of this graphs starting from the Twitter stream, and we report the results of an experimentation aimed to compare the proposed graph with other graphs already analyzed in the literature, by using some standard social network analysis metrics.

Paper Nr: 5
Title:

The Predictor Impact of Web Search Media on Bitcoin Trading Volumes

Authors:

Martina Matta, Ilaria Lunesu and Michele Marchesi

Abstract: In the last decade, Web 2.0 services have been widely used as communication media. Due to the huge amount of available information, searching has become dominant in the use of Internet. Millions of users daily interact with search engines, producing valuable sources of interesting data regarding several aspects of the world. Search queries prove to be a useful source of information in financial applications, where the frequency of searches of terms related to the digital currency can be a good measure of interest in it. Bitcoin, a decentralized electronic currency, represents a radical change in financial systems, attracting a large number of users and a lot of media attention. In this work we studied the existing relationship between Bitcoin’s trading volumes and the queries volumes of Google search engine. We achieved significant cross correlation values, demonstrating search volumes power to anticipate trading volumes of Bitcoin currency.

Paper Nr: 6
Title:

SERP-level Disambiguation from Search Results

Authors:

Mostafa Alli

Abstract: Fast growth of search engines' popularity shows the users attraction to the Web engines. However there is a chance of misinterpretation for ambiguous queries. At this point, we propose a more adherence user interface which consist of a relevant visual content as well as generating new search snippet and title. Recent researches for meeting this aim are focused on a whole page thumbnail for assisting users to remember a recently visited Web page. Withal, this is not discussed yet that how a specific visual content of a page can allow users to distinguish between a useful and worthless page in the result page especially in an ambiguous search task. Our study, which consists of two parts} shows that the improvement in both textual search snippet and title as well as the additional thumbnail were helpful for users to clarify the Search Engine Result Page (SERP) in an ambiguous search task.

Paper Nr: 7
Title:

Author's Paper Similarity Prediction based on the Similarity of Textual References to Visual Features

Authors:

Mostafa Alli

Abstract: In this paper we introduce a mechanism to find similar papers of an author, based on the author’s previous publications. In other words, since the author(s) of a paper are more likely to publish similar work(s) to their paper, we use this intuition to seek related papers based on the visual similarity of those papers. The visuality here is the figures and or tables that are commonly used by authors to describe their method structure and/or the result of their experiments. Since similar works of authors are focused on solving similar problems as well as developing and improving similar techniques, we noticed that comparing these visual features among their publications would help to spot most similar papers of those authors. We call our method, Similarity of Textual References to Visual Features which means, we compare parts of content of any two arbitrary papers that have references to any figures and/or tables. In our experiment we show that how we can use this similarity together with other factors of a paper to form a Boolean function which helps to build an indexation for papers based on the number of their authors. In this way, we omit time consuming process of papers’ content determined analysis, such as, textual content analysis, building coauthor network, citation network etc. In addition, our Boolean function has the ability of adjusting level of Sensitivity. If we want to achieve higher accuracy of similar papers, the Boolean function needs to be enabled for more conditions.