Congreso
Autoría
Cecilia Baggio
;
Rocio Cecchini
;
MAGUITMAN, ANA GABRIELA
;
Evangelos Milios
Fecha
2019
Editorial y Lugar de Edición
ACM Press
ISSN
978-1-4503-6887-2
Resumen
Información suministrada por el agente en
SIGEVA
Genetic Programming techniques have demonstrated great potential in dealing with the problem of query generation. This work explores different Multi-Objective Genetic Programming strategies for evolving a collection of topic-based Boolean queries. It compares three approaches to build topical Boolean queries: using terms, incorporating Wikipedia semantics (Wikipedia concepts) and a hybrid approach, using a combination of both terms and concepts. In addition, different fitness functions are comb...
Genetic Programming techniques have demonstrated great potential in dealing with the problem of query generation. This work explores different Multi-Objective Genetic Programming strategies for evolving a collection of topic-based Boolean queries. It compares three approaches to build topical Boolean queries: using terms, incorporating Wikipedia semantics (Wikipedia concepts) and a hybrid approach, using a combination of both terms and concepts. In addition, different fitness functions are combined giving rise to seven multi-objective schemes. In particular, we investigate the use of the proposed strategies in conjunction with novel fitness functions aimed at attaining high diversity based on the information-theoretic notion of entropy and Jaccard similarity. Experiments were completed using 25 topics from a dataset consisting of approximately 350,000 webpages classified into 448 topics. The results reveal that the use of Wikipedia concepts does not result in statistically significant improvements in precision, global recall or diversity when compared to the term-based approaches. However, the use of concepts has a positive effect on query interpretability since the use of terms leads to artificial queries that are hard to interpret by humans. In the meantime, concept-based queries contain a smaller number of operands than the term-based ones, hence resulting in better execution times without a loss in retrieval performance.
Ver más
Ver menos
Palabras Clave
TOPICAL RECOMMENDATIONSIMILARITY MEASURESRECALL MAXIMIZATIONQUERY OPTIMIZATIONWIKIFICATION