P. Kostakos, "Strings and Things: A Semantic Search Engine for news quotes using Named Entity Recognition," 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 2020, pp. 835-839, doi: 10.1109/ASONAM49781.2020.9381383
Strings and things : a semantic search engine for news quotes using named entity recognition
1Center of Ubiquitus Computing, University of Oulu, Oulu, Finland
|Online Access:||PDF Full Text (PDF, 0.8 MB)|
|Persistent link:|| http://urn.fi/urn:nbn:fi-fe2021051229665
IEEE Computer Society,
|Publish Date:|| 2021-05-12
Emerging methods for content delivery such as quote-searching and entity-searching, enable users to quickly identify novel and relevant information from unstructured texts, news articles, and media sources. These methods have widespread applications in web surveillance and crime informatics, and can help improve intention disambiguation, character evaluation, threat analysis, and bias detection. Furthermore, quote-based and entity-based searching is also an empowering information retrieval tool that can enable non-technical users to gauge the quality of public discourse, allowing for more fine-grained analysis of core sociological questions. The paper presents a prototype search engine that allows users to search a news database containing quotes using a combination of strings and things. The ingestion pipeline, which forms the backend of the service, comprises of the following modules i) a crawler that ingests data from the GDELT Global Quotation Graph ii) a named entity recognition (NER) filter that labels data on the fly iii) an indexing mechanism that serves the data to an Elasticsearch cluster and iv) a user interface that allows users to formulate queries. The paper presents the high-level configuration of the pipeline and reports basic metrics and aggregations.
International Conference on Advances in Social Network Analysis and Mining
|Pages:||835 - 839|
12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2020
|Host publication editor:||
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
|Type of Publication:||
A4 Article in conference proceedings
|Field of Science:||
113 Computer and information sciences
Partly funded by European Commission grants CUTLER (770469) and PRINCE (815362), and by Academy of Finland 6Genesis Flagship (318927).
|EU Grant Number:||
(770469) CUTLER - Coastal Urban developmenT through the LEnses of Resiliency
|Academy of Finland Grant Number:||
318927 (Academy of Finland Funding decision)
© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.