The research of events characteristics based on EventRegistry.org database.

Krzysztof Dzienisiuk 

Warsaw University of Technology, Koszykowa 75, Warsaw 00-662, Poland

Abstract

EventRegistry.org is a public, updated in real-time database with news from many information services collected worldwide. The engine of EventRegistry is a large analysis system based on natural language processing and machine-learning methods. The crucial elements to understanding the whole system are concepts of 'events' and 'articles' which are strictly defined within the system. Events are original clusters of articles in multidimensional space with factors like language, title, entities and keywords of articles. We analyse EventRegistry from sociophysical perspective by examination of characteristic indices like size and lifespan distributions of events or article distribution in events. We study also correlations between these indices and importance of keywords on the evolution of event. The main goal of this research is to find out how the newspaper events emerge, evolve and vanish.

 

Presentation: Poster at 8 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych", by Krzysztof Dzienisiuk
See On-line Journal of 8 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych"

Submitted: 2015-09-15 15:22
Revised:   2015-09-15 15:22