Statistical and network-based analysis of English and Polish literary texts.
|Jarosław Kwapień 1, Stanisław Drożdż 1,2, Adam Orczyk 1|
1. Polish Academy of Sciences, Institute of Nuclear Physics (IFJ PAN), Radzikowskiego 152, Kraków 31-342, Poland
We analyze the statistical properties of selected English and Polish texts. We investigate the scaling relations of the rank-frequency distributions of words. From this point of view, we compare the corpora consisting of texts written by the same author with the corpora consisting of texts written by different authors as well as the corpora of native texts with the ones of translated texts. Moreover, we transform texts into series of numbers and apply a few standard methods of the time series analysis in order to look for statistical dependencies in texts, among which are the fractal properties. We also construct different network representations of texts and study their topology.
Presentation: Oral at 4 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych", by Jarosław Kwapień
See On-line Journal of 4 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych"
Submitted: 2009-03-12 17:08 Revised: 2009-06-07 00:48