Search for content and authors
 

  Statistical and network-based analysis of English and Polish literary texts.

Jarosław Kwapień 1Stanisław Drożdż 1,2Adam Orczyk 1

1. Polish Academy of Sciences, Institute of Nuclear Physics (IFJ PAN), Radzikowskiego 152, Kraków 31-342, Poland
2. University of Rzeszów, Institute of Physics, Department of Complex Systems, Rejtana 16, Rzeszów 35-310, Poland

Abstract

We analyze the statistical properties of selected English and Polish texts. We investigate the scaling relations of the rank-frequency distributions of words. From this point of view, we compare the corpora consisting of texts written by the same author with the corpora consisting of texts written by different authors as well as the corpora of native texts with the ones of translated texts. Moreover, we transform texts into series of numbers and apply a few standard methods of the time series analysis in order to look for statistical dependencies in texts, among which are the fractal properties. We also construct different network representations of texts and study their topology.

 

Legal notice
  • Legal notice:
 

Related papers

Presentation: Oral at 4 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych", by Jarosław Kwapień
See On-line Journal of 4 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych"

Submitted: 2009-03-12 17:08
Revised:   2009-06-07 00:48