Statistical and network-based analysis of English and Polish literary texts.

Jarosław Kwapień 1Stanisław Drożdż 1,2Adam Orczyk 1

1. Polish Academy of Sciences, Institute of Nuclear Physics (IFJ PAN), Radzikowskiego 152, Kraków 31-342, Poland
2. University of Rzeszów, Institute of Physics, Department of Complex Systems, Rejtana 16, Rzeszów 35-310, Poland

Abstract

We analyze the statistical properties of selected English and Polish texts. We investigate the scaling relations of the rank-frequency distributions of words. From this point of view, we compare the corpora consisting of texts written by the same author with the corpora consisting of texts written by different authors as well as the corpora of native texts with the ones of translated texts. Moreover, we transform texts into series of numbers and apply a few standard methods of the time series analysis in order to look for statistical dependencies in texts, among which are the fractal properties. We also construct different network representations of texts and study their topology.

 

Related papers
  1. Volatility correlations in narrative
  2. Multifractal cross-correlation and casual direction between energy and financial markets in 2014-2016
  3. Correlation structure decomposition through scale- and amplitude-dependent qMST methodology  
  4. Technological stock market revolution from multifractal perspective
  5. Zipf distribution related characteristics of punctuation marks in narrative texts
  6. Complexity characteristics of world econo- and sociophysics scientific collaboration network
  7. Asymmetry Effect in Fractal Organization of Financial Time Series
  8. The generalized detrended cross-correlation coefficient ρq and its application to financial data.
  9. Agent-based modelling of commodity market dynamics
  10. Literary and scientific texts in network representation
  11. Complexity: what it is and how it can be identified
  12. Effect of detrending on multifractal characteristics
  13. Current world markets development from log-periodic perspective
  14. World markets development from log-periodic perspective
  15. Long-range dependences in natural language
  16. Accuracy analysis of the box counting algorithm
  17. Characteristics of distributions for the stock returns and trading volumes
  18. Financial extreme events with negative fractal dimensions.
  19. Foreign currency network: its structure, evolution and subtle interactions
  20. Fractals, log-periodicity and financial crashes
  21. Modelling emergence of money
  22. Time correlations in currency exchange rates
  23. Analysis of a network structure of the foreign currency exchange market
  24. Internal organization of languages: Decomposing "Ulysses"
  25. Asymmetric fractal properties of positive and negative returns
  26. Current status of financial log-periodicity
  27. Cross-correlations in Warsaw Stock Exchange
  28. Power like scaling in Minimal Spanning Tree Graphs for FOREX networks
  29. Non-Hermitean matrices in an analysis of financial correlations
  30. Multifractal Model of Asset Returns versus real stock market dynamics
  31. Complexity characteristics of currency networks
  32. Correlation matrix decomposition of intraday WIG20 fluctuations
  33. Statistical properties of stock market eigensignals
  34. A comparative study of the applicability of the MF-DFA and the wavelet methods in the context of financial data
  35. Measuring subtle effects of persistence in the stock market dynamics

Presentation: Oral at 4 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych", by Jarosław Kwapień
See On-line Journal of 4 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych"

Submitted: 2009-03-12 17:08
Revised:   2009-06-07 00:48