Zipf distribution related characteristics of punctuation marks in narrative texts

Andrzej Kulig 1Stanisław Drożdż 1,2

1. Institut of Nuclear Physics Polish Academy of Science (IFJPAN), Radzikowskieg 152, Kraków 31-342, Poland
2. Cracow University of Technology, Institute of Computing Science, Al. Jana Pawła II 37, Kraków 31-864, Poland

Abstract

Owing to their fundamental role in human life, natural languages - probably the most advanced examples of the complex systems - are intensively studied by linguists, biologists, physicists, and computer scientists. During their evolution natural languages developed remarkable, already identified, quantifiable patterns of behaviour such as hierarchical structure in their syntactic organization, a corresponding lack of characteristic scale as evidenced by the Zipf law, small world properties and long-range correlations in the use of words. The punctuation marks in narrative texts, apparently considered less relevant, are so far treated somewhat marginally. In this contribution, we therefore compare these marks with ordinary words and investigate their frequencies and their role in the word-adjacency networks. We in particular show that the punctuation marks, irrespective of the language studied, appear to obey the Zipfian rank-frequency proportions in line with the words.

 

Related papers
  1. Volatility correlations in narrative
  2. Multifractal cross-correlation and casual direction between energy and financial markets in 2014-2016
  3. Correlation structure decomposition through scale- and amplitude-dependent qMST methodology  
  4. Technological stock market revolution from multifractal perspective
  5. Complexity characteristics of world econo- and sociophysics scientific collaboration network
  6. Asymmetry Effect in Fractal Organization of Financial Time Series
  7. The generalized detrended cross-correlation coefficient ρq and its application to financial data.
  8. Agent-based modelling of commodity market dynamics
  9. Literary and scientific texts in network representation
  10. Complexity: what it is and how it can be identified
  11. Effect of detrending on multifractal characteristics
  12. Current world markets development from log-periodic perspective
  13. World markets development from log-periodic perspective
  14. Long-range dependences in natural language
  15. Accuracy analysis of the box counting algorithm
  16. Characteristics of distributions for the stock returns and trading volumes
  17. Financial extreme events with negative fractal dimensions.
  18.   Statistical and network-based analysis of English and Polish literary texts.
  19. Foreign currency network: its structure, evolution and subtle interactions
  20. Fractals, log-periodicity and financial crashes
  21. Modelling emergence of money
  22. Time correlations in currency exchange rates
  23. Analysis of a network structure of the foreign currency exchange market
  24. Internal organization of languages: Decomposing "Ulysses"
  25. Asymmetric fractal properties of positive and negative returns
  26. Current status of financial log-periodicity
  27. Cross-correlations in Warsaw Stock Exchange
  28. Power like scaling in Minimal Spanning Tree Graphs for FOREX networks
  29. Non-Hermitean matrices in an analysis of financial correlations
  30. Multifractal Model of Asset Returns versus real stock market dynamics
  31. Complexity characteristics of currency networks
  32. Correlation matrix decomposition of intraday WIG20 fluctuations
  33. Statistical properties of stock market eigensignals
  34. A comparative study of the applicability of the MF-DFA and the wavelet methods in the context of financial data
  35. Measuring subtle effects of persistence in the stock market dynamics

Presentation: Oral at 8 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych", by Andrzej Kulig
See On-line Journal of 8 Ogólnopolskie Sympozjum "Fizyka w Ekonomii i Naukach Społecznych"

Submitted: 2015-09-15 12:28
Revised:   2015-09-15 17:30