Project Title: Quantifying Degrees of Randomness in Word Rhythms of Literary Works

 

Advisor:   John Konvalina

 

Description:  When we read a poem, a novel, or a scientific article, our intuition senses a qualitative difference independent of the meaning of the words. Each literary work seems to have its own “rhythm” based on the sequence of words used. Poems tend to be more rhythmic than novels, and novels tend to be more rhythmic than technical scientific articles. What factors shape the rhythm of the words, and can we quantify the observed qualitative differences? Our goal is to isolate some of the factors that determine the rhythm of the work based on just the structure of the word sequences.  The formal measure we will use is the number of characters in a word. Next, we will consider various literary works and interpret them as time series. We will apply traditional linear methods, such as Fourier analysis, as well as nonlinear methods, such as chaos theory and fractal dimensions, to determine the degrees of randomness in the literary works.

 

 

The student will perform the following tasks:

 

A. Become familiar with research in the literature involving the application of linear and nonlinear mathematical methods to computational linguistics and text processing.

 

 

B. Become familiar with the Matlab software in order to be able to write and run several programs related to time series analysis and nonlinear modeling.

 

C. Analyze and interpret the resulting time series data from various literary works including poems, novels and scientific articles. Apply the various entropy measures available in Matlab to determine the degrees of randomness in the literary works.

 

D. Create a final research report to be presented at the MAM Symposium.

 

OTHER REQUIREMENTS: The students interested in the project above are expected to have had a course in time series and a course in chaos theory and fractals. Experience with a computer algebra system, such as Maple or Matlab, is essential.

 

NOTE: The results of this research will represent the core of a research paper that later will be sent for publication to a suitable journal.