The style of a text is an important factor that determines its quality, its legibility and its identity.
The idea of generating new texts in the style of an existing author has been popular since the invention of Markov processes, that have shown to capture, at least in a first approximation, elements of the style of a corpus. Markov processes represent faithfully local properties of sequences, at varying orders, which makes them well-suited for such a task.
However, a text is not just a Markovian sequence, and,
notwithstanding the issue of meaning, it has been shown that texts also
exhibit statistical long-range correlations.
For instance, poems or song lyrics often have rhymes or metric constraints, that are not always defined as local properties.
They induce long-range dependencies that violate the hypothesis of short-term memory of Markov processes and demand long distance modeling.
As a consequence, most approaches in automatic generation of stylistically imitative texts based on Markov models fail to capture higher-level properties of texts, which limit their use for practical applications, such as machine translation or automatic summarization.
The recently proposed framework of constrained Markov processes (CMP), introduced here, addresses precisely the issue of imposing constraints to a Markov process, to control it without the need of other tools, such as, e.g., a second, high-order Markov process. CMP is a technique to generate structured Markov sequences by reformulating Markov Processes as constraint satisfaction problems.
Lyrics generation can be formulated and solved in the CMP framework by expressing style as a set of Markov constraints, and properties of grammaticality, rhymes, meter and even to some extent, semantics, as constraints in that framework.
The Style Game
Play the Style Game to test the lyrics generation tool.
PEREC in action
A video illustrating how PEREC works.
Browse generated lyrics in more than 60 authors' styles.