Publications of Leigh M. Smith
This is a list of my recent publications on music cognition, and musical rhythm representation.
Proceedings of the Tenth International Conference on Music Perception and Cognition, pages 360-365, Sapporo, Japan, 2008
We describe a computational model of rhythmic cognition that predicts expected onset times. A dynamic representation of musical rhythm, the multiresolution analysis using the continuous wavelet transform is used. This representation decomposes the temporal structure of a musical rhythm into time varying frequency components in the rhythmic frequency range (sample rate of 200Hz). Both expressive timing and temporal structure (score times) contribute in an integrated fashion to determine the temporal expectancies. Future expected times are computed using peaks in the accumulation of time-frequency ridges. This accumulation at the edge of the analysed time window forms a dynamic expectancy. We evaluate this model using data sets of expressively timed (or performed) and generated musical rhythms, by its ability to produce expectancy profiles which correspond to metrical profiles. The results show that rhythms of two different meters are able to be distinguished. Such a representation indicates that a bottom-up, data-oriented process (or a non-cognitive model) is able to reveal durations which match metrical structure from realistic musical examples. This then helps to clarify the role of schematic expectancy (top-down) and it's contribution to the formation of musical expectation.
Proceedings of the International Conference on Music Communication Science, 4 pages, Sydney, 2007
A dynamic representation of musical rhythm, the multiresolution analysis using the continuous wavelet transform (CWT), is evaluated using a dataset of the interonset intervals of 105 national anthem rhythms. This representation decomposes the temporal structure of a musical rhythm into time varying frequency components in the rhythmic frequency range (sample rate of 200Hz). Evidence is presented that the beat (typically quarter-note or crochet) and the bar (measure) durations of each rhythm are revealed by this transform. Such evidence suggests that the pattern of time intervals, when analyzed with the CWT, function as features that are used in the process of forming a metrical interpretation. Since the CWT is an invertible transform of the interonset intervals in each rhythm, this result is interpreted as setting a minimum capability of discrimination that any perceptual model of beat or meter can achieve. It indicates that a bottom-up, data-oriented process (or a non-cognitive model) is able to reveal durations which match metrical structure from realistic musical examples. This then characterises the data and behaviour of a top-down cognitive model which must interact with the bottom-up process.
Proceedings of the 2006 International Computer Music Conference, New Orleans, pages 688-91
What makes a rhythm interesting, or even exciting to listeners? While in the literature a wide range of definitions of syncopation exists, few allow for a precise formalization. An exception is Longuet-Higgins and Lee (1984), that proposes a formal definition of syncopation. Interestingly, this model has never been challenged or empirically validated. In this paper the predictions made by this model, along with alternative definitions of metric salience, are compared to existing empirical data consisting of listener ratings on rhythmic complexity. While correlated, noticable outliers suggest processes in addition to syncopation contribute to listeners judgements of complexity.
UWA PhD Thesis, 191 pages, October 2000, Department of Computer Science, University of Western Australia
Computational approaches to music have considerable problems in representing musical time. In particular, in representing structure over time spans longer than short motives. The new approach investigated here is to represent rhythm in terms of frequencies of events, explicitly representing the multiple time scales as spectral components of a rhythmic signal.
Approaches to multiresolution analysis are then reviewed. In comparison to Fourier theory, the theory behind wavelet transform analysis is described. Wavelet analysis can be used to decompose a time dependent signal onto basis functions which represent time-frequency components. The use of Morlet and Grossmann's wavelets produces the best simultaneous localisation in both time and frequency domains. These have the property of making explicit all characteristic frequency changes over time inherent in the signal.
An approach of considering and representing a musical rhythm in signal processing terms is then presented. This casts a musician's performance in terms of a conceived rhythmic signal. The actual rhythm performed is then a sampling of that complex signal, which listeners can reconstruct using temporal predictive strategies which are aided by familarity with the music or musical style by enculturation. The rhythmic signal is seen in terms of amplitude and frequency modulation, which can characterise forms of accents used by a musician.
Once the rhythm is reconsidered in terms of a signal, the application of wavelets in analysing examples of rhythm is then reported. Example rhythms exhibiting duration, agogic and intensity accents, accelerando and rallentando, rubato and grouping are analysed with Morlet wavelets. Wavelet analysis reveals short term periodic components within the rhythms that arise. The use of Morlet wavelets produces a "pure" theoretical decomposition. The degree to which this can be related to a human listener's perception of temporal levels is then considered.
The multiresolution analysis results are then applied to the well-known problem of foot-tapping to a performed rhythm. Using a correlation of frequency modulation ridges extracted using stationary phase, modulus maxima, dilation scale derivatives and local phase congruency, the tactus rate of the performed rhythm is identified, and from that, a new foot-tap rhythm is synthesised. This approach accounts for expressive timing and is demonstrated on rhythms exhibiting asymmetrical rubato and grouping. The accuracy of this approach is presented and assessed.
From these investigations, I argue the value of representing rhythm into time-frequency components. This is the explication of the notion of temporal levels (strata) and the ability to use analytical tools such as wavelets to produce formal measures of performed rhythms which match concepts from musicology and music cognition. This approach then forms the basis for further research in cognitive models of rhythm based on interpretation of the time-frequency components.
Proceedings of the 2000 International Computer Music Conference, Berlin, pages 503-506
This paper describes the new implementation and port of the NeXT MusicKit, and a clone of the NeXT SoundKit - the SndKit, on a number of different platforms, old and new. It will then outline some of the strengths and uses of the kits, and demonstrate several applications which have made the transition from NeXTSTEP to MacOS-X and WebObjects/NT.
Proceedings of the Fourth International Conference on Music Perception and Cognition, Montreal 1996, pages 197-202
Existing theories of musical rhythm have argued for a conceptualization of a temporal hierarchy of rhythmic strata. This paper describes a computational approach to representing the formation of rhythmic strata. The use of Gabor transform wavelets (as described by Morlet and co-workers) is demonstrated as an analysis technique capable of explicating elements of rhythm cognition. Transforms over a continuous time-frequency plane (the scalogram) spanning rhythmic frequencies (0.1 to 100Hz) capture the multiple periodicities implied by beats at different temporal relationships. Gabor wavelets have the property of preserving the phase of the frequency components of the analyzed signal. The use of phase information provides a new approach to the analysis of rhythm. Measures of phase congruence over a range of frequencies are shown to be useful to highlight transient rhythms and temporal accents. The performance of the wavelet transform is demonstrated on an example of generated rhythms.
Proceedings of the 1996 International Computer Music Conference, Hong Kong, pages 392-5
The use of linear phase Gabor transform wavelets is demonstrated as a robust analysis technique capable of making explicit many elements of human rhythm perception behaviour. Transforms over a continuous time-frequency plane (the scalogram) spanning rhythmic frequencies (0.1 to 100Hz) capture the multiple periodicities implied by beats at different temporal relationships. Wavelets represent well the transient nature of these rhythmic frequencies in performed music, in particular those implied by agogic accent, and at longer time-scales, by rubato.
The use of the scalogram phase information provides a new approach to the analysis of rhythm. Measures of phase congruence over a range of frequencies are shown to be useful in highlighting transient rhythms and temporal accents. The performance of the wavelet transform is demonstrated on examples of performed monophonic percussive rhythms possessing intensity accents and rubato. The transform results indicate the location of such accents and from these, the inducement of phrase structures.
- 2 of 2