Perspectives and approaches to determine measures of similarity for musical performances using data analysis algorithms

Main Article Content

Gustavo Adolfo Colmenares Pacheco
Zenaida Natividad Castillo Marrero

Abstract

The automatic characterization of a musical composition, and its interpretation, is a current line of research, due to its importance and the technological resources available and computational tools capable for detecting voice and sound. The recognition of an interpreter when listening a composition is simple for a human, but not as simple for machines, thus, this topic has kept a community of researchers in search of measures to compare and accurately recognize the composition characteristics and the interpreter. However, still a general measure has yet to be discovered, although various statistical-computational techniques have been generated and deserve to be evaluated and perhaps combined to strengthen any research on this topic. This work is the product of a comprehensive literature review that collects the main techniques and tools that have been used and proposed by researchers over the last two decades. The document will be of help to researchers who decide to undertake studies, evaluations and implementations of these tools, as well as those who wish to work on automatic recognition of interpreters and characteristics of musical compositions, or music information retrieval.

Downloads

Download data is not yet available.

Article Details

How to Cite
Colmenares Pacheco, G. A., & Castillo Marrero, Z. N. (2020). Perspectives and approaches to determine measures of similarity for musical performances using data analysis algorithms. ConcienciaDigital, 3(3.1), 75-87. https://doi.org/10.33262/concienciadigital.v3i3.1.1366
Section
Artículos

References

Adams, N. H., Bartsch, M. A., Shifrin, J., & Wakefield, G. H. (2004, October). Time Series
Alignment for Music Information Retrieval. In ISMIR.
Alvarado, P. A., & Stowell, D. (2016, September). Gaussian processes for music audio
modelling and content analysis. In 2016 IEEE 26th International Workshop on
Machine Learning for Signal Processing (MLSP) (pp. 1-6). IEEE.
Bay, M., Ehmann, A. F., & Downie, J. S. (2009, October). Evaluation of multiple-f0
estimation and tracking systems. In ISMIR (pp. 315-320).
Benetos, E., Dixon, S., Giannoulis, D., Kirchhoff, H., & Klapuri, A. (2013). Automatic music
transcription: challenges and future directions. Journal of Intelligent Information
Systems, 41(3), 407-434.
Dressler, K. (2012). Multiple fundamental frequency extraction for MIREX 2012. Eight
Music Information Retrieval Evaluation Exchange (MIREX).
Friberg, A., Bresin, R., and Sundberg, J. (2006). Overview of the KTH rule system for
musical performance. Adv. Cogn. Psychol. 2, 145–161.
Fu, M., Xia, G., Dannenberg, R. B., & Wasserman, L. A. (2015). A Statistical View on the
Expressive Timing of Piano Rolled Chords. In ISMIR (pp. 578-583).
Giraldo, S., & Ramirez, R. (2016). A machine learning approach to ornamentation modeling
and synthesis in jazz guitar. Journal of Mathematics and Music, 10(2), 107-126.
Grindlay, G., & Helmbold, D. (2006). Modeling, analyzing, and synthesizing expressive
piano performance with graphical models. Machine learning, 65(2-3), 361-387.
Gu, Y., & Raphael, C. (2012, October). Modeling Piano Interpretation Using Switching
Kalman Filter. In ISMIR (pp. 145-150).
Holzapfel, A., & Stylianou, Y. (2008, March). Rhythmic similarity of music based on
dynamic periodicity warping. In 2008 IEEE International Conference on Acoustics,
Speech and Signal Processing (pp. 2217-2220). IEEE.
Humphrey, E. J., Bello, J. P., & LeCun, Y. (2012, October). Moving beyond feature design:
Deep architectures and automatic feature learning in music informatics. In ISMIR (pp. 403-408).
Kim, T. H., Fukayama, S., Nishimoto, T., & Sagayama, S. (2011). Polyhymnia: An
Automatic Piano Performance System with Statistical Modeling of Polyphonic Expression and Musical Symbol Interpretation. In NIME (pp.96-99).
Kim, T. H., Fukayama, S., Nishimoto, T., & Sagayama, S. (2013). Statistical approach to
automatic expressive rendition of polyphonic piano music. In Guide to Computing for Expressive Music Performance (pp. 145-179). Springer, London.
Liem, C. C., & Hanjalic, A. (2015). Comparative analysis of orchestral performance
recordings: an image-based approach. In Proceedings of the 16th International Society for Music Information Retrieval Conference, ISMIR 2015, Malaga, Spain, October 26-20, 2015. Eds. Muller, M., Wiering, F. IEEE.
Molina-Solana, M., Arcos, J. L., & Gomez, E. (2008). Using Expressive Trends for
Identifying Violin Performers. In ISMIR (pp. 495-500).
Moulieras, S., & Pachet, F. (2016). Maximum entropy models for generation of expressive
music. arXiv preprint arXiv:1610.03606.
Muda, L., Begam, M., & Elamvazuthi, I. (2010). Voice recognition algorithms using mel
frequency cepstral coefficient (MFCC) and dynamic time warping (DTW)
techniques. arXiv preprint arXiv:1003.4083.
Nakamura, E., Yoshii, K., & Dixon, S. (2017). Note value recognition for piano transcription
using markov random fields. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(9), 1846-1858.
Ohishi, Y., Mochihashi, D., Kameoka, H., & Kashino, K. (2014, May). Mixture of Gaussian
process experts for predicting sung melodic contour with expressive dynamic fluctuations. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 3714-3718). IEEE.
Okumura, K., Sako, S., & Kitamura, T. (2014). Laminae: A stochastic modeling-based
autonomous performance rendering system that elucidates performer characteristics. In ICMC.
Patricio, C. V., & Chew, E. Application of Hidden Markov Models to music performance
style classification via timing and loudness. Scientific committee, 40.
Raphael, C. (2009, July). Representation and Synthesis of Melodic Expression. In IJCAI (pp.
1475-1480).
R Core Team. (2020). R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/
Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken
word recognition. IEEE transactions on acoustics, speech, and signal
processing, 26(1), 43-49.
Senin, P. (2008). Dynamic time warping algorithm review. Information and Computer
Science Department University of Hawaii at Manoa Honolulu, USA, 855(1-23), 40.
Schlüter, J. (2017). Deep learning for event detection, sequence labelling and similarity
estimation in music signals/submitted by Jan Schlüter (Doctoral dissertation, Universität Linz).
Sapp, C. S. (2007, September). Comparative Analysis of Multiple Musical Performances.
In ISMIR (pp. 497-500).
Stamatatos, E. (2002). Quantifying the differences between music performers: Score vs.
norm. In ICMC.
Teramura, K., Okuma, H., Taniguchi, Y., Makimoto, S., & Maeda, S. I. (2008). Gaussian
process regression for rendering music performance. Proc. ICMPC, 167-172.
Temperley, D. (2009). A unified probabilistic model for polyphonic music analysis. Journal
of New Music Research, 38(1), 3-18.
Yang, Y. H., Lin, Y. C., Su, Y. F., & Chen, H. H. (2008). A regression approach to music
emotion recognition. IEEE Transactions on audio, speech, and language processing, 16(2), 448-457.
Tobudic, A., and Widmer, G. (2006). Relational IBL in classical music. Mach. Learn. 64,
5–24. doi: 10.1007/s10994-006-8260-4.
Widmer, G. (2003). Discovering simple rules in complex data: A meta-learning algorithm
and some surprising musical discoveries. Artificial Intelligence, 146(2), 129-148.
Widmer, G., & Goebl, W. (2004). Computational models of expressive music performance:
The state of the art. Journal of New Music Research, 33(3), 203-216.
Zurek, E. E., Gamarra, M. R., Escorcia, J. R., Gutierrez, C., Bayona, H., Pérez, R., &
García, X. (2016). Análisis Espectral Para El Reconocimiento De Huellas Acústicas. Revista de Investigaciones Universidad del Quindío, 28(1), 116-122.