Programme
en détail
Invités d'honneur
Articles acceptésPanneaux/articles courts acceptés Accueil Partenariats
et subventions
|
ISMIR 2002
|
[Abstract21]We propose a model for errors in
sung queries, a variant of the Hidden Markov Model (hmm). This is related to
the problem of identifying the degree of similarity between a query and
a potential target in a database of musical works, in the music
retrieval framework. The model comprehensively expresses the types of error or
variation between target and query: cumulative and non-cumulative local errors,
transposition, tempo and tempo changes, insertions, deletions and modulation.
Results of experiments demonstrating the robustness of such a model are
presented.
In this paper, a new system for the
automatic transcription of singing sequences into a sequence of pitch and
duration pairs is presented. Although such a system may have a wider range of
applications, it was mainly developed to become the acoustic module of a
query-by-humming (QBH) system for retrieving pieces of music from a digitized
musical library. The first part of the paper is devoted to the systematic
evaluation of a variety of state-of-the art transcription systems. The main
result of this evaluation is that there is clearly a need for more accurate
systems. Especially the segmentation was experienced as being too error prone
(≈ 20_ % segmentation errors). In the second part of the paper, a new auditory
model based transcription system is proposed and evaluated. The results of that
evaluation are very promising. Segmentation errors vary between 0 and 7 %,
depending on the amount of lyrics that is used by the singer. Anyway, an error
of less than 10 % is anticipated to be acceptable for QBH. The paper ends with
the description of an experimental study that was issued to demonstrate that
the accuracy of the newly proposed transcription system is not very sensitive
to the choice of the free parameters, at least as long as they remain in the
vicinity of the values one could forecast on the basis of their meaning.
[Abstract23]A hidden Markov model approach to
piano music transcription is presented. The main difficulty in applying
traditional HMM techniques is the large number of chord hypotheses that must be
considered. We address this problem by using a trained likelihood model to
generate reasonable hypotheses for each frame and construct the search graph
out of these hypotheses. Results are presented using a recording of a movement
from Mozart's Sonata 18, K. 570.
[Abstract24]Voice separation, along with
tempo-detection and quantization, is one of the basic problems of
computer-based transcription of music. An adequate separation of notes into
different voices is crucial for obtaining readable and usable scores from
performances of polyphonic music recorded on keyboard (or other polyphonic)
instruments; for improving quantisation results within a transcription system;
and in the context of music retrieval systems that primarily support monophonic
queries. In this paper we propose a new voice separation algorithm based on a
stochastic local search method. Different from many previous approaches, our
algorithm allows chors in the individual voices; its behaviour is controlled by
a small number of intuitive and musically motivated parameters; and it is fast
enough to allow interactive optimisation of the result by adjusting the
parameters in real-time. We demonstrate that compared to existing approaches,
our new algorithm generates better solutions for a number of typical voice
separation problems. We also show how by changing its parameters it is possible
to create score output for different needs (i.e. piano-style or orchestral
scores).
[Abstract25]The Music Information Retrieval
methods can be classified into online and offline methods. The main drawback in
most of the offline algorithms is the space the indexing structure requires.
The amount of the data stored into the structure can however be reduced by
storing only the suitable index terms or phrases instead of the whole contents
of the database. Repetition is agreed to be one of the most important factors
of musical meaningfulness. Therefore repetitive phrases are suitable for
indexing purposes. The extraction of such phrases can be done by applying and
existing text mining method to musical data. Because of the differences between
text and musical data the application requires some technical modification of
the method. This paper introduces a text mining-based music database indexing
method that extracts maximal frequent phrases from musical data and sorts them
by their length, frequency and personality. The implementation of the method
found three different types of phrases from the test corpus consisting of Irish
folk music tunes. The suitable two types of phrases out of three are easily
recognized and separated from the set of all phrases to form an index data for
the database.
[Abstract26]In order to represent musical
content, pitch and timing information is utilized in the majority of existing
work in Symbolic Music Information Retrieval (MIR). Symbolic representations
such as
[Abstract27]The main contribution of this paper
is an invistigation on the effects of exploiting melodic features for automatic
melody segmentation aimed at content-based usicd retrieval. We argue that
segmentation based on melodic features is more effective than random or
n-grams-based segmentation, which ignore any context. We have carried out an
experiment employing experienced subjects. The manual segmentation result has
been processed to detect the most probably boundaries in the melodic surface,
using a probabilistic decision function. The detected boundaries have then been
compared with the boundaries detected by an automatic precedure implementing an
algorithm for melody segmentation, as well as by a random segmenter and by a
n-gram-based segmenter. Results showed that automatic segmentation based on
melodic features is closer to manual segmentation that algorithms that do not
use such information.
[Abstract28]Hidden Markov Models (HMMs) have
been suggested as an effective technique to represent music. Given a collection
of musical pieces, each represented by its HMM, and a query , the retrieval
task reduces to finding HMM most likely to have generated the query. The
musical piece represented by this HMM is frequently the one rendered by the
user, possibly imperfectly. This method might be inefficient if there is a very
large music database, since each HMM to be tested requires the evaluation of a
dynamic-programming algorithm. In this paper, we propose an indexing mechanism
that can aggressively prune the set of condidiate HMMs to be evaluated in
response to a query. Our experiments on a music database showed an anverage of
a seven-fold spped up with no false dismissals.
[Abstract29]The CUIDADO Project (Content-based
Unified Interfaces and Descriptors for Audio/music Databases available Online)
aims at developing a new chain of applications through the use of audio/music
content descriptors, in the spirit of the MPEG-7 standard. The project includes
the design of appropriate description structures, the development of extractors
for deriving high-level information from audio signals, and the design and
implementation of two applications: the Sound Palette and the Music Browser.
These applications include new features which systematically exploit high-level
descriptors and provide users with content-based access to large catalogues of
audio/music material. The Sound Palette is focused on audio samples and targets
professional users, whereas the Music Browser addresses a broader user target
through the management of music titles. After a presentation of the project
objectives and methodology, we describe the original features of the two
applications made possible by the use of descriptors and the technical
architecture framework on which they rely.
[Abstract30]"Query by humming" is an
interaction concept in which the identity of a song has to be revealed fast and
orderly from a given sung input using a large database of known melodies. In
short, it tries to detect the pitches in a sung melody and compares these
pitches with symbolic representations of the known melodies. Melodies that are
similar to the sung pitches are retreved. Approximate pattern matching in the
melody comparison process compensates for the errors in the sung melody by
using classical dynamic programming. A filtering method is use to save
computation in the dynamic programming framework. This paper presents the
algorithms for pitch detection, note onset detection, quantization, melody
encoding and approximate pattern matching as they have been implemented in the
Cubyllum software system. Since human reproduction of melodies is imperfect,
findings from an experimental singing study were a crucial input to the
development of the algorithms. Future research should pay special attention to
the reliable detection of note onsets in any preferred singing style. in
addition, research on index methods and fast bit-parallelism algorithms for
approximate pattern matching needs to be further pursued to decrease
computational requirements when dealing with large melody databases.
[Abstract31]In this paper, we propose four
peer-to-peer models for content-based music information retrieval (CBMIR) and carefully
evaluate them on load, time, refreshment and robustness qualitatively and
quantitatively. And we bring forward an algorithm to accelerate the retrieval
speed of CBP2PMIR and a simple but effective method to filter the replica in
the final results. And we present the architecture of content-based
peer-to-peer music information retrieval system QUIND which can implement
CBMIR. QUIND combines content-based music information retrieval technologies
and peer-to-peer environment, and has good robustness and expansibility. Music
stored and shared on each PC makes up of the whole available music resource.
When user puts forward a music query, e.g. a song or a melody, QUIND can
retrieve a lot of similar music quickly and accurately according to the content
of query music. After user selects his favorite ones, he can download and enjoy
them.