3 DEVELOPMENTS
Many improvement are programmed in our project,
in all phases of its operation.
An improvement is that of using eventual XML
content information for the search (Bellini Nesi
2001, Haus Longari 2002) and that of using text
information from the URL by means of techniques
from Natural Language Processing, to be added to
the content information obtained by the signal: the
context of the sound file, description, annotations
and similar may in fact add useful information on it.
Other features to be used as means for
classification and search will be added, from the
large number identified by the literature (Peeters,
Rodet 2002); an example is the kind of thumbnails
recently introduced by one of the authors
(Evangelista & Cavaliere 2005).
A second modality of search will also be
implemented, based on histogram similarity using
the Kullback-Leibler divergence or other measure.
In this case the user will provide an example file or
an entire class of files for the search; files are then
searched for, which provide best fit to the statistical
distribution of the parameters in the example file.
We are also working to an improvement of the
program, consisting in a parallel version of it;
parallelism will be achieved by a master computer
which will divide the burden of annotation in chunks
and will send tasks to slave computers (these mostly
are in the LAN, but also might reside in any position
in the network); these slaves, as soon as the user in
them decides to open to parallel processing, will
signal its presence in the net and will be waiting for
the completion of the task. The maarester in fact will
receive the address of the slaves which are ready
and will send to it a specific task. The granularity of
these tasks is easily identified in the analysis of the
different sound files: the master just sends the
address of the files in the Internet: the slave will
download the sound file and, in turn, send back the
computed sound parameters to be stored in the
archive for further search.
The practice of our project has collected its first
encouraging results, showing that it has configured a
complete set of tools, which, installed in a Local
Area Network, in a studio or also classroom or
Research Laboratory, allows easily the efficient
paradigm of a parallel archive with distributed
storage and also distributed processing.
Also we realized that in spite of the use of high
level interpreted languages the efficiency of the
program is quite satisfying, while easiness of
prototyping lets experiment easily new solutions: on
the other end a compiled version of the Sound
Browser speeds up both search and classification.
REFERENCES
Bellini, P. Nesi, P., 2001 WEDELMUSIC format: an
XML music notation format for emerging applications
Proceedings of the First International Conference on
Web Delivering of Music.
Burred JJ, A Lerch 2004 Hierarchical Automatic Audio
Signal Classification Journal of the Audio Engineering
Society. Vol. 52, No. 7/8.
Evangelista G., Cavaliere S. 2005. Event Synchronous
Wavelet transform approach to the extraction of
Musical Thumbnails, Proc. of the DAFX05
International Conference on Digital Audio Effects
Madrid, Spain.
Foote, J. 1999. An overview of audio information
retrieval. ACM Multimedia Systems, 7:2–10.
Haus G, Longari M, 2002 Towards a Symbolic/Time-
Based Music language based on XML
Proc. First International IEEE Conference on Musical
Applications Using XML (MAX2002), New York.
Lu L., Hao J., and HongJiang Z., 2001. A robust audio
classification and segmentation method. In Proc. ACM
Multimedia, Ottawa, Canada.
Pachet F, La Burthe A, Zils A, Aucouturier JJ - Popular
music access: The Sony music browser Journal of the
American Society for Information Science and and
Technology, Volume 55, Issue 12 , Pages 1037 – 1044.
Panagiotakis C, Tziritas G, 2005. A Speech/Music
Discriminator Based on RMS and Zero-Crossings -
IEEE Transactions on Multimedia.
Peeters G., Rodet X., 2002. Automatically selecting signal
descriptors for sound classification. In Proceedings of
ICMC 2002, Goteborg, Sweden.
Rossignol S., Rodet X., 1998. et al. Features extraction
and temporal segmentation of acoustic signals. In
Proc. Int. Computer Music Conf. ICMC, pages 199–
202. ICMA.
Scheirer E., Slaney M., 1997. Construction and evaluation
of a robust multifeature speech/music discriminator. In
Proc. Int. Conf. on Acoustics, Speech and Signal
Processing ICASSP, pages 1331–1334. IEEE.
Tzanetakis, G. Cook, P., 2000, MARSYAS: a framework
for audio analysis . Organised Sound,
CambridgeUnivPress 4(3), pages 169-177.
Tzanetakis G. and Cook P., 2002. Musical Genre
Classification of Audio Signals IEEE Transactions on
Speech and Audio Processing, VOL. 10, NO. 5, JULY
p. 293.
Vinet H, Herrera P, Pachet F. , 2002. The Cuidado Project:
New Applications Based on Audio and Music Content
Description Proc. ICMC.
Wold E., Blum T., Keislar D., and Wheaton J., 1996.
Content-based classification, search and retrieval of
audio. IEEE Multimedia, 3(2).
Zhang T. and Kuo J., 2001. Audio Content Analysis for
online Audiovisual Data Segmentation and
Classification IEEE Transactions on Speech and
Audio Processing (4):441–457, May.
Zölzer U. (ed.). 2002. DAFX - Digital Audio Effects. John
Wiley & Sons.
SIGMAP 2006 - INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA
APPLICATIONS
338