monophony (C4) or polyphony (C4 +
), (2)
polyphony (C4 + C4 : unison), (3) polyphony
(C4+C5) :octave), (4) polyphony (C4+G5:three-
times tone), where we assume the lower tone is C4
and we denote these tones by (2) unison, (3) octave
and (4) three-times tone.
First, we must discriminate if the input sound is
monophony or polyphony. If we detect the beat
components, then we know the sound may be
polyphony. But there are some monophony with beat
components like trumpet G5 in Fig.6 (e). Another
method to determine if the sound is monophony or
polyphony is to notice the time difference of the
starting points. For example, we can detect the
difference of the starting points in the case of Fig.6 (f).
Next, we must determine if the tone is unison,
octave or three-times tone. If we detect that all most
components of the sound are beating, we can know
the sound may be a union, for example in the case of
Fig.6 (f). If even components are beating like Fig.6
(g) and (h), we can know the sound may be an octave.
If harmonic components of the third component are
beating like Fig.6 (i), we can determine the sound
may be a three-times tone.
The pitch estimation method by the beat signals
uses some measurement time, about 100 or 200ms.
This is a problem to estimate the pitches for shorter
sounds.
4 COMB FILTER METHOD
In this method, we process an input sound using a
comb filter and the sample data of 2000 (45 ms) to
3000 (68 ms) from the starting point of the input
sound. For simplicity, we assume that the lower tone
of polyphony is a C4 tone. In this case, we must
discriminate the following four tones, (1) monophony
C4 or polyphony, (2) polyphony (C4+C4: unison), (3)
polyphony (C4+C5:octave), (4) polyphony (C4+G5:
three-times tone).
First, an input sound is passed by a comb filter C4.
The comb filter C4 means the filter
p
N
p
zzH
−
−= 1)(
where
p =C4 and
62.261/1.44[]/[ kHzffN
psp
==
.168] =Hz Ideally the comb filter C4 can eliminate
all above four tones, i.e., monophony, unison, octave
and three times tone. But we can obtain a small output
signal caused by some frequency difference from
ideal frequencies. Next, we measure the periods of the
output signal of the comb filter C4. From these
periods, we can get the clues to discriminate the
above four tones.
Figure 7 shows the input and output waveforms
(sample number, n=2000-3000) of the comb filter C4:
input ((a)-(d)) and output ( (e)-(i)).
First we use the comb filter C4 of
168
0
=
p
N that
is a sample number determined from
].62.261/44100[/ HzHzff
ps
= When the
monophony C4 in Fig.7(a) is filtered by the comb
filter C4 of
168
0
p
N
, we obtain the output signal in
Fig.7(e) of which amplitude is decreased by the factor
of 0.04 from the input one. Next we measure the
period of the comb filter output signal (Fig.7(e)) and
obtain the period of 166
1
p
N . Then we filter the
input sound again by the comb filter C4 of 166
1
p
N
and this time we measure the period to be
167
2
p
N .When we pass the input sound through
the comb filter C4 of
167
2
p
N
, we obtain the filter
output signal having its period
166
3
=
p
N . These
waveforms of the output signals in the comb filters of
167
2
p
N
and
166
3
p
N
are almost same and so
we determine that the input sound is monophony C4.
When the input sound of Fig.7(b) is filtered by the
comb filter C4 of
168
0
p
N
, we obtain the filter
output signal in Fig.7(f) of which amplitude is
decreased by the factor of 0.16. From the output
signal in Fig.7(f), we measure the period to be
170
1
p
N . Then we filter the input signal in
Fig.7(b) by the comb filter C4 of
170
1
=
p
N and we
obtain the output signal in Fig.7(g) with the period of
168
2
p
N (or 167). The waveforms of Fig.7(f) and
(g) are different and so we determine that the input
sound of Fig.7 (b) is polyphony (C4+C4: unison).
Above we showed an example to discriminate the
monophony and polyphony in the comb filter method.
But we think that a more effective method using a
comb filter is to use two input sounds obtained from
two different points and measure the amplitude ratio
of the output/input signals of the comb filter C5 or G5.
If the input sound is monophony, then the
output/input ratio does not change for two input
sounds. If the input sound is polyphony, the
waveforms of two input sounds change for the phase
relation of two tones and so the output/input ratio also
change. We can determine if an input sound is
monophony or polyphony by noticing the change of
the output/input ratio for two input sounds.
Next we consider the polyphony of octave and
three-times tone. When the input sound of Fig.7(c) is
passed through the comb filter C4 of
168
0
=
p
N
, we
PITCH ESTIMATION OF DIFFICULT POLYPHONY SOUNDS OVERLAPPING SOME FREQUENCY
COMPONENTS
171