mirror of https://github.com/drowe67/phasenn.git
Update README.md
parent
0d6129d500
commit
290323bf81
|
@ -12,11 +12,10 @@ For high quality speech, sinusoidal codecs require a suitable set of the sinusoi
|
|||
|
||||
Building up techniques for modelling phase using NNs and toy speech models (cascades of 2nd order filters) in a series of tests.
|
||||
|
||||
Here is the output from [phase_test11.py](phase_test11.py). The first plot is a series of magnitude spectra of simulated speech frames. The voiced frames have two fairly sharp peaks (formants) beneath Fs/2 with structured phase consisting of linear and dispersive terms. Unvoiced frames have less sharp peaks above Fs/2, and random phases.
|
||||
Here is the output from [phasenn_test11.py](phasenn_test11.py). The first plot is a series of magnitude spectra of simulated speech frames. The voiced frames have two fairly sharp peaks (formants) beneath Fs/2 with structured phase consisting of linear and dispersive terms. Unvoiced frames have less sharp peaks above Fs/2, and random phases.
|
||||
|
||||

|
||||

|
||||
|
||||
The next plot shows the original phase spectra (green), the phase spectra with an estimate of the linear phase term removed (red), and the NN ouput estimated phase (blue). For voiced frames, we would like green (original) and blue (NN estimate) to match. In particular we want to model the accurate phase shift across the peak of the amplitude spectra - this is the dispersive term that shifts the phase of high energy speech harmonics apart and reducing the buzzy/unnatural quality in synthsised speech. For unvoiced speech, we want the NN output (blue) to be random.
|
||||
|
||||

|
||||
|
||||
|
|
Loading…
Reference in New Issue