Update README.md

master
drowe67 2019-12-01 10:27:46 +10:30 committed by GitHub
parent 0d6129d500
commit 290323bf81
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 3 deletions

View File

@ -12,11 +12,10 @@ For high quality speech, sinusoidal codecs require a suitable set of the sinusoi
Building up techniques for modelling phase using NNs and toy speech models (cascades of 2nd order filters) in a series of tests.
Here is the output from [phase_test11.py](phase_test11.py). The first plot is a series of magnitude spectra of simulated speech frames. The voiced frames have two fairly sharp peaks (formants) beneath Fs/2 with structured phase consisting of linear and dispersive terms. Unvoiced frames have less sharp peaks above Fs/2, and random phases.
Here is the output from [phasenn_test11.py](phasenn_test11.py). The first plot is a series of magnitude spectra of simulated speech frames. The voiced frames have two fairly sharp peaks (formants) beneath Fs/2 with structured phase consisting of linear and dispersive terms. Unvoiced frames have less sharp peaks above Fs/2, and random phases.
![](example_mag.png "Magnitude Spectra")
![](example_phase.png "Phase Spectra")
The next plot shows the original phase spectra (green), the phase spectra with an estimate of the linear phase term removed (red), and the NN ouput estimated phase (blue). For voiced frames, we would like green (original) and blue (NN estimate) to match. In particular we want to model the accurate phase shift across the peak of the amplitude spectra - this is the dispersive term that shifts the phase of high energy speech harmonics apart and reducing the buzzy/unnatural quality in synthsised speech. For unvoiced speech, we want the NN output (blue) to be random.
![](example_phase.png "Phase Spectra")