Phase modelling with Neural Nets

Go to file

David 4f8c7ea4fb added some tests for est_n0, working OK for basic impulse case		2019-12-04 14:26:24 +10:30
README.md	updated README	2019-11-22 12:42:55 +10:30
codec2_model.py	debugging plot_n0.py/est_n0 C code	2019-12-03 07:07:59 +10:30
example_mag.png	swapped example names	2019-12-01 10:23:42 +10:30
example_phase.png	swapped example names	2019-12-01 10:23:42 +10:30
phasenn_test1.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test2.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test3.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test4.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test5.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test6.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test7.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test7a.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
phasenn_test8.py	more realistic range of filters	2019-11-22 17:05:22 +10:30
phasenn_test9.py	correct n0 values	2019-11-22 17:01:03 +10:30
phasenn_test9a.py	broken...	2019-11-22 15:28:14 +10:30
phasenn_test9b.py	still now working	2019-11-30 13:53:18 +10:30
phasenn_test9c.py	docu	2019-11-24 10:06:52 +10:30
phasenn_test10.py	work in progress combination of n0 est and dispersive part	2019-11-22 12:43:59 +10:30
phasenn_test11.py	added some tests for est_n0, working OK for basic impulse case	2019-12-04 14:26:24 +10:30
phasenn_test12.py	attempt to remove linear component of phase, unsucessful	2019-11-23 13:35:24 +10:30
phasenn_train.py	moved out of codec2 repo	2019-11-17 12:05:58 +10:30
plot_n0.py	added some tests for est_n0, working OK for basic impulse case	2019-12-04 14:26:24 +10:30
run_n0_est.sh	added some tests for est_n0, working OK for basic impulse case	2019-12-04 14:26:24 +10:30
test_n0_est.sh	added some tests for est_n0, working OK for basic impulse case	2019-12-04 14:26:24 +10:30

README.md

PhaseNN

A project to model sinusoidal codec phase spectra with neural nets.

Recent breakthroughs in NN speech synthesis (WaveNet, WaveRNN, LPCNet and friends) have resulted in exciting improvements in model based synthesised speech quality. These algorithms typically use NNs to estimate the PDF of the next speech sample using a history of previous speech samples. This PDF is then sampled. As such, speech is generated on a sample by sample basis. Computational complexity is high, although steadily being reduced.

Speech codecs employing frequency domain, block based techniques such as sinusoidal transform coding can deliver high quality speech using block based synthesis. They typically synthesise speech in blocks of 10-20ms at a time (e.g. 160-320 samples at Fs=16kHz) using efficient overlap-add IDFT techniques. Sinusoidal codecs use a similar parameter set to NN based synthesis systems (amplitude spectra and pitch information).

However for high quality speech, sinusoidal codecs require a suitable set of the sinusoidal harmonic phases for each frame that is synthesised. This work aims to generate the sinusoid phases from amplitude information using NNs, in order to develop a block based NN synthesis engine based on sinusoidal coding.

Status (Nov 2019)

Building up techniques for modelling phase using NNs and toy speech models (2nd order filters) in a series of tests.