Speech synth

neochrome32 · Joined: 09 Jun 2013 Posts: 153

GRR need some help, i am trying to create something that sounds like the SpeakJet.

Is there anyone that has successfully written a speech synth?

I have a working PWM system.

I tried to convert the AVR

http://www.societyofrobots.com/member_tutorials/node/212

but the result is so dreadfully horrible!

I suspect that the words aren't forming correctly...

It sounds NOTHING like the Speak jet :( and can't buy speak jets easily from where I am! If they were $10 each, I'd buy a whole pack of em!

Ttelmah · Joined: 11 Mar 2010 Posts: 19568

I haven't looked at the code, but the AVR, offers as one of it's PWM modes a 'frequency' mode (as part of the phase correct mode), which would be the easiest way to synthesise the tones for speech, but would need a lot of work to translate to the PIC PWM. If this mode is being used, it'd explain your problems.....

Best Wishes

neochrome32 · Joined: 09 Jun 2013 Posts: 153

The PWM is working ok as it plays back 22Khz Wav file uint8_t nicely enough so i know the PWM is acting like an audio output.

I am trying the PWM mode, should i try the other modes?

Interestingly though it definitely has the "Voice" sound and not just noise.

But its not forming the sounds up to make a word :( hmm

If i recorded a clip could you take a listen ?

-- additional:
does the AVR C langfuage terminate its const strings automatically?

static const char progmem test[] = "Hello world";

is it the same as

static ROM char test[] = "Hello world\0";

neochrome32 · Joined: 09 Jun 2013 Posts: 153

ok i have a reasonable conversion of the speech v1.0 synth

its not.. what youd call clear...

but the source code appears to be working!

it seriously needs improving as you REALLY have to listen out

but, my favorite "MIND MACHINE" was CLEARLY heard! whoohoo

Ttelmah · Joined: 11 Mar 2010 Posts: 19568

Well done.

Good audio filtering may be a 'key' part to better intelligibility.

Remember that 'speech' only needs quite low frequencies (in the old days things like phones were filtered above 2.4KHz), and getting rid of high frequency noise, and also low frequency artefacts, may well make things work much better. Try a simple band pass filter from perhaps 240Hz, to 3KHz.

Best Wishes

notbad · Joined: 10 Jan 2013 Posts: 68

Hi. Congrats. Can you upload an audio clip?

neochrome32 · Joined: 09 Jun 2013 Posts: 153

[i'll do more than that, if i can get this tidy, i'll post a zip for people to play with]

i found that i forgot that the compiler was not intelligent enough
to make int8 set_ont;
int16 set_oft;

so when i did value = set_oft + (set_ont * 40) it screwed up and made the entire result 8bits!

value = set_oft + ((int16)set_ont * 40) fixed it up!

its alot cleared, but still i find it hard to understand. if i upload. can we all try to improve it??? please??

neochrome32 · Joined: 09 Jun 2013 Posts: 153

http://www.electronscape.co.uk/test.mp3

what this is attempting to say with very pathetic filtering :(

LM386 amp
with a 100nF cap

"Hello tony, how are you"

"Hello, mind machine, this is a test"

im sure i could get this working better but the V sounds like a noise, semi formed, i cant work it out...

i am trying to setup the lib for you guys and it will be setup for PIC18LF46K22, but im sure its easily done to other chipsets..

i WILL attempt to make it customisable.. but at the moment the code is messy

Ttelmah · Joined: 11 Mar 2010 Posts: 19568

On the integer size, this is not a case of the compiler 'not being smart enough', but following the rules....
The reason you get away with this on many compilers, is 'twofold'. First the default integer size is commonly an int16, and secondly the hardware maths unit is used, which automatically flags an overflow. The code wouldn't 'make the entire result 8bits'. It'd perform the set_ont*40 as 8bit, then convert this to 16bits, and add this to set_oft.

If porting code from another compiler, probably the first thing to do, is to use a #type statement, and set the variable sizes to match those used on the other compiler....

Now, I'd 'step back' first of all.

The code offers being compiled to run on it's target chip, or a PC. Then outputting the speech result as a serial data stream.

I'd start by using this. Get a stream of data that would go to the 'speech' synthesis component, and save this.

Compile a copy of your 'speech' routine, just set up to use this array from the PC converted data.
I'd suspect the problem is in the 'speech' routine, but prove it. This then allows you to debug this one component, with what should be 'known good' incoming data.

You refer to a 'V', but there isn't one in the example you post?.

Initially it sounds as the actual PWM rate being used is lower than it should be. Given that you need to synthesise up to perhaps 3KHz minimum, you need the PWM to be above 6KHz The background tone in the example is way below this. The actual update of data seems right, but given the low tone rate, anything with a reasonably high frequency component, is being lost. Hence I suspect the problem is in the speech routine, and particularly the tone synthesis routine, but prove it by ruling out the rest of the program first.

Best Wishes

neochrome32 · Joined: 09 Jun 2013 Posts: 153

i mean V is when i do speak("vvvvvvvvvvvv") for example, it sounds very bad

http://www.electronscape.co.uk/pic18F46k22speechlib.rar

this is a rough draft of the source code, i've compiled it and it does work...

but its nothing special its only to show the speaking part.

you can hook it up to a serial terminal at 115k Baud, and it seems to accept english still Smile

but you maybe right about the PWM.. maybe i've not set that up proprely :( i thought i had tho...

[edit]

i apologise for the state of the code , i took out the none important LCD driver stuff here for the sake of argument really...

as for the INT and INT16 i did #type the systems

#type signed
#type int = 16, short = 8
that cleared up a ton of stuff, but i feel im missing something