You must be logged in to post Login Register


Lost Your Password?

Search Forums:


 






Wildcard Usage:
*    matches any number of characters
%    matches exactly one character

TTS

UserPost

9:47 am
April 25, 2010


dardack

Member

posts 18

So I have TTS working for when a user joins the channel.  It would be easy  to add the others.  Question, when the person joins it plays the tts but in mangerl it doesn't show him joining until the speech is finished.  Whereas in Ventrilo it shows the person in the channel before the speech starts (i'm using mangler next to vent in wine to test).  What I did was add my .h file to mangler.cpp and where the notificatoin for channel enter was played, added the function that plays user has entered the channel.  

 

Would there be a better place to put this call at?  

 

Also, I've never made/submitted a patch to a project before.  So have no clue how to do this.  I had to add 2 lines to the makefile in src, and then my .h file.  Also, espeak lib would be required.

2:03 pm
April 25, 2010


dardack

Member

posts 18

Ok I can add leave/join channel and server, but mangler always waits for espeak to finish speaking before it finishes another task.  Like if I PTT my button and someone leaves/joins the channel aand I let go, my light stays lit until espeak is done speaking, but i'm not sure if i'm still transmitting.

 

I know TTS is a low priority.  Just thought I would try something out, been like 10 years since i've programmed anything, but it's there just nto perfect.

2:06 pm
April 25, 2010


econnell

Admin

posts 319

Your espeak call is blocking, so nothing else can happen while it's playing audio.  To do this properly, you'd need to get the PCM that espeak generates and play that through the existing sound functions in mangler such as playNotification.  That will play the audio in its own thread.

6:47 pm
April 25, 2010


dardack

Member

posts 18

Yea unfortunately it seems like espeak returns a short *wav, while the q/PCM/audio functions from mangler all take uint_8 haven't figured otu how to convert between the 2.

6:54 pm
April 25, 2010


econnell

Admin

posts 319

Post edited 10:57 pm – April 25, 2010 by ekilfoil


In the case of espeak and others, the wav file is probably fairly simple and not going to change.  It probably doesn't do wave chunking or anything complicated like that, so wavefile_ptr + 40 should be raw PCM (which is what those functions are looking for in those uint8_t pointers)… but you'll need figure out the sample rate, which is stored somewhere in that 40 byte header.  You should be able to save it to a file, load that wav file up in audacity or something, and it'll show you all the PCM params.

 

This may help:  http://www.ringthis.com/dev/wa…..format.htm

7:40 pm
April 25, 2010


dardack

Member

posts 18

Ok so the short *wav, is a pointer to the data.  So I could pass wav offset by 40?  (gonna have to break out the c books about pointers again), and in audacity the wav file has a rate of 22050,, but unsure how to convert this to a uint8_t from an int.

7:51 pm
April 25, 2010


econnell

Admin

posts 319

I'm doing this from memory, so I'm not sure if it's 40 or not.  It may be 44?  Check out that link above for a description of the header.

You need to know the following:  byte order, sample rate, sample size, signed/unsigned, and the number of channels.

So you've got 22kHz, little endian (which would need to be converted to native endian (don't worry about this, if your patch is good enough, we'll handle that))…

You need to figure out bits per sample (we use 16 bit internally) and whether or not it's mono or stereo.  I'd assume mono and I think that's a pretty safe assumption.  Lastly, you need to know whether or not it's signed.

I would guess the following: 22kHz signed 16bit little-endian mono

Take a look at libventrilo3/codec-test/ for some programs that will play raw PCM data

8:19 pm
April 25, 2010


dardack

Member

posts 18

I know it's mono, audacity also says the sample format is 32bit float.

 

That link shows 44 before the data.  OK so we have sample rate, sample size, mono.  According to link can also find:

 

bytes/sec 4 bytes – DWORD Bytes/Second
Block alignment 2 bytes – WORD Block alignment
Bits/sample 2 bytes – WORD Bits/Sample

which I'm not sure which is needed.

 

basically espeak gives a callback if not using it with the espeak synthesizer (which I was using and causing it not to release back to mangler, so there was always  a delay).

 

int synthCallback(short *wav, int numsamples, espeak_EVENT *events)

 

so wav is a  pointer to the data, with the 44 byte header, than the raw audio.  

 

What I was trying first was:

 

 

ManglerPCM *testspeak_MPCM = new ManglerPCM(?a, ?b);
ManglerAudio *testSpeaknotify = new ManglerAudio(AUDIO_NOTIFY, 22050, 1, 0, 0, false);
testSpeaknotify->queue(?c, ?d);
testSpeaknotify->finish();

 

Trying to figure out what to put in ?a/b/c/d is where I'm stuck at.  Unless I shouldn't be using mangler's notify system.

8:30 pm
April 25, 2010


econnell

Admin

posts 319

Post edited 12:37 am – April 26, 2010 by ekilfoil


We discussed on IRC and decided that Audacity is reporting the 32bit float wrong (it's an odd format for normal output).

You may want to just join on irc.freenode.net #mangler

Edit:  oh… bits/sample is what you need

10:04 am
April 27, 2010


dardack

Member

posts 18

Ok I was wrong, espeak gives back raw audio data, no header, 16 bit mono, 22050 Hz.  So I get a short pointer to this data, and numsamples: is the number of entries in wav.  This number may vary, may be less than      the value implied by the buflength parameter given in espeak_Initialize, and may      sometimes be zero (which does NOT indicate end of synthesis).

 

So I still need length to pass to ManglerPCM correct?

 

so basically, in mangler.cpp when a user joins a channel i call, ttspeak_UserJoinChan(name, phonetic);

In that function, after epeak is initialized and synth'd, it generates a callback (instead of playing), this callback calls the callback function:

 

static int callback (short *wav, int numsamples, espeak_EVENT *events)

{

 

}

 

events items which indicate word and sentence events, and also the occurance if <mark> and <audio> elements within the text.  The list of events is terminated by an event of type = 0. Dont' care right now about them, cause basically they indicate word/sentence/char, and I know i'm passing sentences at this time.  

 

But MangePCM takes uint32_t length, uint8_t *sample.  So not sure how to send this audio sample to Mangler.

 

10:06 pm
April 27, 2010


Haxar

Moderator

posts 58

A primitive implementation of eSpeak is now in trunk (r781). Usage examples can be found in 'mangler.cpp' under "case V3_EVENT_TEXT_TO_SPEECH_MESSAGE" and "case V3_EVENT_USER_PAGE".

12:24 am
April 28, 2010


dardack

Member

posts 18

Ahh nice, this is the part was messing me up:

 

numsamples * sizeof(short), (uint8_t *)wav, for some reason i thought the audio data had header info.  Was informed this was wrong.

 

Oh well, otherwise the ESPEAK part looks close to mine, except i separated it didn't put it into mangleraudio.cpp, very nice.

 

Doesn't seem too primitive, I added for join/leave channel:

 

 

                        // they're joining our channel
                        #ifdef HAVE_ESPEAK
                            if (strlen(u->phonetic) == 0) {
                                audioControl->playText(c_to_ustring(u->name) + " has joined the channel.");
                            }
                            else {
                                audioControl->playText(c_to_ustring(u->phonetic) + " has joined the channel.");
                            }

                       #else

                           audioControl->playNotification("channelenter");

                       #endif

same for leave.  Works ok.  Would just have to add a GUI option to turn on notifications in TTS or not.  

 


About the Mangler forum

Forum Timezone: America/New_York

Most Users Ever Online: 30

Currently Online:
2 Guests

Currently Browsing this Topic:
1 Guest

Forum Stats:

Groups: 1
Forums: 4
Topics: 244
Posts: 1124

Membership:

There are 725 Members

There is 1 Admin
There are 2 Moderators

Top Posters:

clearscreen – 48
dardack – 18
Krovikan – 16
FWishbringer – 14
faldiin – 12
vpro – 11

Recent New Members: metap0d, cajinboy, HeartofDixie, Gridge, morusec, Redsolardragon

Administrators: econnell (319 Posts)

Moderators: Haxar (58 Posts), bobshaffer (2 Posts)