Unoffical empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

Topic Options
#165266 - 12/06/2003 09:22 gpsapp and voice output
fossi
journeyman

Registered: 12/01/2003
Posts: 64
Loc: Germany
is there anybody out there with the capability and interest to enrich gpsapp with a voice output (e.g. turn right in 200m)?

Implementing voice should not be that difficult (see http://empeg.comms.net/php/showflat.php?Cat=&Board=empeg_general&Number=151288&page=&view=&sb=&o=&vc=1 ) but I'm not the one with coding-knowledge.

Juergen


Top
#165267 - 12/06/2003 09:48 Re: gpsapp and voice output [Re: fossi]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Well, there are a few "gotchas" involved here. Something like "turn left here" "turn right here" is trivial, but not very useful. What you really want to hear is "turn left onto Evergreen Terrace" since "here" could mean this left, the little left-ish road near it, etc... TTSclock just plays samples, and that's fine for a clock, since the vocabulary needed to tell you the time isn't a big deal.

Having said that, BBS member TheAmigo did some good work on bringing true semi-realtime text to speech to the empeg. His solution as it currently exists (do a BBS search for "ttsd" to find the relevant threads) was okay, but there was a very noticable lag from the time a program said "say this" to the time the sound was actually generated. TheAmigo hasn't been seen around these parts for quite some time, so I'm not sure what's going on there..

But there is good news. The text-to-speech engine that he was using (Flite) has just received a pretty substantial upgrade, and it now has API's which can be called directly. It's also possible other performance improvements were made to the engine itself, but I haven't had a chance to try it. It's about halfway down on my list of empeg-related coding projects, but I just discovered the latest version a few days ago...

So, the easy answer to your question is yes, having empeg programs play a few limited canned sounds "turn left, turn right" is very easy. It's just a question of if "turn left" is going to give you the kind of accurate information you want when you're navigating.
_________________________
- Tony C
my empeg stuff

Top
#165268 - 12/06/2003 10:11 Re: gpsapp and voice output [Re: tonyc]
Yang
addict

Registered: 14/01/2002
Posts: 443
Loc: Raleigh, NC
Since GPSApp routes are pregenerated on a PC, couldn't the TTS program be integrated into the route generation pipeline to precompute the spoken directions? Would mean a longer conversion process, but would be doable.

Edit:The process would be to use the TTS program to compute the names of roads, seperate from 'turn', 'exit', 'continue', 'left', 'right', 'onto', etc, to allow it to determine what the real turn would be at runtime (coming from the wrong direction etc..).


Edited by Yang (12/06/2003 10:14)

Top
#165269 - 12/06/2003 11:52 Re: gpsapp and voice output [Re: fossi]
mcomb
pooh-bah

Registered: 31/08/1999
Posts: 1649
Loc: San Carlos, CA
is there anybody out there with the capability and interest to enrich gpsapp with a voice output (e.g. turn right in 200m)?

What we are lacking is a speech engine that is actually understandable. A while back I modified gpsapp to use ttsd to speak waypoints instructions at the same time it showed them on the screen. Unfortunately it was basically impossible to understand what it was saying in a moving vehicle. I never bothered to work all the bugs out and post the changes since it didn't seem to have much value. Maybe the newer version of flite is easier to understand?

-Mike
_________________________
EmpMenuX - ext3 filesystem - Empeg iTunes integration

Top
#165270 - 12/06/2003 11:54 Re: gpsapp and voice output [Re: tonyc]
fossi
journeyman

Registered: 12/01/2003
Posts: 64
Loc: Germany
Many thanks for your response.

For some months now I navigate extensively with gpsapp and anytime I have a chance to compare to another commercial navigation-system I try to.

I have inspected the VDO solution as well as an original BMW navigation system. Both of them were not capable of speaking the roadnames (even if that would be great). Both of them give a first information 1000m ahead ("further on turn left" or something like that) , a second info 200m ahead ("in 200m turn left") and the last info when reached the turn ("now turn left"). That info would be a great help even if the roadname would be a perfect solution.

So the simple solution with samples sould be at least as good as most of the commercial products.

Juergen

Top
#165271 - 12/06/2003 12:06 Re: gpsapp and voice output [Re: mcomb]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
Unfortunately it was basically impossible to understand what it was saying in a moving vehicle.
Do you think it was the quality of the voice itself that was sub-par? The volume? Or something else? All these can be worked out... I haven't tried the new Flite out yet, but according to the docs, it can use higher quality voices, but it looks like a massive project to get voices converted from FestVox to Flite.

If it's just a problem with the volume, that's easy to work with. The speed and frequency of the voices are also easy to change. What exactly did you think the problem was with the ttsd/Flite solution?
_________________________
- Tony C
my empeg stuff

Top
#165272 - 12/06/2003 12:07 Re: gpsapp and voice output [Re: fossi]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
So the simple solution with samples sould be at least as good as most of the commercial products.
Okay, then I'll defer to the GPSapp folks for their thoughts. The kind of change you speak of would be very simple though, since I experimented with a similar idea (talking menus) on my empeg trivia game, and it works quite well.
_________________________
- Tony C
my empeg stuff

Top
#165273 - 12/06/2003 12:39 Re: gpsapp and voice output [Re: tonyc]
mcomb
pooh-bah

Registered: 31/08/1999
Posts: 1649
Loc: San Carlos, CA
What exactly did you think the problem was with the ttsd/Flite solution?

The inability to accurately pronounce multi-syllable, proper nouns was the biggest thing. It might work OK for things like "Turn Left", but when you throw street names into the mix flite (at least the version/voice I was using) is in over its head.

To be fair, my car is a very noisy environment (Jeep w/ loud 33inch tires) and it might have worked better in a more ideal setting. I just found that I couldn't understand most of what is said unless I looked down to see the turn instructions on the screen as well.

-Mike
_________________________
EmpMenuX - ext3 filesystem - Empeg iTunes integration

Top
#165274 - 12/06/2003 13:06 Re: gpsapp and voice output [Re: mcomb]
tfabris
carpal tunnel

Registered: 20/12/1999
Posts: 31578
Loc: Seattle, WA
I just found that I couldn't understand most of what is said unless I looked down to see the turn instructions on the screen as well.
Which brings me to something that I'd consider to be a much more useful modification to GPSapp:

- Give it a pop-up mode, where it stays on the music/player screen most of the time, and only pops up to show you the map screen when you are approaching a turn.

- Have it give an audio notification that there is an approching turn, so you know to look down at the screen when it happens. The audio notification could be mixed in with the music, as a beep or as a voice saying "turn approaching" or whatever. Heck, you could even supply your own wave file for it.
_________________________
Tony Fabris

Top
#165275 - 16/06/2003 13:39 Re: gpsapp and voice output [Re: tonyc]
fossi
journeyman

Registered: 12/01/2003
Posts: 64
Loc: Germany
For me it yould be perfect to play some predefined files (turn right, turn left, ...) which I could also easily translate to any language.

Juergen

Top
#165276 - 16/06/2003 22:53 Re: gpsapp and voice output [Re: tonyc]
TheAmigo
enthusiast

Registered: 14/09/2000
Posts: 363
TheAmigo hasn't been seen around these parts for quite some time, so I'm not sure what's going on there..


Just in the last couple days I've started skimming the boards, but not really keeping up.

It doesn't look like I'm going to have much time work on ttsd anymore If you'd like to be the new official maintainer, I'd be happy to know it has a new home

IIRC 1.0a3 was the last posting I made. The archive includes all the modified source and I may have even put a couple comments in the code, but no promises It took surprisingly little work to get flite running on the empeg. For my initial tests, I was able to get flite running without even recompiling! I think it was the iPaq binaries that ran stock.

Anyway, maybe a mix of ttsd and pre-recordings would work even better:
turn in 300 meters to the right on krxoff
So the directions are clear and as a bonus, a proper noun is available.
_________________________
--The Amigo

Top
#165277 - 17/06/2003 10:57 Re: gpsapp and voice output [Re: TheAmigo]
tonyc
carpal tunnel

Registered: 27/06/1999
Posts: 7058
Loc: Pittsburgh, PA
If you'd like to be the new official maintainer, I'd be happy to know it has a new home
I can do that. Yeah, flite is very much suited for the empeg, but the fact that it doesn't handle the empeg's very specific audio format requirements is what caused us to need the ttsd/pcmplay shim initially. The new version apparently has an API that you can call directly from C programs, so that might help us out even more. I'll put the new version through its paces sometime soon.
_________________________
- Tony C
my empeg stuff

Top
#165278 - 17/06/2003 17:36 Re: gpsapp and voice output [Re: tonyc]
TheAmigo
enthusiast

Registered: 14/09/2000
Posts: 363
The next feature I wanted to work on was getting rid of the bash process needed to do the looping. Currently, the ttsd script reads:
while `true`; do
flite /usr/local/ttsd stdout
sleep 1
done

At first I was a bit intimidated thinking about trying to reset all necessary variable and what a mess it would be trying to figure all that out. Then I found the "-l" option to flite... it does exactly that

So that makes getting rid of the bash process easier... just running flite -l from hijack would probably be the simplest launcher. That should reduce the delay between saying two phrases since it doesn't have to re-launch the app.

Hmmm... wonder how much memory a bilingual ttsd would use up
_________________________
--The Amigo

Top
#165279 - 16/07/2003 05:26 Re: gpsapp and voice output [Re: Yang]
Warp10
member

Registered: 18/02/2002
Posts: 179
Loc: Germany
Hi!
Any updates? It would really be great to have at least prerecorded voice output combined with gpsapp (Like fossi proposed). What's going on with the suggested pop-up mode for gpsapp?
Yang's idea is also very interesting. Best case would be that everyone could use his favorite text to speech engine (including comercial ones and multiple languages).

cheers,
thorsten
_________________________
---------------------------- MK1: 00314 (4GB) MK2a: 030103104 (30GB) Installed in a BMW 323ti

Top
#165280 - 19/07/2003 22:55 Re: gpsapp and voice output [Re: Warp10]
fossi
journeyman

Registered: 12/01/2003
Posts: 64
Loc: Germany
Yep, I would also be very keen on any news on this feature for gpsapp.

If anybody works on it please remember that the Germans swap directions and instructions, e.g.:
Turn right = rechts abbiegen
Bear left = links halten

So samples should not be contructed word by word ("turn", "right", ...) as this would be difficult to translate (gpsapp would need to swap the order) but instruction by instruction ("turn right", "turn left", ...) to be easily translatable.

Top