Speech Processing? | General | unofficial empeg BBS

Quick Links: Empeg FAQ | RioCar.Org | Hijack | BigDisk Builder | jEmplode | emphatic
Repairs: Repairs

You are not logged in. [Log In] empegbbs.com » Forums » empeg-car » General » Speech Processing?

Page 1 of 2

1

Topic Options

#11175 - 11/07/2000 02:44 Speech Processing?
Mark Miller journeyman Registered: 21/09/1999 Posts: 69 Loc: Southeastern Pennsylvania	I've been away for a while. Trying to catch up here. I can't seem to locate any information regarding the speech processing capabilities of the MK2. Has it been implemented yet? --> <a href="http://www.alladvantage.com/go.asp?refid=CCI498">Get Paid to Surf the Web! </a>
Top

#11176 - 11/07/2000 02:59 Re: Speech Processing? [Re: Mark Miller]
teemcbee addict Registered: 04/02/2000 Posts: 687	I don't definitely know but I think they are not ready yet. Rob said that it'll not be ready till the 1000-batch ships. But that was some time ago. But I think it's not a big problem - It'll be just a software upgrade (I think you know). The jack for the mic is already on the slide bay of the Mk2 so you can already wire your mic (it's not delivered with the Mk2 - You can take any mic you like - but preferable the empeg suggestet mic). TeeMcBee _________________________ TeeMcBee [orange]Mk2, # 080000143, 40+30 GB, Tuner, Peugeot stalk hookup</font color=orange>
Top

#11177 - 11/07/2000 03:27 Re: Speech Processing? [Re: teemcbee]
Big John member Registered: 05/10/1999 Posts: 126 Loc: Hants, UK.	Hi, The jack for the mic is already on the slide bay of the Mk2 so you can already wire your mic (it's not delivered with the Mk2 - You can take any mic you like - but preferable the empeg suggestet mic). Which is? I've misssed this one.. Regards, _________________________________________ John, (MK1 114-20G, MK2 15-36G). _________________________ [color:yellow]_________________________________________John, (MK1 #114-20G, MK2 #15-36G).</font color=yellow>
Top

#11178 - 11/07/2000 03:40 Re: Speech Processing? [Re: Big John]
teemcbee addict Registered: 04/02/2000 Posts: 687	This is the link to the mic. It's a microphone array of 4 mics - for my taste it's a bit too big. There was a discussion about it here. TeeMcBee _________________________ TeeMcBee [orange]Mk2, # 080000143, 40+30 GB, Tuner, Peugeot stalk hookup</font color=orange>
Top

#11179 - 11/07/2000 14:00 Re: Speech Processing? [Re: teemcbee]
tanstaafl. carpal tunnel Registered: 08/07/1999 Posts: 5561 Loc: Ajijic, Mexico	An ugly thought just occurred to me... If I have multiple cars and multiple docking sleds, will I also have to have multiple microphones as well? I guess it depends on how easy it is to mount/unmount the microphone and gain access to the input on the docking sled. Does anybody have any educated guesses as to how the V/R is going to function? What commands will we be able to give? Will we be able to call up a specific playlist by name? How about a specific album? How about a specific song? How about that neat drum riff in YYZ at 2:21 into the song? How about.... // he pauses, thinks "maybe it's time I took one of my pills..." // tanstaafl. "There Ain't No Such Thing As A Free Lunch" _________________________ "There Ain't No Such Thing As A Free Lunch"
Top

#11180 - 11/07/2000 14:07 Re: Speech Processing? [Re: tanstaafl.]
Dredd enthusiast Registered: 12/11/1999 Posts: 261 Loc: Bay Area, California	The mic input is an 1/8" jack on the back of the sled, attached to about 8" of cable. You may or may not be able to "Easily" hook/un-hook the mic-rig, depending on the design of your car. I would recommend simply getting two mic kits, one per vehicle. I'm lazy enough that hooking up my earphone/mic doesn't get done, let alone having to rig up a hands-free mic every time I changed cars. D
Top

#11181 - 11/07/2000 22:06 Re: Speech Processing? [Re: tanstaafl.]
teemcbee addict Registered: 04/02/2000 Posts: 687	This brings me to another point I've already thought about: What happens with the VR if you turn up the volume (something like: "EMPEG - louder"..) I think there'll be a point where the music is too loud for the empeg to be able to filter out your words. So I think the car's noise is not the only problem. But I'm sure this is a point which can't be solved. So if you listen to your music a bit louder you'll have to control the Mk2 with the remote or on the unit itself anyway. TeeMcBee _________________________ TeeMcBee [orange]Mk2, # 080000143, 40+30 GB, Tuner, Peugeot stalk hookup</font color=orange>
Top

#11182 - 12/07/2000 02:51 Re: Speech Processing? [Re: teemcbee]
influx new poster Registered: 16/08/1999 Posts: 17 Loc: Western Australia	Actually the sound coming out of the empeg would be the easiest noise to deal with. Conceptually it's a simple subtraction of the output from the input. In reality implementing it must be more complex, but not impossible. Of course microphone quality would put a limit on that as well, if it's shonky then the detail of your quiet little voice would be lost.
Top

#11183 - 12/07/2000 03:19 Re: Speech Processing? [Re: influx]
teemcbee addict Registered: 04/02/2000 Posts: 687	Good point that I haven't thought about. What do the guys@empeg say to that? Any status of VR? TeeMcBee _________________________ TeeMcBee [orange]Mk2, # 080000143, 40+30 GB, Tuner, Peugeot stalk hookup</font color=orange>
Top

#11184 - 12/07/2000 04:21 Re: Speech Processing? [Re: influx]
Nils member Registered: 09/06/1999 Posts: 197 Loc: Germany	Oh ah ! The simple substraction of [sound digitized from mike] - [music] = [voice] is a bit more complex i fear :-) -1- The first thing is [other noise] which is: [car noise] + [rest of the world noise] So the new formula is: [sound digitized from mike] - [music] = [voice] + [other noise] and of course the sound of the mike depends of the mike hardware and the way you use it ( holding it close or far, or in an angle , this is not only volume, but also a different frequency response and phase shift ), so we have an unknown factor [mike distortion] The same is for the music, which is heard as a function of your amplifier, your speakers and your car design, so we have a factor [music distortion] ... so the new formula is something like: --------- [sound digitized from mike] - [mike distortion]([music distortion]([music])) = [mike distortion]([voice]) + [mike distortion]([other noise]) -------- Where [a]() is meant to read as -> a as a function of b -> b is the input [a]() is the converted signal -> the output ... There are som pretty nasty things involved, and even nastier, the way that you use the mike and the way the music is distorted by your system is going to change with the way you hold the mike and adjust your system ( new subwoofers ?? new config on the amp ?? ), so it gets really horrible ... This is nothing to be solved by pure mathematics, the way i see it, the first and very primitive way to approach the [mike distortion] and [music distortion] would be to simply measure some values -> Maybe in this "calibrating" process, the empeg has to play sounds in different freqs in different volume levels, so maybe play freqs of: 50 Hz, 200 Hz, 500 Hz, 2kHz, 5kHz in volume levels of -20db, -15 db, -10db, -5db, 0db ( protect ears and loudspeakers !! ) .. So there would be a "grid" of played sounds and corresponding measured values from the mike to get a hint of those nasty distortion functions, which are without a doubt very complex differential ( correct word in englich ?? ) functions ... So if it doesn't work, you could make the grid fine with using smaller freq steps and smaller volume steps, and don't forget to take in account the phase shifting ... HOPEFULLY the margins and thresholds of the voice recognitions are wide enough to simply "download and play" with it, but if not, you would have to do at least what i roughly proposed here. Very sad would be, if the empeg people just offer a voice recognition that only works with low empeg volume, this would be close to useless, just beeing a marketing slogan, but i trust empeg people by now :-) Nils Damn did i forget something important, that sounds too complicated to me ... :-(
Top

#11185 - 12/07/2000 04:51 Re: Speech Processing? [Re: Nils]
teemcbee addict Registered: 04/02/2000 Posts: 687	and of course the sound of the mike depends of the mike hardware and the way you use it ( holding it close or far, or in an angle , this is not only volume, but also a different frequency response and phase shift ), so we have an unknown factor [mike distortion] By reading this a questions comes to my mind: Will the guys@empeg provide something like different settings for different mics? Or is the difference between some mics so small that it can be left beside? TeeMcBee _________________________ TeeMcBee [orange]Mk2, # 080000143, 40+30 GB, Tuner, Peugeot stalk hookup</font color=orange>
Top

#11186 - 12/07/2000 04:59 Re: Speech Processing? [Re: teemcbee]
Nils member Registered: 09/06/1999 Posts: 197 Loc: Germany	Like i said, there are 2 Options that do not require at least basic calibrating: -1- The speech recognition is G-REAT , and margins & thresholds are wide enough to deliver good speech recognition even under bad circumstances ( loud empeg volume & other noise ) -2- The speech recognition is crappy and only delivers good results in your living room, with volume set to <-20 db ... ( maybe not that bad, but you get the basic idea :-) Nils empeg people, do you read this, can you give us a kind of forecast on the speech recognition ???
Top

#11187 - 12/07/2000 05:01 Re: Speech Processing? [Re: teemcbee]
Nils member Registered: 09/06/1999 Posts: 197 Loc: Germany	By the way -> does all this mean, that the mike is not included ???? That would be cheap ... Nahhhhh i cant believe it ... :-) Nils
Top

#11188 - 12/07/2000 05:04 Re: Speech Processing? [Re: Nils]
teemcbee addict Registered: 04/02/2000 Posts: 687	Yep, you're right! It's not included. Well - the mic the guys@empeg are testing and developing with is a mic-array. Available here . It's cost is 150$. But every other mic would generally work, too. (But nowbody knows which quality.... TeeMcBee _________________________ TeeMcBee [orange]Mk2, # 080000143, 40+30 GB, Tuner, Peugeot stalk hookup</font color=orange>
Top

#11189 - 12/07/2000 06:00 Re: Speech Processing? [Re: teemcbee]
Nils member Registered: 09/06/1999 Posts: 197 Loc: Germany	>Yep, you're right! It's not included. >Well - the mic the guys@empeg are testing and developing with is a mic-array. >Available here . It's cost is 150$. But every other mic would generally work, >too. (But nowbody knows which quality.... WOW now that's what i call bad news !!!!!!! Arghh: Nils
Top

#11190 - 12/07/2000 06:38 Re: Speech Processing? [Re: Nils]
Mark Petersen journeyman Registered: 19/09/1999 Posts: 97 Loc: Denmark, Kbh Ø	Yes !!! and the speakers and amplifier are olso not includet Mark wait for mk III with a USB Host/slave (USB->GPS)(USB->Bluetooth)(USB->You name it) _________________________ Mark wait for mk III with a USB Host/slave (USB->GPS)(USB->Bluetooth)(USB->You name it)
Top

#11191 - 12/07/2000 07:44 Re: Speech Processing? [Re: Mark Petersen]
Nils member Registered: 09/06/1999 Posts: 197 Loc: Germany	>Yes !!! >and the speakers and amplifier are olso not includet >Mark DAMN !!! I cant believe it ! I counted on that !!!!!!!!!!!!!!! I have to sell my last cow then to afford amps & speakers :-( Good good i am lucky they still include the car, do they ??? Nils P.S. nah, but to have a mike bundled with the speech recognition system is really common, even speech rec. software on PC includes the Mike, and my Pioneer System had it too ... Try it with sarcasm, or with whatever, but the mike belongs to this package, if only to assure that the speech rec. works good ...
Top

#11192 - 12/07/2000 10:53 Re: Speech Processing? [Re: Nils]
Dearing addict Registered: 22/07/1999 Posts: 453 Loc: Florida	-1- The speech recognition is G-REAT , and margins & thresholds are wide enough to deliver good speech recognition even under bad circumstances ( loud empeg volume & other noise ) -2- The speech recognition is crappy and only delivers good results in your living room, with volume set to <-20 db ... ( maybe not that bad, but you get the basic idea :-) I think you're close on both counts. The company I work for (we design IVR's for telephone systems) does a LOT of voice reco and we've had good experiences even in noisy environments, i.e. mobile phones, etc. Most good VR packages can "tune out" the frequencies that do not contain speech, significantly narrowing the band which it needs to filter voice from. Assuming the software Empeg will be using is of good, commercial quality, it shouldn't be thrown off(too far) by high-or-low frequency road noise, and will only pay attention to freq's in the speech range. This is aided by the Microphone array (with built-in DSP, also for filtering) which may or may not be necessary, depending on just how much noise you're dealing with. Chances are, you'll get much better "natural speaking tone" VR with the better mike, but alternatively, you could just speak "Loud&Clear" into a cheap condenser, or even better, unidirectional mike pointed right at your face and do all right. The problem is, we just can't know until it's released! Jason _~= Dearing =~_ "WAY too happy about having #99." _________________________ _~= Dearing =~_ Gettin' back into it thanks to slimrio!
Top

#11193 - 12/07/2000 12:07 Re: Speech Processing? [Re: Dearing]
Dignan carpal tunnel Registered: 08/03/2000 Posts: 12348 Loc: Sterling, VA	A long while ago when they forst started talking about VR for the Mark II they said it was being programmed by a well known company in the market. It should be pretty good. And I would believe that they suggest that particular mic because it not only has higher quality, but because it does the best job at singling out the voice in the noisy car environment. I don't think it's as complicated an "equation" as that. I know it's hard but it's mostly a factor of just singling out the voice from road noise and the music. This is something they've been counting on from the beginning, though, so they have been thinking about it. DiGNAN _________________________ Matt
Top

#11194 - 12/07/2000 12:12 Re: Speech Processing? [Re: Dignan]
Dignan carpal tunnel Registered: 08/03/2000 Posts: 12348 Loc: Sterling, VA	And consider this. If you use a keyword to start the VR, only then does it really need to work at "hearing" you. The rest of the time, it just needs to be adequate at hearing the keyword, and something like the word "empeg" is unlikely to be confusing to the player, even when blasting music and driving over rumble strips. DiGNAN _________________________ Matt
Top

#11195 - 12/07/2000 12:33 Re: Speech Processing? [Re: teemcbee]
altman carpal tunnel Registered: 19/05/1999 Posts: 3457 Loc: Palo Alto, CA	Not much to report, we've been tied up with other mk2 issues recently. Don't expect miracles though, the VR will be on a par with other consumer VR devices, such as the sony MD head unit which has VR. Also, don't expect ViaVoice, as this requires about 2x the CPU power we have, an FPU, a quiet room, and about 128Mb of ram. It will work worse (or maybe even not at all) with loud music - the array mic helps here by filtering noise on accoustic depth of field. We're still evaluating solutions. We've seen one amazing one, but I suspect it's about 9 months away from even being ported to the ARM, and it ran about 4x slower than realtime on a P400 laptop :( Hugo
Top

#11196 - 12/07/2000 13:29 Re: Speech Processing? [Re: altman]
Dignan carpal tunnel Registered: 08/03/2000 Posts: 12348 Loc: Sterling, VA	empeg VR __ autoPC VR < > = ? DiGNAN _________________________ Matt
Top

#11197 - 12/07/2000 13:38 Re: Speech Processing? [Re: Dignan]
altman carpal tunnel Registered: 19/05/1999 Posts: 3457 Loc: Palo Alto, CA	I've not tried the autoPC stuff in anger, but seeing as they have (a) more money and (b) a bigger team, I wouldn't be suprised if the AutoPC stuff is better, at least at first. You can make a lot of headway by licencing code then paying for the company involved to do lots of optimisation for you - this is something which we simply can't afford to do :( We're relying on John & the VR suppliers. I think we're in pretty safe hands :) Hugo
Top

#11198 - 12/07/2000 13:57 Re: Speech Processing? [Re: altman]
Dignan carpal tunnel Registered: 08/03/2000 Posts: 12348 Loc: Sterling, VA	Uh-oh. I mean I'm sure that you're going to do a good job, but I hope it's better than the AutoPC than what I've heard. The installer I talked to has also done an AutoPC, and he said the VR didn't work too well. I understand why you wouldn't want to try it out. If I were in your position, I'd wouldn't either If it makes you feel any better, the installers liked the Mark I they installed much better than the AutoPC anyway. DiGNAN _________________________ Matt
Top

#11199 - 12/07/2000 14:54 Re: Speech Processing? [Re: Nils]
muzza Pooh-Bah Registered: 21/07/1999 Posts: 1765 Loc: Brisbane, Queensland, Australi...	Now you're probably going to tell me that I have to buy the car and beautiful woman to go with it too!!! Murray 06000047 ____________________ _________________________ -- Murray I What part of 'no' don't you understand? Is it the 'N', or the 'Zero'?
Top

#11200 - 12/07/2000 16:57 Re: Speech Processing? [Re: muzza]
eternalsun Pooh-Bah Registered: 09/09/1999 Posts: 1721 Loc: San Jose, CA	I suspect the lady empeg buyers will have a better chance at snagging a guy by having one of these in the dash than the other way around. :) Calvin
Top

#11201 - 12/07/2000 17:42 Re: Speech Processing? [Re: Dignan]
tanstaafl. carpal tunnel Registered: 08/07/1999 Posts: 5561 Loc: Ajijic, Mexico	Why wouldn't this work... A button attached to the back of the steering wheel to mute the empeg and enable voice recognition whenever you want to talk to it? This would solve the problem of voice control when the empeg was at high volume, might allow effective use of an inexpensive microphone, and would also eliminate the possibility of unwanted voice commands affecting the unit... "I tell you, Bob, this here empeg is really loud compared to my old stereo ... what the... owww, that hurts, turn it down." tanstaafl. "There Ain't No Such Thing As A Free Lunch" _________________________ "There Ain't No Such Thing As A Free Lunch"
Top

#11202 - 12/07/2000 18:32 Re: Speech Processing? [Re: tanstaafl.]
dionysus veteran Registered: 16/06/1999 Posts: 1222 Loc: San Francisco, CA	In reply to: A button attached to the back of the steering wheel to mute the empeg and enable voice recognition whenever you want to talk to it? _________________________ http://mvgals.net - clublife, revisited.
Top

#11203 - 12/07/2000 18:34 Re: Speech Processing? [Re: dionysus]
dionysus veteran Registered: 16/06/1999 Posts: 1222 Loc: San Francisco, CA	hmm... Double D'oh! -mark ...proud to have owned one of the first Mark I units _________________________ http://mvgals.net - clublife, revisited.
Top

#11204 - 12/07/2000 19:04 Re: Speech Processing? [Re: dionysus]
Dignan carpal tunnel Registered: 08/03/2000 Posts: 12348 Loc: Sterling, VA	Woah! How'd you do that? Anyway, I think that's a fantastic idea. It would also help in terms of how the language worked with the unit. I mean, we've all talked about how the unit would have to recognize a keyword like empeg to get it started, but what about the end of the command? It might not be clear-cut as to when your instructions end. This would also help for 2 other reasons. 1) The rest of the time, the VR wouldn't have to do a thing or even be active. I presume that otherwise the VR program would have to lie in waiting, taking up at least some speed. This way it would be like a regular button command coupled with voice, and you still wouldn't have to take your eyes off the road. 2)This would be great for Mark I owners who wouldn't like the VR software running while all the time even though they don't have an empeg, or for people with Mark II's who don't have mics. FANTASTIC idea tanstaafl! Although, I imagine they're pretty far in the process to do anything about it, and who says they have to take your suggestion anyway. (but think about...please?) DiGNAN _________________________ Matt
Top

#11205 - 12/07/2000 19:17 Re: Speech Processing? [Re: Dignan]
dionysus veteran Registered: 16/06/1999 Posts: 1222 Loc: San Francisco, CA	In reply to: FANTASTIC idea tanstaafl! Although, I imagine they're pretty far in the process to do anything about it, and who says they have to take your suggestion anyway. (but think about...please?) In reply to: Rob said ages ago:Whether it will be possible to begin a command session by simply saying "empeg" or somesuch word isn't something we'll know for a while yet. It's more likely that it will be necessary to press a button on a steering wheel remote, in line with other in-car voice systems currently on the market. Errr.. This is actually the way that it will most likely work; and the way that some users (including myself) are hoping it doens't work:) -mark ...proud to have owned one of the first Mark I units _________________________ http://mvgals.net - clublife, revisited.
Top

#11206 - 13/07/2000 02:17 Re: Speech Processing? [Re: tanstaafl.]
rob carpal tunnel Registered: 21/05/1999 Posts: 5335 Loc: Cambridge UK	This thread is quite ironic - one of the major stumbling points with the software at the moment is the fact you have to press an attention button! This is quite common in in-car speech recognition, but we're trying hard to move away from this requirement, in line with the customer feedback that we've received up until now. Rob
Top

#11207 - 13/07/2000 05:25 Re: Speech Processing? [Re: dionysus]
Dignan carpal tunnel Registered: 08/03/2000 Posts: 12348 Loc: Sterling, VA	In reply to: Rob said ages ago Pardon me sir for my insubrdination, but: Date of that thread: 11/12/99 12:11 PM Date I registered: 9/3/00 05:47 AM I had no idea SR was being discussed that early, and I'm not about to read all ~10,000 posts on this bulletin board. I was just going by what has been discussed since SR was first mentioned in the newsletters, which was much later. In that time period I had not once seen mention of the button idea. So rob, what's the problem with the button? DiGNAN _________________________ Matt
Top

#11208 - 13/07/2000 11:50 Re: Speech Processing? [Re: Dignan]
Dearing addict Registered: 22/07/1999 Posts: 453 Loc: Florida	I mean, we've all talked about how the unit would have to recognize a keyword like empeg to get it started, but what about the end of the command? It might not be clear-cut as to when your instructions end. This shouldn't be as big of a problem as you might think. Most VR nowadays does not listen to your whole sentence and parse it out like we humans do. It might have an "attention" word, 10-20 fixed "commands", and some of those commands may have qualifiers which would also be fixed. Basically, VR typically has a very small vocabulary, stored phonetically, from which to pick the "n-best" commands/words in order of confidence-level. It can tell from the pauses after each word that that word is complete, then it scans it's vocabulary of phonemes (is that the right word? - we call them utterances) and decides which word you most likely said. If it doesn't find one with a high-enough confidence level, it either skips that word or reprompts. What I'm getting to is this: User: Empeg! Mk2: (beep) User: ShuffleOn Mk2: (turns Shuffle ON)(beep) User: Play... Mk2: (now waiting for a song/playlist qualifier) User: ...Barenaked Ladies Mk2: (plays my BNL master playlist shuffled) (beep) There should be a time of a few seconds now before the Empeg stops listening for commands, to allow for multiple command requests. Either that, or insert another "User: Empeg!" before the play command. Disclaimer: I have no idea how it's actually going to work on the Mk2, but the VR packages I've used work very similarly to this, because it reduces the amount of CPU cycles/RAM required for a workable interface. _~= Dearing =~_ "WAY too happy about having #99." _________________________ _~= Dearing =~_ Gettin' back into it thanks to slimrio!
Top

#11209 - 13/07/2000 19:03 Re: Speech Processing? [Re: Dignan]
Terminator old hand Registered: 12/01/2000 Posts: 1079 Loc: Dallas, TX	"I had no idea SR was being discussed that early, and I'm not about to read all ~10,000 posts on this bulletin board." Thats what the search function is for. Sean
Top

Page 1 of 2

1

View All Topics