I've been away for a while. Trying to catch up here. I can't seem to locate any information regarding the speech processing capabilities of the MK2. Has it been implemented yet?
--> <a href="http://www.alladvantage.com/go.asp?refid=CCI498">Get Paid to Surf the Web! </a>
I don't definitely know but I think they are not ready yet. Rob said that it'll not be ready till the 1000-batch ships. But that was some time ago. But I think it's not a big problem - It'll be just a software upgrade (I think you know).
The jack for the mic is already on the slide bay of the Mk2 so you can already wire your mic (it's not delivered with the Mk2 - You can take any mic you like - but preferable the empeg suggestet mic).
The jack for the mic is already on the slide bay of the Mk2 so you can already wire your mic (it's not delivered with the Mk2 - You can take any mic you like - but preferable the empeg suggestet mic).
If I have multiple cars and multiple docking sleds, will I also have to have multiple microphones as well?
I guess it depends on how easy it is to mount/unmount the microphone and gain access to the input on the docking sled.
Does anybody have any educated guesses as to how the V/R is going to function? What commands will we be able to give? Will we be able to call up a specific playlist by name? How about a specific album? How about a specific song? How about that neat drum riff in YYZ at 2:21 into the song? How about.... // he pauses, thinks "maybe it's time I took one of my pills..." //
tanstaafl.
"There Ain't No Such Thing As A Free Lunch"
_________________________
"There Ain't No Such Thing As A Free Lunch"
Registered: 12/11/1999
Posts: 261
Loc: Bay Area, California
The mic input is an 1/8" jack on the back of the sled, attached to about 8" of cable. You may or may not be able to "Easily" hook/un-hook the mic-rig, depending on the design of your car.
I would recommend simply getting two mic kits, one per vehicle. I'm lazy enough that hooking up my earphone/mic doesn't get done, let alone having to rig up a hands-free mic every time I changed cars.
This brings me to another point I've already thought about:
What happens with the VR if you turn up the volume (something like: "EMPEG - louder"..) I think there'll be a point where the music is too loud for the empeg to be able to filter out your words. So I think the car's noise is not the only problem.
But I'm sure this is a point which can't be solved. So if you listen to your music a bit louder you'll have to control the Mk2 with the remote or on the unit itself anyway.
Registered: 16/08/1999
Posts: 17
Loc: Western Australia
Actually the sound coming out of the empeg would be the easiest noise to deal with. Conceptually it's a simple subtraction of the output from the input. In reality implementing it must be more complex, but not impossible.
Of course microphone quality would put a limit on that as well, if it's shonky then the detail of your quiet little voice would be lost.
The simple substraction of [sound digitized from mike] - [music] = [voice]
is a bit more complex i fear :-)
-1- The first thing is [other noise] which is: [car noise] + [rest of the world noise]
So the new formula is: [sound digitized from mike] - [music] = [voice] + [other noise]
and of course the sound of the mike depends of the mike hardware and the way you use it ( holding it close or far, or in an angle , this is not only volume, but also a different frequency response and phase shift ), so we have an unknown factor [mike distortion]
The same is for the music, which is heard as a function of your amplifier, your speakers and your car design, so we have a factor [music distortion] ...
-------- Where [a]() is meant to read as -> a as a function of b -> b is the input [a]() is the converted signal -> the output ...
There are som pretty nasty things involved, and even nastier, the way that you use the mike and the way the music is distorted by your system is going to change with the way you hold the mike and adjust your system ( new subwoofers ?? new config on the amp ?? ), so it gets really horrible ... This is nothing to be solved by pure mathematics, the way i see it, the *first* and *very primitive* way to approach the [mike distortion] and [music distortion] would be to simply measure some values -> Maybe in this "calibrating" process, the empeg has to play sounds in different freqs in different volume levels, so maybe play freqs of: 50 Hz, 200 Hz, 500 Hz, 2kHz, 5kHz in volume levels of -20db, -15 db, -10db, -5db, 0db ( protect ears and loudspeakers !! ) ..
So there would be a "grid" of played sounds and corresponding measured values from the mike to get a hint of those nasty distortion functions, which are without a doubt very complex differential ( correct word in englich ?? ) functions ...
So if it doesn't work, you could make the grid fine with using smaller freq steps and smaller volume steps, and don't forget to take in account the phase shifting ...
HOPEFULLY the margins and thresholds of the voice recognitions are wide enough to simply "download and play" with it, but if not, you would have to do at *least* what i roughly proposed here. Very sad would be, if the empeg people just offer a voice recognition that only works with low empeg volume, this would be close to useless, just beeing a marketing slogan, but i trust empeg people by now :-)
Nils
Damn did i forget something important, that sounds too complicated to me ... :-(
and of course the sound of the mike depends of the mike hardware and the way you use it ( holding it close or far, or in an angle , this is not only volume, but also a different frequency response and phase shift ), so we have an unknown factor [mike distortion]
By reading this a questions comes to my mind: Will the guys@empeg provide something like different settings for different mics? Or is the difference between some mics so small that it can be left beside?
Like i said, there are 2 Options that do not require at least basic calibrating:
-1- The speech recognition is *G-REAT* , and margins & thresholds are wide enough to deliver good speech recognition even under bad circumstances ( loud empeg volume & other noise )
-2- The speech recognition is crappy and only delivers good results in your living room, with volume set to <-20 db ... ( maybe not that bad, but you get the basic idea :-)
Nils
empeg people, do you read this, can you give us a kind of forecast on the speech recognition ???
Well - the mic the guys@empeg are testing and developing with is a mic-array. Available here . It's cost is 150$. But every other mic would generally work, too. (But nowbody knows which quality....
>Well - the mic the guys@empeg are testing and developing with is a mic-array. >Available here . It's cost is 150$. But every other mic would generally work, >too. (But nowbody knows which quality....
>Yes !!! >and the speakers and amplifier are olso not includet
>Mark
DAMN !!!
I cant believe it !
I counted on that !!!!!!!!!!!!!!!
I have to sell my last cow then to afford amps & speakers :-( Good good i am lucky they still include the car, do they ???
Nils
P.S. nah, but to have a mike bundled with the speech recognition system is *really* common, even speech rec. software on PC includes the Mike, and my Pioneer System had it too ... Try it with sarcasm, or with whatever, but the mike *belongs* to this package, if only to assure that the speech rec. works good ...
-1- The speech recognition is *G-REAT* , and margins & thresholds are wide enough to deliver good speech recognition even under bad circumstances ( loud empeg volume & other noise )
-2- The speech recognition is crappy and only delivers good results in your living room, with volume set to <-20 db ... ( maybe not that bad, but you get the basic idea :-)
I think you're close on both counts. The company I work for (we design IVR's for telephone systems) does a LOT of voice reco and we've had good experiences even in noisy environments, i.e. mobile phones, etc. Most good VR packages can "tune out" the frequencies that do not contain speech, significantly narrowing the band which it needs to filter voice from. Assuming the software Empeg will be using is of good, commercial quality, it shouldn't be thrown off(too far) by high-or-low frequency road noise, and will only pay attention to freq's in the speech range. This is aided by the Microphone array (with built-in DSP, also for filtering) which may or may not be necessary, depending on just how much noise you're dealing with. Chances are, you'll get much better "natural speaking tone" VR with the better mike, but alternatively, you could just speak "Loud&Clear" into a cheap condenser, or even better, unidirectional mike pointed right at your face and do all right. The problem is, we just can't know until it's released! Jason
_~= Dearing =~_ "WAY too happy about having #99."
_________________________
_~= Dearing =~_ Gettin' back into it thanks to slimrio!
Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
A long while ago when they forst started talking about VR for the Mark II they said it was being programmed by a well known company in the market. It should be pretty good.
And I would believe that they suggest that particular mic because it not only has higher quality, but because it does the best job at singling out the voice in the noisy car environment. I don't think it's as complicated an "equation" as that. I know it's hard but it's mostly a factor of just singling out the voice from road noise and the music. This is something they've been counting on from the beginning, though, so they have been thinking about it.
Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
And consider this. If you use a keyword to start the VR, only then does it really need to work at "hearing" you. The rest of the time, it just needs to be adequate at hearing the keyword, and something like the word "empeg" is unlikely to be confusing to the player, even when blasting music and driving over rumble strips.
Registered: 19/05/1999
Posts: 3457
Loc: Palo Alto, CA
Not much to report, we've been tied up with other mk2 issues recently. Don't expect miracles though, the VR will be on a par with other consumer VR devices, such as the sony MD head unit which has VR. Also, don't expect ViaVoice, as this requires about 2x the CPU power we have, an FPU, a quiet room, and about 128Mb of ram. It will work worse (or maybe even not at all) with loud music - the array mic helps here by filtering noise on accoustic depth of field.
We're still evaluating solutions. We've seen one amazing one, but I suspect it's about 9 months away from even being ported to the ARM, and it ran about 4x slower than realtime on a P400 laptop :(
Registered: 19/05/1999
Posts: 3457
Loc: Palo Alto, CA
I've not tried the autoPC stuff in anger, but seeing as they have (a) more money and (b) a bigger team, I wouldn't be suprised if the AutoPC stuff is better, at least at first. You can make a lot of headway by licencing code then paying for the company involved to do lots of optimisation for you - this is something which we simply can't afford to do :(
We're relying on John & the VR suppliers. I think we're in pretty safe hands :)
Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
Uh-oh. I mean I'm sure that you're going to do a good job, but I hope it's better than the AutoPC than what I've heard. The installer I talked to has also done an AutoPC, and he said the VR didn't work too well.
I understand why you wouldn't want to try it out. If I were in your position, I'd wouldn't either
If it makes you feel any better, the installers liked the Mark I they installed much better than the AutoPC anyway.
A button attached to the back of the steering wheel to mute the empeg and enable voice recognition whenever you want to talk to it?
This would solve the problem of voice control when the empeg was at high volume, might allow effective use of an inexpensive microphone, and would also eliminate the possibility of unwanted voice commands affecting the unit... "I tell you, Bob, this here empeg is really loud compared to my old stereo ... what the... owww, that hurts, turn it down."
tanstaafl.
"There Ain't No Such Thing As A Free Lunch"
_________________________
"There Ain't No Such Thing As A Free Lunch"
Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
Woah! How'd you do that?
Anyway, I think that's a fantastic idea. It would also help in terms of how the language worked with the unit. I mean, we've all talked about how the unit would have to recognize a keyword like empeg to get it started, but what about the end of the command? It might not be clear-cut as to when your instructions end.
This would also help for 2 other reasons. 1) The rest of the time, the VR wouldn't have to do a thing or even be active. I presume that otherwise the VR program would have to lie in waiting, taking up at least some speed. This way it would be like a regular button command coupled with voice, and you still wouldn't have to take your eyes off the road.
2)This would be great for Mark I owners who wouldn't like the VR software running while all the time even though they don't have an empeg, or for people with Mark II's who don't have mics.
FANTASTIC idea tanstaafl! Although, I imagine they're pretty far in the process to do anything about it, and who says they have to take your suggestion anyway. (but think about...please?)
Registered: 16/06/1999
Posts: 1222
Loc: San Francisco, CA
In reply to:
FANTASTIC idea tanstaafl! Although, I imagine they're pretty far in the process to do anything about it, and who says they have to take your suggestion anyway. (but think about...please?)
In reply to:
Rob said ages ago:Whether it will be possible to begin a command session by simply saying "empeg" or somesuch word isn't something we'll know for a while yet. It's more likely that it will be necessary to press a button on a steering wheel remote, in line with other in-car voice systems currently on the market.
Errr.. This is actually the way that it will most likely work; and the way that some users (including myself) are hoping it doens't work:) -mark
...proud to have owned one of the first Mark I units
Registered: 21/05/1999
Posts: 5335
Loc: Cambridge UK
This thread is quite ironic - one of the major stumbling points with the software at the moment is the fact you have to press an attention button! This is quite common in in-car speech recognition, but we're trying hard to move away from this requirement, in line with the customer feedback that we've received up until now.
Registered: 08/03/2000
Posts: 12338
Loc: Sterling, VA
In reply to:
Rob said ages ago
Pardon me sir for my insubrdination, but: Date of that thread: 11/12/99 12:11 PM Date I registered: 9/3/00 05:47 AM
I had no idea SR was being discussed that early, and I'm not about to read all ~10,000 posts on this bulletin board. I was just going by what has been discussed since SR was first mentioned in the newsletters, which was much later. In that time period I had not once seen mention of the button idea.
I mean, we've all talked about how the unit would have to recognize a keyword like empeg to get it started, but what about the end of the command? It might not be clear-cut as to when your instructions end.
This shouldn't be as big of a problem as you might think. Most VR nowadays does not listen to your whole sentence and parse it out like we humans do. It might have an "attention" word, 10-20 fixed "commands", and some of those commands may have qualifiers which would also be fixed. Basically, VR typically has a very small vocabulary, stored phonetically, from which to pick the "n-best" commands/words in order of confidence-level. It can tell from the pauses after each word that that word is complete, then it scans it's vocabulary of phonemes (is that the right word? - we call them utterances) and decides which word you most likely said. If it doesn't find one with a high-enough confidence level, it either skips that word or reprompts. What I'm getting to is this: User: Empeg! Mk2: (beep) User: ShuffleOn Mk2: (turns Shuffle ON)(beep) User: Play... Mk2: (now waiting for a song/playlist qualifier) User: ...Barenaked Ladies Mk2: (plays my BNL master playlist shuffled) (beep)
There should be a time of a few seconds now before the Empeg stops listening for commands, to allow for multiple command requests. Either that, or insert another "User: Empeg!" before the play command.
Disclaimer: I have no idea how it's actually going to work on the Mk2, but the VR packages I've used work very similarly to this, because it reduces the amount of CPU cycles/RAM required for a workable interface.
_~= Dearing =~_ "WAY too happy about having #99."
_________________________
_~= Dearing =~_ Gettin' back into it thanks to slimrio!