When Google denounced its ultimate mobile working network to the world final week, the firm asked a indifferent but exceedingly assured human declared Hugo Barra to squeeze the microphone, and commemorate Android 4.1 as the most appropriate mobile working network the world has seen. It couldn't have been easy to sing the praises of an OS code-named "Jelly Bean" with a entirely true face, but Barra, Android's executive of product management, was cold and calm as he common Android's ultimate hired gun features.
There was the new graphically extended looking tool, Google Now. There was the new voice-based looking helper - Google's answer to Apple's Siri. And there was moreover a new square of hardware - the Nexus 7 - that would uncover off Android's full potential. Barra anchored all these announcements, stating the Google I/O headlines that the world was most meddlesome in hearing.
And right away he speaks directly with Wired about Google's mobile future. We sat down with Barra final week at Google I/O to collect his brain about the Nexus 7, and all the other key Android announcements. Here is the edited conversation.
Wired: Jelly Bean unequivocally has two leading new features - Google Now and voice search. Walk us by the considering at the back these additions.
Hugo Barra: The process of a card with a few data in it [Google Now] isn't obviously new. For a long time, we've had the belief of "One Boxes." Whenever Google presents data to you on tip of looking results - it's arrange of formatted in a specific way, and physically well-defined from the looking results - we've called that a "One Box" for awhile. So we've taken that process of a card with data in it only a few stairs serve by formatting it in a way that's more appropriate for mobile gadgets and giving it a poignant amount of visible polish. It's not a new concept. It's only an enrichment of an existing process when it comes to search.
Wired: Is Google Now only creation things looking prettier, or is this obviously a use case-driven enhancement? Can you quantify either this creates data simpler or more approachable to the user?
Barra: It of course is. If you've asked a subject for that a specific answer or a tiny set of specific answers exist, you're expected wanting to see that specific answer, right? So rsther than than guileless that the user will differentiate by the web in a rarely precisely ranked form, you take it one step further, and serve that answer up on an data card.
The second thing you talked about - giving Google a voice - is very use case-driven. If you're in a incident where you're asking a subject with your voice, there's a poignant luck you're in a somewhat compelled environment. You're on the go, you're rushing. You might be in the car. You're carrying something else with your hands. You can't unequivocally postponement to look at your shade or type.
So vocalization it back to you seems flattering natural, right? That's how humans communicate. But you moreover longed for to do that only when you had a text-to-speech engine that was exceedingly high quality. And what you listen to today, if you inquire Google a subject on Jelly Bean, is quite spectacular. There isn't a text-to-speech engine, as you call them, that has correctness as high as that.
We didn't speak about this in the keynote, but you have built a text-to-speech engine that's networked-based, meaning it uses a very considerable amount of data to constitute a oral answer. You know, quite from a singularity viewpoint - dont think about about responding questions - it takes a very considerable amount of data to produce a synthesized audio of someone speaking. But you moreover have a relating engine that sits on the device. It's the expect same voice but with a very not similar computational technique. You'll always listen to the same voice either it's vocalization back to you in a related use-case, in that it comes from the server, or a disconnected offline use-case, in that it would only be synthesized on the device.
Wired: What creates a great voice? Did you model it after someone?
Barra: I obviously advance from debate recognition, and I worked in debate in broad for a very long time. So do not let me speak about this all day. But it's a very, very complicated process. And it starts with anticipating a voice talent.
Wired: A actual person?
Barra: Finding a person who has a voice that only nails it. And in this day and age, it's obviously a very not similar voice gift than the voice talents that power most of the voice technology that exists today. A lot of today's voice technology comes from the companies you'd expect - Nuance and Microsoft and others. That technology is built for a telephony world, for a patron service mood where you need this posh, absolute voice - a branding draw close to things.
We set out to emanate the very initial conversational voice, and I think you nailed that. I think you have the very initial high-quality, natural-sounding, conversational, synthesized voice in the whole world.
Between a garland of designers, engineers and debate scientists, you sat down and attempted to explain the mannerism of the person, the mannerism of the voice that you were perplexing to create. We wrote down "friendly" [as a product goal] and there were literally 15 not similar ways to explain what kind means. So that was the short that you gave to a cast of characters agency, and they came back with 10 candidates. We available those 10 candidates, and you did a garland of blind tests with all sorts of not similar people, and you voted it down to two people. And then you available more of those people, and you did a few tests and you motionless "OK, we're going to go with this one person."
I do not obviously know her name. In fact, no one knows her name.
Wired: It's a secret?
Barra: It's ostensible to be. It's not something that you broadcast since it needs to be the voice of Google. And then you emanate the voice, you collect a lot of data. What you did is an attention first.
Wired: While it does sound more human-like, it doesn't have a lot of mannerism in the clarity that it doesn't say humorous things back to you. It doesn't broach jokes.
Barra: So nothing to do with the voice itself, but what it says and how it says it?
Wired: Exactly. Is that something you guys were looking to increase in the future, or is that something you longed for to leave out?
Barra: It's very intentionally not creation jokes with you. Google is a neutral celebration - it's not your friend, personal assistant or sister. It's not your mom. It's not your partner or boyfriend. It is an data retrieval entity. You ask, you respond. And it's very critical that this entity be impartial, and adding jokes and other mannerisms to the voice would take away from that.
It's something that we've talked about, and it's flattering clear. There hasn't been a singular person in the firm who thinks you should have vanished the other direction.
Pages: 1 2 View All
No comments:
Post a Comment