Google Home vs. Amazon Echo: A Battle No AI Can Win

Google Home and Amazon Echo describe a future where virtual assistants powered by your voice take care of all your online interaction.

Currently, this future may be hard for you and me to see, since speaking commands to the Amazon Echo (Alexa) or Google Home (Google Assistant) frequently ends in, “I’m afraid I don’t know the answer to that” or “Here’s what I found on the web for “book me a flight””.

Google Home dictating an audible advertisement for a popular Disney movie doesn’t seem like much of an evolution from typing in a web browser. Right now, these tech giants aren’t living up to their promise of a better user experience with Voice-AI.

So, how can they make this future less distant from us?

The Problem with Google Home & Amazon Echo

Voice recognition is the basic building block of natural language processing, by transcribing what you say into text. With exception to speech impediments, voice recognition has become pretty good. For instance, you can speak your text messages and for the most part, all the words come out correctly.

But, the moment you tell your Voice-AI what to do, it crumbles under the pressure. They can read what you write, but they don’t know how to even go about the actions. Go ahead, ask Siri to book you a flight to Tallahassee.

If Voice-AI were a brain, it would be missing the motor cortex – the area of the brain that tells your body what to do. It needs the nerves in the fingertips to know when to pinch or scratch.

We need systems in place that the Voice-AI can work with to fulfill our requests.

Easier said than done, though.

Replicating the different actions and workflows of every website or app you visit requires an immense amount of work. Think about how different it is for you to book a flight versus booking a hotel…and those are relatively similar actions.

Realistically, Google Home could start by creating these systems for the 100 most common requests on the internet (they have the world’s biggest search engine…they know). They’ve started to do this selling your stuff with eBay and ordering food through GrubHub. But, now you run into the learning curve involved in remembering what those 100 requests are.

Voice-AI needs to be able to serve the billions of functions that websites and apps can do.

This feat is seemingly impossible for Google Home and Amazon Echo to scale these systems.

Let’s Franchise!

If you would’ve told Fred DeLuca in 1968 to build 44,000 Subway restaurants, he would’ve scoffed at you. Until he figured out franchising. Then, it was a piece of cake!

To accelerate growth, franchising makes sense. Franchising creates a framework for success (with a little freedom to make things unique) and then sells that framework to interested parties. Franchising maximizes brand visibility and, in a sense, crowdsources growth.

Imagine Google Home and Amazon Echo applying this basic framework of franchising to scale their Voice-AI efforts. Perhaps making a tool or interface that would allow companies to create voice workflows alongside a specified framework. 

Currently, Amazon Echo allows anyone to create skills for Alexa to use…that’s why they have over 10,000 skills. But, a majority of them have 1-star reviews. This shows they are missing the recipe for success which franchising is known for providing – giving people a guaranteed framework to build skills for the Voice-AI.  

The tech giants can’t create general systems that encompass all of the niche questions asked of every app or website. But, the individual brands understand the specific requests that customers ask them. They understand how their visitors navigate their site, so they could create workflow prompts the Voice-AI could then ask a visitor – creating an action-oriented approach.

Naturally, brands are incentivized to create these voice systems because it would give them an advantage over competitors. You can’t tell me that Lyft wouldn’t jump at the opportunity to beat Uber at something.

Their Future Depends On It

Currently, Google Home and Amazon Echo are in a features race. The Amazon Echo just crossed 10,000 skills and Google Home is making moves too. If the future of online interaction depends on voice, it makes sense for both of these companies to dump millions (even billions) of dollars into creating some sort of franchise-esque system.

There will no doubt be hiccups along the way. Users might have to use screens alongside the voice commands for a little while. And it could be a decade or more before these Voice-AI systems are seamless and errorless.

But, taking a little step today always proves fruitful in the long run. That’s a lesson everyone can benefit from.

You don’t need to invest a massive amount of time to trying something new. All it takes is 10 minutes right now to get your toes wet. Don’t ever wait until tomorrow to try something new…because that day will never come.

Time is a gift and we should use it to learn, whether from people, experiences, books, etc… That’s why I created Quick Theories – a weekly newsletter exploring modern technology and its effects on your future – to help you understand and adopt technology in your own creative way.

If you enjoyed this article and would like to read about modern technology from a futurist’s perspective, sign-up here: quicktheories.com

Be the first one to see my new content via Facebook Messenger.

Receive exclusive content that you won't find anywhere else other than on my list. Not only that, but you'll be the first to access my latest quick theories.

Subscribe via Messenger

10 Comments

  • 1 month ago

    This is a stimulating topic. These ‘big picture’ ideas you discuss help me see beyond the micro business world I live in. Thank you-

  • Don Ware
    1 month ago

    Excellent points. Both Apple and Android devices benefited from letting developers use their source code to create applications for their phones. Good point about the “1 star skills”!

  • 1 month ago

    I think what is missing is the user dictionary to match keyboard commands with phrases. Does Siri let you define what it means when I say “Look up…” or Find me…”? If Google would reply with, “please show me how to do that and next time I’ll do better”, then we humans could train the computers to understand what we mean by what we say.

    • Syed Muhammad Al Kherid
      4 weeks ago

      That’s a great point and one that I think would definitely enhance the definition of artificial intelligence to a greater height.

  • Patrick
    1 month ago

    This is exactly what AI assistants are missing. I was excited for many months after hearing about Viv (http://viv.ai), a new AI platform that promised to bring this functionality. Unfortunately Samsung bought Viv and thus we may never see it as an open platform. But in the end, AI should be so much more like you say. You should be able to say to your phone “Get me a Lyft ride home” while you’re out at a bar. Or say “order me a takeout pepperoni pizza from Dominos”, and it should just ask you for your credit card #. There is so much potential here, and I like your idea of crowdsourcing actions.

  • Chip
    1 month ago

    Teaching a processor to process will always require teaching. Teaching a processor to change will always require teaching. Artificial intelligence will always require both teaching and change. When teaching and change are viable products of artificial intelligence, people will become obsolete. When people are obsolete, processors will be asking intelligence questions the same way people ask artificial intelligence questions. Searching for artificial intelligence is a great hobby.

  • Chip
    1 month ago

    BTW, franchise is a four letter word because it implies cost.

  • Archit
    1 month ago

    Hey, can you use other methods to emphasize your point in the emails you send us? You use links which sometimes makes us think there are different articles (but they are not). You can use simple formatting like bold, italics, coloring etc.

    Thanks

  • Gary
    1 month ago

    Any AI system that is asked to perform a task will need to have the past experience of performing this exact same task or to know to ask “its human” about the details and learn from the entire transaction. Booking a hotel isn’t about the booking- its about knowing (or finding out) what’s wanted and then having the capabilities to follow through. We’ll be training or providing examples for our AI buddies for a while before they almost always get it right. Our present speech to text (using SIRI and OK GOOGLE) while being grammatically correct much of the time is a good start. Follow the money- that’s where you will see the progress.

  • Jim
    1 month ago

    Interesting comment. I have been watching the software, computer and technology trend since the late 70s.
    A. Software has been limited by hardware. We are getting to the point that 3D and bandwidth being limited but not the software.
    B. New folk growing up with the current platforms have lots of time for the squirrels in their heads to come up with neat ways to address current challenges. Time is a factor. Companies hire the new wizards to move the yardsticks and the next bound of the global economy is going to be interesting.
    C. Crowd sourcing is the new way to invoke a franchise. Corporately this means buying interesting start-up companies.

Leave A Comment