Our hoarder’s mentality with messaging, social and sharing apps has hope to be cured when The Great Digital Convergence occurs.


Not too long ago, public telephones lined city streets enabling people to communicate with one another. This eventually was converged into a single point of communications, the mobile phone. Over time the methods in which one could connect with others within that device diversified exponentially, we are now again in need of convergence to simplify this mobile world we live.

When I watched the Google I/O 2016 keynote presentation launch not one but two new chat apps, I couldn’t help but think “Oh great, yet more chat apps that do pretty much the same as all the other chat apps on my phone.” and “More unnecessary chat apps attempting to make up for the shortcomings of Hangouts.” Those thoughts won’t win friends with the avid attendees at the I/O event, but I believe they’re justified. Three chat apps from a single company would have been peculiar in years past, but unfortunately not now.

First, Allo was trotted out onto the stage. It basically plays catchup with WhatsApp in such areas as encryption, emoticons and using your phone number as identity. It’s claim to fame is apparently letting Google’s servers get in on the action and partake in your chat. Wasn’t that really what Google Now on Tap was supposed to do with your mobile experience? Google analyses the text and pictures of a conversation and offers content such as restaurant recommendations for both parties to view. From the analysis Google also attempts to anticipate responses in the conversation, apparently saving the user from typing, or for that matter, thinking. Less extraordinary features include how the text size of the chat message could be changed to reflect whispering and shouting; somehow the crowd at I/O went hysterical for it. I to and fro between considering these as limited use gimmicks and intriguing glimpses of the future, at least in the state they’re in now. At any rate Allo is here… ready to take yet another slot on your home screen.

Then Duo was announced, their brand new decidedly consumer oriented chat app focused on one-on-one video chats (aka FaceTime from Google). Presenter Erik Kay did his earnest to paint lipstick on the pig, touting it’s “Knock-Knock” feature to preview a caller’s video stream before answering, like it was some kind of magical fantastical and wonderful invention to change how we all communicate. Give me a break. It was right out of Apple’s marketing mastery book of taking incredibly minute innovations and trumpeting them as progress forward for humanity. While Duo’s promises of high reliability and second-by-second scaling to network conditions is laudable (everyone hates sketchy video calls), does it warrant another app?

In both cases, Google seems yet again late to the party (*cough* Currents *cough* many others *cough*), its direct competitors in this space are already at or close to the billion user mark, great timing. I was going to drop in a joke about Microsoft sometime soon launching a “me too” chat app answer to Google’s “me too” chat app, but I think they’re still reeling from that ill performing $8.5bn Skype acquisition.


Let Me Get This Out of my System

So let’s see here… On my Android Nexus phone, Hangouts comes pre-installed, I’ll use Duo for those ‘magical’ one-on-one chats with family, and Allo for those buddy and work conversations or when I want Google dabbling in my conversations? And then let’s throw in Messenger since I’m on Facebook and they’ve crippled doing any real conversing without it. Gotta make sure WhatsApp’s installed, because heck, everyone else has it and I wouldn’t want only one chat app from Facebook. Skype for all those contacts I’ve collected over the years that haven’t moved on to any of the aforesaid platforms, and just because I love apps that seemingly need to update every week to patch their ancient technology. Oh and client sites are using Slack now, so have to make sure that’s there too. And of course there’s my trusty ever-present SMS app for those ol’skool moments someone uses my ‘telephone number’ (do you know yours?) and to receive marketing messages the moment I step into malls.

With little effort, I’ve amassed 8 completely distinct communications channels and their corresponding apps. All churning away in the background of my phone, taking polite little sips of my batter throughout the day, creeping further and further into the empty portions of my storage universe, eagerly drawing data as it maintains itself with updates and banters back and forth with their respective motherships. It’s all quite consuming, given the end effect is to simply chat with people.

And messaging seems to be in more than just purpose-built apps. Dating apps for example often have their own clumsy attempts at in-app messaging, further worsening the situation for those using them. With users often resorting to giving out their primary messaging accounts to avoid contending with the often limited and sporadic experience. Such secondary messengers may be contributing a handful more to that grand total on your device.

Now add in email (most likely multiple emails), Twitter, Path, Snapchat, Google+ (haha, just kidding), Instagram (because apparently picture sharing capabilities of all the other platforms isn’t enough). and whatever else you’ve got going on – and it all amounts to a colossal amount of noise and consumption.

Is this an Android lifestyle thing? No! Have an iPhone? Swap out Hangouts with FaceTime and little else changes. This isn’t to bash Android or Google, Android has been on this really masterful evolutionary process of becoming a slick mobile OS. It constantly amazes me how my years old Nexus phone only seems to get more responsive and refined through the years of upgrades and patches, that’s unprecedented. Try telling a similar story about a Windows notebook with a straight face. So this criticism is focused principally on our total and utter overload of unnecessary communications channels, which was just announced to swell further.

My mobile devices were littered with chat and social apps, all in a desperate attempt to connect with every flavour of the day platform the crowd would migrate to. Eventually I’d culled the herd, but still found myself looking at screens full of apps that were essentially various forms of text, picture and video pushers, all walled off from each other with digital boarders.

I’m certainly not alone or an edge case, the findings from informally polling people close to me what apps they have installed and are actively using revealed everyone’s predicament largely mirrored my own. So while there’s obvious bias in the sample set (mostly tech sector and startup people, thus I’d place most of them in the “high usage” category of such apps), the data clearly demonstrates how consistently such apps get amassed.


Messaging and Social App Mobile Usage


The problem is FOMO (Fear of Missing Out), a term often reserved for investors not wanting to miss out on the next big whale of an opportunity when hyped the right way. But in this case it’s the fear of missing out on conversations, statuses, invites and so forth you could largely umbrella under “potential experiences”. Take any of the chat and sharing channels out of the equation and you’ve diminished your connectivity to whatever was specifically published to that channel and therefore increased your risk of missing out. Voila, you now have a couple dozen unnecessary apps feeding off you and your phone.

How telling this whole communications apps mess is of human nature. Every company creating their ‘tribes’ and pieces of ‘turf’ in an attempt at monetisation from their slice of the populous. But at some point that has to give, compromises and hand shaking needs to happen. Dreaming, yes quite possibly, but can we propose a system that keeps in place the business motivators while cleaning up the dumpster fire we’re in?

If someone came to each country around the globe and told them they’re to adopt a new international electrical plug standard, and switch their steering wheels to “the right side”, you’d get a lot of middle fingers pointing you to their answer. So for me to sit here thinking everyone’s going to religuish their respective turfs for the greater good is laughable, I know. But let’s look at this in how we can open into a single standard or common infrastructure but not impact the flow of revenue, which is all these platform providers give a shit about anyway.

Just as airline carriers realised decades ago it was in everyone’s favour to shake hands and form alliance groups to work together with codeshare agreements, so too should the major digital players.


We’re on the Cusp of a Big Change

This all lead to a greater train of thought around the overdue need for what I’d say is The Great Digital Convergence. Where all these channels are essentially fed through a single intelligent point of contact, namely your virtual assistant, with most of the heavy lifting happening well before it arrives to your device. All the messages, photos, videos, statuses and actions are funnelled through a single smart interaction point. Imagine Siri, but capable of handling the flow of all your emails, messaging, social and media sharing app feeds into a single continuously learning and customisable experience specific to your needs. Doesn’t that sound liberating? No? Then don’t think of Siri as it is now, that’s not liberating, think of what it could be.

The writing’s on the wall it’s coming… Key technologies are all about to smash together to make convergence possible. This comes in the form of ever better virtual assistants, chat bots, natural language processing, vastly improved AI, machine learning, greater actual use of big data and better integration of said technologies into devices and operating systems.

Products that are setting the scene or are being seen as foundational pieces for convergence are trickling out. Siri, Cortana and Google Now are all becoming smarter and more integrated, with exception to the ridiculous absence of Siri on Macs. Amazon’s Echo device has enjoyed a warm response with it’s interface-less experience of users talking to it’s Alexa virtual assistant. Google Home, the newly announced answer to Echo aims at tighter integration to your life and a more natural conversational interaction with users.


As user’s trust and value of bot and virtual assistant
interactions increases, so too will their usage. Google
already claims 20% of US search queries are performed by voice.


Virtual assistants are starting to show signs of becoming more than simple speech-to-text conduits for actioning mundane “What’s the weather today?” type requests and reporting the traffic conditions for your route to work. Talk of natural language processing, bot and AI technologies are hitting front page in a big way. All the biggies are jumping on board and committing, including the likes of Google, Facebook, etc. Companies are frantically trying to figure out how to make intelligent bots to seamlessly handle customer service requests of all types, which will be the major driver in justifying research and development costs.

Which leads to quite possibly the most telling sign… taking a stroll through the UX jobs in any of the major job portals and seeing how many are for speech and linguistics, bot AI, interface-less experience, and multimodal UX expertise, all wanted pronto! Companies are scrambling the recruiter jets to source what little talent there is out there to bring this next generation into reality.

All of it to me equates to a coming mass migration to a single point of reference for all your communications: your virtual assistant.


How’s This Convergence Supposed to Work?

The simplicity is in the the general high level architecture, the difficulty is in reconfiguring the extremities of the architecture that reach out to the service providers and users. The really tough stuff is getting everyone to play nicely in the sand box together.


Digital Convergence Architecture


Standardise The Packet Shuffling

All of these packets flying around between apps are largely the same… text segments, media attachments, codes to indicate which animated Hello Kitty sticker to display, etc. Basic, basic, basic technology and deployed redundantly across every platform. I’d say, ditch all of that, there’s no point in maintaining all these discrete systems with their scaling, authentication, security and other needs.

If we’d all just work on a common open platform that facilitates the simple task of authenticating and shuttling around these base forms of communications, then the focus can be on what happens on either end of that communication and how the server intercepts and responds. We’ve right away alleviated all these ridiculous authentication wall issues of different systems.

Additional services such a video streaming could piggyback onto the authenticated connection using shared best-of-breed libraries that the alliance of platforms would co-develop and maintain (think how we’ve all agreed on HTML5 or H.264 or other such standards). This would additionally allow things like live broadcasts to potentially be sent in a singe broadcasted data stream rather than many unique but identical data streams per participant as currently happens.


Awareness and Intelligence

Intelligence and adaptability becomes critical when such a system would be dealing with how to present many streams of data into a single interface. How it’s sorted, what’s important at a global and user level, how does the user’s context impact these variables… many things to consider, all to be determined by a combination of machine learning of the user. Think of a newspaper where the cover stories are constantly changing depending on where you are, what you’re doing, who you’ve just met and so forth.


The ability of the assistant to appropriately respond
to queries will make or break such an approach, this
comes from AI, machine learning and a precise
ability to understand all forms of human interaction.


Apps and operating systems are increasingly becoming contextually aware of where they are and what you’re doing. When I book a car to pick me up from work on the Grab app, it automatically assumes my destination is home because it’s done it’s home work. It knows my commuting behaviours, where and where I typically go at that time of day… which it correctly determined as home. As awareness and intelligence improve lock-step with each other, we’ll see our virtual assistants make some pretty informed decisions on our behalf, without any intervention on our part.

This will all become more important as things like autonomous cars seek to streamline how we move about in an almost totally effortless way. We won’t use a car booking app to summon a driver. Our device will know it’s time to get a car ready at the curb to take you home because it knows it’s 2am and just sensed you paid for the bill at the restaurant. There’s no need for even your intent to be communicated, let alone interacting with an interface or firing up an app – it’s awareness of situation and user behaviours will allow it to decide what actions are appropraite and execute them for you. This is getting a bit off topic, but when pitching your assistant as being your centralised point of your digital world, all of this starts coming into the scene.


Always-On Experience Capturing

Our always-on Internet lives means text messages and audio clips are insufficient in capturing our experiences, as Snapchat, Periscope and the like prove. We’ll live in a multimodal world where we’ll be constantly capturing, measuring, locating and inputting all sorts of formats that best represent that particular moment of our lives. The assistant would be tasked with stitching together and bring it all into a compelling real time story others will consume. We’re not talking a gallery of static images, some video files and text messages, but a rich and immersive storyboard flowing between events of the person’s life as they go, either broadcasted or privately. When your assistant has all of this on-hand, it will literally become the digital producer of your life’s show.

Just as text messaging pulled us away from phone calls, this richer experience capturing I describe will wean us from text messaging – as we will more or less be participating digitally in each others ‘life shows’ alleviating much of the need for messaging, which can be summed up as fragmented, unstructured and often arduous anyway. (Think back to the last time you tried to get a group of people to agree on a restaurant and at what time by messaging alone and you understand what I mean).


Single Point Interface

When you combine the single point of a virtual assistant such as Siri, Cortana, with an ability to pull in all your streams of communications into a single audible and visual interface, and intelligently allow you to interact with it, you’ve instantly eliminated the need for everything else. There’s nothing special about WhatsApp’s interface or Twitter’s interface that you’ll miss, if the virtual assistant presents those bits of information in the way you want.

Then those service providers such as WhatsApp and Twitter don’t need something to be installed or plugged-in, they’ve done all their rendering in the cloud, and output to your virtual assistant, maintaining the ability to sell things, display ads, perform actions specific to those providers and so forth. They haven’t lost anything by relinquished their app turfs, if anything they’ve streamlined development.

As a user you’re simply subscribing to any of these platforms through a universal user profile and it becomes part of the scope of your virtual assistant. Doesn’t that sound nice and tidy? Think of all the savings across the board. Your time in checking and maintaining a multitude of separate accounts eliminated. Your phone has a single built-into-the-core interface that singularly is updated and can largely check on the status of all your subscribed services off-device back at the server farm. The device power, storage and data savings, particularly when I think of developing countries, would be staggering.

Single point interface also means it should not only handle communications and sharing with people, but all the things in the coming IoT (Internet of Things) world. The vehicles, homes, offices, wearables, medical devices, screens and audio systems throughout our lives should all be interconnected and have a free flow dialogue of commands and feedback to and from such objects, centrally via an ever-present assistant. None of them will need to be maintained with a plethora of apps, updates and so forth, as the assistant becomes the pre-processor and conduit to all the things that originally needed apps across so many touch points.


Blissful Digital Harmony is Near

I really love where all this is going. We’re all tiring of having to check and interact with so much stuff to maintain connectivity with the world. My research uncovered so many exciting things happening right now. Owners of Tesla cars proudly posting videos of walking out to the front of their houses as their car opens the garage door and self drives out onto the driveway, then spending a vast majority of the ride to work driving itself. Or Google’s video of a near future family interacting with it’s Google Home assistant, being informed of flight delays and rebooking restaurant reservations effortlessly. While the family in that video looked eerily way too excited to be interacting with their virtual assistant that early in the morning, it painted a picture of machines working in harmony with humans. It starts pointing to a future where we’re all free from being a slave to the device and becoming master of the assistant.