skip to main content

Straight Talk Consumer and Entertainment Services

Ovum view

Computer vision will be the most fundamental driver of new technology adoption over the next 10 years. It gives machines the ability to directly or indirectly see and interpret images from various sources. Facial recognition used for friend tagging in pictures on social networks and automatic number-plate recognition for toll charging are simple examples already seen in the real world. The underlying technology is not new but the scale and pace at which it is growing is unprecedented. It is fueled by new camera technology in the smartphone market, where 90% of images are created and amplified by new artificial intelligence algorithms able to recognize objects and scenes in real time.

The first market indicator is growth in the number of eyeballs that machines will have. The largest contributor will be smartphones by a big margin (Ovum expects the smartphone installed base to reach 5.5 billion devices by 2022, up from 3.9 billion in 2017). The proportion of smartphones that will be include augmented reality (AR) capabilities (addressable via ARCore or ARKit platforms) will grow from a third in 2018 to 100% in 2022. Every smartphone has a minimum of two cameras (front and back) and more like four lenses for high-end devices such as the latest iPhone and Samsung smartphones. On a smaller scale, always-on connected cameras used in the home as part of smart home security products will grow 10-fold in the next five years to 1 billion devices. It won’t be long before there will be more artificial than human eyes.

One key aspect of machines that is fundamentally different to humans is that what they see and experience can be recorded and shared with other machines almost instantly. The cameras are also accessible by any app or software for which the owner will grant access, such as Instagram, Snapchat, Facebook, WhatsApp, and Pinterest, among many others. New services such as Vivint’s Streety take advantage of this aspect of computer vision to extend smart home video monitoring from individual houses to an entire neighborhood. A recent research paper from the University of Cambridge in the UK and Nokia Labs published in Frontiers in Physics demonstrates how images posted on social networks can be used to assess the well-being of different city areas and the ability to predict gentrification within five years. An extension to this project is due to look at the health of citizens looking at pictures of food posted on the web.

The quantity of data created by these artificial eyes is another key market indicator of impending computer vision-led tech disruption. There have been more than 1 trillion pictures taken from smartphones each year since 2015, with more than 350 million photos uploaded to Facebook and 95 million to Instagram daily. This data can be used to train AI algorithms, increasing the accuracy of image recognition and interpretation, in turn making applications much more effective and valuable to consumers. This is why long-term futuristic concepts such as self-driving cars and drones are becoming realities in a much shorter timeframe than previously anticipated.

Straight Talk is a weekly briefing from the desk of the Chief Research Officer. To receive this newsletter by email, please contact us.