How Computer Vision is Crucial for Consumer Intelligence
The rise in visual media, specifically the sharing of photos and videos, creates unique opportunities for organizations that text alone cannot. Photos and videos may carry diverse cultural connotations, they are also far more universal than language.
Computer vision enables images to be used as a source of actionable insights for brands. Image analysis transforms detected objects, scenes, attributes or emotions within a picture into readable data. It is easily exploitable for business applications.
Image intelligence can serve as an early warning system for situations that may affect customers. It is also a huge help in terms of enhancing your consumer knowledge: understanding moments of consumption, measuring the impact of a marketing campaign on your target audience, identifying new brand ambassadors, etc.
To stay at the forefront of innovation, we keep working on improving our Computer Vision capabilities, using the latest models available. That’s why we are proud to announce a brand new Computer Vision Feature in Radarly coming from the most-advanced AI innovation: Image Clustering.
What is image clustering?
Image clustering is the ability to cluster similar images using AI. In Radarly, we now have a dedicated page in the Posts & Analytics section, gathering the image clusters detected by AI. This page provides interesting information like the tonality of these clusters, the split by platform, and some terms describing the cluster mainly derived by AI.
A detailed view of the cluster enables users to run a deep qualitative analysis and better understand who the authors are.
How to use image clustering practically
Image clustering can help you find insight nuggets in your project. Paired with the high-level data structuring capabilities of the platform, it helps you go into a deep qualitative analysis of the visual content of your data set and leverage visual platforms like Instagram and Pinterest.
Let’s have a look at some examples of use:
Image clustering was created to help with trend detection, allowing you to go beyond textual content by offering a new way to analyze images and easily identify big trends automatically.
With this feature, you’ll also be able to focus on analyzing weak signals and micro-trends. As the page is not frozen, you will be able to manipulate the data using the filters of your workspace to fine tune the analysis and focus it on a very tiny aspect of your project to derive insights about a specific element of your data structuring (moment of consumption, a specific brand, a competitor, an event, a target audience, etc.) on a platform, a market, and so on.
Thanks to all the data structuring efforts already created for your project, it helps you find relevant insights from the images you have in your data set and save time in the qualitative analysis.
Let’s take Nike as an example. Digging into the clusters, they will find insightful information like discovering a group of people that remix their ads in a flat design poster mode. Analyzing this audience, it could be interesting for the brand to understand whether this trend could be used in terms of content creation to better engage their audience, or a segment of their audience.
They will also find for instance that some people love to associate their Nike TN with chic clothes to contrast. Why not use this to build a collaboration with chic fashion style brands to promote their street style shoes?
Another example of insights that could be derived from image clustering analysis: if we look specifically at two clusters representing two different products of the brand, we can see that the authors of the images are not using the same platforms to share their images: consumers that bought Nike Airforce are mainly on Instagram, but consumers that bought Nike Air Zoom are mostly on Pinterest. An insight nugget when it comes to setting up promotional campaigns!
Let's say, you are in charge of content creation for a global brand. Content ideation is not easy! It’s an ongoing challenge to discover new and relevant topics that will keep your audience engaged.
Looking at the clusters of images detected by Radarly can be a huge help to find what resonates with your target audience, and on that specific point, we all know the power of image to carry a concept, an idea, or send a message.
It also helps understand the codes used by a specific target audience when they share images on visual platforms and adopt them to better engage these audiences.
A final interesting point here is what the AI brings in addition to clustering images: as explained below, AI will also provide a list of terms that best describe the images of the cluster, whether these terms are present or not in the text shared with the images. A good source of inspiration to adopt the right hashtags for instance.
Understanding moments of consumption
In some industry verticals social media is crucial to understand moments of consumption. If you think about the Food & Beverage industry, brands are continuously trying to understand how people consume their products, the experience they live, the place they are to consume them. What better way to understand this than analyzing images shared on social media? Image clustering will reinforce this capability to analyze those moments of consumption and reach this consumer-centricity the brands are pursuing.
Of course, this feature can be used for any other side cases, like crisis management, or spam detection, for example, but its main benefit is enabling analysts to remember that it saves time in conducting qualitative analysis on visual content.
Going beyond in how it works
Corentin Morvan, who is part of our terrific Linkfluence AI team, explains image clustering from his expert perspective:
“Technically, image clustering is the process of grouping images into clusters such that the images within the same clusters are similar to each other, while those in different clusters are dissimilar.
This task, quite simple at first glance, is pretty challenging when one tries to do it automatically on millions of images per day, for many reasons. On the one hand, because it is really hard for a computer to understand whether images are similar or not; since a computer sees nothing but pixels. And on the other hand, because it is quite difficult to automatically name such clusters; and yet, it is an important feature to denominate these groups of images.
To tackle both these problems at the same time, once again, Linkfluence’s AI team relied on Deep Learning and its artificial neural networks. Basically, as we said above, a computer does not understand an image, it only sees arrays of pixels, as well as it does not understand a sentence but only sees letters next to each other. So, how could it make sure that two images are similar and belong to the same cluster? Or that two sentences have similar or opposite meanings?
Fortunately, advances in artificial intelligence in the last decade have enabled computers to understand intrinsically images or texts, thanks to models called encoders. Fed with billions of images or texts, these models have learned to represent inputs such that similar ones should be embedded similarly in the vectorial representation space. To make it easier, for instance, it means that the representation of two images portraying people on the beach should be really close, or similar, since the two pictures are depicting quite the same scene. By doing so, encoders have a deep understanding of the content of an image: the context of the image, the objects in it, etc.
However, for years, models have been segregated between the ones dealing with images and the ones dealing with texts; both in their architecture and their way of learning.
But, as you may easily notice, it does not work that way in a human brain. Indeed, seeing an image of someone skiing down the mountains and reading “a person is skiing down the mountain” is pretty much the same thing for us; as we can routinely switch from text to image or vice versa. As proof, you can easily imagine the scene of such a picture just by reading my words!
So, why could not artificial models do the same? Well, one reason is that, for a long time, they had very different architectures. The idea of using multimodal models emerged in the state-of-the-art around the beginning of 2021. Rather recent! And always wanting to be on the cutting edge of technology at Linkfluence, we decided to give it a try and to take advantage of its potential power.
Fed with an image and a text depicting the same idea, the model, based on a Transformer architecture, learns to represent both inputs similarly; like a human would!
Using such a multimodal approach, the model is not only able to understand that images representing the same idea should be grouped together, it is also able to use the vectorial representation of a cluster of images to pick the most appropriate terms to name it! Thus, solving both of the image clustering challenges! Wonderful!
Furthermore, to perfect the labeling of each cluster, we also retrieve the most common keywords present in the corpus made up of the texts associated with the posted images. Enabling our customers to notice what are the most used terms related to clusters of images.”
Thank you Corentin, it’s crystal clear!
Would you like to see Image Clustering in action for your brand? Request a demo now or contact your account manager!