If you’ve ever gone through the frustration of trying different ways to describe an item through text-based search and still couldn’t find what you were looking for, then visual search might be for you. With visual search, finding exactly what you’re looking for is as easy as taking a picture on your phone. According to a new report by the market research companyGlobal Market Insights, Inc., visual search is forecasted to grow at a CAGR of over 45% during the period of 2018- 2024. This blog post is an introduction to a series of posts that will dive into the technical challenges of visual similarity search that must operate at web scale.

Making Search Easy

Visual search allows you to replace your keyboard with your camera phone by using images instead of text to search for things. For example, as seen in the picture below from Target, if you see a particular vase that you like, all you need to do is take a picture of the vase and visual search will help you find the same vase or a similar vase from Target’s catalogue.

Visual Search in Retail

Visual search is ideal for retail segments, such as fashion and home design because they are largely driven by visual content, and style is often difficult to describe using text search. Wayfair is an example of a home design company that makes it easy for you to search their catalogue by using a picture that you have taken.

Discovering New Products and Ideas

In addition to visually searching their catalogue for similar items, the Wayfair visual search application also allows for discovery of cross-category products. The application looks at your browser history (context) to provide recommendations for items in different categories. For example, it might recommend a nightstand with a similar style to the style of a dresser you recently browsed.

At the Intersection of Cutting Edge Machine Learning, Distributed Computing, and Dedicated Hardware

You may have been wondering how these services work. If you guessed that recent advances in machine learning are driving this, you would be right. But that’s just part of the story. These days, a typical online store the size of Amazon or eBay serves millions of users, offering billions of products. From just a photo, they can search across an enormous inventory and instantly return the top matches. It’s a masterful feat of engineering that lies at the intersection of cutting-edge machine learning, distributed computing, and dedicated hardware.

Many of the players in this space have blogs or publish their research, thus offering a glimpse at how they implement visual search on such a massive scale.

We’ll showcase some of those examples, and, later on, introduce our own tech that enables billion-scale visual similarity search.

This blog series will be composed of the following posts:

  • Machine Learning In Visual Search
  • A Picture Is Worth A Thousand…Numbers
  • Big Data, Make Room For Big Vectors!
  • Embedded Embeddings: New Hardware For Visual Search

Here is a sneak peek at each of the upcoming posts:

Machine Learning In Visual Search

In the blog post, “Machine Learning In Visual Search,” we’ll cover some machine learning background that’s required for the rest of the series. We’ll showcase some successful use cases that combine machine learning within visual similarity search. For example, the combination has enabled Walmart to automate product categorization for its vendors. And at Pinterest, the marriage of machine learning and visual search has significantly improved online recommendations, driving more user engagement by almost 50%.

A Picture Is Worth A Thousand…Numbers

In the blog post “A Picture Is Worth A Thousand…Numbers,” we’ll focus on machine learning algorithms based on deep learning. Deep learning is amazing at extracting meaningful information from images. Through a technique called distance metric learning, deep learning can transform any image into a compact, information-rich vector of numbers. We’ll look at how eBay and Flipkart use this technique for visual similarity search within their massive product catalogues.

Flipkart visual similarity clusters for several shirts based on vector representations of product images.

Big Data, Make Room For Big Vectors!

A database of images, even if each one is represented by a compact numeric vector, could easily run up into the billions. How do you efficiently store and quickly search across such a massive dataset? Traditional large scale storage systems based on SQL or NOSQL aren’t designed for this kind of workload. In the blog post “Big Data, Make Room For Big Vectors! we’ll get into the nuts and bolts of Big Vector storage and computing. While some have proposed hybrid systems that augment existing Big Data techniques, others have started from scratch. We’ll highlight how Facebook and Microsoft are tackling this problem. They demonstrate highly customized mathematical algorithms that optimize visual similarity search on GPUs.

But are GPUs the only answer? Adding more power-hungry GPUs to handle ever-growing product catalogues may not be the best way to go. As a new breed of dedicated AI chips hit the market, the alternatives for low-power and fast computational search are growing. In the blog, “Embedded Embeddings: New Hardware For Visual Search,” we will showcase our own in-memory, bit-processor technology that delivers on key billion-scale similarity search benchmarks, including latency and power.


With visual search, you don’t have to think of the right words to describe an item; you can just use pictures to make it easy to find what you’re looking for. Many people believe that visual search will change the way we search, as evidenced by the following quote from Pinterest co-founder and CEO Ben Silbermann in a CNBC interview, “A lot of the future of search is going to be about pictures instead of keywords.”

To tune in to the rest of this blog series, check out https://medium.com/gsi-technology.

Interested in learning more about visual search and Machine Learning? Join RE•WORK in September at the Deep Learning Summit in London to learn more and network with global experts in AI, Machine Learning and Deep Learning.