From Image Classification to Multitask Modeling: Building Etsy’s Search by Image Feature main image

Building Etsy’s Search by Image Feature: A Journey from Image Classification to Multitask Modeling

Introduction:

Etsy has introduced a new image-based discovery tool on its mobile apps, allowing users to search the marketplace using their own photos. By tapping the camera icon in the search bar, buyers can take a picture and quickly find visually similar results from the platform’s immense inventory. This feature is particularly beneficial for Etsy, where sellers’ unique creations may not be easily described in words. The search algorithm combines machine learning and embedding techniques to enable efficient and accurate image-based searches. Etsy trained a convolutional neural network using transfer learning and triplet loss to generate cohesive and visually relevant embeddings. Additionally, the platform incorporated user-generated photos from reviews to enhance the model’s understanding of diverse image types. Overall, this image search feature provides a seamless and engaging experience for Etsy users.

Full Article: Building Etsy’s Search by Image Feature: A Journey from Image Classification to Multitask Modeling

Etsy Introduces Image-Based Discovery Tool for Mobile Apps

Etsy has recently announced the launch of a new image-based discovery tool on its mobile apps. This feature allows buyers to search the Etsy marketplace using their own photos as a reference. By tapping the camera icon in the search bar, users can take a picture and quickly find visually similar results from the platform’s extensive inventory.

Searching by image is a rapidly growing trend in the e-commerce industry, and Etsy is embracing it to cater to the unique and creative nature of its sellers’ products. Sometimes, the beauty and uniqueness of these handmade creations cannot be easily expressed through words alone.

The Machine-Learning Architecture

To enable image-based search on Etsy, the platform converts all images into a searchable representation called an embedding. This embedding is a dense vector existing in a low-dimensional shared space. With the embedding of a query image and precomputed embeddings for the listing images, Etsy employs a nearest-neighbor search algorithm to find the closest set of listings to the query.

The visual retrieval system relies on a machine learning model that converts each listing’s image into an embedding. These embeddings are indexed into an approximate nearest-neighbor system, which quickly scores a query image for similarity against Etsy’s image embeddings.

You May Also Like to Read  Etsy Engineering: Overcoming the Winner's Curse in Online Experiments for Optimal Results

The Multitask Vision Model

Etsy utilizes a convolutional neural network (CNN) to convert images into embeddings. Instead of training the entire CNN from scratch, the platform employs transfer learning. This involves leveraging a pre-trained model and fine-tuning it specifically for Etsy data.

By using a pre-trained model called EfficientNet, Etsy achieves optimal tradeoffs between accuracy and efficiency. The early layers of the pre-trained model are shared and reused, while the classification head and the last few layers of the CNN are optimized for the new task.

Learning Objectives and Approaches

Training a model on a classification task serves as a proven approach to learning useful embeddings. In Etsy’s case, the initial attempt was to categorize product images. However, these embeddings did not always yield visually cohesive results, as the items surfaced did not match well with the query image in terms of color, material, or pattern.

To overcome this challenge, Etsy switched to a deep metric learning approach using triplet loss. This involves training the model on triplets of examples, consisting of an anchor, a positive example, and a negative example. The triplet loss function then pushes the anchor and positive examples closer together while distancing the negative example.

Using triplet embeddings improved the visual cohesiveness of the listings, displaying similar colors and patterns. However, these embeddings lacked categorical accuracy compared to the classification approach.

Multitask Classification Approach

To ensure visually consistent results and maintain taxonomy accuracy, Etsy implemented a multitask classification approach. Instead of having a single classification head, separate heads were attached for multiple categorization tasks such as item category, fine-grained item category, primary color, and other item attributes.

The multitask learning architecture allowed the sharing of embedding weights across all tasks. However, the challenge arose when some optional seller input attributes were sparse. To address this, Etsy implemented a data sampler that combined examples from disjoint datasets into every minibatch, ensuring equal representation for each task.

Using Accurate and Diverse Data

When it comes to searching by image, Etsy understands that users are not searching from listing photos but rather using their own phone camera. Unlike seller-provided product images, user photos taken with phones may have various qualities, such as blurriness, poor lighting, or distracting backgrounds.

Training deep learning models is susceptible to biases in data distribution. Therefore, Etsy expanded its dataset to include photos from buyers’ reviews, which often feature images taken with phone cameras. This allowed the model to train on data that aligned more closely with the user’s photo-taking experience.

You May Also Like to Read  Key Takeaways: Leveraging a Cybersecurity Vendor for Detecting Malicious Links

Conclusion

Etsy’s new image-based discovery tool for its mobile apps offers buyers a unique and efficient way to search the marketplace using their own photos. By leveraging machine-learning architecture and multitask classification, Etsy ensures visually consistent and categorically accurate results.

The platform’s approach to training deep learning models, along with the use of accurate and diverse datasets, helps eliminate biases and cater to the user’s experience. With this powerful image-based search feature, Etsy continues to enhance the shopping experience for its customers and support its innovative community of sellers.

Summary: Building Etsy’s Search by Image Feature: A Journey from Image Classification to Multitask Modeling

Etsy has introduced a new image-based discovery tool on its mobile apps that allows buyers to search for items using their own photos as a reference. Users can simply take a picture and the tool will provide visually similar results from Etsy’s inventory of nearly 100 million listings. This image search feature is particularly useful for Etsy, as the uniqueness of sellers’ creations can often be difficult to describe with words. The tool utilizes a machine learning architecture that converts all listing images into embeddings, or searchable representations, and uses a nearest-neighbor search algorithm to find the closest matching listings. The system employs a multitask vision model, trained on a dataset of Etsy data, to convert images into embeddings. The model architecture utilizes transfer learning and the pre-trained EfficientNet model for optimal tradeoffs between accuracy and efficiency. The learning objective involves training the model using a deep metric learning approach with triplet loss, where the model is trained on triplets of examples to push similar items closer together while pushing dissimilar items farther apart. However, this approach lacked categorical accuracy compared to classification and had less observability in terms of training metrics. Therefore, the model was modified to use a multitask classification approach, where separate heads were attached for multiple categorization tasks. Loss and evaluation metrics were computed individually for each task, while the embedding weights were shared across all tasks. The dataset used for training the model was expanded to include review photos uploaded by users, allowing for a wider range of images in the training data. This expansion resulted in significant improvement in the model’s ability to provide visually relevant results. Overall, this image-based discovery tool enhances the search experience on Etsy’s platform, allowing buyers to easily find unique items based on their own preferences and style.

You May Also Like to Read  Boost Your Search Rankings with MARRS: Multimodal Reference Resolution System

Frequently Asked Questions:

Q1: What is machine learning and how does it work?

A1: Machine learning is a subset of artificial intelligence that focuses on enabling computers to learn and make decisions without being explicitly programmed. It involves using algorithms and statistical models to analyze and interpret large sets of data, allowing the machine to improve its performance and accuracy over time.

Q2: What are the different types of machine learning?

A2: There are three main types of machine learning: supervised learning, unsupervised learning, and reinforcement learning. Supervised learning involves training a model with labeled data to make predictions or classifications. Unsupervised learning involves finding patterns or relationships in unlabeled data. Reinforcement learning is based on an agent interacting with an environment and learning to maximize rewards.

Q3: What are some real-world applications of machine learning?

A3: Machine learning has found numerous applications in various industries. Some examples include:

– Healthcare: Predicting disease diagnoses, analyzing medical images, and personalized medicine.
– Finance: Credit scoring, fraud detection, algorithmic trading, and risk assessment.
– Marketing: Customer segmentation, recommendation systems, and targeted advertising.
– Transportation: Autonomous vehicles, route optimization, and traffic prediction.
– Manufacturing: Quality control, predictive maintenance, and supply chain optimization.

Q4: What are the main challenges in implementing machine learning?

A4: Some challenges in machine learning implementation include:

– Data quality and availability: Acquiring and preparing high-quality data for training and testing models.
– Model selection and tuning: Choosing the appropriate algorithms and optimizing their hyperparameters for optimal performance.
– Interpretability and explainability: Understanding how and why a model makes a particular prediction or decision.
– Ethical considerations: Ensuring fairness, transparency, and privacy in the use of machine learning systems.
– Deployment and scalability: Integrating machine learning models into existing systems and handling large-scale data processing.

Q5: How can businesses leverage machine learning to gain a competitive advantage?

A5: Machine learning can provide several benefits to businesses, such as:

– Enhanced decision-making: By leveraging data-driven insights, businesses can make more informed and accurate decisions.
– Automation and efficiency: Machine learning can automate repetitive tasks, improve operational efficiency, and reduce costs.
– Personalization: By analyzing customer data, businesses can personalize their products, services, and marketing strategies to cater to individual needs.
– Improved customer experience: Machine learning algorithms can analyze customer behavior to identify patterns and provide personalized recommendations, resulting in a better overall customer experience.
– Predictive capabilities: Machine learning can help businesses predict future trends, customer preferences, and market changes, allowing them to stay ahead of the competition.