Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News An Introduction to Structured Data at Etsy

An Introduction to Structured Data at Etsy

Etsy recently published a blog post detailing how they store and manage structured data. The Etsy team make extensive use of taxonomies, and store the structured data within JSON files.

Etsy is a marketplace allowing sellers to post one-of-a-kind items. Their landing page slogan "If it’s handcrafted, vintage, custom or unique, it’s on Etsy" reveals that uniqueness is a selling point for Etsy. As long as an item falls within the three broad categories of craft supplies, handmade, and vintage, it can be listed in the marketplace.

Etsy’s organisation of structured data defines a taxonomy comprised of categories, attributes, values and scales. For example, an item in the taxonomy can have category "boots", attribute "women’s shoe size", with value "7" and scale "United Kingdom". Etsy’s taxonomy contains over 6,000 categories, 400 attributes, 3,500 values and 90 scales. All these hierarchies combine together to form over 3,500 filters. The taxonomy is represented by JSON files, with one file per category that contains information about the relative placing of the category in the hierarchical tree and attributes, values and scales for items in this category.

The taxonomy is used in two cases. For the seller, it is used to categorise the listing at the time of creation. When a seller is creating a new listing, they can choose its category from within the taxonomy. Using a smart auto-complete suggestion textfield, they can select the most appropriate category. Based on the category selection, JSON provides information about attributes, value ranges and scales (e.g. "women’s shoe size", "2…10", "United Kingdom"). This reduces the need for overloaded product descriptions, simplifies the process for sellers and guarantees that Etsy collects just the relevant attributes for each category.

On the other hand, the taxonomy is used by the buyer to allow users to search by category and subcategory. Every category has its own distinct filters that are defined by the taxonomy. Any search query gets classified to a taxonomy category via a big data job which then gets displayed to the user along with the category’s filters.

The next steps for Etsy revolve around structuring unstructured data like listing titles and descriptions. Also, Etsy can use Machine Learning to discover more information about a listing. An example could be inferring the colour of a listing by the submitted image rather than asking the seller. The goal for Etsy is to use structured data to power deeper category-specific experiences, thus creating a better connection between buyers and sellers.

Rate this Article