Comparing Agentic AI for Fashion Search: Daydream, Zalando, Mango
For years, retail tech has promised us a 'personal stylist in your pocket.' Now, it’s actually here. Zalando, MANGO, and newer players like Daydream have all launched conversational AI shopping assistants claiming to make shopping feel more human using LLMs and generative AI to decode messy, natural-language requests like:
'I’m going to a windy beach wedding and want a green v-neck maxi dress with long sleeves, maybe with a belt, so I don’t get sunburnt.'
At first glance, these agents look magical. They return neat carousels, friendly prompts, and moodboards of options. But when you stress-test them, a simple truth becomes clear:
Without deeply structured, enriched product data beyond color and size, even the best AI can’t deliver the magic.
What Agentic Fashion Discovery Promises
These systems promise to:
- Let customers shop 'like talking to a stylist.'
- Replace multi-click journeys with a few natural queries.
- Use event, mood, and aesthetic context ('I need a dress for a beach wedding') to filter products easily.
- Provide styling recommendations in-line to reduce decision fatigue.
And yes, when it works, it feels magical.
- Daydream quickly refines 'navy suit for a wedding' to 'smart-casual with a navy polo.'
- MANGO assistant personalises tone and context elegantly.
- Zalando push to enrich its catalogue with OpenAI and structured tagging shows it understands the real bottleneck: data.
But that’s exactly where the limitations begin.
Where the Big Players Stand
Zalando: 'We Want to Make Shopping More Inspiring'
Zalando’s Assistant, launched across 25 European markets, shows how serious big retail is about this bet.
'The assistant is now smarter and more personalised than ever. By providing this more personalised and contextually relevant support, we are empowering our customers to make informed choices and find exactly what they're looking for.' — Tian Su, VP Personalisation & Recommendation at Zalando
Backed by OpenAI’s GPT, Zalando’s Content Creation Copilot now enriches around 50,000 product attributes per week during onboarding.
Even Zalando’s engineers openly discuss the limits. In a technical post, Michal Kacper Kubacki and Nikhil Iyer explained that GPT enrichment hits about 75% accuracy — solid for broad attributes like necklines and color, but weaker for 'niche fashion-specific terms' or layered nuances. They noted:
'The risk is that less common or more specific fashion terms may be treated inaccurately or incompletely.'
Some attributes were pulled back entirely when predictions proved unreliable, proving that LLMs can’t fix missing data upstream alone.
When it works, it works: In my tests, Zalando handled context-switching well ('show me green… now long-sleeve… now add shoes'). Product suggestions pulled from reasonably tagged PDPs (fabric, lining, care instructions). But for layered, unusual requests like 'beach wedding, heavy wind, sun protection, belted or not', the assistant often fell back to partial matches and 'closest alternatives.'

On the initial response seems to provide a broad array of options incl. a maternity dress.

The description of how to style the outfit is inspiring, but the visualisation of the Outfit is lacking, as each prompt produces a row of product results.

Brings the actual product into the context of the conversation when the AI assistant is initiated from the PDP.

Zalando test findings:
- Uses OpenAI for attribute enrichment (~75% accuracy on limited set of attributes).
- Shows prices clearly, supporting high-intent shoppers. (I wonder if they tested not showing the price for customers on a Discovery/Inspiration journey)
- Smooth context-switching across requests.
- Allows bringing a single product into the context of the conversation.
- UI designed for mobile. Misses an opportunity for a more immersive product discovery on a desktop.
- Maternity wear sometimes appears unprompted.
- Lacks styling curation (returns rows of disconnected products).
- 'Beach wedding' refinement often unreliable without upstream tags.
Mango: Style-Led Conversations Meet Data Reality
Spanish giant Mango has bet big on its Mango Stylist, rolling out in nine markets. The company calls it:
'A pioneering virtual fashion assistant powered by generative AI,'
crediting its launch to cross-functional teams spanning IT, Data, Digital Product, and Visual Merchandising.
Mango’s vision:
'Customers can interact with fashion in a new and natural way, making every purchase an inspiring experience.'

'i am looking for an outfit for a beach wedding as a guest'
Mango leads with complete Outfits and decided not to show prices. It's a smart move to guide consumers through an Inspiration/Discovery journey.

At a glance it seemed like it nailed the brief. It is clear that the AI assistant understood the context 'here's an outfit in green tones that could work for a beach wedding'. However on inspecting the item it became apparent that is not entirely wedding appropriate. The Assistant understood that I am looking for a wedding outfit and included it in the response language, but did not make the connection to what product attributes are required for it to be wedding appropriate.

It nailed the brief on the follow-up prompt after I explained that wedding appropriate attire should not expose my underwear. Something like this would be easily addressed with enriched product attributes.

In practice, while Mango’s AI holds a polite, iterative conversation ('Maybe something more wedding-appropriate? A higher neckline?'), it hits the same structural wall: if the product catalogue lacks a clear 'occasion' tag or style descriptors, the agent simply can’t find what’s needed.
The best proof? The exact product I asked for — which Mango’s AI said didn’t exist — was immediately findable via the manual onsite search. A basic keyword search succeeded where the 'smart' assistant failed.
'can you find me a green v-neck maxi dress with long sleeves I can wear as a wedding guest'

Onsite search query 'long sleeve green v-neck maxi dress'

Mango test findings:
- Outfit-led discovery with 'shop the look.'
- Personalised conversational tone.
- Avoids price overload to maintain inspiration.
- Weak handoff from conversation to visual search.
- No 'more like this' iterative refinement.
- Inconsistent results when tags are missing ('wedding appropriate' vs. 'see-through').
Daydream: 'Say More.'
Daydream, led by retail veteran Julie Bornstein (Nordstrom, Sephora, founder of The Yes), recently raised $50 million to build the 'ultimate AI shopping experience'.
Daydream’s entire pitch is that traditional e-commerce filters miss the nuance in how people shop. As Bornstein told Vogue Business:
'Searching for a floral dress and having to scroll through endless options without curation is really difficult… Having an iterative dialogue of what the customer is looking for — beyond just a dress — connects with what they’re really emotionally looking for.'
It’s a strong vision, but the underlying tech faces the same hard wall. Daydream enriches its catalogue with subjective tags like 'good for travel' or 'spring appropriate' but still depends on brands supplying decent base data first. Without structured metadata, even the best conversational flows run out of relevant products to retrieve.
Same query as Mango: 'can you find me a green v-neck maxi dress with long sleeves I can wear as a wedding guest' returned surprising results. Not all first page results were green or V-Neck. Possibly indicating some data accuracy challenges.

Daydream also provides an elegant way to bring the individual garments into the conversation to complete the outfit, but there is certainly an opportunity to improve the visual representation of the complete outfit with prompts to style it for different aesthetics.

Daydream certainly stands out in the ability to mix conversational and visual discovery. Done a great job at blending ChatGPT and Pinterest type of Inspiration journeys.

Daydream test findings:
- Seamless conversational-to-visual pivot ('more like this,' 'lower price').
- Shows prices while maintaining inspiration.
- Handles refinement ('beach wedding' → 'pleated green dress' → 'puffy sleeves').
- Contextual styling suggestions ('What can I style this with?').
- Dependent on upstream tagging; irrelevant items can slip through.
- Styling advice remains basic.
- 'Green' returns inconsistent shades.
The Core Problem: AI Can’t Retrieve What Isn’t There
LLM-powered agents are language-first, but effective discovery is data-first.
When a customer asks for:
'A long-sleeve green wrap dress that won’t blow up at the beach,'
the system needs to understand:
- Occasion (beach wedding).
- Style (wrap, quiet luxury, boho).
- Practicality (lined, wind-safe, light or airy).
If these tags don’t exist upstream, the assistant will fallback to:
'I couldn’t find exactly what you asked for — would you like to see something similar?'
And while polite, that slowly erodes trust.
What Zalando’s 75% Accuracy Tells Us
Zalando’s Content Creation Copilot is a serious step forward, using OpenAI’s GPT-4 Turbo to enrich product attributes. But even with smart prompting, human QA is still required — especially for niche terms like 'deep scoop neck' vs. 'low-cut v-neck'.
75% accuracy sounds impressive until you realize it means 1 in 4 tags are wrong, missing, or too vague, which directly impacts what the assistant can retrieve.
Multiply that by 50,000 attributes per week, and the need for continuous QA becomes clear.
What This Means for Retailers Betting on Agentic AI
If you want your AI assistant to outperform your search bar, your product data has to do the heavy lifting.
- Bake occasion and aesthetic tagging into onboarding, not later.
- Normalize subjective tags like 'romantic,' 'Gen Z,' 'business casual.'
- Maintain data governance to avoid synonyms splitting results.
- Keep human QA loops to correct hallucinations and errors.
AI can inspire, but it can’t create structured data out of thin air.
What’s Next?
Mango’s Stylists, Zalando’s Copilot, and Daydream’s 'Say More' are signals of what’s coming: a world where you can talk to your closet, and it talks back. But unless retailers fix their product data plumbing, these sleek assistants will keep hitting the same brick wall.
The bet is worth it — but only if the groundwork is solid.
Disclosure: I work for Mapp, a company that provides enriched product attribution for fashion retail. Our technology is designed to solve precisely the upstream data challenges outlined here, ensuring your AI assistants have the structured, detailed, and aesthetic-aligned product data they need to deliver on the 'personal stylist in your pocket' promise.

Mapp Product Attribution
Why this matters: Enhanced product attribution doesn’t just help your AI find the right green v-neck maxi dress; it transforms fragmented stock into a margin opportunity, fuels scalable brand-aligned styling, and helps retailers move from generic recommendations to true inspiration journeys. Most retail tech isn’t built for fashion’s complexity. Ours is.
Tests conducted on 5th July: https://www.zalando.co.uk/ | https://shop.mango.com/ | https://daydream.ing/
Sources: