Melios
Blog

Choosing AI for Magento Page Builder

Comparing popular AI models to choose the one that can create beautifully designed sections without developer assistance.

Let's battle test AI Composer element with different prompts to see which models produce the best outcome for ecommerce stores.

1. Making a section similar to provided image

Let's steal some designs first 😈. I'll use the following prompt with a few different images:

Create a section based on provided image

Image 1

A clean, minimalistic, flat layout with sharp rectangular edges:

Claude Opus 4.7 and GPT-5.5 provided accurate and consistent results. They properly understood the key aspects of the source design (flat layout, sharp edges, striped full-width sections) and successfully recreated them. Are they worth their price in this test? No, because cheaper models was able to provide nice results too.

Claude Sonnet 4.6 and GPT-5.3-codex were able to deliver the same level of quality at a lower cost.

It's worth noting the difference between GPT-5.3-chat and GPT-5.3-codex. The chat model was more random and tended to add big border radius and shadows. Codex, on the other hand, used more tokens and was therefore more expensive. In fact its price was on the same level as GPT-5.4 model.

GPT-5.4 and Kimi-K2.6 models produced good results, but both struggled with details. GPT-5.4 didn't made full-width sections, Kimi-K2.6 didn't notice striped-rows and sometimes forgot to code three rows and produced two instead.

Hovewer, I do recommend trying Kimi-K2.6 because of its price.

Comparison table

Model Design Accuracy Consistency Speed Cost
Claude-Opus-4.7 ★★★★★ ★★★★★ ★★★★★ 18s $0,058
Claude-Sonnet-4.6 ★★★★★ ★★★★★ ★★★★★ 23s $0,0276
GPT-5.3-codex ★★★★★ ★★★★★ ★★★★★ 23s $0,0252
GPT-5.3-chat ★★★★☆ ★★★☆☆ ★★★★☆ 15s $0,0166
GPT-5.4 ★★★★★ ★★★★☆ ★★★★☆ 19s $0,0242
GPT-5.5 ★★★★★ ★★★★★ ★★★★★ 28s $0,0531
Kimi-K2.6 ★★★★☆ ★★★★☆ ★★★☆☆ 18s $0,00579

Image 2

Modern, fullscreen split-grid layout:

Claude Opus 4.7 and GPT-5.5 proved their status again. They did the clean and accurate job. However, cheaper models were on par with them again.

GPT-5.3-chat and Kimi-K2.6 were able to deliver the same level of quality at a lower cost. Claude-Sonnet-4.6 wasn't 100% accurate, but it kept the most important detail — full-width layout.

GPT-5.3-codex and GPT-5.4 had the same issue in this test. They both didn't made the full-width layout.

Comparison table

Model Design Accuracy Consistency Speed Cost
Claude-Opus-4.7 ★★★★☆ ★★★★★ ★★★★★ 23s $0,0655
Claude-Sonnet-4.6 ★★★★☆ ★★★★★ ★★★★★ 17s $0,0281
GPT-5.3-codex ★★★★★ ★★★★☆ ★★★☆☆ 25s $0,0269
GPT-5.3-chat ★★★★☆ ★★★★★ ★★★☆☆ 17s $0,0152
GPT-5.4 ★★★★☆ ★★★★☆ ★★★★☆ 16s $0,0246
GPT-5.5 ★★★★★ ★★★★★ ★★★★★ 35s $0,0668
Kimi-K2.6 ★★★★★ ★★★★★ ★★★★☆ 21s $0,00716

Image 3

This test will be the hard one. Current models can't produce good quality for this type of graphic.

Playful, vintage, promotional retail banner.

GPT-5.5 did a great job, while Claude Opus 4.7 lost the cartoonish style and added too much whitespace.

GPT-5.3-codex and GPT-5.4 did the good job. Not quite as good as GPT-5.5, but very close!

GPT-5.3-Chat, Claude-Sonnet-4.6, and Kimi-K2.6 produced the least attractive results in this test.

Comparison table

Model Design Accuracy Consistency Speed Cost
Claude-Opus-4.7 ★★★★☆ ★★★★☆ ★★★☆☆ 22s $0,0694
Claude-Sonnet-4.6 ★★★☆☆ ★★★☆☆ ★★★☆☆ 16s $0,0231
GPT-5.3-codex ★★★★☆ ★★★★☆ ★★★☆☆ 15s $0,0206
GPT-5.3-chat ★★★☆☆ ★★☆☆☆ ★☆☆☆☆ 13s $0,0113
GPT-5.4 ★★★★☆ ★★★★☆ ★★★★☆ 25s $0,0441
GPT-5.5 ★★★★★ ★★★★★ ★★★★☆ 39s $0,0936
Kimi-K2.6 ★★★☆☆ ★★★☆☆ ★★★☆☆ 60s $0,00877

Summary

GPT-5.3-Codex, Claude-Sonnet-4.6, and Kimi-K2.6 are great to start with AI Composer.

If a higher budget is acceptable, GPT-5.5 and Claude-Opus-4.7 are also strong options and delivered excellent results.


More tests and models coming soon!