Some background
Since day one we've relied on OpenAI's gpt-image-1
engine to power LookSee. It's the only model we've found that can reliably replace objects and still produce a photo-realistic composite without them looking like obvious bad photoshop attempts. It has quirks, sure, but for the type of visualisation we want with LookSee it's been the only option.
That hasn't stopped us test-driving alternatives:
- Google Gemini Flash Image 2.0 - impressive speed, but items looked like stickers in the finished image
- Midjourney - no API 😢
- Kontext Max - Couldn't re-imagine items in different positions to match the space
- Stability AI's SDXL Turbo - low res and not photo realistic
Most either don't support multi-image editing or lose realism when you transplant an item from one photo to another. So we keep hunting.
Why we need better models
You might be thinking, “But my phone already turns my kids into Ghibli characters …” True, and it does a decent job at lettering these days—but image editing still falls over on non-symmetrical shapes (think L-shaped couches) and ultimately pretty expensive to run.
We can mitigate a bit with clever prompts and pre-processing, yet we have still been limited by the core model that is powering LookSee.
Nano Banana sneaks in
Last week a mystery model—codename “nano-banana”—started topping the charts on LMArena. You'd submit a prompt, vote between two images, and only then learn which engines were used. Nano-banana was getting wild responses with crazy realistic generations and consistent multi-step editing shared to various socials.
It pretty quickly got identified as an up and coming model from Google, speculation from Logan Kilpatrick teasing the new model.
Then Google broke cover with a 🍌🍌🍌 tweet from Sundar Pichai, confirming the model is Gemini 2.5 Flash Image and rolling out to the Gemini studio and api.
We dropped everything, wired up the API, and slotted it into our agent.
Early comparison results
Our baseline test scene is the lounge room below. We hammer it with new lights, wall colours, blinds and furniture because it exposes symmetry and perspective issues quickly. Note: Gemini hasn't had any tweaking this is the same prompt used in both models.
Blinds



Pretty straightforward test - although Gemini kept the curtain track.
Blue Couch



This has typically thrown GPT as it's a contoured non-symmetrical shape. Gemini handles this more realistically.
Brown Couch



This stylish couch has devolved into a more standard couch, with wider arm rests and a thicker base. GPT's material finish possibly looks more like the original.
Carpet



GPT is uncharacteristically incorrect here, it's normally pretty good at handling the colours and tone. But this is way too warm. Gemini's colour is more accurate, but is a bit too grey and flat losing the warmth of the room.
Couch Chaise



GPT's achille's heel - it really struggles on non-symmetrical shapes. Gemini in comparison nails it.
Kramer



GPT gives very much an artists impression of the original (compared to Gemini), but GPT has it sitting along the wall more accurately.
Bonus Mistake

Paint Color



GPT is possibly a bit bright, but given the warm colour of the room, this looks pretty accurate, although the ceiling is getting affected too much. Gemini got the tone incorrectly, however the prompt for paint has been tailor made to handle GPT quirks, so that may be working against it.
Rug



Again GPT has failed on the colour of the rug, however it's texture is much more like the original. Gemini is venturing ito shag-pile territory.
Spiral Lighting



This light has been a favourite of mine to test with, as it's spiral layout and gentle illumination has been a struggle for GPT. Gemini does an excellent job here.
Table



In both cases it's included a shelf that shouldn't be there, but otherwise have both done a great job. GPT possibly edged out Gemini with the legs being recessed from the top.
Error - Extra Elements from the table image.

Wall Lights



Wall Sconce lighting can be a good test, as it can change the lighting significantly in a room. GPT arguably is better here, with the exception of removing the door! Gemini was better at maintaining the room, but the illumination is incorrect for the lighting.
Bunk Bed

Bedroom (image courtesy of Keoteun.com)



Lastly, swapping a radically different bed in the room. GPT normally does a pretty good job here, but really mucked it up. It did maintain the bedding (can't really see in Gemini) but that's part of our prompt to retain bed manchester.
In closing…
So far Nano Banana nails:
- Colours of replaced items - Flooring carpets/rugs were much more representative of the original
- Asymmetric items - It's always been a struggle with GPT, so we are glad to see Gemini making great improvements here
But GPT-image-1 still wins are:
- Applying paint - Will need to revisit the paint prompt, but GPT seems to be more on the money here
- Lighting - Like paint, GPT more accurately changes the lighting/shadows compared to Gemini
We'll keep stress-testing, but the early win rate is strong enough that Nano Banana will be utilised in our rendering. Currently it's GPT or Gemini, but I see siutation where we are using both to render our complex scenes - like the LookSee Editor (this is still GPT-image-1 only).
Want to give it a test? jump on to the Beta now!