The goal of this test was straightforward: take a photo and convert it into a clean black-and-white version, where the subject (a person) appears in solid black while the rest of the background is completely white. The challenge was not only to make the image black and white but also to remove all distracting elements so the focus remains entirely on the subject.
This type of transformation is useful in web design, branding, and visual content creation, where minimalism and clarity are critical. For this test, I used OpenAI’s ChatGPT image model and Google Gemini’s Nano Banana image model, applying the same task to see which one delivered better results.
Winner
After running the test, the clear winner was Google Gemini Nano Banana. Its outputs were closer to the intended result, generated faster, and required less fine-tuning compared to ChatGPT’s image model.
Step-by-Step Process
Plan and Device: I used the Chagpt Free Version and the Google Gemini Free (2.5 Flash) Version, both in a desktop browser and turn on “create images”
Step 1: (Image Import) I uploaded the same source image to both ChatGPT and Google Gemini platforms.
Step 2: (Initial Prompt) Both models were asked to produce a black-and-white version, highlighting the subject in black and keeping the background entirely white.
“I want to turn this image into a pure black and white image. I just need to make the shape of the woman/person black and the rest completely white.”
Step 3: (Refinement Prompt) Based on initial results, I provided a second, more specific prompt to both models
“Please remove all elements, shapes, and objects except for the woman/person shape. The background should be a clear white.”
Final Result
Comparison
| ChatGPT | Gemini Nano Banana | |
|---|---|---|
| Processing Time | 2m 55s | 19s |
| File Size | 1.47MB | 721KB |
| Image Resolution | 1536×1024 | 1344×768 |
| Output Quality | High | Medium (Same as Orginal) |
| Requirement Fulfillment | No, ChatGPT generated a completely different image compared to the original source image; it created its own new image while completely neglecting the guidance. Without image editing, it just created a new image. | Yes, Gemini output the desired result I was looking for, as explained in the prompt. |
Conclusion
The results of our test were conclusive. While both models are powerful tools, their performance in this specific task was markedly different.
- OpenAI’s ChatGPT took significantly longer—almost three minutes, to be precise—and ultimately failed to produce the desired result. Instead of editing the original image as instructed, it generated a completely new, high-resolution image that neglected the core guidance of the prompt. This demonstrates a key limitation: it lacks the capability to edit an existing image and instead acts as a pure image generator.
- Google’s Gemini Nano performed exceptionally well. It took a mere 19 seconds to deliver the exact output we requested. By isolating the subject and removing all unnecessary background elements, Gemini proved it could follow precise instructions and effectively edit the original image to meet our specific requirements. Its efficiency and accuracy were impressive.
Overall, Gemini proved to be easier to control and produced the desired outcomes quickly and efficiently, while ChatGPT’s performance highlights the distinction between image editing and new image generation.
Final Verdict
Based on these results, my recommendation for anyone needing fast, precise image editing using simple text prompts is clear. Google Gemini is the superior choice. Its ability to accurately interpret and execute specific editing instructions, coupled with its remarkable speed, makes it an invaluable tool for designers and anyone else needing quick, effective, and free image transformations.
















