ChatGPT gets image recognition: 6 wild things people are using it for

When ChatGPT first came out, people were flabbergasted at its remarkably human-like understanding of queries and the way it responded to those.

The AI chatbot became an overnight sensation and was all over social media. In fact, global Google searches for the term ‘artificial intelligence’ reached an all-time high, demonstrating the intense consumer interest in the technology.

But people move on quick. And yet just when the hype seemed like it was fizzling out, OpenAI dropped a couple of new update bombs introducing the ability to ‘see,’ ‘hear,’ and browse the web. The vision feature is particularly impressive, as ChatGPT can now analyse images with a level of detail that almost seems beyond human capabilities.

Naturally, people started to talk about ChatGPT again, and below we have compiled some of the best examples of how people have used the new image recognition feature.

Understanding complex diagrams

Diagrams are used to better represent complex information, but what happens when the diagrams themselves are too convoluted? ChatGPT’s new image capabilities come to the rescue, breaking those down in a language that can easily be grasped even by a child. For instance, one Twitter user was able to get the AI chatbot to explain an image packed with a flow diagram comprised of hundreds of elements.

Helping you learn

It works the other way around too. If you need additional context or notes for a simple diagram/flowchart – or simply want to figure out what it’s even about – ChatGPT does an excellent job at it as well.

Identifying image sources

One Twitter user uploaded a screengrab from the movie Gladiator asking ChatGPT it’s source and what the person in the scene is saying. The chatbot responded like it had watched the movie itself, not only responding to the original query but also generating additional context.

It remains to be seen if the feature works for random shots from movies as well or if it’s limited to popular scenes. But regardless, the tool can come in super handy for reverse image searches, especially when combined with its ability to browse the web.

Interpreting memes and concepts

You either get it or you don’t. Understanding viral memes is sometimes impossible if you are missing the context. Or maybe the post is just too nonsensical or cliche for you to find the humour. If you can’t for the life of you figure out why a meme has received hundreds of thousands of likes, ChatGPT can help.

Additionally, ChatGPT can also help you understand ‘deep’ images with hidden meanings.

Translation

Yes, tools like Google Lens and Microsoft’s Visual Lens exist, but things can sometimes get lost in translation. ChatGPT can come in helpful as a substitute when an attempt to translate text on a hoarding, road sign, shop board, or anywhere else returns gibberish.

Writing code based on images

But perhaps the most impressive application for the feature is its ability to figure out the code for websites and other projects – from screenshots alone – and replicating it accordingly. For example, a user uploaded a screenshot of a SaaS dashboard and ChatGPT produced the complete code for it. Upon checking if the code worked, the developer was astonished to see it indeed got most things right.

Of course, it’s not even been a full week since ChatGPT gained the ability to see, hear, and speak, so it’s fair to assume that these use cases only scratch the surface. People are continuing to experiment with different types of inputs and there’s probably a host of cool new applications waiting to be discovered.

ChatGPT’s new image and voice capabilities are still undergoing rollout and are currently exclusive to Plus and Enterprise users.

May be an image of phone and text that says "The ChatGPT hear, and can now see, speak They"

Leave a Reply

Your email address will not be published. Required fields are marked *