Google will pause Gemini artificial intelligence image generation

Google announced on Thursday that it will temporarily suspend the image generation of people for its new version of a powerful artificial intelligence model, Gemini, after facing criticism and ridicule for producing inaccurate and inappropriate historical depictions.

What is Gemini and what can it do?

Gemini is Google’s flagship suite of generative AI models, apps and services, launched at the end of 2023. It is designed to create realistic and diverse images, texts, videos, sounds and other forms of content from user inputs, such as text prompts, sketches, keywords or existing media.

Gemini is based on a deep neural network architecture that can learn from large amounts of data and generate novel outputs that are coherent, consistent and creative. Gemini can also adapt to different domains, styles and tasks, such as creating logos, cartoons, portraits, landscapes, poems, stories, songs, code and more.

Gemini is available as a web-based platform, a mobile app and a subscription service that offers access to a more powerful version of the AI model. Gemini also integrates with other Google products and services, such as Gmail, YouTube, Photos, Drive, Docs and Assistant.

What went wrong with Gemini’s image generation of people?

One of the features of Gemini is the ability to generate images of people from text prompts, such as names, descriptions, occupations or historical contexts. For example, a user can type “a famous painter” and Gemini will produce an image of a person who looks like a painter, along with some metadata, such as the name, nationality and style of the painter.

However, users on social media have been posting examples of images generated by Gemini that showed historical figures, such as the U.S. Founding Fathers, popes, kings and queens, in a variety of ethnicities and genders that did not match their actual identities. Some users called this inaccurate, misleading, disrespectful or even racist, while others mocked Google’s AI for being a “nonsensical DEI parody” (DEI standing for ‘Diversity, Equity and Inclusion’).

Google said in a post on X, its social media platform, that it was aware of the issues with Gemini’s image generation of people and that it was working to improve the historical accuracy of the outputs. Google also said that it would pause the image generation of people and re-release an improved version soon.

Why did Gemini generate inaccurate historical images?

Google has not provided a detailed explanation of why Gemini generated inaccurate historical images, but some possible factors are:

  • The quality and quantity of the training data. Gemini’s image generation of people is based on a large dataset of images of people from various sources, such as the internet, public databases, Google’s own products and user-generated content. However, the dataset may not be representative, balanced or diverse enough to capture the nuances and complexities of historical contexts, such as the ethnicity, gender, culture, clothing, hairstyle and accessories of different people in different times and places.
  • The limitations and biases of the AI model. Gemini’s image generation of people is based on a deep neural network that learns from the data and tries to generate realistic and diverse outputs that match the user input. However, the AI model may not be able to understand the meaning, relevance or appropriateness of the user input, especially when it is vague, ambiguous or open-ended. The AI model may also have inherent biases or assumptions that affect its outputs, such as favoring certain ethnicities or genders over others, or generating stereotypical or unrealistic images of people.
  • The expectations and interpretations of the users. Gemini’s image generation of people is intended to be a creative and playful tool that can generate a wide range of people from user inputs. However, the users may have different expectations and interpretations of the outputs, depending on their background, knowledge, preferences and values. The users may also have different criteria and standards for judging the accuracy, quality and suitability of the outputs, especially when they involve historical figures or sensitive topics.

How can Google improve Gemini’s image generation of people?

Google has not revealed its plans or timeline for improving Gemini’s image generation of people, but some possible steps are:

  • Reviewing and refining the training data. Google could examine and evaluate the dataset of images of people that Gemini uses to learn and generate outputs, and make sure that it is representative, balanced and diverse enough to cover the historical contexts that users may request. Google could also augment, update or remove the data that is inaccurate, outdated, incomplete or inappropriate.
  • Improving and testing the AI model. Google could enhance and optimize the neural network architecture and algorithms that Gemini uses to generate images of people, and make sure that it can understand, process and match the user inputs more accurately and appropriately. Google could also test and validate the AI model’s outputs more rigorously and systematically, and identify and correct any errors, inconsistencies or biases that may occur.
  • Communicating and educating the users. Google could inform and explain to the users the capabilities and limitations of Gemini’s image generation of people, and the sources and methods that it uses to create the outputs. Google could also provide and encourage the users to use feedback, guidance and control mechanisms, such as ratings, comments, suggestions and options, to improve their experience and satisfaction with the tool.

What are the implications and impacts of Gemini’s image generation of people?

Gemini’s image generation of people has sparked a lot of debate and discussion among the public, the media, the experts and the regulators, as it raises various ethical, social, legal and technical issues and challenges, such as:

  • The representation and diversity of people in AI-generated images. Gemini’s image generation of people aims to create realistic and diverse images of people from user inputs, but it may also produce inaccurate and inappropriate historical depictions, as well as unrealistic and stereotypical images of people. This may affect how people perceive themselves and others, and how they are represented and valued in society.
  • The authenticity and verifiability of AI-generated images. Gemini’s image generation of people can create convincing and indistinguishable images of people that do not exist or that are different from their actual identities, such as historical figures, celebrities, politicians or ordinary people. This may pose risks of deception, manipulation, fraud, identity theft, defamation, misinformation and disinformation, especially in the digital and online domains.
  • The ownership and rights of AI-generated images. Gemini’s image generation of people can create original and creative images of people that may have artistic, commercial or personal value, but it may also infringe on the intellectual property, privacy or personality rights of the people whose images are used or generated, such as the source images, the user inputs, the AI outputs or the AI model itself. This may raise questions of who owns, controls, benefits from and is responsible for the AI-generated images and the data involved.

The following table summarizes some of the pros and cons of Gemini’s image generation of people, based on different perspectives and criteria:

Perspective/CriteriaProsCons
User– Can create a wide range of people from user inputs
– Can explore and express creativity and curiosity
– Can have fun and entertainment
– May encounter inaccurate and inappropriate historical depictions
– May be deceived or manipulated by AI-generated images
– May infringe on the rights of others
Google– Can showcase and leverage its AI capabilities and innovations
– Can attract and engage more users and customers
– Can generate more data and revenue
– May face criticism and ridicule for producing inaccurate and inappropriate historical depictions
– May be accused or sued for infringing on the rights of others
– May have to comply with more regulations and standards
Society– Can benefit from the AI-generated images for various purposes and domains, such as education, art, entertainment, research, etc.
– Can promote and celebrate the diversity and representation of people
– Can foster and facilitate the dialogue and awareness of the ethical, social, legal and technical issues and challenges of AI-generated images
– May suffer from the negative impacts of the AI-generated images, such as deception, manipulation, fraud, identity theft, defamation, misinformation and disinformation
– May experience the erosion and loss of the authenticity and verifiability of images
– May encounter the conflicts and disputes over the ownership and rights of AI-generated images and the data involved
pros and cons of Gemini’s image generation of people, based on different perspectives and criteria

Conclusion

Gemini’s image generation of people is a powerful and innovative tool that can create realistic and diverse images of people from user inputs, but it also has its limitations and drawbacks, as it may produce inaccurate and inappropriate historical depictions, as well as pose various ethical, social, legal and technical issues and challenges. Google has decided to pause the image generation of people and re-release an improved version soon, but it remains to be seen how Google will address and resolve the problems and concerns that Gemini’s image generation of people has raised.

One comment

  1. I don’t think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article.

Leave a Reply

Your email address will not be published. Required fields are marked *