Elon Musk's xAI has introduced its latest advancements with the launch of Grok-2 and Grok-2 mini in beta. These new AI models bring enhanced reasoning capabilities and a notable new feature: image generation. Currently, this feature is available exclusively to Premium and Premium+ users on the X social network.
The release marks a significant upgrade from the previous Grok-1.5 model, with xAI highlighting Grok-2's advanced abilities in chat, coding, and reasoning. The Grok-2 mini, while smaller, shares many of these improvements. Early tests have shown Grok-2’s performance on the LMSYS leaderboard under the name “sus-column-r,” showcasing its cutting-edge technology.
In addition to these updates, xAI plans to offer both Grok models to developers via its enterprise API later this month, potentially expanding their use in various applications.
However, the rollout of Grok’s image generation feature has raised concerns. Currently, there are no restrictions on creating images of political figures, a situation some users are exploiting, especially with the U.S. presidential election approaching. The ability to generate politically sensitive content without oversight may put xAI under pressure to implement safeguards.
Early observations suggest that Grok-2's image generation leverages FLUX.1 by Black Forest Labs. Although details on Grok-2’s full capabilities are limited, app researcher Nima Owji claims that Grok-2 excels in code generation, writing, and news summarization, although Grok-1.5 had issues with summarizing news accurately.
The lack of controls on image generation raises the risk of Grok being used to spread misinformation. It remains unclear whether AI-generated images will include metadata to indicate their origin. We’ve reached out to X for comment on how they plan to address potential misuse, but responses have been sparse since Musk's takeover.
Looking ahead, xAI plans to integrate Grok-2 and Grok-2 mini into X’s AI-driven features, including improved search functions, post analytics, and potentially AI-powered replies. As the company continues to develop these models, the addition of multimodal understanding could enhance user interactions on X, though the exact impact remains to be seen.