URGENT UPDATE: Startups are raising alarms over Amazon’s AI chips, claiming they are falling significantly short of Nvidia’s performance. An internal Amazon document from July 2023 reveals that AI startup Cohere has found Amazon’s Trainium chips “underperforming” compared to Nvidia’s H100 GPUs, sparking concerns about their competitiveness in the rapidly evolving AI landscape.

The document, obtained by Business Insider, highlights that access to the Trainium 2 chips has been “extremely limited” and plagued by frequent service disruptions. This raises critical questions about Amazon’s ability to deliver reliable AI solutions to its cloud customers, especially as Nvidia commands a staggering 78% of the AI chip market.

With Amazon heavily investing in its in-house AI chips to fuel its next phase of growth, the challenges outlined by startups underscore the urgency of the situation. Stability AI, known for its AI image generation, echoed similar concerns, noting that Trainium 2 chips lag in speed and cost-effectiveness compared to Nvidia’s offerings.

Amazon’s efforts to compete in the AI-cloud race hinge on the success of its Trainium chips. Historically, the company’s profitability stemmed from designing its data-center chips, avoiding hefty Intel fees. However, the current scenario suggests that AWS might face difficulties if customers prefer Nvidia’s proven technology, potentially impacting Amazon’s future revenues.

“The performance challenges with Cohere are still under investigation, but progress has been limited,” the internal document stated.

Amazon representatives stated they appreciate feedback to enhance their chips and pointed out that Trainium is already seeing adoption from clients like Ricoh and Datadog. They claim that Trainium chips provide 30% to 40% better price performance than current GPUs. However, customer dissatisfaction is alarming, as some have reported that Nvidia’s older A100 GPUs are up to three times more cost-efficient than AWS’s Inferentia 2 chips for specific workloads.

Moreover, AWS’s share in the market is worrying, with just 2% compared to Nvidia’s dominance. The document reveals that AWS faces long-standing complaints about its AI chips, which have created “friction points” and hindered adoption.

In a recent $38 billion partnership between AWS and OpenAI, the absence of Trainium in the deal highlights Amazon’s struggles. Analysts noted that it was “logical” for OpenAI to start with Nvidia GPUs due to their superior performance and established developer ecosystem.

Looking ahead, Amazon is set to unveil Trainium 3 later this year, aiming to accommodate more customers and address the ongoing feedback. AWS CEO Andy Jassy recently mentioned that Trainium 2 chips are “fully subscribed” and have become a “multibillion-dollar” business.

As Amazon continues to navigate these challenges, the stakes remain high. The performance of Anthropic, a major client using Trainium chips, may prove crucial. The startup’s ability to leverage Trainium effectively could either validate Amazon’s investments or further complicate its competitive standing.

Investors are closely watching how AWS will adapt amidst these challenges. As market dynamics shift, Amazon’s response in the coming months will be pivotal in determining its future in the AI cloud sector.

Stay tuned for further updates as this developing story unfolds. The competitive landscape for AI is changing rapidly, and Amazon’s next moves will be critical.