Google is making sweeping changes to artificial intelligence operations to raise money from Gemini

Updated 4 months ago on June 22, 2024

There have been some changes at Google over the past week that have not been widely publicized, including the appointment of a new head of AI development, suggesting that the company is moving in a new direction.

Google has also priced the use of its Gemini API and shut down most of the free access to its APIs. The message is clear: the party for developers looking for AI freebies is over, and Google wants to capitalize on AI tools like Gemini.

Google has given developers free access to its old and new APIs for its LLMs. The free access was an attempt to entice developers to adopt their AI products.

Customers learn about Gemini mainly through the company's chatbot interface. However, many developers create their own chatbots using Gemini. Through APIs, Gemini receives questions and answers, which are then delivered to users in the user interface.

Specifically, Google is closing access to the PaLM API (the older version before Gemini LLM) through its AI Studio. Google is also closing free access to the more advanced Gemini Pro API by introducing a new paid plan that restricts free use. Essentially, all paths to the API end at Gemini 1.0 Pro, around which Google is consolidating its developer activities.

Other moves hint at major changes in Google's artificial intelligence plans.

This week, Google hired Logan Kilpatrick as head of AI Studio and Gemini APIs. Kilpatrick comes from OpenAI, which he joined in December 2022, where he led developer outreach. At OpenAI, he "helped scale the... developer platform to millions of developers," according to his LinkedIn profile.

Kilpatrick will now do the same for Google and its Gemini artificial intelligence platform.

Google is lagging behind OpenAI in chatbots, and catching up in getting developers to use its AI platforms.

Many companies chose OpenAI's APIs because they were first to market. OpenAI now charges customers for their APIs and access to their large language models.

For example, the OpenAI API is available as part of Windows PCs running Intel's Core Ultra chips, where the API connects users to OpenAI to answer questions. Security companies are integrating ChatGPT into their software products. Companies such as Glean are integrating OpenAI into their enterprise search offerings.

Google is attracting developers with its cloud and AI Studio service. For now, developers can get free API keys on Google's website, which provides access to Google LLM through a chatbot interface. So far, developers and users have enjoyed free access to Google LLM, but that too is coming to an end.

Google struck a double blow this week, effectively shutting down free access to its APIs via AI Studios.

In an email to developers earlier this week, Google announced that it was shutting down access to the PaLM API (the pre-Gemini model) for developers through AI Studio effective August 15. Developers had free access to the PaLM API, which was used to build custom chatbots.

"You'll be able to prompt, configure, and perform inference using the Google AI PaLM API until August 15, 2024," Google said in a March 29 email to developers.

"We recommend testing hints, customization, inference, and other features with a stable version of Gemini 1.0 Pro to avoid outages. To access Gemini models through the Google AI SDK, you can use the same API key you used for the PaLM API," Google said in a statement.

Google also announced this week that it is limiting access to its Google Gemini model API in an attempt to turn free users into paying customers. Free access to Gemini has allowed many companies to offer LLM-based chatbots for free, but Google's changes will likely lead to the shutdown of many of those chatbots.

"Paid pricing for the Gemini API will be introduced," Google said in a message sent to developers on Monday.

"If you're using the Gemini API from a project that has billing disabled, you can still use the Gemini API for free, but without the benefits available in our paid plan," Google said in a statement.

The free plan includes two queries per minute, 32,000 tokens per minute and a maximum of 50 queries per day. One downside, however, is that Google will use the chatbots' responses to improve its products, which presumably includes LLM.

The paid plan includes five queries per minute, 10 million tokens per minute, and 2,000 queries per day. The preview price is $7 for inputting 1 million tokens or $21 for outputting 1 million tokens. The hints/responses in the paid model will not be used by Google to improve its products.

There is one exception: PaLM and Gemini will remain available for customers already paying for Vertex AI on Google Cloud. Regular developers with lower budgets typically use AI Studio because they can't afford Vertex.

Google's APIs use hardware hosted in the company's data centers to provide answers to clients. Gemini runs on TPUs that do the training and inference.

Google has committed billions to building new data centers, most recently a $1 billion data center in the UK.

The hundreds of billions spent on data centers for AI startups is a gamble because companies don't have proven AI revenue models. As the use of LLMs grows, small revenue streams from offerings like APIs can help drive down hardware and data center costs.

Other AI companies are spending billions to build new data centers and are looking to AI revenue to pay the bills.

It was recently reported that Amazon will spend $150 billion over 15 years to build new data centers.

OpenAI and Microsoft plan to spend $100 billion on a supercomputer called Stargate, The Information reports.

For those who don't want to pay, Google has released large language models called Gemma, based on which users can build their own artificial intelligence applications. Other open source models such as Mixtral are also gaining popularity. Meta CEO Mark Zuckerberg extols an upcoming LLM model called Llama 3 as a framework that will lower the cost of AI adoption.

As the cost of AI increases, customers are leaning toward open source LLMs. These models can be downloaded and run on custom hardware configured to run applications, but most cannot afford such hardware, which in most cases are Nvidia GPUs. AI hardware is also not easy to find freely available.

Let's get in touch!

Please feel free to send us a message through the contact form.

Drop us a line at mailrequest@nosota.com / Give us a call over skypenosota.skype