The Generative AI landscape has developed very quickly in the last few years, and there are a huge array of companies offering different capabilities. The Wall Street Journal recently provided an overview of four key Generative AI companies in this market. While more niche players may offer a specific benefit for you company, these four companies offer the best place to start. Let’s dig in!
Making the Right Comparisons
There are many ways, both broad and fine, to compare Generative AI offerings between companies. Along with many factors that matter for your specific business (to be covered in a future post), the essential factors of contracts provide a good starting point for comparison:
- Quality – Does the AI’s performance meet your needs? Does it have a sufficient context length for your application? Is 100% correctness required, or is the occasional error manageable? Is it okay to share your data with another company, or is privacy a critical requirement?
- Cost – There are multiple ways to implement AI in your business. Do you want to pay based on consumption on an OpEx basis? Or would you rather stand up your own system on a CapEx basis?
- Timing – How quickly do you need this implemented? When implemented, what’s the response speed you need?
- Quantity – Some services offer bulk processing, while others may limit your transaction rate at certain payment tiers. How much will you be using this service, and what’s the right cost/usage balance for you?
All companies listed offer both chat (human-to-computer) and API (computer-to-computer) options. These options typically use the same models, but are frequently billed differently. Plan accordingly!
Company #1: OpenAI
OpenAI kicked off the most recent Generative AI race with the release of GPT-3 in November 2022. Since then, they’ve continued to lead the world with the quality and ease-of-use of their products.
Their latest model, GPT-4o, supports multimodal input (voice, text, and images) natively, and performs 2x as fast as their next best model. They also offer older or more focused models at lower cost, based on your needs. Here’s how they compare:
- Quality – OpenAI continues to set the standard for AI performance. GPT-4o’s context length is 128k tokens, or the length of an airport book. It does occasionally return erroneous information, although that’s largely determined by your guidance to the model and topic. You are sharing your data with OpenAI in this process. For the Free and Plus chat plans, your data will be used to train future models. All other plans ensure your data won’t be used for future training.
- Cost – For chat-based interaction, they offer four options: free, Plus ($20/mo), Teams ($30/mo, 2 people minimum), and Enterprise (custom). Each of these have their own trade-offs between price, capability and privacy – you’ll need to determine what’s right for your business. For API-based interaction, they charge by the token, with lower prices for smaller, less-capable models and higher prices for more powerful models. Their prices are in line with other providers at similar capabilities.
- Timing – Response speed is generally based on volume and plan type, and can vary a bit through the day/week. They also offer a specific slower “batch” option, where you get a 50% discount if your response is not time-sensitive.
- Quantity – For chat-based interactions, there are some capacity limits associated with each plan type. For API-based interaction, there are limits on number of transactions/second based on the model used.
The article notes that “no one ever got fired for choosing ChatGPT” (the OpenAI chat offering). If you’re just starting with AI for your business and data privacy is not a core concern, you will not go wrong here.
Company #2: Google
Despite having invented much of the technology for today’s AI, they’ve been an AI also-ran over the last few years. While not having the best models available, they offer features and integrations not available with other providers, which offers three distinct advantages.
First, they are building AI directly into many of their products. If you use any Google Workplace software (Docs, Sheets, etc.), many AI-empowered tools like autocomplete are already built-in. They also recently announced AI Teammates, a system that exists between your documents and can act as a, well, teammate, searching information, answering questions, etc. This will be a unique offering from a major provider.
Second, Google’s flagship model, Gemini 1.5 Pro, now has a 1M token context length, with 2M announced later this year. For reference, this is the equivalent of multiple copies of Moby Dick or War and Peace, with the ability to inquire, summarize, or build on any part of the text inside. This large context window is also unique for a major provider, and a significant advantage if you need to work with lots of documents.
The last, very technical (but important!) feature coming soon is called “context caching”. Historically, when you share data with an AI, the slate gets wiped clean once the interaction (chat or API session) is over. This is fine for small chats, but challenging when you’re querying large sets of documents – each session using those documents would require uploading them again (and getting charged for this again!). With “context caching”, though, you can upload a lot of data, and then reference it in the future over multiple sessions, without uploading again. This will be a huge time and money savings for a lot of data-intensive use-cases.
Now, for the comparison:
- Quality – Gemini 1.5 Pro is a very good model, if not quite as performant at OpenAI’s GPT-4o. Like other models, it does occasionally return erroneous information, although that’s largely determined by your guidance to the model and topic. You are sharing your data with Google when using their systems, although using AI through a Workspace account will keep your information out of their AI training data.
- Cost – For chat-based interaction, they offer two options: free and Advanced ($20/month). The main difference is capability – free is a slightly older model with smaller context window, while Advanced is Gemini 1.5 Pro. For API-based interaction, they have a free tier that’s great for experimentation (but your information is in the training data), and a pay-as-you-go plan that’s similar to OpenAI’s.
- Timing – Response speed is generally good, and powered by Google’s immense cloud capabilities. They are also rolling out a new model called “Flash”, specifically designed to answer small queries quickly and cheaply. In independent testing, this has been 2-3x faster than any other 1st party model available.
- Quantity – For chat-based interactions, there are some capacity limits associated with each plan type. For API-based interaction, there are limits on number of transactions/second based on the model used that are generally lower than OpenAI’s limits today.
If any of Google’s unique AI features are critical for your business, they’re a great option.
Company #3: Anthropic
You may not have heard of Anthropic, as they are a relatively new player in the AI space. However, they have some of the smartest AI engineers in the business, and their models are very competitive with OpenAI and Google.
Their excellent base model, Claude, is offered in three tiers: Haiku, Sonnet, and Opus. Each offering provides a trade-off of cost and speed vs context length. Here’s how they compare:
- Quality – Most AI comparisons place the Claude models favorably among the Google and OpenAI models. They claim to have very low error rates, and context windows up to 200k tokens. Again, all of your data is shared with Anthropic, but they offer an API that offers both SOC II Type 2 and HIPAA compliant certifications.
- Cost – Like OpenAI, they offer multiple tiers for the chat option. The free option only provides access up to Claude Sonnet, and the Pro and Team versions are priced identically to OpenAI, with higher usage limits and access to Claude Opus. For the API, pricing for Haiku is $1.50 per 1M tokens in/out (among the lowest), where Opus is $100 per 1M tokens in/out (highest by far).
- Timing – Research from Artificial Analysis found Claude Haiku as one of the fastest major models today (113 output tokens/s), and Claude Opus the slowest (29 output tokens/s).
- Quantity – The requests per minute are limited based on payment tier, with the free tier getting 5 requests per minute, and the top tier getting 4000 requests per minute.
If speed, compliance, or low error rates are your most critical needs, then Anthropic may be a great fit for your business.
Company #4: Meta
Meta has taken a much different direction from the other major AI providers. Rather than build and serve their own secret AI, they made their advanced AI models available for everyone to use. With their Llama 3 model, you can get GPT-4 levels of performance, but from one of any number of providers across the internet, offered at the price and speed that you choose.
If privacy or system control is a primary concern for you, you can also deploy the model yourself on your own company’s servers. There are even versions of Llama 3 that allow you to download a single file to you computer and run it locally (if slowly). You get to choose your own adventure.
But how does this compare to the others?
- Quality – In most comparisons, Llama 3 is on par with GPT-4 or Gemini. The major downside is the context window, which is only 8k tokens. (Since the model is open, some programmers have reportedly extended the context window significantly, but this has not been officially blessed by Meta at this point). This limits many document intensive workflows while supporting more short-term, interactive workflows.
- Cost – This varies based on your approach. Some providers, such as Replicate, offer the best Llama model at $3.40 per 1M tokens via API. If you decide to serve a comparable model yourself, the server will likely cost a few thousand dollars minimum, not including labor, but OpEx costs in the few cents per 1M tokens.
- Timing – This, again, varies on your approach. The fastest option, Groq, can serve at speeds of a few hundred tokens per second. Serving yourself is likely on the order of dozens or low hundreds of tokens per second.
- Quantity – Rate limiting will vary considerably among providers. If you manage it yourself, rate is limited only by your network and server hardware.
If speed, price, or privacy is your top concern, then Meta’s Llama 3 model – either served yourself or via one of many providers – is your best bet.
What about other Generative AI companies?
As I mentioned before, there’s a large-and-growing number of AI’s and AI services available. While these four options are the best today, this article will likely age like milk within a few years. As always, I’ll continue to provide updated perspectives on this as the marketplace evolves.
However, this underscores an important point. The AI space is evolving astonishingly quickly, and the AI you use today will be the worst you will ever use. Quality, speed, and performance only go up from here. You want to design your processes and systems so that you are not locked into a single vendor, and can switch easily as better options arise.
The potential for AI in the business world is huge. If you’re just getting started, these generative AI companies will help you make the most of it.
Stay informed on the most important AI news in just 30 minutes a month by signing up for the Executive Summary newsletter by AI For Business.
If you liked this post, have future posts delivered straight to your inbox by subscribing to the AI For Business newsletter. Thank you!