I follow a lot of nerd stuff to bring you the best insights on AI for business. Developments like specific chip technology or hardware generations are not things business leaders should have to worry about. Your focus should be the processes and outcomes, not the hairy details of the underlying technology.
However, there have been a number of recent announcements in the tech space that lead to a clear pattern about the future of AI services – and a clear mantra for business leaders:
Minimize investment, keep switching costs low.
Of course, this isn’t the first time I’ve said it. However, I want to take a minute to really dig into the “what” and “why” of these recommendations. These developments will have big implications for the value generative AI can deliver to your business now and into the future, and the implementing the right strategy will be critical. Let’s dig in.
Trends in AI Services
Today, GPU’s (graphical processing units) are the core hardware of generative AI. At the recent COMPUTEX event, NVIDIA – today’s undisputed leader in GPU’s – showed the following graph, comparing the trend of computational power of their chips to Moore’s Law:
Moore’s Law says computing power will typically double every 18 months. According to the graph, the computing power of NVIDIA’s chips is growing by an order of magnitude every 2 years – an enormous difference!
But, as I’ve noted before, it won’t be long before other players are on the scene. Microsoft is working with numerous partners to expand availability of AI-specific chips called NPU’s. Other companies like Groq and SambaNova have already demonstrated bespoke systems that can serve AI at 100x-1000x faster than existing GPU-based systems. The age of AI-focused compute is just around the corner.
And then there’s the fate of all of the existing GPU’s already in service. When the big AI players decide to upgrade to the latest NVIDIA (or other) AI hardware, they won’t throw out all of the old GPU’s. This last generation of hardware will likely find it’s way to hobbyists, start-ups, and other companies looking to run their own AI affordably. They’ll get today’s AI at tomorrow’s prices.
Putting all this together, and even considering major increases in demand, this all leads to one clear assessment:
Within the next few years, AI is going to get both dramatically faster and dramatically cheaper.
But what does this mean for business?
Rule 1: Minimize Investment
We are still in the early days of the generative AI revolution. While some applications, like Tier 1 customer support, are clear winners, experimentation is still the norm.
With both CAPEX and OPEX options widely available for AI, and trends moving toward dramatically cheaper AI, the rule for business should be simple: only invest where you can see fast payback.
With opportunities like customer support, where savings of 60-80% are common, the payback time for using generative AI is measured in months, if not faster. If your company already has datacenter competencies, simply buying the AI servers may be your most affordable option – even on very short timescales.
However, if you’re still experimenting or the business case hasn’t yet been proven, then pay as you go, and use the least expensive models that work for you. Once the business case is dialed in, cheaper and faster AI will be waiting for you.
Rule 2: Keep Switching Costs Low
OpenAI’s ChatGPT came out in November 2022, kicking off the era of widespread generative AI just over 18 months ago. Since then, not only has OpenAI released multiple new models at various price points and capabilities, but other very competitive options have become available from Google, Microsoft, and Anthropic. The cost of the original ChatGPT model, GPT-3.5, is now 95% lower than when it was first offered – and this has become the rule, not the exception.
Business uses that may not have pencilled at earlier prices will become no-brainers a year from now. Likewise, they may make sense with one service today, and another service tomorrow. If you had locked in your pricing in early 2023, you would have missed out on the 95% price decrease of GPT-3.5, or the increased capabilities of the newer GPT-4 at an 80% lower price.
Using a pay-as-you-go OPEX strategy lets you pay for exactly the benefits you get today and takes advantage of any increases in capability or decreases in price that will come in the near future.
Likewise, you should design your implementation so that switching from one model to the next, or one provider for the next, is fast and seamless. With a few minutes of effort, the savings will go straight to your bottom line.
The Cloudy Crystal Ball
Of course, all of this is speculative, not investment advice, etc. You need to make the best decisions for your business based on your own processes and financials.
Likewise, the world of generative AI might not always be this way. Capabilities may plateau, technology might hit some roadblocks, or all of the easy wins may already be implemented. Some pundits have recently said as much.
But, if history – and the companies themselves – are any guide, the next few years of generative AI will be just the opposite, with an explosion of capabilities and a price that’s nearly too cheap to meter.
So, for now, there’s just one mantra to keep in mind: minimize investments, keep switching costs low.
Become your company’s AI expert in under 30 minutes a month by signing up for the Executive Summary newsletter by AI For Business.
If you liked this post, have future posts delivered straight to your inbox by subscribing to the AI For Business newsletter. Thank you!