Whenever you encounter the limitations of a block in your tech stack, it’s the perfect opportunity to look for alternatives. If your code is well decoupled, swapping in a new component should be straightforward and quick.
Let’s look at serverless functions, which are often used for simple backend tasks executed in under 10 seconds. Because these functions have a short initial load time, it’s best to call these endpoints in the background to ensure that users don’t notice any added latency.
But what if you want to add an AI feature using one of these serverless (lambda) functions? You’ll quickly discover that this can result in timeout errors. While you might opt for a "mini" or "nano" AI model that responds faster, you’ll likely sacrifice some accuracy in the generated content.
Alternatively, switching to a dedicated backend server opens up the possibility of offering users more advanced content powered by reasoning models. While these responses may take longer, you can often accommodate this by simply switching backend URLs in your code, depending on which model you want to use.
The key takeaway: whenever you feel limited by your technology stack, take time to benchmark your options. You’ll often find that exploring new solutions doesn’t require a painful refactor—especially if your system is well decoupled and designed for change.