X

This site uses cookies and by using the site you are consenting to this. We utilize cookies to optimize our brand’s web presence and website experience. To learn more about cookies, click here to read our privacy statement.

Video Podcast Series: What’s On Tech? Episode 4

Future-proofing AI

Welcome to SPR’s video podcast series, What’s on Tech?, the show where we ask technology experts what’s going on in tech right now. This episode is hosted by Matt Mead, CTO of SPR, with special guest Steven Devoe, Director of Data at SPR.

Be sure to watch the episode below, or catch up on the key points in our recap.

SPR’s What’s On Tech? Episode 4: “Future-proofing AI”

Video Recap

AI is changing all the time. One day, one model might be considered the best for a certain task; and the next, you’re going to find that there’s a significantly better model for what you're trying to do. You don't want to find yourself one day using a model and your product has evolved to need something different, and you are unable to leverage another model other than the one you’re using today.

For example, imagine you’re conducting a complex RAG-based chat bot implementation where multiple pipelines are coming into a vector database. You have maybe one or more large language models (LLMs), and all these technologies needed to be coordinated together. That is the sort of complexity we’re talking about when considering how to future-proof AI.

In isolation, none of those things are overly complex. But when you combine them all together, they all have dependencies on each other. You may have certain data pipelines that perform in one way. Combine that with a vector-based database plus some type of retrieval algorithm. Then throw an LLM on top. All those pieces must fit together at any given time. You may want to change one or more of those things as your product evolves, and you want do that in a way that doesn't slow you down.

One of the things that SPR discusses with clients is ways to get the fundamentals right. Data pipelines are at the heart or data is at the heart of AI, so we need to make sure that our data pipeline is absolutely rock solid.

While we’re at it, let’s automate it so the initial training and any sort of retraining can be done with the least amount of human intervention.

Given how long ML has been around in the LLM and AI world, there's a lot less best practices to be had. At the core of it, you want to ensure you’re taking scalability, flexibility, security, and all the traditional best practices, and making sure they are built in from the start. However, that will take a little bit more forethought because this is also new.

The concept of abstraction has been around for decades in software development and systems design. This is idea where we isolate or encapsulate a portion of the implementation so that the specifics around that implementation don't sort of bleed out into the rest of the software code base. That way, it’s easier to swap out that piece in the future if necessary or if desired. People have been doing this for decades in software engineering, whether they wanted to isolate a database or a payment system.

We've been talking about flexibility so far in terms of using different models for performance and doing more and more complex tasks at the same time. You want to think about the lower level tasks and using the most cost effective model and the fastest response times to give you the best ROI on your dollars spent.You also want to think about things like regulation, which is almost certainly imminent in certain legal jurisdictions. You don't want to find yourself dependent on one specific model type and suddenly, you can’t use that because that model doesn’t comply with the laws and regulations in those areas.

And finally, a lot of these models are still under managed quotas where you are limited to the number of requests you can make per minute or total tokens you can use per minute. That obviously puts a ceiling on how much you can use this and leverage it for your business purposes. Scaling across different models is obviously one great way to mitigate against that risk.

We see clients change the frequency with which they’re calling a model, and so that can change just how the system performs. It can also have a huge impact on the cost for a system, and it can change the cost over time. It's no secret that people are concerned about the cost of AI models and how effective they are. You always want to consider ways to reduce the cost or use the best model for the job.