SLM series - SUSE: Balanced realities for AI model use cases

This is a guest post for the Computer Weekly Developer Network written by Ian Quackenbos aka Ian Q.

Quackenbos leads the AI Innovation & Incubation team (technology & product) at SUSE and he writes in full as follows…

On the question of whether SLMs and LLMs should be combined and used in union and in concert, my answer is yes!

Combining SLMs (Smaller Language Models) and LLMs (Large Language Models) can be beneficial for tasks that require specialised knowledge SLMs while also leveraging the broader knowledge and language abilities of LLMs. They can complement each other for optimised performance.

What role does intelligent routing play inside an AI when it needs to choose the most appropriate model to direct queries towards? Well, iIntelligent routing helps ensure that the right AI model is used for a specific task or query. By analysing the input, it can direct queries to the model most suited to the type of problem, improving efficiency and response accuracy.

Dedication decisioning

Is a selection of dedicated SLMs better than one LLM – and so, are there use cases where SLMs are not needed and an LLM works more effectively? In some cases, SLMs might be better suited for specific, niche tasks due to their efficiency and reduced computational cost. However, LLMs are better for broader, general tasks that require a wide range of knowledge and adaptability. For many applications, LLMs might be more effective due to their generality and versatility. It’s important to also factor in whether the task involves creative or empirical “matter of fact” output.

By nature of their size, SLMs are typically faster to train due to their smaller size and reduced complexity. This can speed up the iteration process, allowing AI engineers to experiment with more models in less time and optimise workflows more quickly. Keep in mind new models are being released all the time so it’s inefficient to go through long and computationally expensive training and retraining.

SLMs can be deployed on-premises or in private cloud environments, depending on the need for data security, privacy and proximity to critical infrastructure. For mission-critical applications, on-premises or private cloud deployments may be preferred to ensure tighter control over sensitive data, especially if we’re talking about air-gapped deployments.

Environmental sustainability

Ian Quackenbos leads the AI Innovation & Incubation team at SUSE.

SLMs may be more environmentally sustainable due to their smaller size and lower computational requirements, leading to reduced energy consumption compared to larger models. This can be particularly important in large-scale AI deployments or for compliance regulation in EMEA around energy efficiency.

SLMs may have limitations in terms of knowledge coverage and performance on complex tasks. They can still harbour biases similar to larger models, depending on their training data. Their smaller scope might reduce their ability to handle more nuanced or difficult tasks effectively hence making them more appropriate for “matter of fact” tasks.

Domain-specific LLMs

Domain-specific LLMs can be better than general-purpose SLMs for tasks in specific fields, as they are trained with specialised data to handle complex, industry-specific challenges. However, SLMs might be more efficient for simpler, less specialised tasks.

SLMs are well-suited to applications like chatbots, sentiment analysis and other customer-facing technologies where task-specific knowledge and faster response times are more important than handling complex, open-ended queries.

Finance and retail are indeed key verticals for SLMs. In finance, SLMs can be used for analysing transaction data, compliance and fraud detection. In retail, they can help with customer service, sentiment analysis and product recommendations. Both industries benefit from faster, more specialised processing of data.