Getting Started with DBRX: First Impressions of Databricks’ Innovative Language Model

Databricks is advancing rapidly with its Generative AI platform. It is increasingly integrating foundation models and large language models. The most recent addition is DBRX, which I will discuss in this blog.

Model serving

Recently, Databricks released their own LLM named DBRX, available through the Foundation Model API. DBRX excels in summarization, question-answering and coding and is extremely powerful when using RAG where the accuracy is important of the retrieved documents. Another important feature is its large context window of up to 32 thousand tokens. DBRX is competitive and even surpassing leading LLMs, such as GPT-3.5, Gemini 1.0 Pro and Llama2-70B over  different measurements such as inference speed, accuracy over different tasks, such as programming language understanding and math, and performs really well answering questions using RAG (which is shown in Table 1).

First tests

DBRX can easily be tested through the AI Playground available on Databricks. First of all, some specific parameters must be set to define the characteristics of the model:

  • Temperature: control the randomness, a higher value increases the diversity
  • Top P: the cumulative probability cutoff for token selection. Use a lower value to ignore less probable options
  • Top K: specify the number of token choices the model uses to generate the next token
  • Max tokens: specify the maximum number of tokens to use in the generated output response

In general you will use higher temperature and top P values for creative tasks using a moderate top K value to balance between creativity and coherence, and lower temperature, top P  and top K values for a deterministic output.

The system prompt is a way to provide context, instructions and guidelines to the LLM before presenting it with a question or task.  Furthermore, you can include the objective of the chatbot, add some rules, guidelines and guardrails, define customized output formatting and set some verification standards and requirements. This way, you can set the stage and define a role, personality or tone for the chatbot.

For example, in this case, the system prompt is set to: “You are a helpful assistant that will answer questions about Databricks and things related to it. You should only answer if the question is related to Databricks, otherwise say “I’m sorry, but this is out of scope”. Explain it to me like I’m 5 years old.”. This way, you limit the role of the chatbot to only answer Databricks-related questions and define the way how the chatbot should respond.

If I ask the question “Can you explain the difference between a data lake and data warehouse, and how the data lakehouse solves their shortcomings?”, I got the following answer:

As we have set the system prompt to only answer Databricks-related questions, we get the following answer for the question : ”Can you explain to me how the motor of an EV works?”

Databricks AI Playground also lets you compare different models at the same time, such that you can evaluate some key characteristics about LLMs, such as output and tone, latency, speed and the number of tokens produced. Here, you can see that DBRX is extremely fast in generating output tokens and has an incredibly low latency compared to competitors like Llama2 and GPT-3.5-turbo, while generating a qualitative response.

The system prompt is: “You are a helpful AI assistant. You will answer the questions correctly and try to do this in less than 200 words if possible” and the question is “What is generative AI and how can it help our company, which is in the building industry?”

Pricing

If we look at the pricing, DBRX is  the most expensive foundation model in the pay-per-token pricing and is among the most expensive ones when you choose the provisioned throughput pricing.

If we look at an applied example for the provisioned throughput, we can see that the price is slightly higher than the Llama2 model,  but considerably lower than the Mixtral-8-7b model.

However, if we would compare this pricing with some external model providers, DBRX is one of the cheapest options, significantly lower than GPT-4-32k for example, which has the same context window as DBRX.

Conclusion

DBRX offers a compelling alternative to other state-of-the-art foundation models. DBRX is really easy to use and integrate inside your Databricks environment. It contains some interesting and unique capabilities, such as high speed and low latency which could be really useful in a chat environment as fast response are expected. Besides, the large context window is a big advantage, as this gives the opportunity to include more information to the prompt to increase the accuracy of the answer.

Pieter Verfaillie

Analytics consultant @Aivix

Pieter is a motivated data scientist with a Master of Science in business engineering. He’s dedicated to constructing models that address problems and offer insights. Proficient in Python, Pieter has a wealth of experience from diverse projects, reflecting his enthusiasm for learning. His passion lies in leveraging data to innovate solutions and his solid business background adds a unique dimension to his approach.