Lightdash at Ubie Part 3: Toward AI-powered BI

Yu Ishikawa
14 min readNov 8, 2023

--

This is part 3of the blog post series about Lightdash at Ubie. Ubie automatically generates medical records using an AI-powered patient questionnaire that helps save time and provide better patient care. As you can imagine, data engineering, data management and data governance are very significant to build the high-quality AI-powered AI system.

Introduction

Welcome back to part 3 of the blog series on Lightdash at Ubie. If you’re joining us for the first time, don’t worry — you can catch up by reading Part 1: Introduction and Part 2: Governance at Scale. In the first part, we introduced Lightdash as a powerful Business Intelligence (BI) tool that leverages dbt for data modeling. We explored how it fits into the broader landscape of data analytics and why it’s a game-changer for businesses like Ubie. In the second part, we delved into the governance aspects, discussing how Lightdash helps manage data at scale, ensuring quality and reliability.

But the world of data analytics is not static; it’s continually evolving. One of the most exciting developments is the integration of Artificial Intelligence (AI) and Large Language Models (LLMs) into BI tools. These advancements are not just bells and whistles; they are fundamentally reshaping how we approach data analytics, making it more dynamic, intuitive, and insightful.

In part 3, we’re going to dive deep into the future of data analytics, particularly examining how Lightdash can potentially leverage the power of AI and LLMs, although Lightdash doesn’t offer AI-powered features yet. We’ll discuss the recent AI-powered features in BI tools, the challenges of integrating AI into data analytics, and how Lightdash can serve as a solution. More importantly, we’ll explore the potential for human-AI collaboration in Lightdash, emphasizing how this synergy can lead to more effective and insightful data analytics.

Recent Movements in AI-powered BI Tools

The field of Business Intelligence is quietly undergoing changes that are worth noting. These changes are primarily driven by the gradual integration of Artificial Intelligence and Large Language Models into BI platforms. While it’s too early to predict the full impact of these developments, they are certainly influencing how we approach data analytics. Let’s take a closer look at some of these movements.

AI’s Growing Role in BI

One of the more subtle yet significant shifts is the increasing presence of AI in BI tools. AI functionalities are being added to BI platforms, not as flashy selling points, but as practical features aimed at enhancing data analytics. These features are making data analytics more interactive and potentially insightful. For instance, natural language querying and some level of task automation are becoming more common in modern BI tools.

Human-AI Collaboration

Another development that’s gaining traction is the idea of human-AI collaboration. The emerging generation of BI tools seems to be designed with the understanding that AI can be most effective when used to complement human skills. While AI can handle data collection and initial analysis, humans are still needed for tasks that require a deeper level of acttivity, such as decition making and ethical considerations.

Accessibility and Data Analytics

The inclusion of AI in BI tools is also making data analytics more accessible to a wider range of people. By simplifying some of the more complex aspects of data analysis, these tools are becoming more user-friendly. This is particularly beneficial for business analysts and decision-makers who may not have advanced data science skills but still need to derive insights from data.

A Shift in Focus for Data Experts

As AI begins to take on more routine tasks, there’s a noticeable shift in the responsibilities of data experts. Instead of focusing solely on implementation — writing code, creating dashboards, etc. — there’s a growing emphasis on review and validation of AI-generated outputs. This shift is subtle but important, as it suggests a future where data experts spend less time on manual tasks and more time on ensuring the reliability and accuracy of AI-generated outputs.

AI-powered BI Tools

While the integration of AI into BI is still evolving, some platforms are already showcasing these capabilities. For example, Google Cloud’s Looker has introduced Duet AI, which aims to simplify the BI experience and bring insights to all users in an organization. Similarly, Amazon’s QuickSight offers Generative BI features, and Hex’s Magic AI is also making strides in this direction. These are just a few examples that indicate the growing influence of AI in the BI landscape.

While it’s still early days, the integration of AI and LLMs into BI tools is showing signs of influencing the field in several ways. These developments are not just passing trends but could be early indicators of a more significant shift in how we approach data analytics. They hint at a future where the collaboration between human intelligence and artificial intelligence could lead to more reliable and potentially insightful outcomes, albeit with the need for careful validation and review.

Challenges in Leveraging LLMs and AI for Data Analytics

The integration of Artificial Intelligence and Large Language Models into Business Intelligence tools is a promising development, offering the potential for more advanced and automated data analytics. However, this integration is not without its challenges, particularly in terms of the reliability and validation of AI-generated outputs. Let’s delve into these two primary challenges.

Challenge 1: Reliability of AI-generated Outputs

The first challenge centers on the reliability of the outputs generated by LLMs and AI. While these technologies can process vast amounts of data and generate insights at an unprecedented scale, their outputs are not always reliable. One issue is the phenomenon of “hallucinations,” where the model generates outputs that are not accurate or even existent in the data. This is a significant concern, especially for organizations that have complex data landscapes with thousands of tables and tens of thousands of fields across multiple products. Let’s consider if we have to share an analysis report to a business partner. Can we share an AI-generated report without reviews by humans? Absolutely, no. In the near future, we can technically generate a report from raw data sources. But, human beings should be in the loop.

Moreover, the challenge of reliability is compounded by regulatory considerations. Organizations today must adhere to a myriad of regulations designed to protect customer privacy. This makes the task of automating data analytics even more complex, as not only do the outputs need to be accurate, but they also need to be compliant with these regulations. Even if a LLM returns an output with reasonings, humans with no expertise compliance can’t check if the resulting reasonings are valid or not. The challenge, therefore, is to develop AI and LLM capabilities that are both reliable and compliant, a task easier said than done given the complexities of modern data and the ever-changing regulatory landscape.

Challenge 2: Validating AI-Generated Outputs

The second challenge is the difficulty of validating the complex outputs generated by LLMs and AI. For instance, if an LLM generates an output consisting of hundreds of lines of SQL statements, it becomes a Herculean task for data consumers who are not familiar with SQL to validate the output. This is a significant issue as BI tools become more user-friendly and accessible to individuals without technical expertise.

Moreover, the challenge of validation isn’t just limited to code-like outputs. Even when AI generates more user-friendly outputs like charts and dashboards, there’s still a need for a mechanism to validate these outputs. Simply put, while charts and dashboards are excellent for visualization, they don’t offer a detailed look into the data manipulations that led to these visual representations. Therefore, there’s a need for additional tools or methods that allow for a more in-depth investigation into what the AI did to arrive at these outputs.

The integration of AI and LLMs into BI tools, while promising, presents significant challenges in terms of reliability and validation. These challenges are not trivial and require a concerted effort to address. As we move forward in this exciting but complex landscape, it becomes increasingly clear that the successful integration of AI into BI will require a balanced approach. This approach must combine the computational power of AI with rigorous validation mechanisms and a deep understanding of both data complexities and regulatory requirements.

Great Possibilities of Lightdash as AI-powered BI

As we’ve discussed the challenges of integrating Artificial Intelligence and Large Language Models into Business Intelligence tools, it’s equally important to explore potential solutions. One such promising avenue is Lightdash, a BI tool that offers seamless integration with dbt. While Lightdash itself is not yet AI-powered, its capabilities and features make it a strong candidate for future AI integration.

Lightdash and dbt: A Powerful Combination

Lightdash’s integration with dbt offers a robust foundation for reliable data analytics. Dbt allows for the transformation of tables in data warehouses like BigQuery and provides the ability to test data quality rigorously. This ensures that the data marts you’re working with are reliable, which is crucial when you’re looking to integrate AI and LLMs into your BI processes. The more reliable the data, the more reliable the AI-generated outputs are likely to be.

User Experience: The Lightdash Advantage

One of Lightdash’s standout features is its user interface and user experience. The tool is designed to make data exploration quick and intuitive. Users can easily create charts and dashboards without having to delve into complicated SQL queries. This ease of use is not just a luxury but a necessity, especially when considering the challenges of validating complex AI-generated outputs.

Solving the Challenges with Lightdash

Lightdash’s capabilities offer potential solutions to the challenges we’ve discussed earlier. For instance, its seamless integration with dbt models means that you can define dimensions and metrics explicitly. This makes it easier to validate the outputs visually, without having to sift through long, complicated SQL statements. In essence, Lightdash offers a visual validation layer that could be invaluable in an AI-integrated BI environment. Of course, this isn’t limited to AI-generated outputs. We can easily understand Lightdash explores created by humasn as well.

The Role of APIs

While Lightdash doesn’t currently offer AI-powered features, its API capabilities offer a glimpse into future possibilities. As demonstrated in previous discussions about BI tool governance at scale in part 2, Lightdash’s APIs could potentially be used to integrate AI functionalities into the platform. This opens up exciting avenues for automating various aspects of data analytics, from data exploration to insight generation. For example, The RunMetricQuery API enables us to explore data by specifying dimensions and metrics with ease rather than a SQL statement. We don’t have to construct a complicated query to wrangle data In this way, we can make the best of APIs of Lightdash to deal with data.

For instance, OpenAI’s function calling feature allows businesses to supercharge their operations by integrating AI into their workflows. Imagine you have a customer service chatbot. With OpenAI’s API, you can make your chatbot not just answer FAQs but also perform tasks like checking the weather or sending emails. You send a ‘request’ to OpenAI’s system, which includes what you want the AI to do, and you get back a ‘response’ that can be a piece of information or an action carried out by another software. This is incredibly powerful when combined with other APIs. For example, your chatbot could use a weather service’s API to provide real-time weather updates. This seamless integration of multiple APIs, including OpenAI’s, can make your services more dynamic, responsive, and intelligent. So, we can take advantage of APIs of BI tools to integrate with LLMs and AI.

The Future: A Balanced Approach

Will the future of BI tools like Lightdash involve only natural language interfaces? While natural language capabilities offer exciting possibilities for initial data exploration, the robust UI of Lightdash is likely to remain invaluable. For data experts, the ability to explore data through a well-designed UI is often more efficient and productive than using natural language queries alone. Therefore, the future likely involves a balanced approach, combining the strengths of natural language interfaces with the robust, intuitive UIs that tools like Lightdash offer.

while Lightdash is not yet AI-powered, its existing capabilities and potential for future integration make it a strong candidate for an AI-augmented BI tool. Its seamless integration with dbt, robust UI and UX, and potential for API-based AI integration offer promising solutions to the challenges of reliability and validation in AI-powered BI.

Potential Use Cases for Human-AI Collaboration in Lightdash

The integration of AI and Large Language Models into Business Intelligence tools like Lightdash opens up a plethora of opportunities for enhancing data analytics. While the technology is still nascent, the potential for human-AI collaboration is immense. In this section, we will explore some of the most promising use cases where humans and AI can work together to revolutionize the way we approach BI.

Augmented dbt Model and Metadata Generation

The Lightdash CLI enables us to generate dimensions to dbt models using their actual schema information. But, we humans still have to think of metrics to effectively calculate and measure desired indexes on our business. We would be able to take adantage of LLMs to generate candidates of our metrics.

One of the foundational elements of any BI tool is the quality of its data models and metadata. Dbt has been instrumental in transforming tables and ensuring data quality, especially when integrated with Lightdash. However, the process of creating these models and metadata can be labor-intensive and prone to human error. This is where AI and LLMs can come into play.

Imagine a scenario where a data analyst is working on creating a new dbt model. The LLM can assist by suggesting the most relevant tables and fields based on the analyst’s natural language query. It can even generate SQL code snippets or entire dbt models that the analyst can review, modify, or approve. This not only speeds up the process but also reduces the likelihood of errors.

Moreover, metadata is often neglected or inconsistently maintained. AI can assist in auto-generating metadata descriptions at both table and field levels. It can analyze the data and provide concise, informative metadata, which the human analyst can then review and refine. This collaborative approach ensures that the metadata is both accurate and informative.

Interactive Chart and Dashboard Creation

Lightdash already offers an excellent UI and UX for data exploration and visualization. However, the process can be further enhanced with AI assistance. For instance, after a user inputs a natural language query like “Show me the monthly revenue for the last year,” the AI can instantly generate a draft chart or dashboard based on the query. The user can then fine-tune this draft, perhaps changing the type of chart, adding more data points, or applying additional filters.

This human-AI collaboration can be particularly useful for users who are not well-versed in SQL or data analytics. They get a head start with the AI-generated draft, which they can then customize as per their needs. For data experts, this feature can significantly speed up the process of chart and dashboard creation, allowing them to focus more on interpretation and decision-making.

Insight Generation

One of the ultimate goals of BI is to derive actionable insights from data. Traditionally, this has been a wholly human-driven activity. However, AI can play a significant role here as well. Once a dashboard is created, the AI can analyze the data and generate initial insights in natural language, like “The revenue has been steadily increasing for the last six months but saw a sudden dip in July.”

These AI-generated insights serve as a starting point for human analysts. They can either validate these insights or delve deeper into the data to understand the anomalies or trends better. This collaborative approach ensures that the insights are not only quick but also thoroughly vetted.

Agent-Assisted Tasks

The concept of agents, as described in Langchain’s documentation, involves using an LLM to choose a sequence of actions to take. Agents can be particularly useful in automating routine tasks in Lightdash. For example, an agent could be programmed to generate daily revenue reports every morning. However, instead of just automating this task, the agent can be designed to seek human approval before sending out the report. This ensures that the report is both timely and accurate.

Agents can also assist in more complex tasks. For instance, they can be programmed to monitor certain KPIs and alert human analysts if any irregularities are detected. The analyst can then decide whether the irregularity is an anomaly that needs further investigation or a data error that needs to be corrected.

In essence, agents act as a bridge between full automation and human decision-making, ensuring that while tasks are carried out efficiently, human expertise is not sidelined. This is particularly important in a field like BI, where the stakes are high, and the cost of errors can be significant.

The integration of AI and LLMs into Lightdash offers a multitude of possibilities for enhancing data analytics. From augmented dbt model and metadata generation to interactive chart and dashboard creation, insight generation, and agent-assisted tasks, the potential for human-AI collaboration is immense. As the technology matures, it’s not hard to imagine a future where BI tools like Lightdash become more intelligent, efficient, and collaborative, revolutionizing the way we approach data analytics.

Summary

In this part 3 of our series on Lightdash, we have ventured into the exciting realm of AI-powered Business Intelligence. The landscape of data analytics is undergoing a seismic shift, thanks in part to the advent of Large Language Models and advanced AI technologies. While these innovations offer incredible potential, they also present unique challenges that need to be addressed.

Firstly, the reliability of outputs generated by LLMs and AI is a significant concern. With companies managing thousands of tables across various products, fully automating data analytics remains a complex endeavor. Secondly, the ease with which these AI-generated outputs can be validated poses another challenge, especially when they are as intricate as hundreds of lines of SQL code.

Lightdash, a BI tool that already stands out for its excellent UI and UX, offers a promising solution to these challenges. Its seamless integration with dbt allows for the transformation and quality testing of data, providing a more reliable foundation for AI-generated analytics in addition to human-generated analytics. The APIs of Lightdash further extend its capabilities, opening up the possibility of integrating AI functionalities directly into the platform.

We explored several potential use cases where human-AI collaboration could significantly enhance the capabilities of Lightdash. From augmented dbt model and metadata generation to interactive chart and dashboard creation, the possibilities are endless. AI can assist in initial data exploration and insight generation, while human expertise can be leveraged for validation and more nuanced understanding. The concept of agents adds another layer of automation, bridging the gap between machine efficiency and human oversight.

In conclusion, Lightdash is not just a powerful BI tool for today; I personally think it holds immense potential for the future. As we look ahead, it’s clear that the integration of AI and LLMs into platforms like Lightdash will revolutionize the field of data analytics. The challenges are real, but so are the opportunities. Lightdash could very well be at the forefront of this AI-powered transformation in BI. And that’s why we Ubie decided to take Lightdash due to the huge potential.

--

--

Yu Ishikawa

Data Engineering / Machine Learning / MLOps / Data Governance / Privacy Engineering