Decipher the Hype and Reality of ChatGPT for BI and Analytics
ChatGPT’s power is astounding and even more importantly, it is growing incredibly fast. Almost every BI and analytics vendor has initiatives in this area, and while several have implemented integrations, applications are still primarily experimental. The promise is huge, but of course, there are plenty of skeptics with valid reasons.
This blog highlights areas where ChatGPT can help business intelligence initiatives, and discusses areas where there may be a bit more hype than reality, for now.
What is ChatGPT?
Let's step back and level set a bit around what ChatGPT is and what we can use it for.
What better way to define ChatGPT, than ask it to do it for us?
“ChatGPT is a large-scale language model developed by OpenAI, based on the GPT (Generative Pretrained Transformer) architecture. It is one of the most powerful and advanced natural language processing (NLP) models available today.
ChatGPT is capable of generating high-quality human-like text in response to a wide range of prompts and questions. It has been trained on a vast corpus of text data from the internet, including books, articles, and websites, and can generate text in multiple languages.
ChatGPT has a wide range of potential applications, including chatbots, language translation, content generation, and data analysis. It is particularly well-suited for tasks that require a deep understanding of natural languages, such as answering complex questions, summarizing long documents, and generating creative writing.”
While in this blog we are focusing on ChatGPT due to its current publicity, there are similar and equally powerful efforts by Google DeepMind and others to be noted.
How does ChatGPT benefit data analytics?
While ChatGPT snuck in data analysis in the explanation, the core of its abilities is based on its highly accurate probabilistic ways to generate human-like text on almost any topic. Generative model variations can deal with images and music. It is difficult to do justice to these tools in a small blog, so we will leave the background at that.
ChatGPT has amazing capabilities when it comes to text analysis (e.g. extracting insights and sentiments), text interpretation of visual analysis, and content generation, which can vary from human to computer languages. However, ChatGPT is not designed to perform actual computational forecasting or other quantitative data. Instead, it can support the forecasting process by analyzing text data and generating insights to inform the forecasting model.
So let's explore some of the areas where ChatGPT can be used in BI and Analytics:
1. Data and Charts Interpretation into Contextual “Text”
BI tools excel in data visualization. They generate beautiful charts in many varieties, but require a human to interpret the data, identify trends, and communicate them effectively.
One of the main advancements from old-school Excel is the ability to visualize data a lot more effectively and change visualizations dynamically with BI solutions. ChatGPT may allow you to take an entirely different angle, and explain the data in plain English language (similar to natural language query tools) with key findings such as averages, outliers, etc. Here is an example we generated from ChartGPT.
"Sales increase by 20% during the holidays, with the highest sales for electronics, toys, and home goods. Northeast and West regions have higher sales volumes of 25% and 30%, respectively, compared to the Midwest and South regions. Online sales are growing at a rate of 15%. Focus on high-performing categories and regions during the holidays, which can lead to a potential sales increase of up to 20%. Invest in online sales channels and promotional campaigns to drive sales growth."
That is certainly easier to consume than a lot of charts.
However, training a model to be accurate requires a good handle on the data structure and a high level of consistency. It may be easier to use in relatively narrow areas such as sales analytics, but in the vast array of analytics use cases, a single practical solution by a BI vendor is unlikely.
Today there are simpler and more practical AI-enabled tools to achieve this reliably, such as Yellowfin’s Assisted Insights, which provides narration mixed with drill-down charts that maintain data integrity, which is more challenging in a probabilistic approach enabled by ChatGPT.
ChatGPT will create many new opportunities for innovative domain-specific analytic solutions. We expect a host of new vendors to emerge that could leverage more traditional BI platforms to integrate or embed in their solutions. This will accelerate delivery and optimize resources, while still providing differentiation. This aligns with our expectation that opportunities for embedded BI and analytics will continue to grow.
2. Code Generation and “Programming”
BI tools use programming languages to access and interact with data. Even visual interfaces translate the combination of inputs into code. The most common language used in database access and manipulation is SQL.
As noted earlier, ChatGPT is part of a class of tools called LLMs or Large Language Models, they excel in language tasks including translation. It is entirely possible to input English language prompts and generate corresponding SQL code (or Python, JS, etc.). Many vendors are experimenting with this approach; some like Thoughtspot and Tellius have formal product integrations. However, results are uneven at best. The code that is generated is rarely production ready, but for sure can help with ideas or a level of productivity. It is difficult to determine the pace of advancements here, as the universe of appropriate programming data for model training is far more limited.
Recognizing some of these limitations, Yellowfin adopted a slightly different approach called Guided NLQ (Natural Language Query). It is a departure from natural language processing (NLP) models, but provides a more practical approach that addresses well even fairly complex use cases that require several layers of nested queries. With Guided NLQ, users are presented with valid data elements to assist in generating their queries that still follow English language prompts.
Guided NLQ trades some of the elegance of completely free text entry for accuracy and practicality. As NLQ evolves with advancements in tools like ChatGPT; augmenting or replacing the Guided NLQ with NLP will be possible. In fact, the repository of training data that Yellowfin generates will be very useful.
3. AI-Centric Analytics Platforms
The ideas described so far are applicable to all BI and analytics vendors, and will likely be important features in the next several years. Keep in mind that such features will be relatively easy to integrate into products given the API-centric deployment approaches. In general, a key question for vendors would be how friendly their platforms are to APIs (Application Programming Interface).
The ease of implementation of these new API architectures is giving birth to new interesting players, startups like seek.ai and olli.ai. They take a fully AI-centric approach to building BI and Analytics platforms. Relying exclusively on natural language prompts to drive analysis is innovative, but perhaps also unproven for implementation at scale. There are multiple Open Source projects, such as Auto-Analyst (https://github.com/aadityaubhat/auto-analyst) that may present easier and cheaper ways to experiment although the solutions have many limitations. Again, domain centricity is likely to be key for success in LLM-enabled AI-Centric Analytics Platforms.
At Yellowfin, we are exploring all aspects where it is practical and useful to implement ChatGPT. BI tools, especially SaaS ones require a decent amount of power. We considered even internal use cases where data tends to be fairly consistent. However, our existing traditional statistical or machine learning models appear more appropriate for such use cases. LLM models are best for text analysis, so the outlined use cases above appear the most relevant for now.
Try Yellowfin For Yourself
Learn how Yellowfin can provide your customers and users with a sophisticated, AI-powered analytics experiences for your unique use cases.