← Back to Blog

Asking your business data questions in plain English

Every company has the same pattern. Someone needs a number. Revenue by product line. Budget variance by department. Top customers by margin.

If they are technical, they write a query. If they are not, they ask someone who can. Or they dig through a dashboard that almost answers their question but not quite. Or they export to Excel and spend an hour pivoting.

Every ad hoc question means waiting. And most business questions are ad hoc.

What we built

We built a system where you ask a question in plain language. The AI generates the appropriate database query. The database executes it. Only the results come back to the conversation. The AI then presents the answer in a readable format.

The AI never sees your raw data. It generates the query, the database runs it, and only the small result set returns. This matters for cost, for performance, and for data privacy.

Running in production

This is currently running internally at DigiDuo. Our dataset: 70,000 rows across 6 business tables — budget, transactions, orders, customers, revenue targets, and employees.

Here is what actual interactions look like:

Question: “What were our top 5 customers by margin last quarter?”

CustomerGross MarginMargin %
Acme Solutions34,200 EUR42.1%
Nordic Partners28,750 EUR38.6%
Baltic Freight21,400 EUR35.2%
TechServe Group19,800 EUR33.9%
DataPoint Ltd17,600 EUR31.4%

Question: “Show me departments that exceeded budget by more than 10% in any month this year.”

DepartmentMonthBudgetActualOverspend
MarketingFebruary8,000 EUR9,640 EUR+20.5%
EngineeringMarch22,000 EUR25,100 EUR+14.1%
MarketingMarch8,000 EUR8,960 EUR+12.0%

Question: “Compare actual revenue vs targets for Q1, broken down by product line.”

Product LineTargetActualVariance
Consulting120,000 EUR134,500 EUR+12.1%
Managed Services85,000 EUR79,200 EUR-6.8%
Training30,000 EUR28,400 EUR-5.3%
Licensing45,000 EUR51,100 EUR+13.6%

No SQL. No dashboard hunting. No waiting for the data team.

Why not just paste data into ChatGPT?

Fair question. Here is why that does not work at scale:

Data size. 70,000 rows do not fit in an AI context window. Even if they did, performance degrades badly with large inputs.

Cost. Sending large datasets as context with every query is enormously expensive. Our approach sends a small query description, not the data itself.

No live connection. Pasting data is a snapshot. By the time you finish your analysis, the data may be stale. Our system queries live data every time.

Privacy. With our approach, the AI generates the query but never sees the underlying dataset. Only the filtered, aggregated results enter the conversation.

The secret ingredient: a context layer

The system works because of a context layer — a markdown file that tells the AI what the data actually means.

Table relationships. Column naming conventions. Business logic rules. What “margin” means in your company. Which fiscal year definition to use. How departments map to cost centres.

Without this context layer, the AI generates wrong SQL. It guesses at column names. It misunderstands relationships. It applies the wrong filters.

With it, accuracy is high. The AI knows that cust_id in the orders table maps to customer_id in the customers table. It knows that your fiscal year starts in April. It knows that “margin” means gross margin, not net.

This context file takes a few hours to write. It saves hundreds of hours of wrong answers.

Three data modes

We built three modes depending on what the situation requires:

Direct file query. The AI queries the source file directly every time. Always current. Best for data that changes frequently.

Loaded tables. Data is loaded into a local database for faster querying. A stable snapshot. Best for analysis sessions where you need speed and consistency.

Scheduled refresh. An automated pipeline that refreshes the database on a set schedule — daily, hourly, whatever fits. Best for production use where you need both speed and freshness.

What it takes to set up

For a standard setup: 1-2 weeks. That includes understanding your data, building the context layer, configuring the query system, and testing with real questions from your team.

It works with CSV files, Excel exports, ERP data dumps, and any structured data source. If your data lives in rows and columns, this approach works.

The bottom line

The question is not whether your team needs data. It is whether they can get it without asking someone else.

Ready to transform your business with AI?

Let's talk about what's possible for your specific situation.

Get in touch →