Polytope - A multidimensional data engine
In the prior article, I described my experience taking a small, long-running compiler project and using AI coding tools to extend it with first a TUI and later a web-based mini-IDE. That work taught me a great deal about:
- Refining a specification through back-and-forth conversation with an LLM
- Defining outcomes and tests up front
- Keeping the spec and implementation plan outside of the LLM’s context (as project files)
- Paying close attention to the recurring traps that emerge once implementation begins
One of the earliest pieces of advice I encountered about using LLMs effectively was: don’t ask them to jump too far in a single step. Breaking a large goal into smaller, focused, testable sub-goals tends to produce far better results.
That said, every few months the latest models arrive with noticeably stronger capability — and increasingly they can perform some of this “divide and conquer” automatically. Once I began using proper coding environments (Claude Code, Gemini Code Assist, and Codex), it became obvious that:
- Maintaining TODO lists
- Defining phases
- Checkpointing
- Writing at least some tests
…is now built into the philosophy of all major tools. Attentive monitoring is still essential, but the workflows themselves are structured to encourage good engineering practice.
LLMs nevertheless remain prone to the occasional odd behaviour — especially as their working context becomes full. One of my “favourite” Claude Code quirks is when it suddenly announces: “This is getting complicated, let’s do something simpler.”
It then quietly redefines the entire problem to suit the simpler goal!
Still, with a bit of guidance — and with the willingness to stop a chain of actions whenever something odd appears — you can usually keep the “junior–intern robot programmer” more or less on track.
A New Experiment: Building an Analytic Data Engine
With a little experience under my belt, I wanted to attempt something more ambitious: creating an analytic data engine that could eventually be embedded into psimulang.
This is a substantial undertaking — large enough to:
- Provide an excellent testbed for comparing different coding models
- Refine my own workflow for AI-assisted development
- Explore how well LLMs handle sophisticated architectural design
Comparisons between models age quickly, of course, but they remain interesting. Capabilities differ in surprising ways, and sometimes for reasons you can infer from design choices, context window behaviour, or planning features. Even within a single vendor’s lineup, offerings have diverged: context-window utilization, planning depth, “thinking time,” and scheduling heuristics have all evolved markedly over the past year.
Goals of the Data Engine Development
The high-level goals were:
1. Build a Haskell Engine Library
- A DSL for dimensions and multidimensional cubes (“ORTHOs”), with calculations and consolidation hierarchies.
- A loader DSL for projecting tabular data into a chosen multidimensional shape.
- A reporting DSL for defining report renderings over populated ORTHOs.
- Storage strategies suitable for both dense and sparse data.
- A runtime evaluator capable of executing calculations and consolidations.
- A loader that can ingest data into an ORTHO.
- A reporting engine that can execute report definitions against ORTHO data.
2. Integrate the Engine Into psimulang
- Extend psimulang syntax with blocks corresponding to each DSL object.
- Have the psimulang analyzer emit and serialize DSL structures.
- Pass serialized DSL to new runtime intrinsics.
- Maintain a polytope environment within the psimulang runtime.
- Add commands for polytope lifecycle management and report execution.
These are non-trivial tasks. I followed the same pattern as before: each step began with a requirements and implementation-plan session, refined through conversation. Once ready, the LLM produced:
- A master plan committed to the project
- A bootstrap prompt describing invariants, architectural expectations, library preferences, and anti-patterns to avoid
Only after this review did I begin an implementation session referencing the bootstrap prompt.
Designing the DSLs
For the DSLs, I adapted the patterns I knew from working on an early MOLAP engine in the 1990s. The key objects are:
- DIMENSION — a coordinate set, e.g., “country” or “measure”.
- ORTHO — a multidimensional shape defined by a list of DIMENSIONs; each intersection is a cell with associated storage.
- RULE — a calculation relating one cell to others (a multidimensional formula).
- MODEL — a set of RULEs applied to one or more ORTHOs.
- REPORT — a set of visual objects (tables, charts) each bound to data, plus a sectioning construct for multidimensional iteration.
All objects live within a ConcurrentEnvironment, the core resource container. Objects legally created by the constructor DSL are added to the environment and then manipulated via lifecycle commands. These commands are queued using Haskell’s Software Transactional Memory (STM) and applied atomically when executed. Object-level effects will use a generational scheme with atomic visibility, and dependency deletion is guarded.
This level of complexity is ideal for testing the limits of AI-assisted coding — and for deepening my understanding of effective prompting, process control, and long-form LLM collaboration at scale.
As is often the case, the initial DSL generation from Claude was fast — and impressively confident.
[LWE TBD: tidy and elaborate this section]
Observations and Problems:
- Evolution of the Environment, started simple, then added concurrent features, but left the old non-concurrent version in place for "backward compatibility". Took a fair bit to excise this.
- Kept forgetting to serialize with Store, which we had chosen, when back to Show/Read, despite contextual code having this
- Forgot that Store was derivable from Generic, build dozens of instances long-hand.
- Lots of test failures as we progressed, but these did work nicely for regression/fixing - so long as you remembered to remind the LLM to do this
- LLM forgetting to implement whole swathes of development phases and declaring 'done'. Always review at suitable intervals - ask for a code quality review (covering missing code, TODOs, vestigial code, anti-patterns, performance issues etc.)
- Claude gets stuck and starts saying things like "X is hanging, let's simplify".
Claude is a powerful coder, but ChatGPT is great at deep analysis - get Claude to create a briefing and prompt for ChatGPT.
-- Business Dashboard Demo in Psimulang
-- Defines all dimensions, ORTHOs, data loads, and report layout
-- ---------------------------------------------------------------------
-- Dimension Definitions
-- ---------------------------------------------------------------------
DIMENSION Quarter
FIELD Q1 INDEX 1
FIELD Q2 INDEX 2
FIELD Q3 INDEX 3
FIELD Q4 INDEX 4
FIELD "All Quarters" = SUM Q1:Q4
END DIMENSION
DIMENSION Month
FIELD Jan INDEX 1
FIELD Feb INDEX 2
FIELD Mar INDEX 3
FIELD Apr INDEX 4
FIELD May INDEX 5
FIELD Jun INDEX 6
FIELD Jul INDEX 7
FIELD Aug INDEX 8
FIELD Sep INDEX 9
FIELD Oct INDEX 10
FIELD Nov INDEX 11
FIELD Dec INDEX 12
FIELD "All Months" = SUM Jan:Dec
END DIMENSION
DIMENSION Region
FIELD "North America"
FIELD Europe
FIELD "Asia Pacific"
FIELD "Rest of World"
FIELD "All Regions" = SUM "North America";Europe;"Asia Pacific";"Rest of World"
END DIMENSION
DIMENSION Product
FIELD Software
FIELD Services
FIELD Hardware
FIELD "All Products" = SUM Software;Services;Hardware
END DIMENSION
DIMENSION ProductCategory
FIELD Enterprise
FIELD Professional
FIELD Standard
END DIMENSION
DIMENSION Segment
FIELD Enterprise
FIELD "Mid-Market"
FIELD "Small Business"
END DIMENSION
DIMENSION ExpenseCategory
FIELD "Sales & Marketing"
FIELD "R&D"
FIELD Operations
FIELD Administration
END DIMENSION
DIMENSION Measure
FIELD Cost
FIELD GrossMargin
FIELD NetMargin
FIELD OperatingMargin
FIELD Revenue
END DIMENSION
DIMENSION Metric
FIELD "Order Fulfillment Rate"
FIELD "Inventory Turnover"
FIELD "Customer Satisfaction"
FIELD "Employee Productivity"
END DIMENSION
DIMENSION KPIAttribute
FIELD Achievement
FIELD Target
FIELD Status
END DIMENSION
-- Measure dimension will be populated via LOAD operations as needed
-- ---------------------------------------------------------------------
-- ORTHO Definitions
-- ---------------------------------------------------------------------
ORTHO BusinessFinancials
DIMENSIONS Quarter, Measure
END ORTHO
ORTHO BusinessRegionalSales
DIMENSIONS Region, Product, Quarter
END ORTHO
ORTHO BusinessMonthlySales
DIMENSIONS Month, ProductCategory
END ORTHO
ORTHO BusinessCustomers
DIMENSIONS Segment, Month
END ORTHO
ORTHO CustomerCounts
DIMENSIONS Segment, Month
END ORTHO
ORTHO BusinessKPIs
DIMENSIONS Metric, KPIAttribute
END ORTHO
ORTHO BusinessExpenses
DIMENSIONS ExpenseCategory
END ORTHO
-- ---------------------------------------------------------------------
-- Data Loads from Polytope CSV sources
-- ---------------------------------------------------------------------
LET load_financials = LOAD "Quarterly financial data"
READ FILE HOST "polytope/data/business_quarterly_revenue.csv"
WRITE ORTHO BusinessFinancials
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 TEXT
COLUMN 3 NUMBER
DIMENSION Quarter FIELD FROM COLUMN 1
DIMENSION Measure FIELD FROM COLUMN 2
VALUE FROM COLUMN 3
SKIP 1
END LOAD
LET load_regional = LOAD "Regional sales data"
READ FILE HOST "polytope/data/business_regional_sales.csv"
WRITE ORTHO BusinessRegionalSales
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 TEXT
COLUMN 3 TEXT
COLUMN 4 NUMBER
DIMENSION Region FIELD FROM COLUMN 1
DIMENSION Product FIELD FROM COLUMN 2
DIMENSION Quarter FIELD FROM COLUMN 3
VALUE FROM COLUMN 4
SKIP 1
END LOAD
LET load_monthly = LOAD "Monthly sales data"
READ FILE HOST "polytope/data/business_monthly_sales.csv"
WRITE ORTHO BusinessMonthlySales
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 TEXT
COLUMN 3 NUMBER
DIMENSION Month FIELD FROM COLUMN 1
DIMENSION ProductCategory FIELD FROM COLUMN 2
VALUE FROM COLUMN 3
SKIP 1
END LOAD
LET load_customers = LOAD "Customer growth data"
READ FILE HOST "polytope/data/business_customer_data.csv"
WRITE ORTHO BusinessCustomers
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 TEXT
COLUMN 3 NUMBER
COLUMN 4 NUMBER
COLUMN 5 NUMBER
DIMENSION Segment FIELD FROM COLUMN 1
DIMENSION Month FIELD FROM COLUMN 2
VALUE FROM COLUMN 3
SKIP 1
END LOAD
LET load_customer_counts = LOAD "Customer counts"
READ FILE HOST "polytope/data/business_customer_data.csv"
WRITE ORTHO CustomerCounts
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 TEXT
COLUMN 3 NUMBER
COLUMN 4 NUMBER
COLUMN 5 NUMBER
DIMENSION Segment FIELD FROM COLUMN 1
DIMENSION Month FIELD FROM COLUMN 2
VALUE FROM COLUMN 4
SKIP 1
END LOAD
LET load_kpis_achievement = LOAD "KPI metrics - Achievement"
READ FILE HOST "polytope/data/business_kpi_metrics.csv"
WRITE ORTHO BusinessKPIs
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 NUMBER
COLUMN 3 NUMBER
COLUMN 4 TEXT
DIMENSION Metric FIELD FROM COLUMN 1
DIMENSION KPIAttribute FIELD "Achievement"
VALUE FROM COLUMN 2
SKIP 1
END LOAD
LET load_kpis_target = LOAD "KPI metrics - Target"
READ FILE HOST "polytope/data/business_kpi_metrics.csv"
WRITE ORTHO BusinessKPIs
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 NUMBER
COLUMN 3 NUMBER
COLUMN 4 TEXT
DIMENSION Metric FIELD FROM COLUMN 1
DIMENSION KPIAttribute FIELD "Target"
VALUE FROM COLUMN 3
SKIP 1
END LOAD
LET load_kpis_status = LOAD "KPI metrics - Status"
READ FILE HOST "polytope/data/business_kpi_metrics.csv"
WRITE ORTHO BusinessKPIs
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 NUMBER
COLUMN 3 NUMBER
COLUMN 4 NUMBER
DIMENSION Metric FIELD FROM COLUMN 1
DIMENSION KPIAttribute FIELD "Status"
VALUE FROM COLUMN 4
SKIP 1
END LOAD
LET load_expenses = LOAD "Expense breakdown"
READ FILE HOST "polytope/data/business_expenses.csv"
WRITE ORTHO BusinessExpenses
FORMAT COLUMN DIRECTED USING "," "\""
COLUMN 1 TEXT
COLUMN 2 NUMBER
DIMENSION ExpenseCategory FIELD FROM COLUMN 1
VALUE FROM COLUMN 2
SKIP 1
END LOAD
LET consolidatedRegional = CONSOLIDATE BusinessRegionalSales
LET consolidatedFinancials = CONSOLIDATE BusinessFinancials
LET consolidatedMonthly = CONSOLIDATE BusinessMonthlySales
LET consolidatedCustomers = CONSOLIDATE BusinessCustomers
LET consolidatedCounts = CONSOLIDATE CustomerCounts
LET consolidatedExpenses = CONSOLIDATE BusinessExpenses
LET consolidatedKPIs = CONSOLIDATE BusinessKPIs
-- ---------------------------------------------------------------------
-- Business Dashboard Report Definition
-- ---------------------------------------------------------------------
REPORT BusinessDashboard
TITLE "Q4 2023 Business Performance Dashboard"
SECTION "Executive Summary"
PRINT LEFT
"Key performance indicators and business metrics for Q4 2023."
END PRINT
TABLE "Key Metrics"
SOURCE ORTHO BusinessKPIs
COLUMNS KPIAttribute
HEADINGS <BOLD, COLOR blue>
FIELDS Achievement <COLOR green, FIXED 1>
FIELDS Target <FIXED 1>
FIELDS Status <FIXED 0, RANGE 0 <COLOR red, BOLD, REPLACE "Below Target">, RANGE 1 <COLOR orange, BOLD, REPLACE "On Target">, RANGE 2 <COLOR green, BOLD, REPLACE "Good">, RANGE 3 <COLOR lightblue, BOLD, REPLACE "Excellent">>
END COLUMNS
ROWS Metric
HEADINGS <BOLD>
FIELDS Metric
END ROWS
END TABLE
BESIDE
GRAPH "Revenue by Product Line" PIE
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4", Region = "All Regions"
METADATA measure = "Sales"
FIELDS LIST Software, Services, Hardware
PLOT Product
MEASURES Sales
AXIS NUMERIC
LEGEND RIGHT
END GRAPH
BESIDE
GRAPH "Quarterly Revenue Trend" BAR
SOURCE ORTHO BusinessFinancials
METADATA measure = "Revenue"
PLOT Quarter
MEASURES Revenue
AXIS NUMERIC
LEGEND TOP
END GRAPH
END SECTION
SECTION "Sales Performance Analysis"
PRINT LEFT
"Detailed breakdown of sales performance across regions and product categories."
END PRINT
BESIDE
SECTION "Regional Performance"
TABLE "North America Performance"
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4", Region = "North America"
COLUMNS Region
HEADINGS <COLOR navy>
FIELDS Sales <CURRENCY "usd", FIXED 2, RIGHT, COLOR green, BOLD>
END COLUMNS
ROWS Product
HEADINGS <BOLD>
FIELDS Product
END ROWS
END TABLE
TABLE "Europe Performance"
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4", Region = "Europe"
COLUMNS Region
HEADINGS <COLOR navy>
FIELDS Sales <CURRENCY "usd", FIXED 2, RIGHT>
END COLUMNS
ROWS Product
HEADINGS <BOLD>
FIELDS Product
END ROWS
END TABLE
TABLE "Asia Pacific Performance"
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4", Region = "Asia Pacific"
COLUMNS Region
HEADINGS <COLOR navy>
FIELDS Sales <CURRENCY "usd", FIXED 2, RIGHT>
END COLUMNS
ROWS Product
HEADINGS <BOLD>
FIELDS Product
END ROWS
END TABLE
TABLE "Rest of World Performance"
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4", Region = "Rest of World"
COLUMNS Region
HEADINGS <COLOR navy>
FIELDS Sales <CURRENCY "usd", FIXED 2, RIGHT>
END COLUMNS
ROWS Product
HEADINGS <BOLD>
FIELDS Product
END ROWS
END TABLE
TABLE "All Regions Performance"
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4", Region = "All Regions"
COLUMNS Region
HEADINGS <COLOR navy>
FIELDS Sales <CURRENCY "usd", FIXED 2, RIGHT, BOLD, COLOR blue>
END COLUMNS
ROWS Product
HEADINGS <BOLD>
FIELDS Product
END ROWS
END TABLE
END SECTION
BESIDE
GRAPH "Regional Sales Distribution" BAR
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4"
METADATA measure = "Sales", Product = "All Products"
PLOT Region
MEASURES Sales
AXIS NUMERIC
LEGEND TOP
END GRAPH
PRINT LEFT
"Product performance across all regions:"
END PRINT
GRAPH "Product Sales by Region" BAR
SOURCE ORTHO BusinessRegionalSales
FILTER Quarter = "Q4"
METADATA measure = "Sales"
PLOT Region
REFERENCE Product
MEASURES Sales
AXIS NUMERIC
LEGEND RIGHT
END GRAPH
GRAPH "Monthly Sales Trend" LINE
SOURCE ORTHO BusinessMonthlySales
METADATA measure = "Sales"
FIELDS RANGE Jan TO Dec
PLOT Month
REFERENCE ProductCategory
MEASURES Sales
AXIS NUMERIC
LEGEND TOP
END GRAPH
END SECTION
SECTION "Customer Analytics"
PRINT LEFT
"Customer segmentation and behavior analysis."
END PRINT
BESIDE
TABLE "Customer Segments"
SOURCE ORTHO BusinessCustomers
FILTER Month = "Dec"
COLUMNS Month
HEADINGS <BOLD>
FIELDS NewCustomers
END COLUMNS
ROWS Segment
HEADINGS <BOLD>
FIELDS Segment
END ROWS
END TABLE
BESIDE
GRAPH "Customer Distribution" PIE
SOURCE ORTHO CustomerCounts
FILTER Month = "Dec"
METADATA measure = "CustomerCount"
PLOT Segment
MEASURES CustomerCount
AXIS NUMERIC
LEGEND RIGHT
END GRAPH
GRAPH "Customer Growth Trend" AREA
SOURCE ORTHO BusinessCustomers
METADATA measure = "NewCustomers"
FIELDS RANGE Jan TO Dec
PLOT Month
REFERENCE Segment
MEASURES NewCustomers
AXIS NUMERIC
LEGEND TOP
END GRAPH
END SECTION
SECTION "Financial Performance"
PRINT LEFT
"Detailed financial metrics and profitability analysis."
END PRINT
TABLE "Q4 Financial Summary"
SOURCE ORTHO BusinessFinancials
FILTER Quarter = "Q4"
COLUMNS Quarter
HEADINGS <COLOR navy>
FIELDS Value <CURRENCY "usd", FIXED 2, RIGHT, BOLD, RANGE :1000000 <COLOR red>, RANGE 1000000+:5000000 <COLOR orange>, RANGE 5000000: <COLOR green>>
END COLUMNS
ROWS Measure
HEADINGS <BOLD>
FIELDS Measure
END ROWS
END TABLE
BESIDE
GRAPH "Quarterly Revenue Trend" LINE
SOURCE ORTHO BusinessFinancials
METADATA measure = "Revenue"
PLOT Quarter
MEASURES Revenue
AXIS NUMERIC
LEGEND TOP
END GRAPH
BESIDE
GRAPH "Q4 Expense Breakdown" BAR
SOURCE ORTHO BusinessExpenses
METADATA measure = "Expenses"
PLOT ExpenseCategory
MEASURES Expenses
AXIS NUMERIC
LEGEND TOP
END GRAPH
END SECTION
SECTION "Operational Metrics"
PRINT LEFT
"Key operational performance indicators."
END PRINT
BESIDE
TABLE "Operational KPIs"
SOURCE ORTHO BusinessKPIs
COLUMNS KPIAttribute
HEADINGS <BOLD>
FIELDS Achievement;Target;Status
END COLUMNS
ROWS Metric
HEADINGS <BOLD>
FIELDS Metric
END ROWS
END TABLE
BESIDE
GRAPH "KPI Achievement" BAR
SOURCE ORTHO BusinessKPIs
METADATA measure = "Achievement"
PLOT Metric
MEASURES Achievement
AXIS NUMERIC
LEGEND RIGHT
END GRAPH
END SECTION
END REPORT
LET dashboard_result = CALL BusinessDashboard
WRITE "Business dashboard generated (CALL result = " dashboard_result ")"
The psimulang code (including declarative data and report blocks for a business dashboard)

Psimulang syntax - tried gemini 2.5 Pro and codex. Codex was promising, but error rates quite high. Claude faired better. ChatGPT was excellent at debugging though, even when Claude seemed stuck.
Feedback on differences between LLMs:
gemini: big context window - great for wide code analysis, documentation etc.
codex/chatGPT (4o/5): great at complex analysis and fixing complex bugs
claude (4.1 opus, 4.5 sonnet): great at general coding