Skip to content.

Almost every software project needs to generate text, and it usually has a specialized language to do it. From printf to PHP, we keep inventing new ways to print text again and again. But, for each one I used, I always felt like we could do better.

Finally I did what every coder with too much free time does and built my own. (Thanks, 2020.) The result is Acutis. Maybe it’s better than its predecessors, but I’ll let you decide that yourself. Either way, it’s certainly different.

I wrote the following document to share what I learned along the way. This isn’t so much of a tutorial or a guide as is it the story of my journey. I’m not a computer scientist by education or profession, and I’m sure that much of the knowledge herein is elementary to compiler experts. Some of it also may also seem a bit sophisticated to beginner coders. Regardless of the reader’s skill level, I hope to convey my decisions and the problems I solved more than merely the technical details.

The Acutis icon.
To see Acutis in action, visit its home page here or check out its source code here.

What Acutis is, or: what makes this language different than any other language

Acutis is a simple language based on sophisticated ideas. What sets Acutis apart from its peer template languages is that it’s statically compiled using a structural type system, and it automatically validates its input to guarantee predictable behavior.

An Acutis template consists of text that renders as-is, sections that conditionally render, and sections that render variables from the input data.

This is a basic Acutis template:

Hello {% name %},

{% match hasMetBefore with false ~%}
  It's my pleasure to meet you.
{%~ with true ~%}
  I'm happy to to see you again.
{%~ /match %}

Given a JSON input of {"name": "John", "hasMetBefore": false} this template will render:

Hello John,

It's my pleasure to meet you.

So far, this is nothing too special. We use match and with instead of the more-common if and else, but in this example that difference is only cosmetic.

Things get more interesting once you begin pattern-matching larger data structures. Consider this template:

{% match article
   with {published: true, title, dates: {posted, updated}} %}
  {% title %} was posted on {% posted %} and updated on {% updated %}.
{% with {published: false} %}
  {* Don't render unpublished articles. *}
{% /match %}

Now we aren’t just introducing different code paths, we’re also describing the exact structure of the input for each path and using the names from that description to print variables.

The compiler uses this structural information to infer a type scheme for your template. With that type scheme, it can parse your input data to ensure that it conforms.

A template is rarely useful on its own. We can include template components with an XML-esque syntax:

{% Header title=siteTitle description / %}

{% Article date=articleDate %}
  Write your article text here.
{% /Article %}

This all may seem elementary, and indeed simplicity is by design. But the story of how I ended up with this design is long and sinuous.

Further reading

Why I made a new language, or: how to scratch your own itch

Most template languages suffer from an unfortunate design where they mash any inputs together and try to render whatever they can without crashing. Templates built with JavaScript will gladly print nonsense like “[Object object]” while acting oblivious that anything went wrong. It feels like having an assistant who does their job wrong often enough that you have to always check behind them. When you ask them why they did what they did, they can’t explain it.

I needed a smarter assistant. It should always output exactly what I specified. If doing that was impossible, then it would report what was wrong with the template source or what was wrong with the input data. Could that be so hard?

In no particular order, I wanted:

  • A language with a fully specified behavior including clear success and failure states, and that never rendered nonsense as a kind of pseudo-error.
  • A static analyzer that could guarantee type coherence, find unreachable code, and detect conditionals that didn’t cover all cases.
  • Pattern matching.
  • Type inference.
  • The ability to use my websites’ data without manually converting it.
  • A JavaScript runtime.
  • Support for asynchronous execution, preferably by using JavaScript promises.
  • The warm feeling I get when the pieces of a program all “click” together with no room for error.

These are all individually solved problems across many languages, but I hadn’t seen a template language that solved them together in the way I wanted. So I went to work.

Further reading

Design decisions, or: loving to crash loud and long and clear

Before we get into the nitty-gritty, I wish make you aware of the Acutis language manual. It contains much more detail about how the language works, and I don’t want to bog down this document by repeating that all here. Feel free to reference it if any of the following information seem scant.

Also, as an aside, I found writing the manual to be both highly rewarding and helpful for designing the language. When I needed to figure out how a feature should work, I would try writing about it. If it was hard to describe in English, then it was probably a bad idea to implement in code as well. Good ideas are intuitive. Trying to explain bad ideas only reveals how bad they are.

Now, for the language itself, I employed the philosophy of making “impossible states unrepresentable.” A good type-checker can make incoherent types impossible, and a pattern-matching compiler can identify unused cases. In Acutis, there’s no printing arbitrary values with unspecified outputs. It either works or it crashes.

File "example.acutis", 1:4-1:33
Matching error.
This pattern-matching is not exhaustive.
Here's an example of a pattern which is not matched:
An example error message.

I wanted to improve on the development experience I found in other template languages. Even with fancy editor features, you still have to continuously execute the templates and inspect their output to catch bugs while you edit. In contrast, Acutis “fails loudly.” It crashes as soon as it catches a whiff that something in your code doesn’t add up, and it prints a message explaining exactly why. This lets you write Acutis code using the timeless “edit, compile, run” cycle. As a bonus, this eliminates (or at least minimizes) the need for extra tooling for analysis and debugging.

The Acutis language is also deliberately constrained à la the “principle of least power.” Sometimes a less expressive language makes it easier to build what you need. Acutis lacks conventional functions (or “shortcodes” in the jargon of other template languages). It doesn’t support imperative loops or assigning and manipulating variables. It doesn’t even support recursion, which puts it in a whole category of unpowerful-ness.

I found that these limitations don’t hinder me from building websites using Acutis. I don’t need a Swiss-army knife to make templates, just a really sharp blade. Any other tools are better taken from more powerful languages. Instead of using functions or variables to manipulate data within the templates, it’s easier to simply preprocess the data (using any language of my choice). Acutis supports executing external code as if it was a template component, which works as an escape hatch for those cases where I need something more sophisticated.

And the hardest design decision for any software is naming it. I settled on dubbing it after Carlo Acutis. I had started working on it just after his beautification in 2020, and his patronage seems most appropriate for a compiler that builds websites.

Further reading

Playing the orchestra, or: how Acutis basically works

Acutis is built using OCaml, a functional programming language that’s used for both research and industry. OCaml’s features and ecosystem happen to make it well-suited for creating compilers. I also personally find it pleasant to work with.

The Acutis compiler all starts with a lexer, which breaks down each template into discreet tokens. The lexer itself is made with ocamllex, an OCaml program which generates lexical analyzers. Next, we feed the tokens to a parser, which produces an abstract syntax tree (AST). This parser is made with the parser generator Menhir.

After the lexer and parser produce an AST, our work has only just begun. The next step is to feed that AST into a type-checker, which will validate that the types are coherent and produce a typed tree (like the abstract syntax tree, but type-ier.) Now that we’ve warmed up with that, we send the typed tree to the pattern-matching compiler which reads the matrices of patterns and produces a decision tree for each. We’re almost done, but what do we want to do with it now? Run it? Print it to a file? We can do either by folding over it with tagless-final-style language semantics which map to either executable functions or printers for JavaScript code.

cluster_modulesCompile stepscluster_structuresData structurescluster_runtimesRuntimesLexerLexerTokensTokensLexer->TokensParserParserAbstract\nsyntax treeAbstractsyntax treeParser->Abstract\nsyntax treeType checkerType checkerTyped\ntreeTypedtreeType checker->Typed\ntreePattern matchingPattern matchingOptimized\ntreeOptimizedtreePattern matching->Optimized\ntreeTagless-final instructionsTagless-final instructionsRendererRendererTagless-final instructions->RendererJavaScript\nprinterJavaScriptprinterTagless-final instructions->JavaScript\nprinterTokens->ParserAbstract\nsyntax tree->Type checkerTyped\ntree->Pattern matchingOptimized\ntree->Tagless-final instructionsTemplate\nsourceTemplatesourceTemplate\nsource->Lexer
A simplified overview of the Acutis compiler.

At least, that’s the conductor’s viewpoint, watching the instruments play their parts in the orchestra as a whole. Each of these steps deserves a more detailed explanation, but first let’s look at some of the decisions I was forced to make along the way.

One fact which became obvious early was that the type definitions, the pattern matching, and the runtime representation of data were all intertwined. What the type-checker produced needed to be suitable for building decision trees, which needed to match how the data is represented during runtime. The choices I made at each step caused cascades of pieces to rearrange throughout the project.

I took a page from the OCaml book by separating types and values. Types only exist during compilation, a technique called “type erasure.” The values constructed at runtime carry no type information. This allows for a leaner runtime, and it creates a starker border between compile-time and runtime, where certain operations are only possible in either one or the other.

At runtime, everything is either a primitive value (integer, floating-point number, or string), a tuple, or a hash table. One benefit to this limitation is that you only need to deal with a few kinds of values, which makes compiling pattern-matching and doing optimizations much easier.

To give specific examples of what type erasure does: true is represented as 1, and false and null are both 0. Nullable values are “boxed” into a 1-tuple, so a non-null a becomes (a). Lists are represented as nested 2-tuples, so [a, b, c] becomes (a, (b, (c, 0))). An empty list is 0.

But type erasure like this only works with a compile-time type-checker that is 100% sound, which brings us to the next section.

Further reading

The messy world of inference and unification, or: how the Acutis type-checker works

Type-checking at compile time, or “static typing,” is one of my favorite ways to catch programming errors. Unlike most static analysis techniques, which typically rely on heuristics or case-by-case rules, type-checkers can mathematically prove a whole category of runtime errors are impossible. The more that your program is built on disciplined rules with a sound foundation, the easier it is for both humans and machines to reason about it.

I had looked at several type systems in existing languages, hoping that I could borrow them without reinventing the wheel, but none fit exactly what I had in mind. I wanted a system that worked completely by inference so that type annotations would be unneeded. I needed a structural system where compound types could be extensible. Finally, the hard part, was that it needed to work ergonomically with my (and your) messy data.

I started with the JSON specification, which includes some standard types: number, string, boolean, null, array, and key-value pairs (objects). But you can only safely model a small subset of real-world data with only those types, at least with my notion of safe. So this required some additions.

The easiest problem to fix was null. In many lanaguages, anything can be nullable. Acutis instead treats null like the “option” type in OCaml, where it’s a distinct type that can contain other types. This plays well with type erasure because at runtime the value null doesn’t necessarily need to exist; you can represent it as something like 0. Non-null values get wrapped into a structure, so nested nullable values is safely possible.

Another problem was with the container types: array and object. You can safely have a dynamically-sized container containing a homogeneously-typed values or a fixed-sized container with a heterogeneously-typed values, but a dynamic heterogenous container breaks soundness. To solve this, I split the containers into safer definitions. I split array into list (dynamically sized, containing a single homogenous type) and tuple (fixed-sized, containing multiple heterogenous types). I split object into dictionary (dynamic, homogenous) and record (fixed, heterogenous).

The final problem was JSON’s lack of sum types. For most practical data, you need some way to express that different data structures may appear in the same place. Although it may seem tempting to simply allow unions like string | int, that is only safe in limited situations. It also breaks type erasure, where the compiler doesn’t preserve certain type information at runtime. Even more problematic is the need to unify complex records. Although it’s possible to model a union like {title: string, pages: int} | {title: string, runtime: int}, making an algorithm to do so efficiently and reliably is more trouble than it’s worth. When the type-checker sees a new record structure and tries to unify it with an existing union, how does it decide which case to use? Or does it add a new case? How will the pattern-matching runtime determine which case it’s handling?

As usual, the simpler and more explicit rules are easier for both humans and machines. Acutis allows records to unify when they share a “discriminator” field. Let’s add one to our previous example: {@type: "book", title: string, pages: int} | {@type: "film", title: string, runtime: int}. The @ labels a field as a discriminator. This rule is not only simple to implement in the type-checker, but it also naturally fits most practical JSON data. You very likely want some kind of ad-hoc discriminator field in your data anyway to help classify variations.

The same type-checking logic also applies to enumeration types, which we write in Acutis like @"book" | @"film" or @0 | @1 | @2. I don’t personally find these useful for the kinds of data that gets rendered onto web pages, but they’re easy to add given the presence of record unions already. They also allow an easy way to implement booleans, where false | true is really just an alternate syntax for @0 | @1.

With these features, we now have a basic implementation of what fancy-pants languages call algebraic data types. The example above is analogous to an OCaml type like Book of {title: string; pages: int} | Film of {title: string; runtime: int}. In the OCaml type, the compiler internally manages how each value gets discriminated. In the Acutis version, the user defines it themselves. This means that we don’t need to rely on special fields (like the GraphQL __typename field). It balances flexibility with soundness.

My highest priority when inventing this type system was guaranteeing soundness (which is provable), but my second priority was having good ergonomics and practicality (which is mostly subjective). I like to think I did a decent job at both.

Further reading

Decisions, decisions, or: how Acutis pattern-matching works

If you’ve used pattern matching in other languages, then the Acutis version works probably the same, at least on the surface. You write a sequence of patterns and the runtime checks to see if any of them match the input data.

{% match author with {name, books: [{title}, ..._]} %}
  {% name %}'s latest book is {% title %}.
{% with {name, books: []} %}
  {% name %} hasn't published any books yet.
{% /match %}

The above code roughly says: “take the name field from author. If the books field is a list with at least one item, take the first item’s title field and render the first case. If books is an empty list, render the second case.”

My initial implementation of pattern-matching simply checked every single value on every single pattern until it got a positive hit. This wasn’t very efficient, but worse was that it had no way to analyze the patterns. I needed to detect partial matches (where a possible case isn’t covered) and redundant patterns (that will never be covered). It turns out that solving those can get complicated.

The Acutis compiler presently takes each matrix of patterns and creates a decision tree. In our above example, the tree would describe something like: “bind the field name, if books is empty then go to case 2, if books is not empty then bind field title and then go to case 1.” That’s an oversimplification, since the real tree has instructions like when to open and close each structure (records, list items, etc.), but it’s the basic idea.

The most complicating feature of the Acutis decision trees is how it handles “wildcards,” which are a value that matches anything (e.g. a variable like name or _). The example above is simple, but complex cases are common. Imagine if we matched on a specific value or data structure and then, in a later case, also matched it with a wildcard.

Consider the tree that these cases would produce:

{% match a,  b,  c
   with 10, 11, 12 %} case 0
{% with  _, 21, 22 %} case 1
{% with 30, 31, 32 %} case 2
{% with 30,  _, 42 %} case 3
{% with  _,  _,  _ %} case 4
{% /match %}

“If a = 10 then check if b = 11 then check if c = 12, then…” But what if b = 21? Where do you go from there? What if c = 32? Do you need to go back and test a again?

Acutis uses a basic yet effective technique called wildcard expansion. When you add a wildcard node to a tree, it takes the nodes after the wildcard and copies them into all of the following nodes in the existing tree. In our example, that means that after checking for a = 10 then the next node will check for b = 11 as well as b = 21, and follow the tree accordingly. If a = 30 or if a doesn’t match any of the given numbers, then it will also check if b = 21, and follow the resulting tree from there.

a0ab0ba0->b010b1ba0->b130b2ba0->b2wildcardc0cb0->c011c1cb0->c121c2cb0->c2wildcardc3cb1->c321c4cb1->c431c5cb1->c5wildcardc6cb2->c621c7cb2->c7wildcardexit00case 0c0->exit0012exit40case 4c0->exit40wildcardexit10case 1c1->exit1022exit41case 4c1->exit41wildcardexit42case 4c2->exit42wildcardexit11case 1c3->exit1122exit30case 3c3->exit3042exit43case 4c3->exit43wildcardexit20case 2c4->exit2032exit31case 3c4->exit3142exit44case 4c4->exit44wildcardexit32case 3c5->exit3242exit45case 4c5->exit45wildcardexit12case 1c6->exit1222exit46case 4c6->exit46wildcardexit47case 4c7->exit47wildcard
A decision tree for our "match a, b, c" example.

This approach brings us advantages, albeit not without downsides. One benefit is that its trees guarantee that every value will only need to be tested once at runtime. We can always move forward at each decision point without backtracking.

We also get some features for “free” with this algorithm. Detecting unused patterns is simple. When you merge a new pattern into an existing tree, just check to see if the resulting tree differs from the original. If it doesn’t, then it means that all of the new pattern’s cases were covered already, ergo it’s unused.

It’s also easy to identify non-exhaustive sequences of patterns. You just need to traverse the tree until you find a node without a wildcard (or, for unions or enums, until you find a node missing one of its cases).

This tree also represents everything you need for runtime instructions. It tells you exactly how to scan the data to find each possible match.

So we have one algorithm that gives us dead code analysis, non-exhaustive match analysis, and runtime instructions all at once. What’s the downside?

Pattern-matching aficionados have probably noticed a glaring issue with this strategy. Most compilers avoid wildcard expansion because it leads to an exponential explosion in output size. Just look at the graph above and how many times the paths to four cases get multiplied. While this is a real concern (my own tests in the Acutis repository produce some quite large outputs), I decided that it was a cost I could afford. Acutis’ practical application is with simple patterns. The problematic templates are mostly pathological, and the mildly concerning ones have not caused any issues in practice yet.

It’s also not an easy algorithm to implement. It’s long, is difficult to debug, uses advanced OCaml features, and is brittle to changes. But, from what I’ve seen, pattern-matching compilation is complex no matter what algorithm you use. So I’ll take it.

Either way, it took me long enough to figure this one out, so I’m not in a rush to invent a new algorithm all over again.

Further reading

Template components, or: why Acutis doesn’t have functions

Acutis templates can include other templates, otherwise known to as components. It does so with an XML-inspired key-value syntax with a closing backslash.

{% Header title="My title" / %}

{% Body %}
  Content can go inside a component like this.
{% /Body %}

In the above example, a Header and Body template are both included using the data supplied to their parameters (“props”). Components in Acutis only have access to the data they’re directly supplied; they don’t automatically inherit everything in their parent template. I like to keep everything simple and explicit.

The Acutis language lacks something that’s standard in just about every other template language: functions, filters, shortcodes, or whatever else other languages call them. In Acutis, template components are analogous to functions, and can be implemented as functions exactly in your language of choice. You just need a function that returns a string (or a promise of a string for async template), and then you can use the Acutis API to define your function’s type scheme.

I believe this design gives Acutis a pragmatic balance to its limited power. If you want to do something that requires a general-purpose language with more expressiveness (and you will eventually), then you can just call one and do whatever you need with that.

After enough trouble building this language that simply prints text, I’m happy to outsource bigger features to the professionals.

The many Acutis runtimes, or: how I added more languages into my language

The data structure that represents a compiled Acutis template neatly maps to runtime instructions. This made it simple for me to compose a set of functions that takes this structure, takes some input data, and “folds” across it to render the template.

But there was one problem. I thought this was too simple. A mere interpreter like this seemed hardly fitting for a statically-typed, compiled language like Acutis. What was the fun of going through all of these compile steps if the end result was just a simple fold function? Surely I could do better.

I wanted a compile target, and so I began work on a new module that takes that same data structure and maps it to another data structure that represents JavaScript commands. But soon I had more problems. Representing JavaScript (or any language) as a data structure is quite tedious and error-prone. I had to juggle two incompatible runtimes along with a structure which needed to fit both of them. What happens when I change some internal detail in the pattern-matching, or when I decide to add yet-another compile target?

Fortunately, we have an elegant way to unify all of this together. Using “tagless final” style, we can write only one set of folding functions and let them either execute the code, print JavaScript, or do anything else we need.

cluster_runtimeDecodes into internal\ndata structureDecodes into internaldata structureExecutes template\ninstructionsExecutes templateinstructionsDecodes into internal\ndata structure->Executes template\ninstructionsError messageError messageDecodes into internal\ndata structure->Error messageResult textResult textExecutes template\ninstructions->Result textExecutes template\ninstructions->Error messageExternal\nJSON-like dataExternalJSON-like dataExternal\nJSON-like data->Decodes into internal\ndata structureExternal\ncomponentsExternalcomponentsExternal\ncomponents->Executes template\ninstructions
A simplified overview of the Acutis runtime.

To do this, we have to write our functions using semantics for a new, abstract language. We then define those semantics in terms of OCaml functions. For example, we define an if function which takes a boolean value and two functions for the then and else paths. let is a function which takes an expression and a function for what we do with that expression’s result. And so on. You can see the entire semantics I use here.

Once we’ve defined language semantics and have written our folding functions that use them, the next step is to define implementations. Executing the semantics as OCaml code is easy. Our if function is simply fun b ~then_ ~else_ -> if b then then_ () else else_ (), a function wrapper around the built-in if expression. Executing our functions like this just renders the template.

Compiling to JavaScript means defining the language as a pretty-printer. Executing a function calls the OCaml Format module to print code. For example, our if function prints using an OCaml format string like this: "@[<hv 2>@[<hv 2>if (@,%a@;<0 -2>)@] {@ %a@;<1 -2>} else {@ %a@;<1 -2>}@]". (That prints if () {} else {} with indentation.)

In practice, this composes and scales elegantly. Each component of this process is independent enough that we can modify each separately, and adding new features means just writing a couple more functions.

Further reading

Using Acutis in the real world, or: how I finally made a website out of this

With all of that machinery in place, we can now write some templates to make some web pages.

Building a website requires more than just templates, though. So I called upon the Eleventy static-site generator. Eleventy runs on Node.js and has a broad API that includes support for custom templates engines.

We can package Acutis into JavaScript using the Js_of_ocaml compiler. And, thanks, to Acutis template components being implementable as functions, we can call virtually anything in the Node.js ecosystem while building our site. (I use that for things like image optimization.)

This setup, compiling to JS and piggybacking on Eleventy, honestly adds more complexity than I’m completely comfortable with. Eleventy itself is very “JavaScript-y,” for lack of a better term, and it relies on quirky JS features which sometimes cause more harm than good. But it’s undeniably practical, and implementing a whole static-site generator is not something I’m interested in at this time. Ultimately, I’m just glad that the Acutis code is flexible enough that we can plug it into other programs like this with minimal fuss. If it works, it works.

Further reading

The end, for now

And that’s how I built the web page that published this text. As usual for these kinds or projects, it took several years of work to have a mostly-decent program that does a task that I could have done with an existing tool in a fraction of the time.

Do you want to use Acutis for your OCaml or JavaScript projects? I’m not actively encouraging people to work with it, but that’s only because I don’t have the time to respond to support questions. I also recognize that everything about its design is based on my personal needs. If you think you could use Actuis for something, or if you want to learn something from its code, then go ahead; it’s under an open-source license for a reason.

Meanwhile, I need to use Acutis to build a few thousand more websites to eventually get a return on my time investment. Or maybe the real reward was the esoteric compiler knowledge I gained along the way.