writing
long-form pieces on the thing i keep coming back to: getting models to produce structure you can actually trust.
a 4-part series · neuro-symbolic LLM systems
Creating the first morphological analyser for Spoken Tamil
Spoken Tamil is hard for language models to understand. Come see why, and how to get an LLM to break a Spoken Tamil word into its parts without inventing parts that don't exist. The series builds from “what is this even?” to a working neuro-symbolic system, traced end to end. Pages are interactive, so you don't need to know how to read Tamil script or even know what morphlogy is.
-
01
→
What is morphological analysis?
Breaking a word into its smallest meaningful pieces — and why Spoken Tamil has no algorithm to do it. The interactive primer; no Tamil required.
-
02
→
Where a language model goes wrong
The same word, eight times, to a capable model: fluent, confident answers that don't agree — and the majority is wrong.
-
03
→
Teaching a model to admit it doesn't know
The fix: the model proposes, a curated knowledge graph disposes. A ReAct loop, typed tools, and a fully traced run.
-
04
→
The recipe: what it takes
The transferable method behind the system — six things that have to be true to make a model admit it doesn't know, with ReAct as the spine.