What we're thinking about.
Short, honest notes on software delivery, AI systems, local models, data pipelines, and the choices that actually matter once something has to work in production.
Local LLMs
10 April 2026 · 4 min read
Gemma's second act: not ready for agents, but worth watching
We spent a week running the new Gemma models through the same agent workloads we push to Claude every day. Short version: if you were hoping to pull your Anthropic bill to zero and run everything locally, you're not there yet. But the trajectory is interesting.
Read the post →AI Tooling
11 April 2026 · 3 min read
Block's Goose: an open-source coding agent worth running locally
Block's open-source AI coding agent Goose has been picking up serious momentum. It runs as a local agent on your machine, connects to your tools via MCP, and handles multi-step engineering tasks with less hand-holding than most alternatives. Here's what we've found running it.
Read the post →AI Research
12 April 2026 · 4 min read
Karpathy's AutoResearch and what it could mean for neuroscience data
Andrej Karpathy's AutoResearch project points toward a future where AI agents run the grunt work of scientific investigation autonomously. I've been thinking about what that looks like applied to OpenNeuro, one of the largest publicly available collections of human neuroscience data.
Read the post →LLMs
8 April 2026 · 5 min read
What is a Large Language Model (LLM)? A practical guide for Australian businesses
A Large Language Model (LLM) is an AI system trained on massive amounts of text to predict, generate, and reason with language. They power tools like ChatGPT, Claude, and Gemini. Here's what Australian businesses actually need to understand about how they work and where they fit.
Read the post →AI Agents
7 April 2026 · 5 min read
How AI agents work: a plain-English guide for Australian businesses
An AI agent is a system that uses a language model as its reasoning engine, gives it access to tools (APIs, databases, browsers), and lets it plan and act across multiple steps to complete a goal. Here's what that means in practice for Australian businesses looking to automate complex workflows.
Read the post →AI Agents
6 April 2026 · 5 min read
What is RAG (Retrieval-Augmented Generation) and when should you use it?
RAG (Retrieval-Augmented Generation) is a technique that gives an LLM access to a knowledge base at query time, so it can answer questions using your data, not just what it learned during training. It's the standard architecture for internal AI search, document Q&A, and knowledge management systems.
Read the post →ML Fundamentals
5 April 2026 · 4 min read
What is an MLP (Multi-Layer Perceptron)? The foundational neural network explained
A Multi-Layer Perceptron (MLP) is the simplest form of neural network: layers of neurons connected by weights, trained to map inputs to outputs. Despite being the oldest neural architecture, MLPs remain useful for tabular data, classification, and regression tasks where more complex architectures are overkill.
Read the post →ML Fundamentals
4 April 2026 · 4 min read
What is a CNN (Convolutional Neural Network)? How convolutions learn features
A Convolutional Neural Network (CNN) is a neural network architecture designed to process grid-structured data like images by learning local spatial features using convolutional filters. CNNs are the reason modern computer vision works, from satellite imagery analysis to document classification.
Read the post →ML Fundamentals
3 April 2026 · 5 min read
Transformer architecture explained: the model behind every modern LLM
The transformer is the neural network architecture that powers GPT, Claude, Gemini, and every major LLM. Introduced in the 2017 paper 'Attention Is All You Need', it replaced recurrent networks with self-attention, enabling parallel training and scaling to unprecedented sizes.
Read the post →ML Fundamentals
2 April 2026 · 4 min read
What is an LSTM (Long Short-Term Memory)? Sequential modelling explained
An LSTM (Long Short-Term Memory) is a type of recurrent neural network designed to learn patterns in sequential data over long time spans. Before transformers dominated the field, LSTMs were the architecture of choice for time-series forecasting, language modelling, and any task where order and history matter.
Read the post →ML Fundamentals
1 April 2026 · 3 min read
What is a BiLSTM (Bidirectional LSTM) and when does bidirectionality matter?
A BiLSTM (Bidirectional LSTM) runs two LSTM layers over the same sequence: one forward (left to right) and one backward (right to left). The outputs are concatenated, giving the model context from both directions. It's the standard choice when full sequence context matters and you don't need to generate text left to right.
Read the post →ML Fundamentals
31 March 2026 · 4 min read
What is ResNet (Residual Network) and what did residual connections solve?
ResNet (Residual Network) introduced skip connections that allow gradients to flow directly through deep networks, solving the vanishing gradient problem that made very deep neural networks impractical before 2015. ResNet-50 and its variants remain widely used in production computer vision systems today.
Read the post →ML Fundamentals
30 March 2026 · 4 min read
What is CatBoost and why is it still the go-to for tabular data in production?
CatBoost is a gradient boosting library developed by Yandex that handles categorical features natively, trains fast, and consistently outperforms neural networks on structured tabular data. It's one of the most reliable models for business prediction tasks in finance, retail, and resource industries.
Read the post →