Skip to content

Blog

Welcome to my technical blog, where I share insights about:

  • AI Innovation
  • AI Tooling
  • AI System Implementation
  • Deep Learning Techniques

Stay Updated

You can connect with me or follow me on to be kept updated with AI technical insights, news, and best practices:

Subscribe to Updates

Connect with me on LinkedIn

Follow me on X (Twitter)

Micro blogging with SolveIt and SolveBlog

I am trying something new, micro blogging

The plan is to write three or four short posts a week. After every interesting lesson, video or podcast I listen to, I will write something about it. Not a summary. More like what I took away, what I questioned, what connected to other things I have been thinking about. The writing itself is the point, it reinforces learning and recall in a way passive consumption does not.

Selection and Ensemble Strategies for Embedding Retrieval

In my previous post, I questioned that many RAG systems use embeddings 4-6x larger than necessary. Simple factual content plateaus at 256-512 dimensions. Complex technical content plateaus at 768 dimensions. Test with your own data and you will probably find you need far fewer dimensions than model and service providers recommendations suggest.

What if you cannot afford or consider even the optimal dimensions? This post covers three techniques I tested: domain-adaptive dimension selection, ensemble approaches and cascaded retrieval. These techniques help when you operate below optimal dimensions or need better performance within resource constraints.

Match Embedding Dimensions to Your Domain, Not Defaults

Vector database costs scale with embedding dimensions. Most systems use 768-3072 dimensions. You may only need 256-768.

When you choose an embedding model, you skip the most important decision. You glance at MTEB leaderboards, check the costs, then move straight into chunking strategies and RAG architecture. Most practitioners treat the embedding model as just a configuration variable. But when your embedding model cannot match relevant chunks in the vector search step, everything downstream suffers.

The Human is the Agent: How SolveIt Changed My Programming Journey After 25 Years

I have been programming for over 25 years off and on, with a background in pre-AlexNet AI and having been a technical reviewer for AI publications and courses. As an early adopter of AI tooling and LLMs I thought I had a good sense of what AI could do for programming. Then I joined the first cohort of SolveIt students and something unexpected happened. Despite my experience with AI tools, SolveIt changed how I approach programming in ways I did not anticipate.

Improving LLM & RAG Systems: Essential Concepts for Practitioners

Building effective, production-ready LLM and RAG systems requires more than just theoretical knowledge. This intermediate guide outlines concepts and techniques for overcoming real-world implementation challenges, optimising performance, and ensuring system reliability. Whether you're scaling an existing deployment or planning your first production system, these essential insights will help you navigate the complexities of modern AI LLM & RAG architecture.

Speculative Decoding: Using LLMs Efficiently

Speculative decoding makes large language models (LLMs) work more efficiently.

Large language models are transforming how we write code, but running them efficiently remains a challenge. Even with powerful hardware, code completion can feel sluggish, breaking our concentration just when we need it most. The bottleneck isn't necessarily computational power - it's how efficiently we use it. This is where speculative decoding comes in.

How to Validate AI Solutions Before Committing Resources

The biggest risk in AI projects isn't the technology - it's the gap between expectation and reality. But what if you could validate your AI solution in days, not months?

When marketing and advertising agencies develop AI-powered concepts for clients, they face a practical challenge: how to validate technical feasibility before committing significant resources. Traditional approaches involving detailed specifications and lengthy proposals often prove inefficient with AI projects, where real-world performance can differ significantly from paper specifications.