My Current AI & Dev Tech Stack
What tools, models and platforms I use to build and ship automations and projects
It's been a little while since my last post, and with a welcome influx of new subscribers (thank you!), it feels like the perfect time to tackle a question I get asked frequently: what specific tools and AI models am I actually using day-to-day?
If you've been following the AI space, you know things change fast. The releases in 2025 alone are a perfect example of this:
January:
DeepSeek R1
OpenAI o3-mini
February
Anthropic Claude Sonnet 3.7
OpenAI GPT-4.5
March
Google Gemini 2.5 pro
April
OpenAI o3 & o4-mini
What seems like the best model or tool one week might be superseded the next. Because of this constant change, my approach isn't about blind loyalty to any single company or platform. Instead, it's grounded in flexibility. It's about understanding what different tools do well (and where they fall short), how they feel to integrate into a workflow, and crucially, retaining the freedom to adapt. Getting locked into one ecosystem right now can be a real disadvantage.
So, consider this less a definitive list of "the best" tools, and more a snapshot of my current thinking. It’s about the why behind the what – how I piece together tools to build things, automate tasks, and navigate this evolving landscape.
Moving Fast with No-code Tools: Prototyping Workflows Before Coding
Before I even think about writing custom code for a new process or automation, I almost always start with visual automation platforms. Why? Speed and simplicity, especially early on. These tools let you connect different apps and services visually, like drawing a flowchart. Need to grab data from an email, send it to a spreadsheet, and then notify someone in Slack? You can often build a working version of that in minutes.
Another big part is that with AI models getting cheaper and more capable, there’s a lot of low hanging fruit when it comes to automations. You don’t always need AI Agents. A great of example of that is speech to text models have gotten a lot better, and are even more useful when that result is fed into an LLM. Off the top of my head, a few usecases that only require a few steps:
Quickly logging a sales call
Documenting a project update
“Writing” a report
A massive advantage of these tools is that they handle the messy parts of talking to other software – figuring out authentication (logging in securely), understanding the specific requirements of different APIs – it's mostly pre-configured, as well as not having to host any code. Below is an example of a document analyzer that takes 30 minutes to set up and can send different messages based on the contents of an invoice that lands in Dropbox.

These platforms are also incredible for rapid prototyping and experimentation. If a client has an idea for a workflow, or if I want to test if connecting Service A to Service B is feasible, I can quickly build it in a tool like Make.com or N8N.
Make.com is powerful and relatively user-friendly, great for handling complex data visually, though its pricing can sometimes push you towards less readable workflows.
N8N offers more control, feels closer to coding, and can be self-hosted (a plus for cost and privacy), but requires a bit more setup; its active community and templates are a big help here (N8N uses a source-available "Sustainable Use License" with some commercial restrictions).
Even Zapier, while simpler and potentially more costly for complex tasks, is great for quickly validating basic connections between common apps.
Example of a flow that would be expensive with Make.com: Analyse and collect messages in discord, checking whether they’ve already been seen, collect and pass on new messages:
Using these tools, I can get something working, show it to a client, get feedback, and iterate fast. We can tweak the logic, change the steps, and see the results almost immediately. Once we've landed on a workflow that everyone's happy with and that proves its value, then we can decide if it needs the robustness, performance, or specific control that comes with a custom-coded solution. These visual tools act as an stepping stone, letting us validate ideas in the real world before committing significant development resources.
While newer, AI-focused automation tools (like Relevance AI, Relay.app) are popping up, I tend to rely on the stability and broad integrations of these established players. N8N also does a great job keeping up with the high pace changes, making them a good option for AI-native workflows.
Building Something New with Code
Once a workflow is validated or if a project requires custom functionality from the start, it's time to write code. Here’s how I typically navigate that process:
Work on a Project Description
Before diving headfirst into writing code, I almost always take some time discussing the project with an LLM. I outline the idea, and what I’m hoping to do. There’s a few things that are very important to take into consideration:
First of all, when working on the cutting edge, understand that the model may not know some things as the training data hasn’t gone as far. It can search the web to help, but it won’t natively understand that GPT-4 is a model nobody uses anymore.
Second, if you’re diving into something you don’t know much about yourself, give the model room to correct you. This can be done easily by adding some lines into your prompt, like saying: I’m thinking of doing it through X and Y. Is this a good idea, or is there a better way?
Discuss edge cases, challenges, and what success would look like. Outline all of it, and only once you’ve got a good idea, move on. Any time spent extra in this phase will save a multitude of those hours down the line.
The Coding Itself (With Careful AI Assistance):
My Main Coding Tool - Aider: I rely heavily on AI for coding, but not by blindly copy-pasting or using fully automated "agentic" systems. My primary tool is Aider. Why? Freedom and Control. It's open-source and works with any AI model (GPT-4.1, Gemini 2.5 Pro, Sonnet 3.7, local models, etc.), letting me pick the best brain for the specific task – crucial when capabilities change constantly. More importantly, it gives me control. I tell it exactly which files to read and which to edit. I see every proposed change and approve it. It integrates with Git, making every AI edit a trackable commit that's easy to undo.
I have played with other tools like Cursor and Windsurf, but I always end up back at Aider as I feel most in control over every single aspect, from model selection to settings.
Why Not Fully Agentic Coders? Tools like Replit's agent or others aiming for full automation (like Lovable) are impressive for getting a starting point quickly. They can automate steps like running code, reading errors, and attempting fixes. However, their effectiveness is ultimately limited by the underlying LLM's intelligence. More critically, they lack the granular control and easy to read insights into what has changed. When an agent tries to "solve" a problem autonomously, it can sometimes take shortcuts that look good on the surface but hide deeper issues. for example:
"Oh, one test is failing? Let's just delete that test! Problem solved!"
“There’s an error there! Let’s just hardcode a simulated response if it errors again, that way at least we can move on!
Without careful oversight, these systems can lead you down the wrong path or produce brittle code. Once a project moves beyond the initial sketch, I need the transparency and explicit control Aider provides.
Where I Code - Replit & VS Code: For quick projects or coding on the go, Replit's online environment is hard to beat – no setup, accessible anywhere. Clean environemnt every time. When working locally, I'll settle in with VS Code on my desktop, still using Aider in its integrated terminal.
The Foundation - Backend Made Easier: Most apps need somewhere to store data and manage users. Building this from scratch is often reinventing the wheel.
My Shortcuts - Supabase & Pocketbase: Tools like these are really helpful. They offer pre-built databases, user logins, and APIs. Using them lets me skip weeks of foundational backend work and jump straight to building the unique features of whatever I'm creating.
The Blueprint - Keeping Track with GitHub: Code needs a home where changes are tracked. GitHub (or similar) is pretty much essential for saving versions, collaborating, and having the safety net to roll back if things go sideways. Integrates really well with Aider.
Getting it Live - Deployment with Railway: To make something a live application people can use, it needs to run somewhere.
My Preferred Platform - Railway: This is my current favorite place to deploy applications. It connects directly to GitHub and makes the process very smooth via a clean visual interface. Need a database? Deploying an open-source tool? Setting up automatic updates when I push code? Railway handles it elegantly. Their "Functions" feature is particularly useful for instantly deploying small AI-generated APIs. It just removes a lot of friction from the deployment process.
The AI Brains: Choosing Models Based on Behaviour, Not Just Benchmarks
Underpinning tools like Aider, or used directly for analysis and generation, are the Large Language Models (LLMs). My selection process here is fluid and pragmatic. Here's the thing: right now, the top-tier models (Gemini 2.5 Pro, o3, Sonnet 3.7, GPT-4.1) are often remarkably close in raw capability or "smarts" on many benchmarks.
This convergence makes their behaviour – their personality, quirks, and interaction style – the critical deciding factor. A few percentage points on a benchmark mean little if the model is frustrating to work with or ill-suited to your task's style. If one model was significantly ahead in intelligence, you might tolerate its flaws. But when they're this close, usability and how the model actually interacts become paramount.
My Current Roster & Reasoning:
Gemini 2.5 Pro (Google): My current favourite, the reliable all-rounder. Great for coding, writing, and general tasks. It’s thinking process is insightful, balances detail well, and crucially, offers constructive pushback, acting more like a collaborator. Its biggest strength aside from simply high-quality output, might be its lack of major annoying quirks – it just works smoothly. It also stays capable even when using lots of context.
o3 (OpenAI): The specialist for heavy lifting. When faced with truly complex problems requiring deep reasoning or intricate code generation that stumps other models, o3 is often the one that can push through. It excels at multi-step analysis and finding solutions others miss. Its directness can sometimes feel less collaborative, but for raw power on tough challenges (including deep-dive web searches), it's my go-to.
GPT-4.1 (OpenAI): The meticulous rule-follower. If I need an AI to follow a complex set of instructions to the letter, especially for automation, GPT-4.1 (Regular or Mini) is unparalleled. It's a specialist for tasks demanding precision and predictability.
Claude Sonnet 3.7 (Anthropic): The (sometimes overly) enthusiastic initiator. This model is brilliant when you want it to take initiative, explore tangents, and build things out proactively – think AI agent or "vibe coding" where you give it a rough direction and let it run. However, this same tendency makes it frustrating when you need precise control; it often adds unwanted complexity or ignores constraints.
Honorable Mentions: Beyond the main ones I use regularly, a few others deserve a shout-out for specific niches:
Mistral Small (24b): This model is impressively fast and surprisingly capable for its size. It's great for quick, smaller tasks like rewriting text, summarizing, or powering simple RAG chatbots where speed is key.
GPT-4.1 Mini (OpenAI): Phenomenal value for the price. It's particularly good at function calling (using tools) and excels at following complex, structured prompts. This makes it a strong contender for tasks like data extraction or classification where you need reliable output based on specific rules. In other words, perfect for automated tasks.
DeepSeek v3: Offers performance that gets remarkably close to the top-tier models but at a significantly lower cost. It's readily available through API providers like Fireworks.ai or Together.ai, making it an excellent budget-friendly option for demanding tasks without sacrificing too much quality.
The Takeaway on Models: Forget chasing the absolute "smartest" on a leaderboard. Focus on the job you need done and match it to the model whose behaviour fits best. Do you need obedience, collaboration, initiative, or raw power? Right now that's often the more important question.
How to find New Models: Check out OpenRouter’s LLM Rankings. They rank based on how many tokens models use across different topics. It’s not a measure of what’s best, but can give some ideas to try new models
AI Chat interface
There’s of course different platforms to talk to AI models, usually provided by the company that provides the model. ChatGPT, Gemini, Claude, Mistral’s le Chat, I’ve used them all and still do. However, I’m noticing I’m using MCP’s more and more commonly. Whether that’s NotionMCP to organize my projects and quickly store updates, Make or n8n’s MCP integration to quickly give an LLM access to any tool. Having an AI model directly communicate with a database works really well.
I’m currently using two chat platforms the most, and that’s the Claude Desktop App, because of it’s MCP support. The second is Cherry Studio which allows me to plug in any API key and use different models, with decent MCP support. Unfortunately, it is a little buggy at times. If anyone has a good alternative for a chatinterface that allows plug and play with different models and good MCP support, please let me know!
Staying Adaptable
So, that’s my current approach – a flexible toolkit assembled to navigate things with AI tools changing so quickly. It’s built around choice, understanding the nuances of each piece, and prioritizing adaptability over allegiance to any single vendor. It works for me today, but ask me again in six months, and I guarantee parts of it will have changed.
What does your stack look like? How are you navigating this constant change? Any tools or workflows you’re finding particularly effective? I’d love to hear your thoughts!
Thanks for your post! It's been a couple of months since, so I highly recommend giving Windsurf (no affiliation) another shot. They constantly release thoughtful upgrades and enhancements, and their MCP integration has been spot on. I'd also love to hear your thoughts on their internal SWE LLMs that have been custom built to act as highly experienced software engineers.
I agree with you on the multidimensional reliability of Gemini 2.5 Pro. Haven't looked back. My toolstack is pretty similar, with Vercel replacing Railway. I must take advantage of MCPs a lot more, and explore n8n for simpler automations/POCs.