Category: Blog

  • Your brain on chatGPT: Is it making you dumber?

    Your brain on chatGPT: Is it making you dumber?

    There’s a research paper that’s been making the rounds recently, a study by MIT’s Media Lab, that talks about the cognitive cost of using LLMs (Large Language Models that power AI apps like ChatGPT).

    In the study, researchers asked 54 participants to write essays. They divided them into 3 groups – one that could use ChatGPT, one that could only use a search engine like Google, and the third (Brain-only) that couldn’t use any tool.

    And, surprise surprise, they found that ChatGPT users had the lowest brain engagement, while participants who use only their brains to write the essays had the highest. ChatGPT users also had a harder time recalling quotes from the essay.

    No shit.

    Let’s leave aside the fact that 54 participants isn’t statistically significant and that writing an essay is maybe not a comprehensive test of cognitive load. The paper is essentially saying that if you use AI to help you think, then you are reducing the cognitive load on your brain. This is obvious.

    Look, if you use ChatGPT to write an entire article for you, without any input, then of course you’re not using your brain. And of course you’re not going to remember much of it, you didn’t write it!

    Does that mean it’s making you dumber? Not really.

    But it’s also not making you smarter. And that should be obvious to you too.

    Active vs passive mode

    AI is a tool, like any other, and there’s a right way to use it and a wrong way to use it.

    If I need to study a market to evaluate an investment opportunity, I could easily ask ChatGPT to run deep research on the market, and then write up a report. It would take a few minutes, as opposed to a few hours if I did it myself.

    Even better, I can ask ChatGPT to make an investment recommendation based on the report. That way I don’t even need to read it!

    But have I learned anything at all from this exercise? Of course not. The only work I did was write a prompt, and then AI did everything else. There was no input from me, not even a thought.

    Again, all of this is obvious, but it’s also the default mode for most people using AI. That’s why the participants in the study showed low levels of brain activity. They asked AI to do all the work for them.

    This is the passive mode.

    But there’s a better way, one where you can use AI to speed things up and also learn and exercise critical thinking.

    I call this active mode.

    Thinking with AI

    Any task can be broken down into steps that require critical thinking or creative input, and steps that don’t. In the market research example, searching doesn’t require critical thinking but understanding it and writing a report does.

    In active mode, we use AI to do the steps that don’t require critical thinking.

    We use ChatGPT Deep Research to find relevant information, but we read it. And once we read it, we figure out what’s missing and ask ChatGPT to search for that information.

    When we’re done understanding the market, we write the report and we ask ChatGPT to help us improve a sentence or paragraph. We decide what information to put into the report but we ask ChatGPT to find a source to back it up.

    And when we’re done, we ask ChatGPT to poke holes in our report, and to ask us questions that haven’t been covered. And we try to answer those questions ourselves, and go back to our research or ask ChatGPT to research more if we don’t have the answers.

    Writing a report, planning a project, building an app, designing a process, anything can be done this way, you doing the critical thinking and creative stuff, and AI doing the rest.

    You just need to make this your default way of using AI.

    Practical Steps for Active AI Use

    Here’s how to make active mode your default:

    1. Start with Your Framework

    Before touching AI, spend 5-10 minutes outlining:

    • What you’re trying to accomplish
    • What you already know about the topic
    • What questions you need answered
    • How you’ll evaluate success

    This prevents AI from hijacking your thought process from the start.

    2. Use AI for Research

    Ask AI to find information but don’t ask it to summarize it without reading through it yourself

    • Instead of: “What does this data mean for my business?”
    • Try: “Find data on customer churn rates in SaaS companies with 100-500 employees”

    Then draw your own conclusions about what the data means.

    That’s not to say you shouldn’t ask AI to analyze data. You absolutely should, but after you draw your own conclusions as a way to uncover things you’ve missed.

    3. Think Out Loud With AI

    Use AI as a sounding board for your thinking:

    • “I’m seeing a pattern in this data where X leads to Y. What other factors might explain this relationship?”
    • “My hypothesis is that Z will happen because of A and B. What evidence would support or contradict this?”

    4. Ask AI to Challenge You

    After developing your ideas, ask AI to poke holes:

    • “What assumptions am I making that might be wrong?”
    • “What questions haven’t I considered?”
    • “What would someone who disagrees with me say?”

    5. Use the 70/30 Rule

    Aim for roughly 70% of the cognitive work to come from you, 30% from AI. If AI is doing most of the thinking, you’re in passive mode.

    6. Maintain Ownership of Synthesis

    AI can gather information and even organize it, but you should be the one connecting dots and drawing conclusions. When AI offers synthesis, use it as a starting point for your own analysis, not the final answer.

    7. Test Your Understanding

    Regularly check if you can explain the topic to someone else without referencing AI’s output. If you can’t, you’ve been too passive.

    When Passive Mode Is Fine

    Active mode isn’t always necessary. Use passive mode for:

    • Getting quick background on unfamiliar topics
    • Formatting and editing tasks
    • Generating initial ideas to spark your own thinking
    • Routine tasks that don’t require learning or growth

    The Long Game

    The MIT study participants who relied entirely on AI showed less brain engagement, but they also completed their tasks faster. That’s the trade-off: immediate efficiency versus long-term capability development.

    In active mode, you might take slightly longer upfront, but you build knowledge, develop better judgment, and create mental models you can apply to future problems.

    The goal isn’t to avoid AI or to make every interaction with it a learning exercise. It’s to be intentional about when you’re thinking with AI versus letting it think for you.

    Think with AI, anon.

  • How I Write With AI (Without Creating Slop)

    How I Write With AI (Without Creating Slop)

    The best performing post on this blog is a 20,000 word tome on the Google Agent Development Kit. Granted, maybe half the words are code samples, but without AI this would have taken me weeks to write. With AI, it was just a few days.

    Great articles, the kind that get shared in Slack channels, bookmarked for later, or ranked on Google or ChatGPT, don’t just happen. They require deep research, careful structure, compelling arguments, and that yo no sé qué quality we call a tone or voice.

    They need to solve real problems, offer genuine insights, and reflect the author’s hard-earned expertise.

    The traditional writing process goes something like this: ideation (where you wrestle with “what should I even write about?”), research (down the rabbit hole of sources and statistics), outlining (organizing your scattered thoughts into something coherent), drafting (the actual writing), editing (realizing half of it makes no sense), revising (again), and finally polishing (until you hate every word you’ve written).

    That’s a lot of work. For a 2,000-word post like the one you’re reading, probably a couple of days of work. And then AI came along and everyone thought they could short-circuit this process with “vibe marketing”, and now we have slop everywhere and no one wins.

    Stop serving slop

    The problem is that most people have fallen into one of two camps when it comes to AI writing:

    Camp 1: The AI Content Mills

    These are the people who’ve decided that if AI can write, then obviously the solution is to generate unlimited blog posts and articles with minimal human input. More content equals more traffic equals more success, right?

    They’re pumping out dozens of articles per week, each one a generic regurgitation of the same information you can find anywhere else, just rearranged by an algorithm.

    Who’s going to read this? They are bots creating content for other bots. Any real human traffic that hits their site will take one look at it and then bounce.

    Camp 2: The One-Prompt Writers

    On the flip side, you’ve got well-meaning writers who heard AI could help with content creation, so they fired up ChatGPT and typed something like “write me a 2000-word article on productivity.”

    Twenty seconds later, they got back a wall of text that reads like it was written by an intern who’d never experienced productivity problems themselves, which, in a way, it was.

    Frustrated by the generic drivel, they declared AI “not ready for serious writing” and went back to their caves, doing everything the old way. They still create good content, but it takes long and requires too many resources.

    Both camps are missing the point entirely. The problem isn’t AI itself. It’s over-reliance on automation without essential quality control measures in place. They’re both treating AI like a magic one-click content machine.

    The Missing Ingredient: Your Actual Brain

    Here’s a novel concept. What if humans want to read content that is new and interesting?

    Think about what you bring to the writing table that no AI can replicate… creativity, emotional intelligence, ethical reasoning, and unique perspectives. Your years of experience in your field. Your understanding of your audience’s real pain points. Your ability to connect seemingly unrelated concepts. Your voice, your humor, your way of explaining complex ideas.

    AI, meanwhile, excels at the stuff that usually makes you want to procrastinate, like processing vast amounts of information quickly, organizing scattered thoughts into logical structures, and generating that dreaded first draft that’s always the hardest part.

    Two entities with complementary skill sets. You and the AI. Like Luke and R2-D2.

    You’re the director, the editor, the strategic thinker, and the voice curator. AI is the research assistant and first-draft collaborator.

    You are what makes the content new and interesting. AI helps you shape it.

    My AI Writing Process

    Let me walk you through exactly how I’ve been using this collaboration to go from scattered thoughts to published article in 1-2 hours instead of a full day, while actually improving quality.

    Step 1: I Pick the Topic

    This is where your expertise and market understanding are irreplaceable. I don’t ask AI “what should I write about?” That’s a recipe for generic content that already exists everywhere else.

    Instead, I pick topics that genuinely interest me or that I think are timely and underexplored. For example, my piece on ChatGPT’s glazing problem, or my deep dive into Model Context Protocol.

    The blog post you’re reading right now came from a tweet (xeet?) I responded to.

    I start by doing what I call the “thesis dump.” I open a new chat in my dedicated Claude project for blog content and just brain-dump everything I think about the topic. Stream-of-consciousness thoughts, half-formed arguments, random observations, and whatever connections I’m seeing that others might not.

    Pro-tip: Create a Claude project specifically for blog content (or whatever type of content you write), upload samples of past work or work you want to emulate, and give it specific instructions on your writing style and tone.

    Pro-pro-tip: Use the voice mode on Claude’s mobile app or Wispr Flow on your computer to talk instead of type. And just ramble on, don’t self-edit.

    This dump becomes the foundation of everything that follows. It’s my unique perspective, my angle, my voice. The stuff that makes the eventual article mine rather than just another generic take on the topic.

    Step 2: AI Does the Research Legwork

    Now comes the part where AI really shines. AI excels at supporting literature review and synthesis, processing vast amounts of information that would take me hours to gather manually.

    I ask AI to research the topic thoroughly. Before Claude had web search, I’d use ChatGPT for this step. The key questions I want answered are:

    • What’s already been written on this topic?
    • What angles have been covered extensively?
    • What gaps exist in the current conversation?
    • What data, statistics, or examples support (or challenge) my thesis?

    This research phase is crucial because understanding the landscape helps you write something better than what already exists. I’m not looking to regurgitate what everyone else has said. I want to know what they’ve said so I can say something different, better, or more useful.

    The AI comes back with a comprehensive overview that would have taken me hours to compile. Sometimes it surfaces angles I hadn’t considered. Sometimes it finds data that strengthens my argument. Sometimes it reveals that my hot take has already been thoroughly explored, saving me from publishing something redundant.

    My WordPress is littered with drafts of posts I thought I were genius insights only to find out smarter people than I have covered everything on the topic.

    Step 3: Collaborative Outlining

    This is where the collaboration really starts to sing. I ask Claude to create an outline that brings my original thesis dump and the research it has gathered together.

    Here it becomes a cycle of drafting, editing, and reworking where I’m actively shaping the structure based on my strategic vision.

    “Move that section earlier.” “Combine these two points.” “This needs a stronger opener.” “Add a section addressing the obvious counterargument.” And so on.

    By the time I’m done with this back-and-forth, usually about 30 minutes, I’ve got something that looks like a mini-article. It’s got a clear structure, logical flow, and it’s heavily influenced by both my original thinking and the research insights. Most importantly, it already feels like something I would write.

    Step 4: Section-by-Section Development

    Now comes the actual writing, but in a much more manageable way. Instead of staring at a blank page wondering how to start, I work with AI to flesh out each section one by one.

    My guiding principle is to maximize information per word. Every section needs to drive home one key concept or argument, and it needs to do it efficiently. No padding, no fluff, no generic statements that could apply to any article on any topic.

    I’ll say something like, “For the section on why most AI content fails, I want to emphasize that it’s not the technology’s fault, it’s how people are using it. Include specific examples of both failure modes, and make sure we’re being concrete rather than abstract.”

    Just like with outline creation, I’m working with AI closely to refine each individual section. “Make this more conversational.” “Add a specific example here.” “This paragraph is getting too long, break it up.”

    I’ll also directly make edits myself. I add sentences or rewrite something completely. No sentence is untouched by me. AI handles the initial generation and helps maintain consistency, but I ensure the voice, examples, and strategic emphasis stay authentically mine.

    Step 5: The Critical Review

    Here’s a step most people skip, and it’s what separates good AI-human collaboration from slop.

    I ask AI to be my harshest critic.

    “Poke holes in this argument.” “Where am I not making sense?” “What obvious questions am I not answering?” “Where could someone legitimately disagree with me?” “What gaps do you see in the logic?”

    This critical review often surfaces weaknesses I missed because I was too close to the content. Maybe I’m assuming knowledge my readers don’t have. Maybe I’m making a logical leap without explaining it. Maybe I’m not addressing an obvious counterargument.

    I don’t blindly accept the AI’s critique though. Sometimes it gets it wrong, or I just don’t agree with it. But sometimes it gets it right and I fix the issues it identifies.

    Step 6: The Sid Touch

    Now comes the final step, no AI involved here. I go through the entire article, put myself in the reader’s shoes, and make sure it flows well. I make edits or change things if needed.

    I’ll also add a bit of my personality to it. This might be a joke that lightens a heavy section, a personal anecdote that illustrates a point, or just tweaking the language to sound more like how I actually talk.

    I call this the “Sid touch” but you can call it something else. Sid touch has a nice ring to it.

    “Hey did you finish that article on productivity?”

    “Almost! Just giving it a Sid touch.”

    See what I did there?

    Proof This Actually Works

    What used to take me the better part of a day now takes an hour or two tops if I’m being a perfectionist. But more importantly, the quality hasn’t suffered.

    I actually think it has improved because the research and outline process is more thorough. The structure is more logical because we’re iterating on it deliberately. The arguments are stronger because I’m actively testing them during the writing process.

    I started writing this blog in February this year and I’m already on track to go past 5,000 monthly visitors this month, with hundreds of subscribers. Not because I’m publishing a ton of content (I’m not), but because I’m combining AI’s data processing capabilities with my creativity and strategic thinking to create genuinely useful content.

    The Future of Content Writing

    If you’re thinking this sounds like too much work and you’d rather create a fully automated AI slop factory, I can promise you that while you may see some results in the short-term, you will get destroyed in the long term.

    Platforms will get better at filtering AI slop, just like they learned to handle email spam. It’s already starting to get buried in search results and ignored by readers.

    That means the writers who figure out effective human-AI collaboration now will have a massive competitive advantage. While others are either avoiding AI entirely or drowning their audiences in generic content, you’ll be creating genuinely valuable content faster than ever before.

    So here’s my challenge to you: audit your current writing process. Are you spending hours on research that AI could handle in minutes? Are you staring at blank pages when you could be starting with a solid structure? Are you avoiding AI because you tried it once and got generic results?

    Or maybe you’re on the other extreme, using AI to replace your thinking instead of amplify it?

    If so, try the process I’ve outlined. Pick a topic you genuinely care about, dump your thoughts, let AI help with research and structure, then work together section by section while keeping your voice and expertise front and center.

    Let me know how it goes!

    Get more deep dives on AI

    Like this post? Sign up for my newsletter and get notified every time I do a deep dive like this one.

  • The Make.com Automation Guide for GTM and Operations

    The Make.com Automation Guide for GTM and Operations

    The CEO of Zapier recently announced that they have more AI agents working for them than human employees. Now that sounds exciting and terrifying but the truth is most of the “agents” he listed out are really just simple automations (with some AI sprinkled in).

    He is, after all, promoting his own company, which he uses to build these automations.

    In this guide, I will show you how to build those same automations on Make.com. It’s designed for business owners, no-code enthusiasts, and consultants looking to automate real-world tasks in marketing, sales, and operations.

    The first thing you need to do is create a Make.com account. Sign up here to get one month free on the Pro plan.

    I’ve split this guide up into sections for Marketing, Sales, HR, Product, and Customer Support. The following automations are beginner-friendly and the best way to learn is to follow the instructions and build it yourself.

    If you’re looking to build more complex AI Agents, I have a full guide here. I also have a free email course which you can sign up for below.

    High-Impact Use Cases for Marketing Teams

    Write a blog post, summarize it for LinkedIn, create social variations, send campaign results to the team, draft the newsletter…and repeat. Every. Single. Week.

    You’re drowning in content demands and half of it is grunt work that AI can now handle. Enter: Make.com + AI.

    This combo turns your messy marketing checklist into an elegant flowchart. You write once, and AI helps you remix, repurpose, and report across all your channels.

    Here’s what you can automate today:

    • Turn blog posts into LinkedIn content
    • Repurpose content into tweets, emails, or IG captions
    • Summarize campaign performance into Slack reports
    • Generate social variations for A/B testing
    • Create email copy from feature releases
    • Summarize webinars or podcasts for newsletters

    Let’s build a few of these together.

    Project 1: Blog to LinkedIn Auto-Post

    Scenario: You’ve just published a blog post. Instead of opening LinkedIn and crafting a summary from scratch, this automation turns your post into a social-ready snippet instantly.

    How it works: Make.com watches your RSS feed for new content. When a new blog post is detected, it sends the blog’s title and content to Claude or OpenAI with a carefully constructed prompt. The AI replies with a LinkedIn-ready post featuring a hook and CTA. This is then routed to Buffer for scheduling or Slack for internal review. All content can be logged to Google Sheets or Airtable for records and team collaboration.

    Step-by-Step Walkthrough:

    1. Trigger on New Blog Post (RSS Module)
      • Drag the RSS module into your Make.com scenario.
      • Enter your blog’s RSS feed URL (e.g., https://yourblog.com/rss).
      • Set the module to check for new posts every X minutes.
      • Ensure the output includes title, link, and content/excerpt.
    2. AI Content Creation (OpenAI Module)
      • Add the OpenAI module (ChatGPT model) or Claude by Anthropic.
      • Create a prompt like:”You are a copywriter creating LinkedIn posts for a B2B audience. Write a short, engaging post that summarizes this blog. Include a 1-line hook, 1–2 insights, and end with a call-to-action. Blog title: {{title}}, Excerpt: {{content}}.”
      • Choose GPT-4o for Claude Sonnet 4 (I prefer Claude).
      • Output should be plain text.
    3. Routing Options (Router Node)
      • You can either insert a Router node after the OpenAI output or do this in a sequence (like my setup above).
      • Route A: Manual Review
        • Add Slack module.
        • Post the AI-generated copy to #marketing-content.
        • Include buttons for “Approve” or “Revise” via Slack reactions or separate review workflows.
      • Route B: Auto Schedule
        • Add Buffer or LinkedIn module.
        • Schedule the post directly.
        • Add time delay if needed before posting (e.g., delay by 30 min to allow override).
    4. Log It (Google Sheets or Airtable)
      • Add a Google Sheets or Airtable module.
      • Create a row with blog title, link, and generated post.
      • Optional: Include timestamp and user who approved the content.

    Optional Enhancements:

    • Add a “fallback content” path if AI fails or times out.
    • Use a Make “Text Parser” to clean up or trim content to fit platform character limits.
    • Add UTM parameters to links using a “Set Variable” step before publishing.

    Why this helps: This flow cuts down repetitive work, ensures content consistency, and keeps your distribution engine running on autopilot with human review only when needed.

    Project 2: AI Campaign Performance Digest

    Scenario: When I started my career in marketing, over a decade ago, before AI, I would manually compile Google Ads campaign reports every Monday morning. Today, AI does it for you, and shares a clean summary to Slack every morning.

    How it works: Make.com runs a scheduled workflow each morning. It pulls campaign data from Google Ads, sends it to GPT-4o with a prompt designed to extract insights, and then posts a summary digest to a Slack channel.

    Step-by-Step Walkthrough:

    1. Trigger on Schedule:
      • Use the “Scheduler” module in Make.
      • Set the time to run daily at 8:00 AM (or whatever cadence fits your reporting cycle).
    2. Fetch Campaign Data (Google Ads Module):
      • Add a Google Ads module.
      • Authenticate with your account and select the appropriate campaign.
      • Configure it to retrieve key metrics like impressions, clicks, CTR, cost, conversions, and ROAS.
      • Ensure the output is formatted clearly to be used in the next step.
    3. Summarize Metrics (OpenAI Module):
      • Add an OpenAI (ChatGPT) module.
      • Use a system + user prompt combo to ensure structured output: System: “You are a digital marketing analyst summarizing ad performance.” User: “Summarize the following Google Ads metrics in 3 concise bullet points. Highlight performance trends, wins, and concerns. Metrics: {{output from Google Ads module}}”
      • Choose GPT-4o for better language quality and reliability.
    4. Post to Slack (Slack Module):
      • Add the Slack module and connect your workspace.
      • Send the AI summary to your marketing channel (e.g., #ads-daily).
      • Format the message cleanly using markdown, and optionally include a link to the Google Ads dashboard for deeper inspection.
    5. Log for Reference (Optional):
      • Add a Google Sheets or Airtable module to log the raw metrics + AI summary.
      • Include a date stamp and campaign ID for tracking trends over time.

    Optional Enhancements:

    • Add a fallback message if AI output is blank or token limits are exceeded.
    • Use a router to conditionally summarize different campaign types differently (e.g., brand vs. performance).
    • Include comparison to previous day or week by pulling two data sets and calculating diffs before sending to GPT.

    Why this helps: It delivers a high-signal snapshot of ad performance daily without wasting your time, and keeps everyone on the same page.

    Project 3: AI Content Research Assistant

    Scenario: You’re planning a new blog post, campaign, or social series and need quick, high-quality content research. Instead of spending hours googling stats, quotes, and trending ideas, let an AI-powered automation do the heavy lifting.

    How it works: You input a topic into Airtable (or another database), which triggers a workflow in Make.com. The AI uses that topic to generate:

    • A list of content angles
    • Related stats or facts
    • Popular subtopics or related trends
    • Potential hooks or titles

    Everything gets logged into a Google Sheet or Notion database for review and use.

    Step-by-Step:

    1. Trigger: Airtable Record Created
      • Use the Airtable “Watch Records” module.
      • Set it to monitor a “Content Ideas” table.
      • Capture fields like: Topic, Target Audience, Tone (optional).
    2. AI Research Prompt (OpenAI Module):
      • Add OpenAI ChatGPT-4 module.
      • Prompt:”You are a content strategist researching ideas for a blog post or campaign. Given the topic ‘{{Topic}}’ and the audience ‘{{Audience}}’, generate:
        1. 3 content angles
        2. 3 surprising stats or insights with real examples
        3. 3 hook ideas or headline starters. Format clearly with numbered sections.”
    3. Parse and Organize (Text Parser or Set Variables):
      • If needed, extract each section into separate fields using Text Parser or Set Variable modules.
    4. Log to Google Sheets or Notion:
      • Add a new row with:
        • Topic
        • Audience
        • Generated angles
        • Hooks/headlines
        • Suggested stats
    5. Optional Enhancements:
      • Add a Slack notification: “New content research ready for review!”
      • Add a filter so only topics marked as “High Priority” trigger AI research.

    Why this helps: You eliminate blank-page paralysis and get rich, contextual research for any content initiative without wasting your team’s time or creativity on preliminary digging.


    The Sales Bottleneck: Manual Follow-Ups & Cold Data

    Sales teams waste hours every week:

    • Manually sorting through low-quality leads
    • Writing cold emails from scratch
    • Logging CRM updates by hand
    • Missing follow-ups because of clunky tools

    With Make.com and AI, you can automate the entire pre-sale pipeline—from qualification to enrichment to personalized outreach—while still keeping it human where it counts.

    Project 1: AI Lead Qualification & Outreach Workflow

    Scenario: Automatically qualify new leads and kick off personalized outreach. Imagine you have a web form or marketing funnel capturing leads. Instead of manually sifting through them, we’ll build a Make.com workflow that uses AI to evaluate each lead’s potential and respond accordingly. High-quality leads will get a custom email (drafted by AI) and be logged in a CRM, while unqualified ones might get a polite decline or be deprioritized.

    How it works: Whenever a new lead comes in (with details like name, company, message, etc.), the workflow triggers. It sends the lead info to an AI (GPT-4o) to determine if the lead is “Qualified” or “Not Qualified,” along with reasoning. Based on the AI’s decision, Make branches into different actions.

    Step-by-Step:

    1. Trigger on New Lead:
      • Use a Webhook module if your lead form sends a webhook
      • Or use a Google Sheets module if leads are collected there
      • Or integrate with your CRM (HubSpot, Pipedrive, etc.)
      Example: If using a form with a webhook, create a Webhook module to receive new lead data like name, email, company, and message.
    2. AI Qualification (OpenAI):
      • Add OpenAI (ChatGPT) module
      • Prompt:System: “You are a sales assistant that qualifies leads for our business.” User: “Lead details: Name: {{name}}, Company: {{company}}, Message: {{message}}. Based on this, decide if this lead is Qualified or Not Qualified for our services, and provide a brief reason. Respond in the format: Qualified/Not Qualified – Reason.”
      This gives you structured output like: “Qualified – The company fits our target profile and expressed interest,” or “Not Qualified – Budget mismatch.”
    3. Branching Logic (Router or IF):
      • Use an IF or Router module to check if the response contains “Qualified.”
      • Route accordingly:
        • Qualified → Follow-up path
        • Not Qualified → Logging or polite response
    4. Qualified Lead Path:
      • Generate Email: Use another OpenAI module to draft a personalized email:Prompt: “Write a friendly email to this lead introducing our services. Use this info: {{lead data + qualification reasoning}}.”
      • Send Email: Use Gmail or SMTP module to send the AI-generated message.
      • Log Lead: Add/update lead in your CRM or Google Sheet.
    5. Unqualified Lead Path:
      • Polite Decline (Optional): Use GPT to generate a kind “not the right fit” email.
      • Internal Log: Mark the lead in CRM or Sheet as disqualified.
    6. Test the Workflow:
      • Use test leads to verify AI outputs and routing logic.
      • Ensure prompt format is consistent for accurate branching.

    Bonus Ideas:

    • Human-in-the-loop Review: Send AI-drafted email to Slack for approval before sending.
    • Scoring instead of binary: Ask AI to score Hot, Warm, Cold.
    • Enrichment before AI: Use Clearbit or Apollo API to add job title, company size, industry.

    Why this helps: Your sales team only sees high-quality leads and can follow up instantly with personalized, AI-written messages.

    Project 2: AI-Powered CRM Enrichment & Follow-Up

    Scenario: Automate enrichment for CRM records and schedule follow-ups based on lead type.

    How it works: Whenever a new contact is added to your CRM (or manually tagged), the workflow enriches the contact (e.g. via Clearbit), uses AI to suggest next actions, and schedules a follow-up reminder.

    Step-by-Step:

    1. Trigger: Watch for a new CRM contact (e.g., HubSpot “New Contact” trigger).
    2. Enrichment: Call Clearbit or similar API to retrieve job title, company data.
    3. AI Recommendation: Use OpenAI:Prompt: “Based on this lead info, suggest next sales action and urgency level. Respond with a 1-sentence summary.”
    4. Create Task: Add to Trello/Asana/Google Calendar or CRM task board.
    5. Notify Salesperson: Slack message or email summary with AI’s next step.

    Why this helps: Keeps your CRM smart and your reps focused on the right next step.

    Project 3: AI Deal Progress Updates to Stakeholders

    Scenario: Keep internal stakeholders updated as deals progress, without constant emails or meetings.

    How it works:

    When a deal stage changes in your CRM, AI summarizes deal context and posts an update to a Slack channel or email digest.

    Step-by-Step:

    1. Trigger: Watch for deal stage change (e.g. from “Demo” to “Negotiation”).
    2. Pull Context: Use previous notes or contact data.
    3. AI Summary: Prompt:“Summarize this deal update with name, stage, client concern, and next step. Make it brief but informative.”
    4. Send Digest: Post to Slack #deals or email manager/team.

    Why this helps: Reduces status meetings while keeping everyone aligned.


    Automation For Product Teams

    Product managers juggle user feedback, bug reports, feature requests, competitor research, roadmap planning, internal prioritization, and stakeholder updates, all at once. It’s chaos. And most of it is repetitive, noisy, and hard to scale.

    With Make.com and AI, you can:

    • Digest qualitative feedback in minutes
    • Summarize feature requests by theme
    • Classify bugs and assign owners
    • Monitor competitor news
    • Auto-generate user stories and release notes

    Let’s walk through a few real workflows.

    Project 1: Feature Request Summarizer & Classifier

    Scenario: Users submit feature requests through a form, support tool, or product portal. Instead of manually reviewing each one, this automation summarizes and categorizes requests using AI, then logs them in your product management system.

    How it works: A new request triggers the workflow. AI (via GPT-4o) reads and classifies the submission (e.g., UX, performance, integrations), writes a short summary, and sends the data to Airtable or Notion for prioritization.

    Step-by-Step:

    1. Trigger: Form Submission or Inbox Monitoring
      • Use the “Webhook” module if collecting feedback via a form (e.g., Typeform, Tally).
      • Or use the “Gmail” or “Intercom” module to watch for new support emails or messages.
      • Capture key fields: name, email (optional), feature request text, and source.
    2. AI Summarization and Categorization (OpenAI Module):
      • Add the OpenAI module.
      • Use the following prompt:”You are a product manager assistant. Summarize the following user feature request in 1–2 sentences. Then categorize it as one of: UX/UI, Performance, Integrations, New Feature, Other. Respond with: Summary: … / Category: …”
    3. Process Output (Text Parser, Set Variable):
      • If needed, parse out “Summary:” and “Category:” into separate fields.
    4. Log to Product Tracker (Airtable/Notion/Google Sheets):
      • Add a module to write the summary, category, and source to your product request tracker.
      • Optional: Add a timestamp and auto-assign priority if source = “VIP” or “internal.”
    5. Bonus Enhancements:
      • Add Slack notifications to alert the product team when a new high-priority request is submitted.
      • Use a Router node to auto-tag requests into different buckets (e.g., roadmap now/later/backlog).

    Why this helps: Instead of skimming dozens of tickets, PMs see a categorized, summarized list ready to evaluate in minutes.

    Project 2: Bug Report Classifier and Assignment

    Scenario: Your support team logs bugs from users. Instead of having a PM manually triage and assign each one, this workflow uses AI to determine severity and auto-assigns to the right team or Slack channel.

    How it works: When a new bug report is added to your tracking system (e.g., Airtable, Google Sheet, or Intercom), the workflow triggers. GPT-4o reads the bug report, labels it by severity, recommends the team, and routes the report to a Jira board or Slack for resolution.

    Step-by-Step:

    1. Trigger: New Bug Logged
      • Use “Airtable – Watch Records” or “Google Sheets – Watch Rows.”
      • Trigger on a new row in your “Bugs” table with fields: Description, Environment, App Version, Submitter.
    2. AI Classification (OpenAI Module):
      • Add the OpenAI module.
      • Prompt:”You are a technical triage assistant. Read this bug description and assign: a) Severity: Low, Medium, High b) Team: Frontend, Backend, Infra, QA Description: {{bug_text}} Respond: Severity: … / Team: …”
    3. Parse Output:
      • Use a Text Parser or Set Variable module to extract the fields.
    4. Routing & Assignment (Router + Slack/Jira):
      • Use a Router module to route based on team.
      • For each branch:
        • Slack: Send bug summary to respective team channel
        • Jira: Create issue with pre-filled metadata
    5. Log Final Record:
      • Update Airtable/Sheet with AI’s classification, routing action, and date.

    Why this helps: Triage happens instantly, teams are alerted without delay, and engineering isn’t bogged down by unclear, unprioritized issues.

    Project 3: Competitor Research Digest

    Scenario: Your product team wants to monitor competitor news (feature launches, pricing changes, new positioning) but no one has time to check their blogs or Twitter every day. Let automation do it for you.

    How it works: Make.com monitors competitor blogs or news feeds using RSS. New content is piped into GPT-4o, which extracts relevant summaries and logs them to Notion or shares them in Slack.

    Step-by-Step:

    1. Trigger: RSS Feed Monitoring
      • Use the “RSS Watch Items” module.
      • Add feeds from competitor blogs (e.g., /news, /blog/rss).
      • Trigger the scenario when new items appear.
    2. AI Summary (OpenAI Module):
      • Add the OpenAI module.
      • Prompt:”You are a product strategist summarizing competitor updates. Summarize the following blog post in 2–3 sentences. Focus on new features, strategic changes, and pricing or positioning shifts.” Input: {{rss_content}}
    3. Routing and Output:
      • Slack: Send formatted summary with post link to #product-intel
      • Notion: Append to a Competitive Insights database (Title, Summary, Source URL, Date)
    4. Optional Enhancements:
      • Add a keyword filter (e.g., only send if post mentions “AI,” “pricing,” “feature,” etc.)
      • Use sentiment analysis to mark as positive/negative/neutral (another AI call)

    Why this helps: Keeps product and strategy teams aware of external moves without manual research, freeing time for response planning or differentiation work.

    Project 4: Generate User Stories from Feedback

    Scenario: You’ve collected raw user feedback from forms, surveys, support tickets, or customer interviews. Now you need to turn that messy, unstructured input into clear, actionable user stories. Let AI write the first draft for your backlog.

    How it works: Whenever feedback is marked as actionable or tagged with “feature request,” Make.com sends it to GPT-4o. The AI rewrites it in proper user story format and logs it to your dev tracker (Notion, Airtable, Trello, Jira, etc.).

    Step-by-Step:

    1. Trigger: Tagged Feedback Entry
      • Use “Watch Records” (Airtable) or “Watch Database Items” (Notion).
      • Set a filter: Only run if field ‘Type’ = “Feature Request.”
    2. Prompt AI to Generate User Story (OpenAI Module):
      • Prompt:”You are a product manager preparing backlog items. Turn this raw feedback into a user story using this format: ‘As a [user role], I want to [goal/action], so that [benefit].’ Feedback: {{feedback_text}}”
    3. Post-processing (Optional):
      • Add a sentiment analysis module (e.g., another AI call) to assess urgency.
      • Use Router to assign story to the correct product squad based on keyword/topic.
    4. Log Story:
      • Notion: Add to product backlog database
      • Airtable: Insert as a new story row
      • Jira/Trello: Create new ticket with AI-generated description
    5. Notify Stakeholders (Optional):
      • Slack alert to product owner: “New story added from feedback: {{story}}”

    Why this helps: Turns raw, unstructured user data into clean, consistent backlog items—without product managers rewriting every ticket themselves.


    HR Teams: Automate Onboarding and Employee Insights

    HR teams are buried under repetitive, time-consuming tasks:

    • Answering the same policy questions again and again
    • Sorting resumes manually
    • Drafting internal emails and updates

    AI automations free up time for strategic people ops work while giving employees faster responses and a better experience.

    Project 1: AI-Powered HR Slack Assistant

    Scenario: Employees constantly ask HR about leave policies, benefits, or internal procedures. This workflow creates an AI-powered Slack bot that answers common questions instantly.

    How it works: Employees post questions in a designated Slack channel. Make.com captures the question, sends it to GPT-4 (with your handbook or policies as context), and posts the AI-generated answer back in the thread.

    Step-by-Step:

    1. Trigger:
      • Use Slack’s “Watch Messages in Channel” module
      • Monitor #ask-hr or a similar channel
    2. AI Response (OpenAI):
      • Prompt: “You are an HR assistant. Use the following context from our handbook to answer questions. If you don’t know the answer, say so. Question: {{message_text}}.”
      • Provide a static section of company policy or use a database API to insert context
    3. Respond in Thread:
      • Post the AI-generated answer as a reply to the original Slack message
    4. Fallback Handling:
      • If AI is unsure, route to a human HR rep with a notification

    Why this helps: Reduces HR interruptions while improving employee experience with instant, contextual answers.

    Project 2: Resume Screening Assistant

    Scenario: You receive a high volume of applicants and need to quickly assess fit for a role based on resumes and job descriptions.

    How it works: Applicants submit resumes through a form or ATS. Make.com collects the submission, sends it to GPT-4 with the job description, and receives a scored summary with a short rationale.

    Step-by-Step:

    1. Trigger:
      • Watch new form submission or integrate with ATS (e.g., Google Form, Typeform)
      • Collect name, resume text (or file), and job applied for
    2. AI Fit Evaluation (OpenAI):
      • Prompt: “You are an HR recruiter. Based on this job description and the applicant resume, rate this candidate as High, Medium, or Low Fit. Provide a 1-sentence reason.”
        Input: {{resume}}, {{job_description}}
    3. Parse Response:
      • Extract score and reason using text parser or Set Variable
    4. Log Result:
      • Add to Airtable or Google Sheet for internal review
    5. Optional:
      • Notify hiring manager via Slack if rating is “High Fit”

    Why this helps: Quickly filters high-potential candidates without sifting through every resume manually.

    Project 3: Personalized Onboarding Sequence Generator

    Scenario: New hires need to go through onboarding. Instead of sending the same emails manually or giving them generic documents, generate a tailored onboarding plan.

    How it works: When a new hire is added to Airtable or your HRIS, GPT-4o generates a personalized onboarding checklist and intro email based on their role and department.

    Step-by-Step:

    1. Trigger:
      • Watch for a new employee record in Airtable (or Google Sheet or BambooHR)
    2. Generate Plan (OpenAI):
      • Prompt: “You are an HR onboarding assistant. Based on this employee’s name, role, and department, write a custom onboarding checklist for their first 2 weeks. Also generate a welcome email.”
    3. Send Outputs:
      • Email the onboarding checklist and welcome message to the new hire
      • Optionally send a copy to their manager
    4. Log or Archive:
      • Save plan to a shared onboarding doc or Notion database

    Why this helps: Makes onboarding feel personal and organized without HR lifting a finger.


    Support Teams: Automate Ticket Triage And Responses

    Customer support is repetitive by nature, but that doesn’t mean it should be manual. With AI and automation, you can:

    • Instantly classify and route tickets
    • Auto-draft replies to common questions
    • Summarize conversations for handoffs or escalations
    • Proactively flag critical issues

    Let’s break down a few powerful workflows you can launch today.

    Project 1: AI-Powered Ticket Triage Bot

    Scenario: Incoming support tickets vary widely. Some are technical, others are billing-related, some are spam. Instead of human agents triaging each one manually, AI can analyze and route them to the right person or tool.

    How it works: Make.com monitors your support inbox or form. For each new ticket, GPT-4o classifies it (Billing, Technical, Account, Spam) and assigns urgency. Based on the result, the ticket is routed to the correct Slack channel, person, or tool.

    Step-by-Step:

    1. Trigger:
      • Watch new entries from Gmail, HelpScout, Intercom, or a form tool like Typeform.
      • Capture subject, message body, and metadata.
    2. Classify Ticket (OpenAI):
      • Prompt:“You are a support assistant. Read the message below and categorize it as one of: Billing, Technical, Account, Spam. Also assign an urgency level (Low, Medium, High). Respond like: Category: …, Urgency: …”
    3. Parse Output:
      • Use a Text Parser or Set Variable module to extract Category and Urgency
    4. Route Based on Logic:
      • Use a Router or Switch module
      • Route Technical → #support-dev, Billing → #support-billing, etc.
      • Notify urgent issues in a priority Slack channel or tag a team lead
    5. Log for Analytics (Optional):
      • Save categorized tickets to Airtable or Sheets for trend tracking

    Why this helps: Your team spends less time sorting and more time solving. Escalations are never missed.

    Project 2: AI Auto-Responder for Common Questions

    Scenario: Many support tickets are variations of the same FAQ: password resets, refund policies, shipping delays. Let AI draft helpful responses automatically, ready for human review or direct sending.

    How it works: When a new ticket arrives, GPT-4o reviews the content and drafts a relevant reply using company policy snippets or a knowledge base.

    Step-by-Step:

    1. Trigger:
      • Monitor new support tickets via Help Desk or form integration
    2. Draft Response (OpenAI):
      • Prompt:“You are a support rep. Read this customer message and write a helpful reply using our policies: {{kb_snippets}}. Message: {{ticket_text}}”
    3. Review Flow:
      • Send AI draft to Slack for human review (or assign a Google Doc comment task)
      • Use Slack emoji as approval trigger, or set manual override option
    4. Send Response:
      • Upon approval, send email via Gmail, Outlook, or HelpDesk API

    Why this helps: Reduces response time for repetitive inquiries and gives your team a first draft to edit instead of starting from scratch.

    Project 3: Conversation Summary Generator for Escalations

    Scenario: When tickets get escalated across teams, agents spend time writing summaries of what’s happened so far. Use AI to generate this summary instantly.

    How it works: When a ticket is tagged for escalation or transfer, Make.com grabs the conversation thread and asks GPT-4o to summarize the key points.

    Step-by-Step:

    1. Trigger:
      • Tag change or status update in HelpScout/Intercom (e.g., “Escalated”)
    2. Summarize Conversation (OpenAI):
      • Prompt:“Summarize this customer support conversation: who is the customer, what’s the issue, what’s been tried, and what’s needed next. Format as: Summary: … / Next Action: …”
    3. Send to Escalation Path:
      • Post to Slack or assign Jira/Trello task with summary included
      • Tag original agent and team lead

    Why this helps: Handoffs are cleaner, faster, and no critical context is lost.

    Start with something small

    The best way to get started building automations is with something small and non-critical. If it fails, it shouldn’t bring the house down.

    Over time, as you get comfortable with this, you can add on more complexity and transition from automations to autonomous AI agents.

    If you need help with any of this, email me.

  • Coding with Cursor: a Beginner’s Guide

    Coding with Cursor: a Beginner’s Guide

    I was at Web Summit Vancouver last week, a tech conference where the only topic of every conversation was, surprise surprise, AI! As someone who has been in the space for years, well before the ChatGPT boom, I was excited to talk to my fellow nerds about the latest tools and tech.

    And I was shocked to find that many attendees, including product managers and developers, hadn’t even heard of the AI tools I used most, like Claude and Cursor.

    I’ve already written guides on Claude so I figured I’d do one for Cursor. This guide is for you if you’re:

    • A complete coding beginner who’s heard the vibe coding hype and wants to skip the “learn syntax for six months” phase
    • A seasoned developer curious about AI coding tools but tired of switching between ChatGPT tabs and your IDE
    • Someone who tried Cursor once, got confused by all the modes and features, and gave up

    By the end, you’ll know exactly how to use Cursor’s three main modes, avoid the common pitfalls that trip up beginners, and build real projects.

    Installation and First Contact

    Time for the least exciting part of this guide: getting Cursor on your machine. Head to cursor.com and download the application (revolutionary, I know). The installation is standard “next, next, finish” territory, so I won’t insult your intelligence with screenshots.

    If you’re familiar with other IDEs, like VS Code, then Cursor won’t look too different. In fact, it’s literally a fork of VS Code. Your muscle memory, keyboard shortcuts, and extensions all work exactly the same. You can install Cursor and use it as a drop-in VS Code replacement without touching a single AI feature.

    But why would you want to do that when you could have a coding superpower instead?

    Open one of your existing projects in Cursor and hit Cmd+L (Mac) or Ctrl+L (Windows/Linux). That’s your AI sidebar. Type something like “explain what this file does” and watch as Cursor not only explains your code but suggests improvements you hadn’t even thought of.

    This is your first taste of what makes Cursor different. It’s not pulling generic answers from the internet, or generating something irrelevant. It’s analyzing your actual project and giving you contextual, relevant help. Let’s explore the different ways it can do this.

    If you don’t have an existing project, ask Cursor to create one! Just type in “Generate a simple HTML file about pizza toppings” or whatever strikes your fancy, and watch the magic.

    The Three Modes of Cursor

    Cursor has three main ways to interact with AI, and knowing when to use each one is like knowing when to use a scalpel versus a sledgehammer. Both are tools, but context matters.

    Ask Mode: Your Coding Sherpa

    Think of Ask mode as your personal Stack Overflow that actually knows your project. Hit Cmd+L (or Ctrl+L) to open the sidebar, make sure “Ask” is selected in the dropdown, and start asking questions.

    I often use this if I’m returning to a project I haven’t looked at in a couple of days, or if I’m trying to understand why Cursor generated code in a certain way. It’s also a great way to learn how to code if you’re not a professional.

    You can ask it something specific, like what does this function do, all the way to asking it how an entire codebase works. I encourage you to also ask it to explain itself and some of the architectural decisions it makes.

    Examples:

    • “What does this function do and why might it be slow?”
    • “What are other ways to implement this functionality”
    • “How would you approach adding authentication to this app?”
    • “What are the potential security issues in this code?”

    Ask mode is read-only so it won’t change your code. It’s purely for exploration, explanation, and planning. Treat it like Google, but Google that knows your specific codebase inside and out.

    Pro Tip: Ask follow up questions to deeper understanding, request alternative approaches to problems, and use it to understand AI-generated code before implementing.

    Agent Mode: The Code Wizard

    This is where the magic happens. Agent mode (formerly called “Composer”) can actually make changes to your code, create new files, and work across your entire project.

    You tell it to do something, and it just does it, from adding new text to a page, all the way to creating an entire new feature with multiple pages, functions, and components.

    It can even run commands in the terminal, like installing a new package or committing changes to Git.

    Examples:

    • “Build a login form with validation”
    • “Create a new branch for the onboarding feature”
    • “Create a REST API for managing user profiles”
    • “Refactor this component to use TypeScript”

    Agent mode takes into context your entire codebase to understand relationships between different parts and create or modify multiple files. If you ask it to make wholesale change, it will literally go off and generate tons of code across multiple files.

    Pro Tip: Start with clear, specific requirements and review changes before accepting them. Use version-control like Git at every step.

    Edit Mode: The Precision Tool

    Edit mode is for making smaller, more precise edits. To use this, you need to select some code in the editor and you’ll get a little menu with options to add to chat or edit.

    Selecting edit opens up edit mode where you can ask the AI to make changes to that piece of code. You might want to use this when making small tweaks to existing code, refactoring a single function, or a quick bug fix.

    YOLO Mode

    There’s a secret fourth mode in Cursor called YOLO mode. Ok it used to be called YOLO Mode but they’ve changed it to the less scary “auto-run mode”.

    This mode lets the AI run terminal commands automatically. You may have noticed in your tests so far, especially in Agent mode, that it pauses and asks if it can install a package or spin up a dev server.

    If you select auto-run mode, it executes these commands without your consent. This is obviously a risky thing so I suggest you limit it to certain commands, like running tests. That way, when you ask Agent to build a new feature and test it, it does so automatically without your active involvement.

    Choosing Your Mode

    “I want to understand something” → Ask mode

    “I want to build/change something” → Agent mode

    “I want a tiny, precise change” → Edit mode (or just use Agent)

    Here’s a practical exercise to try all three:

    1. Ask mode practice: Open your HTML file and ask “What would make this webpage more accessible?”
    2. Agent mode practice: Tell Agent “Add a CSS file that makes this webpage look modern with a color scheme and better typography”
    3. Edit mode practice: Select the page title and ask Edit to “Change this to something more creative”

    Context is king

    Cursor is only as good as the context you give it. The AI can only work with what it can see, so learning to manage context effectively is the difference between getting decent results and getting mind-blowing ones.

    When you open the AI sidebar, look at the bottom and you’ll see an option to “@add context”. This is where you add files, folders, or specific functions to the conversation.

    The @ symbol: Click the @ symbol or type it in to chat to see what files Cursor suggests. This tells the AI “pay attention to this specific file.”

    You can reference specific files, folders, or even certain functions

    • @docs can pull in documentation if available
    • @components/ includes your entire components folder
    • @package.json includes just that file

    The # symbol: Use this to focus on specific files.

    The / symbol: Before starting a complex task, open the files you think are relevant to that task, then use the “/” command in Agent mode to “Add Open Files to Context.” This automatically adds them all to context.

    The .cursorignore File

    Create a .cursorignore file in your project root to exclude directories the AI doesn’t need to see:

    JSON
    node_modules/
    dist/
    .env
    *.log
    build/

    This keeps the AI focused on your actual code instead of getting distracted by dependencies and build artifacts.

    Context Management Strategy

    Think of context like a conversation. If you were explaining a coding problem to a colleague, you’d show them the relevant files, not your entire codebase. Same principle applies here.

    Good context: Relevant files, error messages, specific functions you’re working on

    Bad context: Your entire project, unrelated files, yesterday’s lunch order

    Similarly, when you have long conversations, the context (which is now your entire conversation history) gets too long and the AI tends to lose track of your requirements and previous decisions. You’ll notice this when the AI suggests patterns inconsistent with your existing code or forgets constraints you mentioned earlier.

    To avoid this, make it a habit to start new conversations for different features or fixes. This is especially important if you’re moving on to a new task where the context changes.

    Beyond giving it the right context, you can also be explicit about what not to touch: “Don’t modify the existing API calls”. This is a form of negative context, telling the AI to work in a certain space but avoid that one spot.

    Documentation as context

    One of the most powerful but underutilized techniques for improving Cursor’s effectiveness is creating a /docs folder in your project root and populating it with comprehensive markdown documentation.

    I store markdown documents of the project plan, feature requirements, database schema, and so on. That way, Cursor can understand not just what my code does, but why it exists and where it’s heading. It can then suggest implementations that align with my broader vision, catch inconsistencies with my planned architecture, and make decisions that fit my project’s specific constraints and goals.

    This approach transforms your documentation from static reference material into active guidance that keeps your entire development process aligned with your original vision.

    Cursor Rules

    Imagine having to explain your coding preferences to a new team member every single time you work together. Cursor Rules solve this problem by letting you establish guidelines that the AI follows automatically, without you having to repeat yourself in every conversation.

    Think of rules as a mini-prompt that runs behind the scenes every time you interact with the AI. Instead of saying “use TypeScript” and “add error handling” in every prompt, you can set these as rules once and the AI will remember them forever.

    Global Rules vs. Project Rules

    User Rules: Apply to every project you work on. Think of these as your personal preferences you bring to any codebase.

    Project Rules: Specific to each codebase. These are the rules your team agrees on and ensure consistency across all contributors.

    Examples That Work in Practice

    For TypeScript projects:

    JSON
    - Always use TypeScript strict mode
    - Prefer function declarations over arrow functions for top-level functions
    - Use meaningful variable names, no single letters except for loops
    - Add JSDoc comments for complex functions
    - Handle errors explicitly, don't ignore them

    For Python projects:

    JSON
    - Use type hints for all function parameters and return values
    - Follow PEP 8 style guidelines and prefer f-strings for formatting
    - Handle errors with specific exception types, avoid bare except clauses
    - Write pytest tests for all business logic with descriptive test names
    - Use Pydantic for data validation and structured models
    - Include docstrings for public functions using Google style format
    - Prefer pathlib over os.path and use context managers for resources

    For any project:

    JSON
    - Write tests for all business logic
    - Use descriptive commit messages
    - Add comments for complex algorithms
    - Handle edge cases and error states
    - Performance matters: avoid unnecessary re-renders and API calls

    Use Cursor itself to write your rules. Seriously. Ask it to “Generate a Project Rules file for a TypeScript project that emphasizes clean code, accessibility, and performance.”

    The AI knows how to write content that other AIs understand.

    Pro Tip: Create different .cursorrules files for different types of projects. Keep a frontend-rules.md, backend-rules.md, and fullstack-rules.md that you can quickly copy into projects.

    Communicating With Cursor

    Here’s the thing about AI: it’s incredibly smart and surprisingly literal. The difference between getting decent results and getting “how did you do that?!” results often comes down to how you communicate.

    Be Specific

    As with any AI, the more specific you are, the better the output. Don’t just say, “fix the styling.” Say “Add responsive breakpoints for mobile (320px), tablet (768px), and desktop (1024px+) with proper spacing and typography scaling”.

    You don’t need to know the technical details to be specific about the outcome you want. Saying “Optimize this React component by memoizing expensive calculations and reducing re-renders when props haven’t changed” works better than just “Optimize this component” even though you’re not giving it detailed instructions.

    Take an Iterative Approach

    Start broad, then narrow down:

    1. “Build a todo app with React”
    2. “Add user authentication to this todo app”
    3. “Make the todo items draggable for reordering”
    4. “Add due dates and priority levels”

    Each step builds on the previous work. The AI maintains context and creates consistent patterns across features.

    Use Screenshots

    Take screenshots of:

    • UIs you want to replicate
    • Error messages you’re getting
    • Design mockups from Figma
    • Code that’s confusing you

    Paste them directly into the chat. The AI can read and understand visual information surprisingly well.

    Treat it like a coworker

    Explain your problem like you’re talking to a colleague:

    “I have this React component that’s supposed to update when props change, but it’s not re-rendering. The props are coming from a parent component that fetches data from an API. I think it might be a dependency issue, but I’m not sure.”

    This gives the AI context about what you’re trying to do, what’s happening instead, and your initial hypothesis.

    The Context Sandwich

    Structure complex requests like this:

    1. Context: “I’m building a shopping cart component”
    2. Current state: “It currently shows items and quantities”
    3. Desired outcome: “I want to add coupon code functionality”
    4. Constraints: “It should validate codes against an API and show error messages”

    This format gives the AI everything it needs to provide accurate, relevant solutions.

    Common Prompting Mistakes

    Making Assumptions: Don’t assume the AI knows what “correct” means in your context. Spell it out by describing expected outcomes. “This function should calculate tax but it’s returning undefined. Here’s the expected behavior…”

    Trying to do everything at once: When you tell the AI to “Build a complete e-commerce site with authentication, payment processing, inventory management, and admin dashboard” it is definitely going to go off the rails at some point.

    Start small and build incrementally. The AI works better with focused requests.

    Describing solutions: Describe the problem, not the solution. The AI might suggest better approaches than you initially considered. Instead of “Use Redux to manage this state”, say “I need to share user data between multiple components”

    Overloading context: Adding every file in your project to context doesn’t help, it hurts. The AI gets overwhelmed and loses focus. Be selective about what’s actually relevant.

    Debugging Your Prompts

    Good prompting is a bit of an art. A small change in a prompt can lead to massive changes in the output, so Cursor may often go off-script.

    And that’s totally fine. If you catch it doing that, just hit the Stop button and say “Wait, you’re going in the wrong direction. Let me clarify…”

    Sometimes it’s better to start a new conversation with a refined prompt than to keep correcting course. When you do this, add constraints like “keep the current component structure” to stop it from going down the same direction.

    Good prompting is iterative:

    1. Initial prompt: Get something working
    2. Refinement: “This is close, but change X to Y”
    3. Polish: “Add error handling and improve the user experience”
    4. Test: “Write tests for this functionality”

    The Psychology of AI Collaboration

    The AI is incredibly capable but not infallible. There’s a small area between treating it like a tool and constraining it too much, and treating it like a coworker and letting it run free. That’s where you want to play.

    Always review the code it generates, especially for:

    • Security-sensitive operations
    • Performance-critical sections
    • Business logic validation
    • Error handling

    Don’t just copy-paste the code. Read the AI’s explanations, understand the patterns it uses, and notice the techniques it applies. You’ll gradually internalize better coding practices.

    If the AI suggests something that doesn’t feel right, question it. Ask “Why did you choose this approach over alternatives?” or “What are the trade-offs here?”

    The AI can explain its reasoning and might reveal considerations you hadn’t thought of. Or it could be flawed because it doesn’t have all the necessary context, and you may be able to correct it.

    Putting it all together

    Here’s a complete example of effective AI communication:

    Context: “I’m building a React app that displays real-time stock prices”

    Current state: “I have a component that fetches data every 5 seconds, but it’s causing performance issues”

    Specific request: “Optimize this for better performance. I want to update only when prices actually change, handle connection errors gracefully, and allow users to pause/resume updates”

    Constraints: “Don’t change the existing API structure, and make sure it works on mobile devices”

    This prompt gives the AI everything it needs: context, current state, desired outcome, and constraints. The response will be focused, relevant, and actionable.

    Common Pitfalls

    Every Cursor user goes through the same learning curve. You start optimistic, hit some walls, wonder if AI coding is overhyped, then suddenly everything clicks. Let’s skip the frustrating middle part by learning from everyone else’s mistakes.

    The “Build Everything at Once” Trap

    The mistake: Asking for a complete e-commerce platform with authentication, payment processing, inventory management, admin dashboard, and mobile app in a single prompt.

    Why it fails: Even the smartest AI gets overwhelmed by massive requests. You’ll get generic, incomplete code that barely works and is impossible to debug.

    The fix: Start with the smallest possible version. Build a product catalog first, then add search, then user accounts, then payment processing. Each step builds on solid foundations.

    Good progression:

    1. “Create a simple product listing page”
    2. “Add search functionality to filter products”
    3. “Create a shopping cart that stores items”
    4. “Add user registration and login”
    5. “Integrate payment processing”

    The Context Chaos Problem

    The mistake: Adding every file in your project to the AI’s context because “more information is better.”

    Why it fails: Information overload makes the AI lose focus. It’s like trying to have a conversation in a crowded restaurant, too much noise drowns out the important signals.

    The fix: Be surgical with context. Only include files that are directly relevant to your current task.

    Bad context: Your entire components folder, all utilities, config files, and documentation Good context: The specific component you’re modifying and its immediate dependencies

    The “AI Will Figure It Out” Assumption

    The mistake: Giving vague instructions and expecting the AI to read your mind about requirements, constraints, and preferences.

    Why it fails: The AI is smart, not psychic. “Make this better” could mean anything from performance optimization to visual redesign to code refactoring.

    The fix: Be specific about what “better” means in your context.

    Vague: “Fix this component” Specific: “This React component re-renders too often when props change. Optimize it using React.memo and useMemo to prevent unnecessary renders.”

    The Copy-Paste Syndrome

    The mistake: Blindly copying AI-generated code without understanding what it does.

    Why it fails: When (not if) something breaks, you’ll have no idea how to fix it. Plus, you miss learning opportunities that make you a better developer.

    The fix: Always ask for explanations. “Explain what this code does and why you chose this approach.”

    What to do when shit inevitably hits the fan

    You may avoid all the pitfalls above and still see the AI go off track. It starts modifying files you didn’t want changed, adds unnecessary complexity, or ignores your constraints.

    The first thing you should do is hit the stop button. You can then let it know it’s going in the wrong direction. Even better, start a new conversation with clearer instructions and additional constraints.

    Another common pattern is when the AI makes a change, sees an error, tries to fix it, creates a new error, and gets stuck in a cycle of “fixes” that make things worse.

    If you see the same type of error being “fixed” multiple times, stop the process and revert to the last working state.

    Here are some other warning signs that things are going off track:

    • It keeps apologizing and starting over
    • Solutions get more complex instead of simpler
    • It suggests completely different approaches in each attempt
    • Error messages persist despite multiple “fixes”

    Then use one of the following debugging methods.

    The Logging Strategy

    When things aren’t working and you can’t figure out why:

    1. Ask the AI to add detailed logging
    2. Run the code and collect the output
    3. Paste the logs back to the AI
    4. Let it analyze what’s actually happening vs. what should happen

    Example prompt: “Add console.log statements to track the data flow through this function. I’ll run it and share the output so we can debug together.”

    The Rollback and Retry Method

    When the AI made changes that broke more than they fixed:

    1. Use Cursor’s built-in history to revert changes
    2. Identify what went wrong in your original prompt
    3. Start a new conversation with better context
    4. Be more specific about constraints and requirements

    The “Explain Your Thinking” Technique

    When the AI gives you code that seems wrong or overly complex:

    “Explain why you chose this approach. What are the trade-offs compared to [simpler alternative]?”

    Often the AI has good reasons you didn’t consider. Sometimes it reveals that there’s indeed a simpler way.

    The Test-Driven AI Approach

    TDD (Test Driven Development) is a common (and standard) practice in web development. However, with vibe coding, it seems like people have forgotten about it.

    But, as the saying goes, prevention is better than cure. Following tried and tested practices like TDD will save you a ton of headache and rework.

    In fact, with AI, it becomes a superpower. AI can write tests faster than you can think of edge cases, and those tests become a quality guarantee for the generated code.

    This single prompt pattern will revolutionize how you build features:

    “Write comprehensive tests for [feature] first, then implement the code, then run the tests and iterate until all tests pass.”

    Here’s an example prompt for building a new React component:

    JSON
    "Write tests that verify this component:
    1. Renders correctly with different props
    2. Handles user interactions properly
    3. Manages state changes
    4. Calls callbacks at the right times
    5. Handles error states gracefully
    
    Then implement the component to pass all tests."

    Watch this workflow in action:

    1. AI writes tests based on your requirements
    2. AI implements code to satisfy the tests
    3. Tests run automatically (with YOLO mode enabled)
    4. AI sees failures and fixes them iteratively
    5. You get working, tested code without writing a single test yourself

    Advanced Tips and Tricks

    The Bug Finder

    Hit Cmd+Shift+P (or Ctrl+Shift+P) and type “bug finder.” This feature compares your changes to the main branch and identifies potential issues you might have introduced.

    It’s not perfect, but it catches things like:

    • Forgot to handle null values
    • Missing error handling
    • Inconsistent variable usage
    • Logic errors in conditional statements

    Image Imports

    This one sounds fake until you try it. You can literally paste screenshots into Cursor’s chat and it will understand them. Take a screenshot of:

    • A UI mockup you want to build
    • An error message you’re getting
    • A design you want to replicate

    Paste it in the chat with your prompt and watch the AI work with visual information. It’s genuinely impressive.

    Tab Tab Tab

    Cursor’s tab completion doesn’t just complete your current line, it can suggest entire functions, predict what you’re about to write next, and even jump you to related code that needs updating.

    The AI analyzes your recent changes and predicts your next move. When it’s right (which is surprisingly often), it feels like magic.

    AI Models and Selection Strategy in Cursor

    Cursor offers access to the latest generation of AI models, each with distinct strengths and cost profiles that suit different development scenarios.

    Claude Sonnet 4 is my current go-to choice for most development tasks. It significantly improves on Sonnet 3.7’s capabilities, achieving a state-of-the-art 72.7% on SWE-bench. Use this for routine development tasks like building React components, writing API endpoints, or implementing standard features.

    Claude Opus 4 represents the premium tier for the most challenging problems. It is expensive but pays for itself in time saved when you’re tackling architectural decisions, complex refactoring across multiple files, or debugging particularly stubborn issues.

    OpenAI’s o3 is a good premium alternative and particularly strong in coding benchmarks, with the high-effort version achieving 49.3% on SWE-bench and excelling in competitive programming scenarios.

    GPT-4o remains a solid and cheaper alternative, especially for multilingual projects or when you need consistent performance across diverse tasks. While it tends to feel more generic compared to Claude’s natural style, it offers reliability and broad capability coverage.

    Gemini 2.5 Pro is also one of my favorites as it combines reasoning with coding, leading to much better performance. It is also the cheapest and fastest of models, though I use it primarily for planning out an app.

    In most cases, you’ll probably just be using one model for the bulk of your work, like Sonnet 4 of GPT-4o, and you can upgrade to a more expensive model like o3 or Opus 4 for complex tasks.

    mCP and Integrations

    MCP (Model Context Protocol) connects Cursor to external tools and data sources, turning it into a universal development assistant. Need to debug an issue? Your AI can read browser console logs, take screenshots, and run tests automatically. Want to manage your project? It can create GitHub issues, update Slack channels, and query your database, all through natural conversation.

    What MCP is and how it works is out of scope of this already long article, so read my guide here. In this section I’ll explain how to set it up and which servers to use.

    Setting Up MCP in Cursor

    Getting started with MCP in Cursor involves creating configuration files that tell Cursor which MCP servers to connect to and how to authenticate with them.

    For project-specific tools, create a .cursor/mcp.json file in your project directory. This makes MCP servers available only within that specific project (perfect for database connections or project-specific APIs). For tools you want across all projects, add them in your settings.

    The configuration uses a simple JSON format. Here’s how to set up the GitHub MCP server:

    JSON
    {
      "mcpServers": {
        "github": {
          "command": "npx",
          "args": ["-y", "@modelcontextprotocol/server-github"],
          "env": {
            "GITHUB_PERSONAL_ACCESS_TOKEN": "your_token_here"
          }
        }
      }
    }

    Essential MCP Servers

    The MCP ecosystem has exploded with hundreds of available servers, but several have emerged as must-haves for serious development work.

    GitHub MCP Server – create issues, manage pull requests, search repositories, and analyze code changes directly within your coding conversation. When debugging, you can ask “what changed in the authentication module recently?” and get immediate insights without leaving your editor.

    Slack MCP Server – read channel discussions, post updates about builds or deployments, and even summarize daily standups. This becomes particularly powerful for debugging when team members report issues in Slack. Your AI can read the problem descriptions and immediately start investigating.

    PostgreSQL MCP Server gives your AI the ability to inspect schemas and execute read-only queries. You can ask “show me all users who logged in yesterday” or “analyze the performance of this query” and get immediate, accurate results.

    Puppeteer MCP Server gives your AI browser automation superpowers. When building web applications, your AI can take screenshots, fill forms, test user flows, and capture console errors automatically. This creates a debugging workflow where you describe a problem and watch your AI reproduce, diagnose, and fix it in real-time.

    File System MCP Server seems basic but proves incredibly useful for project management. Your AI can organize files, search across codebases, and manage project structures intelligently. Combined with other servers, it enables workflows like “analyze our React components for unused props and move them to an archive folder.”

    Advanced MCP Workflows in Practice

    The real power of MCP emerges when multiple servers work together to create sophisticated development workflows. Consider this scenario: you’re building a web application and users report a bug through Slack. Here’s how an MCP-enhanced Cursor session might handle it:

    First, the Slack MCP reads the bug report and extracts key details. Then, the GitHub MCP searches for related issues or recent changes that might be relevant. The File System MCP locates the relevant code files, while the PostgreSQL MCP checks if there are database-related aspects to investigate.

    Your AI can then use the Puppeteer MCP to reproduce the bug in a browser, capture screenshots showing the problem, examine console errors, and test potential fixes. Finally, it can create a detailed GitHub issue with reproduction steps, propose code changes, and post a summary back to Slack, all through natural conversation with you.

    This level of integration transforms debugging from a manual, time-consuming process into an assisted workflow where your AI handles the tedious investigation while you focus on architectural decisions and creative problem-solving.

    Custom MCP Server Creation

    While the existing ecosystem covers many common needs, building custom MCP servers for company-specific tools often provides the highest value. The process is straightforward enough that a developer can create a basic server in under an hour.

    Custom servers excel for internal APIs, proprietary databases, and specialized workflows. For example, a deployment pipeline MCP server could let your AI check build status, trigger deployments, and analyze performance metrics. A customer support MCP server might connect to your ticketing system, allowing AI to help triage issues or generate response templates.

    A real-World workflow

    Building real applications with Cursor requires a different mindset than traditional development. Instead of diving straight into code, you start by having conversations with your AI assistant about what you want to build.

    Let’s say we want to build a project management tool where teams can create projects, assign tasks, and track progress. It’s the kind of application that traditionally takes week, maybe months, to develop, but with Cursor’s AI-assisted approach, we can have a production-ready version in days.

    Foundation

    Traditional projects start with wireframes and technical specifications. With Cursor, you’d start with Agent mode and a conversation about what you’re trying to build. You describe the basic concept and use the context sandwich method we covered earlier:

    Context: “Building a team project management tool”
    Current state: “Just an idea, need MVP definition”
    Goal: “Users can create projects, assign tasks, track progress”
    Constraints: “3-week timeline, needs to scale later”

    The AI would break this down into clear MVP features and suggest a technology stack that balances rapid development with future scalability. More importantly, it would design a clean database schema with proper relationships.

    Save all of these documents in a folder in your project for the AI to reference later.

    Core Features

    Start building each feature one by one. Use the test-driven development approach I mentioned earlier, and start small with very specific context.

    Connect GitHub and Database MCP servers to let the AI commit code and inspect the database in real-time.

    You can even set up a Slack MCP for the AI to update you or read new tickets.

    Follow the same pattern for every feature – tasks tracking, user permissions, etc.

    Don’t forget to keep testing the product locally. Even with the test-driven approach, the AI might miss things, so ask it to use the logging technique described earlier to help debug potential issues.

    Productionizing

    As your app gets ready, you may want to start thinking about performance and production-readiness.

    Ask the AI to proactively analyze your app for potential failure points and implement comprehensive error handling.

    I also often ask it to find areas for refactoring and removing unnecessary code.

    For performance optimization, ask the AI to implement lazy loading, database indexing, and caching strategies while explaining the reasoning behind each decision.

    Launch and iterate

    The monitoring and debugging workflows we covered earlier would prove essential during launch week. The AI would have generated comprehensive logging and performance tracking, so when real users start using your app, you’d have visibility into response times, error rates, and user behavior patterns from day one.

    When users request features you hadn’t planned (keyboard shortcuts, bulk operations, calendar integration, etc) the iterative refinement approach combined with MCP would make these additions straightforward.

    Each new feature would build naturally on the existing patterns because the AI maintains architectural consistency while MCP servers provide the external context needed for complex integrations.

    Your Turn

    Hopefully this article demonstrates a fundamentally different approach to software development. Instead of fighting with tools and configurations, you’re collaborating with an AI partner that understands your goals and helps implement them efficiently.

    The skills you develop transfer to any technology stack: thinking architecturally, communicating requirements clearly, and iterating based on feedback. Most importantly, you gain confidence to tackle ambitious projects. When implementation details are handled by AI, you can focus on solving interesting problems and building things that matter.

    I’d love to support you as you continue on your journey. My blog is filled with detailed guides like this, so sign up below if you want the latest deep dives on AI.

    Get more deep dives on AI

    Like this post? Sign up for my newsletter and get notified every time I do a deep dive like this one.

  • The Executive’s Guide to Agentic AI

    The Executive’s Guide to Agentic AI

    Microsoft CEO Satya Nadella recently declared that “we’ve entered the era of AI agents,” highlighting that AI models are now more capable and efficient thanks to groundbreaking advancements in reasoning and memory.

    Google recently announced a whole slew of new agentic tools in their recent I/O conference.

    Every major tech company is going all in on agents. 61% of CEOs say competitive advantage depends on who has the most advanced generative AI, and Gartner predicts that by 2028, at least 15% of daily work decisions will be made autonomously by agentic AI.

    If you’re an executive trying to understand what this means for your organization, this guide is for you. Let’s dive in.

    Understanding Agentic AI and Its Business Implications

    Agentic AI refers to a system or program that is capable of autonomously performing tasks on behalf of a user or another system by designing its workflow and using available tools.

    Unlike traditional AI that responds to prompts, agentic AI exhibits true “agency”, or the ability to:

    • Make autonomous decisions, analyze data, adapt, and take action with minimal human input
    • Use advanced reasoning in their responses, giving users a human-like thought partner
    • Process and integrate multiple forms of data, such as text, images, and audio
    • Learn from user behavior, improving over time

    When I talk to clients, I often tell them to treat an agent like an AI employee. A well-designed agent can take an existing, manual process, and completely automate it, leading to:

    • Productivity Gains: A Harvard Business School study showed consultants with access to Gen AI completed tasks 22% faster and 40% better
    • Decision Speed: Most C-suite leaders spend 40% of their time on routine approvals, like pricing decisions or supplier evaluations, which could be automated
    • Cost Reduction: Studies reveal that implementation of AI agents has led to over a 15% reduction in compliance costs and a more than 46% increase in revenue for numerous organizations

    Strategic Use Cases for Agentic AI

    Automating existing processes is the most obvious and low-hanging use case for organizations. Any business process that is manual, time-consuming, and does not require human judgement can and should be automated with an agent.

    Customer Experience Transformation

    Gartner predicts that agentic AI will autonomously resolve 80% of common customer service issues without human intervention by 2029, leading to a 30% reduction in operational costs:

    • 24/7 Customer Support: AI agents in call centers orchestrate intelligence and automation across multiple activities involved in serving customers, simultaneously analyzing customer sentiment, reviewing order history, accessing company policies and responding to customer needs
    • Personalized Engagement: AI agents can learn from previous interactions and adapt to individual requirements in real time, enabling greater personalization than ever before

    Knowledge Worker Augmentation

    A major bottleneck in many corporations is finding the right information at the right time and working with hundreds of documents across multiple platforms:

    • Document Processing: Dow built an autonomous agent to scan 100,000+ shipping invoices annually for billing inaccuracies, expecting to save millions of dollars in the first year
    • Sales Automation: Fujitsu’s AI agent boosted sales team productivity by 67% while addressing knowledge gaps and allowing them to build stronger customer relationships

    Supply Chain and Operations Automation

    The supply chain represents perhaps the most compelling use case for agentic AI, with the global AI in supply chain market projected to reach $157.6 billion by 2033.

    • Predictive Logistics: AI agents can autonomously optimize the transportation and logistics process by managing vehicle fleets, delivery routes and logistics on a large scale
    • Inventory Management: AI-powered supply-chain specialists can optimize inventories on the fly in response to fluctuations in real-time demand
    • Risk Management: AI agents regularly monitor world events like pandemics, political unrest, and economic shifts to assist companies in proactively managing supply chain risks

    Product and Service Innovation

    Development Acceleration: AI-powered virtual R&D assistants save researchers significant time by finding relevant academic papers, patents, and technical documents from large databases.

    Market Intelligence: Teams can gather data, identify trends, build marketing assets, inform research and move products to market faster using natural language prompts that reduce time from hours to seconds.

    Process Automation

    Every organization has hundreds of internal processes that are manual, time-consuming, and low value. Employees spend hours on these processes, from taking notes to copying data across platforms and creating reports, that could easily be done with AI Agents.

    Most of my client work involves taking such processes and fully automating them, allowing employees to focus on higher value work. If you’re interested in this, contact me.

    Building the Foundation for Agentic AI

    Data Requirements

    72% of CEOs say leveraging their organization’s proprietary data is key to unlocking the value of generative AI, yet 50% say their organization has disconnected technology due to the pace of recent investments.

    Requirements:

    • Unified Data Platform: 68% say integrated enterprise-wide data architecture is critical to enable cross-functional collaboration and drive innovation
    • Data Quality Framework: Ensuring accuracy, completeness, and consistency
    • Real-time Integration: Breaking down data silos across systems
    • Security and Governance: Protecting sensitive information while enabling access

    Talent Requirements and Organizational Readiness

    Current Skills Gap: 46% of leaders identify skill gaps in their workforces as a significant barrier to AI adoption.

    Essential Roles for Agentic AI:

    • AI Ethics Officers: Ensuring fair and transparent operations
    • Human-AI Collaboration Specialists: Optimizing workflows between humans and AI
    • AI Trainers: Teaching AI systems nuance, context, and human values
    • Data Scientists and ML Engineers: Building and maintaining AI systems

    Training Imperatives: Nearly half of employees say they want more formal training and believe it is the best way to boost AI adoption.

    Process Redesign for Human-AI Collaboration

    Governance Frameworks: Only 22% of organizations that have established AI governance councils consistently track metrics related to bias detection, highlighting the need for robust oversight.

    Essential Elements:

    • Clear policies for AI use within the business
    • Training on AI systems and ethical implications
    • Processes for evaluating and rejecting AI proposals that conflict with company values
    • Regular bias detection and compliance monitoring

    Implementation Roadmap for Agentic AI

    Phase 1: Foundation and Pilot Selection (Months 1-6)

    The key to successful agentic AI implementation is starting with a clear strategy rather than jumping into the latest technology. Too many organizations are making the mistake of tool-first thinking when they should be focusing on problem-first approaches.

    Begin with a comprehensive AI readiness evaluation. This means honestly assessing your current data quality, infrastructure capabilities, and organizational readiness for change.

    When I work with my clients, I often start with surveys to understand the AI literacy of the organization, as well as the tech infrastructure to enable an AI transformation. This data helps us understand what skills or tech gaps we need to fill before moving ahead.

    I also identify high-impact, low-risk use cases where you can demonstrate clear business value while learning how these systems work in your environment.

    Download my AI Readiness Assessment

    These are the same surveys I use with my clients to identify skill gaps and close them.

    Phase 2: Pilot Deployment and Learning (Months 6-12)

    Deloitte predicts that 25% of companies using generative AI will launch agentic AI pilots or proofs of concept in 2025, growing to 50% in 2027. The organizations that succeed will be those that approach scaling strategically rather than opportunistically.

    Start with pilot projects in controlled environments where agentic AI use can be refined, then scale and integrate seamlessly into the bigger picture.

    Establish clear human oversight mechanisms, regular performance monitoring, and continuous feedback loops. Most importantly, invest heavily in employee training and support during this phase.

    Phase 3: Scaling and Integration (Months 12-24)

    Multi-agent orchestration represents the next level of sophistication. Instead of individual AI agents working in isolation, organizations are building systems where multiple agents collaborate to handle complex, multi-step processes.

    The key insight is that agentic AI works best when it’s integrated into existing workflows rather than replacing them entirely. The most successful implementations enhance human decision-making rather than eliminating it.

    Measuring Impact and ROI

    Only 52% of CEOs say their organization is realizing value from generative AI beyond cost reduction. This suggests that many organizations are measuring the wrong things or not measuring comprehensively enough.

    Here are some KPIs I recommend measuring to test if your Agents are delivering value:

    • Productivity Metrics: Time saved, tasks automated, output quality
    • Financial Impact: Cost reduction, revenue generation, ROI calculations
    • Employee Satisfaction: Adoption rates, training effectiveness, job satisfaction
    • Customer Experience: Response times, resolution rates, satisfaction scores

    Preparing for the Organizational Impact

    CEOs say 31% of the workforce will require retraining or reskilling within three years, and 54% say they’re hiring for roles related to AI that didn’t even exist a year ago.

    The workforce of the AI Agent era will need skills like:

    1. AI Literacy: Understanding capabilities, limitations, and ethical implications
    2. Human-AI Collaboration: Working effectively alongside AI agents
    3. Critical Thinking: Validating AI outputs and making strategic decisions
    4. Emotional Intelligence: Areas where humans maintain comparative advantage
    5. Continuous Learning: Adapting to rapidly evolving technology

    The half-life of technical skills is shrinking rapidly, and organizations need to create cultures where learning and adaptation are continuous processes rather than occasional events.

    Here are some training programs I conduct for clients:

    • Foundational AI concepts and applications
    • Hands-on experience with AI tools and platforms
    • Technical skills for building and managing AI agents

    Culture and Change Management Considerations

    Here’s an interesting statistic: 73% of executives believe their AI approach is strategic, while only 47% of employees agree. Even more concerning, 31% of employees admit to actions that could be construed as sabotaging AI efforts.

    This perception gap is perhaps the biggest obstacle to successful AI transformation. And it means leaders need to build trust and adoption with their teams:

    • Transparent Communication: Clear explanation of AI’s role and impact
    • Employee Involvement: Including staff in AI design and implementation
    • Psychological Safety: Creating environments where concerns can be voiced
    • Success Stories: Demonstrating AI’s value as augmentation, not replacement

    Two-thirds of C-suite executives report that generative AI adoption has led to division and tension within companies. Successful implementation requires:

    • Leadership commitment and visible support
    • Clear communication about AI’s role in the organization
    • Regular feedback and adjustment mechanisms
    • Recognition and rewards for successful AI adoption

    Strategic Priorities and Competitive Implications

    Microsoft recently introduced more than 50 announcements spanning its entire product portfolio, all focused on advancing AI agent technologies. Meanwhile, 32% of top executives place AI agents as the top technology trend in data and AI for 2025.

    The timeline for competitive advantage is compressed. Organizations beginning their agentic AI journey now will be positioned to lead their industries, while those that delay risk being permanently disadvantaged.

    Here’s a sample adoption timeline for 2025:

    • Q1-Q2 2025: Pilot programs and proof of concepts
    • Q3-Q4 2025: Limited production deployments
    • 2026-2027: Broad enterprise adoption
    • 2027+: Mature implementations and industry transformation

    Strategic Priorities for C-Suite Leaders

    1. Make Courage Your Core 64% of CEOs say they’ll have to take more risk than their competition to maintain a competitive advantage. The key is building organizational flexibility and empowering teams to experiment.

    2. Embrace AI-Fueled Creative Destruction 68% of CEOs say AI changes aspects of their business that they consider core. Leaders must be willing to fundamentally rethink business models and operations.

    3. Ignore FOMO, Lean into ROI 65% of CEOs say they prioritize AI use cases based on ROI. Focus on practical applications that create competitive moats and generate measurable returns.

    4. Cultivate a Vibrant Data Environment Invest in unified data architectures that can support autonomous AI operations while maintaining security and governance.

    5. Borrow the Talent You Can’t Buy 67% of CEOs say differentiation depends on having the right expertise in the right positions. Build partnerships to access specialized AI capabilities.

    Competitive Implications of Early vs. Late Adoption

    Early Adopter Advantages:

    • Market Positioning: Early adopters will gain a substantial advantage—but success requires a strategic and experimental approach
    • Talent Attraction: Access to top AI talent before market saturation
    • Data Advantage: More time to accumulate training data and refine models
    • Customer Relationships: First-mover advantage in AI-enhanced customer experiences

    Risks of Late Adoption:

    • Competitive Disadvantage: 64% of CEOs say the risk of falling behind drives them to invest in some technologies before they have a clear understanding of the value they bring
    • Talent Scarcity: Difficulty attracting AI-skilled professionals
    • Higher Implementation Costs: Premium for late-stage adoption
    • Operational Inefficiency: Competing against AI-optimized operations

    Strategic Recommendations:

    1. Start Immediately: Begin with low-risk pilot programs while building foundational capabilities
    2. Invest in Data: Prioritize data quality and integration as the foundation for agentic AI
    3. Build Partnerships: Collaborate with technology providers and consultants to accelerate deployment
    4. Focus on Change Management: Invest heavily in employee training and cultural transformation
    5. Plan for Scale: Design initial implementations with enterprise-wide scaling in mind

    Conclusion: The Imperative for Action

    The transition to agentic AI represents the most significant technological shift since the advent of the internet. CEOs are often pushing AI adoption faster than some employees are comfortable with, underscoring the need to lead people through the changes.

    The window for strategic advantage is narrowing. By 2028, at least 15% of daily work decisions will be made autonomously by agentic AI. Organizations that begin their agentic AI journey now will be positioned to lead their industries, while those that delay risk being left behind.

    Key Takeaways for C-Suite Leaders:

    1. Agentic AI is not optional—it’s an inevitability that will reshape competitive landscapes
    2. Success requires holistic transformation—technology, people, processes, and culture must evolve together
    3. Early action is critical—the advantages of being among the first adopters far outweigh the risks
    4. Human-AI collaboration is the goal—augmentation, not replacement, should guide implementation strategies
    5. Continuous learning is essential—both for AI systems and human workers

    The question isn’t whether agentic AI will transform your industry, it’s whether your organization will be leading or following that transformation.

    If you want to be leading the transformation, book a free consultation call with me. I’ve worked with multiple organizations to lead them through this.

  • The Executive’s Guide to Upskilling Your Workforce for AI

    The Executive’s Guide to Upskilling Your Workforce for AI

    Recent research by McKinsey shows that 31% of the workforce will require retraining or reskilling within the next three years. With companies rushing to become AI-first, I’m not surprised. In fact, I think that number should be higher.

    Much like digital literacy became essential in the early 2000s, AI literacy is the new baseline for workforce competence. Organizations that fail to develop AI skills will fall behind competitors who leverage AI to enhance productivity, drive innovation, and deliver superior customer experiences.

    This guide offers a comprehensive roadmap for executives seeking to transform their workforce for the AI era. We’ll examine practical strategies for conducting skills gap analyses, developing talent through multiple channels, creating a learning culture, empowering change champions, and addressing AI anxiety.

    Each section provides actionable frameworks backed by research and case studies, enabling you to immediately apply these approaches within your organization.

    Book a free consultation

    If you’re looking for customized training programs for your employees, book a free consultation call with me. I’ve trained dozens of organizations and teams on becoming AI experts.

    Section 1: Conducting an AI Skills Gap Analysis

    Where Do you Want to be?

    Before launching any training initiative, you must first understand the specific AI-related skills your organization requires. When working with my clients, I’ve identified three categories of AI skills that companies need:

    Foundational AI Literacy (All Employees)

    In my opinion, this is table-stakes. Every employee in your company needs to have basic AI literacy, the same way they need to have basic computer literacy.

    • Understanding basic AI concepts and terminology
    • Recognizing appropriate use cases for AI tools
    • Effective prompt engineering and interaction with AI assistants
    • Critical evaluation of AI outputs and limitations
    • Awareness of ethical considerations and responsible AI use

    Intermediate AI Skills (Domain Specialists)

    As you go deeper into your AI transformation, you’ll want to start automating processes and integrating AI deeper into workflows. This means training a percentage of your workforce on AI automation and AI agents.

    Ideally, these are domain specialists who understand the workflows well enough to design automations for them.

    • Ability to identify automation opportunities within specific workflows
    • Data preparation and quality assessment
    • Collaboration with technical teams on AI solution development
    • Integration of AI tools into existing processes
    • Performance monitoring and feedback provision

    Advanced AI Expertise (Technical Specialists)

    Finally, for organizations that are building AI products and features, the following skills are absolutely necessary.

    • AI ethics implementation and compliance
    • AI system design and implementation
    • Model selection, training, and fine-tuning
    • AI infrastructure management and optimization
    • Data architecture and governance for AI

    Where are you now?

    The next step is understanding your organization’s current AI capabilities. When working with clients, I often start with a survey to leadership and employees.

    My Leadership Capability Assessment evaluates executive understanding of AI potential and limitations, and assesses their ability to develop and execute AI strategy.

    My Workforce Literacy Survey measures baseline understanding of AI concepts across the organization, and assesses comfort levels with AI tools and applications.

    For organizations that are building AI products and features, create a Technical Skills Inventory to document existing data science, machine learning, and AI engineering capabilities, map current technical skills against future needs, and identify training needs for different technical roles.

    I also recommend an overall Organizational Readiness Assessment to evaluate data infrastructure and governance maturity, assess cross-functional collaboration capabilities, and review change management processes and effectiveness.

    At this point, it becomes fairly obvious where the gaps are in where you are right now and where you want to be.

    Download my Leadership capability Assessment and workforce literacy survey

    Download the exact surveys I use with my clients to measure your organization’s current AI capabilities

    Create A development plan

    I then create a custom skills development plan to close the gap. Here’s a sample timeline I draw up for clients, although this depends heavily on how fast you move and how big your organization is.

    Time HorizonPriority SkillsTarget AudienceBusiness Impact
    0-3 monthsAI literacy, foundational concepts, AI tool usageAll employeesImproved AI adoption, reduced resistance
    3-6 monthsRole-specific AI applications, workflow integrationDepartment leaders, domain expertsProcess optimization, efficiency gains
    6-12 monthsAdvanced AI development, AI system design, AI ethics implementationTechnical specialists, innovation teamsNew product/service development, competitive differentiation
    12+ monthsEmerging AI capabilities, human-AI collaboration, AI governanceExecutive leadership, strategic rolesBusiness model transformation, market leadership

    I suggest running the skills gap analysis every quarter and re-evaluating. The pace at which AI is developing requires continuous up-skilling at training in the latest technologies.

    Section 2: The Build, Buy, Bot, Borrow Model for AI Talent

    As your organization develops its AI capabilities, you’ll need a multi-pronged approach to talent acquisition and development. The “Build, Buy, Bot, Borrow” framework offers a comprehensive strategy for addressing AI talent needs. This model provides flexibility while ensuring you have the right capabilities at the right time.

    Building Internal Talent Through Training and Development

    Internal talent development should be your cornerstone strategy, as it leverages existing institutional knowledge while adding new capabilities. Develop an organizational learning strategy that includes:

    Tiered Learning Programs

    • Level 1: AI Fundamentals – Basic AI literacy for all employees
    • Level 2: AI Applications – Role-specific training on using AI tools
    • Level 3: AI Development – Specialized technical training for selected roles
    • Level 4: AI Leadership – Strategic AI implementation for executives and managers

    Experiential Learning Opportunities

    • AI hackathons and innovation challenges
    • Rotation programs with AI-focused teams
    • Mentorship from AI experts
    • Applied learning projects with measurable outcomes

    Learning Ecosystems

    • On-demand microlearning resources
    • Self-paced online courses and certifications
    • Cohort-based intensive bootcamps
    • Executive education partnerships

    Many organizations are finding that the “build” strategy offers the best long-term return on investment. I’ll dive deeper into how to build AI talent in later sections.

    Strategic Hiring for Specialized AI Roles

    Despite your best efforts to build internal talent, some specialized AI capabilities may need to be acquired through strategic hiring. This includes AI/ML engineers, data scientists, and AI integration specialists.

    To develop an effective hiring strategy for AI roles:

    1. Focus on specialized competencies rather than general AI knowledge
      • Identify the specific AI capabilities required for your business objectives (from skills gap above)
      • Create detailed skill profiles for each specialized role
      • Develop targeted assessment methods to evaluate candidates
    2. Look beyond traditional sources of talent
      • Partner with universities and research institutions with strong AI programs
      • Engage with AI communities and open-source projects
      • Consider talent from adjacent fields with transferable skills
    3. Create an AI-friendly work environment
      • Provide access to high-performance computing resources
      • Establish clear AI ethics and governance frameworks
      • Support ongoing professional development in rapidly evolving AI domains
      • Build a culture that values AI innovation and experimentation
    4. Develop competitive compensation strategies
      • Create flexible compensation packages that reflect the premium value of AI expertise
      • Consider equity or profit-sharing for roles that directly impact business outcomes
      • Offer unique perks valued by the AI community, such as conference attendance or research time

    Using AI to Augment Existing Workforce Capabilities

    The “bot” aspect of the framework involves strategic deployment of AI tools, automations, and agents to amplify the capabilities of your existing workforce. This approach offers several advantages:

    1. AI agents can handle routine tasks, freeing employees to focus on higher-value work
    2. AI tools can provide just-in-time knowledge, enabling employees to access specialized information when needed
    3. AI can augment decision-making, helping employees make more informed choices

    Implement these strategies to effectively leverage AI for workforce augmentation:

    AI Agents

    • Map existing processes to identify routine, time-consuming tasks suitable for AI automation
    • Deploy AI agents for common tasks like scheduling, report generation, and data summarization
    • Create seamless handoffs between AI and human components of workflows

    Knowledge Augmentation

    • Implement AI-powered knowledge bases that can answer domain-specific questions
    • Deploy contextual AI assistants that provide relevant information during decision-making processes
    • Create AI-guided learning paths that help employees develop new skills

    Decision Support

    • Develop AI models that can analyze complex data and provide recommendations
    • Implement scenario-planning tools that help employees visualize potential outcomes
    • Create AI-powered dashboards that provide real-time insights into business performance

    I highly recommend developing AI automations and agents in parallel with employee up-skilling programs. Low hanging automations can be deployed in weeks and provide immediate benefits.

    This is why so many major tech companies are going all in on agents and have paused hiring. If you’re interested in how to find opportunities to do this in your organization and design effective agents, read my guide here.

    Borrowing Talent through Strategic Partnerships

    The final component of the talent strategy involves “borrowing” specialized AI capabilities through strategic partnerships. This approach is particularly valuable for accessing scarce expertise or handling short-term needs.

    Strategic Vendor Relationships

    • Evaluate AI platform providers based on their domain expertise, not just their technology
    • Develop deep partnerships with key vendors that include knowledge transfer components
    • Create joint innovation initiatives with strategic technology partners

    Consulting and Professional Services

    • Engage specialized AI consultants for specific, high-value projects
    • Use professional services firms to accelerate implementation of AI initiatives
    • Partner with boutique AI firms that have deep expertise in your industry

    Academic and Research Partnerships

    • Collaborate with university research labs on cutting-edge AI applications
    • Sponsor academic research in areas aligned with your strategic priorities
    • Participate in industry consortia focused on AI standards and best practices

    Talent Exchanges

    • Create temporary talent exchange programs with non-competing organizations
    • Develop rotational programs with technology partners
    • Participate in open innovation challenges to access diverse talent pools

    The borrowed talent approach offers several advantages:

    1. Access to specialized expertise that would be difficult or expensive to develop internally
    2. Flexibility to scale AI capabilities up or down based on business needs
    3. Exposure to diverse perspectives and industry best practices
    4. Reduced risk in exploring emerging AI technologies

    By strategically combining the build, buy, bot, and borrow approaches, organizations can develop a comprehensive AI talent strategy that provides both depth in critical areas and breadth across the organization.

    Download my Leadership capability Assessment and workforce literacy survey

    Download the exact surveys I use with my clients to measure your organization’s current AI capabilities

    Section 3: Creating an AI Learning Culture

    Let’s dive into how you can up-skill employees and build AI talent internally, as I mentioned above.

    AI training cannot follow a one-size-fits-all approach. Different roles require different types and levels of AI knowledge and skills. From my client work, I have identified three primary audience segments:

    Executive Leadership

    • Focus Areas: Strategic AI applications, ethical considerations, governance, ROI measurement
    • Format Preferences: Executive briefings, peer discussions, case studies
    • Key Outcomes: Ability to set AI strategy, evaluate AI investments, and lead organizational change

    Managers and Team Leaders

    • Focus Areas: Identifying AI use cases, managing AI-enabled teams, process redesign
    • Format Preferences: Applied workshops, collaborative problem-solving, peer learning
    • Key Outcomes: Ability to identify AI opportunities, guide implementation, and support team adoption

    Individual Contributors

    • Focus Areas: Hands-on AI tools, domain-specific applications, ethical use of AI
    • Format Preferences: Interactive tutorials, practical exercises, on-the-job application
    • Key Outcomes: Proficiency with relevant AI tools, ability to integrate AI into daily workflows

    For each segment, design targeted learning experiences that address their specific needs and preferences. Here’s an example of what I recommend to clients:

    LevelExecutive LeadershipManagers / Team LeadersIndividual Contributors
    BasicAI Strategy Overview (2 hours)AI for Team Leaders (2 hours)AI Fundamentals (2 hours)
    IntermediateAI Governance Workshop (2 hours)AI Use Case Design (4 hours)AI Tools Bootcamp (8 hours)
    AdvancedAI Investment Roundtable (2 hours)AI-Enabled Transformation (8 hours)Domain-Specific AI Training (8 hours)

    But AI training does not stop there. AI is always evolving so a one-time training program is insufficient. Many organizations struggle with the pace of changes in AI, with capabilities evolving faster than organizations can adapt.

    This means you need to foster a continuous learning mindset:

    Leadership Modeling

    • Executives should openly share their own AI learning journeys
    • Leaders should participate in AI training alongside team members
    • Management should recognize and reward ongoing skill development

    Learning Infrastructure

    • Create dedicated time for AI learning (e.g., “Learning Fridays”)
    • Develop peer learning communities around AI topics
    • Establish AI learning hubs that curate and share relevant resources

    Growth Mindset Development

    • Promote the belief that AI capabilities can be developed through effort
    • Encourage experimentation and learning from failures
    • Recognize improvement and progress, not just achievement

    I’ve found it’s a lot easier to create and maintain an AI learning culture when there are champions and go-to experts in the organization driving this culture.

    I often advise clients to identify these AI champions and empower them by creating AI leadership roles, providing them with advanced training and resources, and creating a clear mandate that defines their responsibility for driving AI adoption.

    These AI champions should be included in AI strategy development, use case and implementation approaches, and vendor selection and evaluation processes.

    Other ways to sustain this learning culture and increase AI adoption that have worked well for my clients are:

    1. Incentivizing AI adoption through recognition programs, and financial incentives
    2. Creating mentorship programs and group learning cohorts within the company
    3. Establish communities based on specific business functions (marketing AI, HR AI, etc.)
    4. Implement hackathons and innovation challenges
    5. Create knowledge repositories for AI use cases and lessons learned

    Section 4: Addressing AI Anxiety and Resistance

    Despite growing enthusiasm for AI, 41% of employees remain apprehensive about its implementation. Understanding these concerns is essential for effective intervention.

    Key factors driving AI anxiety include:

    • Fear of Job Displacement – Concerns about automation replacing human rolesand uncertainty about future career paths
    • Security and Privacy Concerns – Worries about data protection and cybersecurity risks
    • Performance and Reliability Issues – Skepticism about AI accuracy and reliability and fears of over-reliance on imperfect systems
    • Skills and Competency Gaps – Concerns about keeping pace with change

    One of the most effective ways to allay these fears is to demonstrate how the technology augments human capabilities rather than replacing them. This approach shifts the narrative from job displacement to job enhancement.

    Pilot Projects with Visible Benefits

    • Implement AI solutions that address known pain points
    • Focus initial applications on automating tedious, low-value tasks
    • Showcase how AI frees up time for more meaningful work

    Skills Enhancement Programs

    • Develop training that shows how AI can enhance professional capabilities
    • Create clear pathways for employees to develop new, AI-complementary skills
    • Emphasize the increased value of human judgment and creativity in an AI-enabled environment

    Role Evolution Roadmaps

    • Work with employees to envision how their roles will evolve with AI
    • Create transition plans that map current skills to future requirements
    • Provide examples of how similar roles have been enhanced by AI in other organizations

    Shared Success Metrics

    • Develop metrics that track both AI performance and human success
    • Share how AI implementation impacts team and individual objectives
    • Create incentives that reward effective human-AI collaboration

    A common pitfall is focusing too narrowly on productivity gains. The McKinsey report notes that “If CEOs only talk about productivity they’ve lost the plot,” suggesting that organizations should emphasize broader benefits like improved customer experience, new growth opportunities, and enhanced decision-making.

    Conclusion: Implementing an Enterprise-Wide Upskilling Initiative

    Timeline for Implementation

    Creating an AI-ready workforce requires a structured, phased approach. Here’s a sample timeline I’ve implemented for my clients:

    Phase 1: Assessment and Planning (1 months)

    • Conduct an AI skills gap analysis across the organization
    • Develop a comprehensive upskilling strategy aligned with business objectives
    • Build executive sponsorship and secure necessary resources
    • Establish baseline metrics for measuring progress

    Phase 2: Infrastructure and Pilot Programs (2-3 months)

    • Develop learning infrastructure (platforms, content, delivery mechanisms)
    • Identify and train initial AI champions across departments
    • Launch pilot training programs with high-potential teams
    • Collect feedback and refine approach based on early learnings

    Phase 3: Scaled Implementation (3-6 months)

    • Roll out tiered training programs across the organization
    • Activate formal mentorship programs and communities of practice
    • Implement recognition systems for AI skill development
    • Begin integration of AI skills into performance management processes

    Phase 4: Sustainability and Evolution (6+ months)

    • Establish continuous learning mechanisms for emerging AI capabilities
    • Develop advanced specialization tracks for technical experts
    • Create innovation programs to apply AI skills to business challenges
    • Regularly refresh content and approaches based on technological evolution

    This phased approach allows organizations to learn and adapt as they go, starting with focused efforts and expanding based on successful outcomes. The timeline above is very aggressive and may need adjustment based on organizational size, industry complexity, and the current state of AI readiness.

    Key Performance Indicators for Measuring Workforce Readiness

    To evaluate the effectiveness of AI upskilling initiatives, organizations should establish a balanced set of metrics that capture both learning outcomes and business impact. Based on my client work, I’ve found that KPIs should include:

    Learning and Adoption Metrics

    • Percentage of employees completing AI training by role/level
    • AI tool adoption rates across departments
    • Number of AI use cases identified and implemented by teams
    • Employee self-reported confidence with AI tools

    Operational Metrics

    • Productivity improvements in AI-augmented workflows
    • Reduction in time spent on routine tasks
    • Quality improvements in AI-assisted processes
    • Decrease in AI-related support requests over time

    Business Impact Metrics

    • Revenue generated from AI-enabled products or services
    • Cost savings from AI-enabled process improvements
    • Customer experience improvements from AI implementation
    • Innovation metrics (number of new AI-enabled offerings)

    Cultural and Organizational Metrics

    • Employee sentiment toward AI (measured through surveys)
    • Retention rates for employees with AI skills
    • Internal mobility of employees with AI expertise
    • Percentage of roles with updated AI skill requirements

    Organizations should establish baseline measurements before launching upskilling initiatives and track progress at regular intervals.

    Long-term Talent Strategy Considerations

    As organizations look beyond immediate upskilling needs, several strategic considerations emerge for long-term AI talent management:

    Evolving Skill Requirements

    • Regularly reassess AI skill requirements as technology evolves
    • Develop capabilities to forecast emerging skills needs
    • Create flexible learning systems that can quickly incorporate new content

    Talent Acquisition Strategy

    • Redefine job descriptions and requirements to attract AI-savvy talent
    • Develop AI skills assessment methods for hiring processes
    • Create compelling employee value propositions for technical talent

    Career Path Evolution

    • Design new career paths that incorporate AI expertise
    • Create advancement opportunities for AI specialists
    • Develop hybrid roles that combine domain expertise with AI capabilities

    Organizational Structure Adaptation

    • Evaluate how AI impacts traditional reporting relationships
    • Consider new organizational models that optimize human-AI collaboration
    • Develop governance structures for AI development and deployment

    Cultural Transformation

    • Foster a culture that values continuous learning and adaptation
    • Promote cross-functional collaboration around AI initiatives
    • Build ethical frameworks for responsible AI use

    Final Thoughts

    AI is going to shock the system in an even bigger way than computers or the internet. So creating an AI-ready workforce requires a comprehensive organizational transformation.

    By conducting thorough skills gap analyses, implementing the “build, buy, bot, borrow” model for talent development, creating a continuous learning culture, and addressing AI anxiety with empathy and transparency, organizations can position themselves for success in the AI era.

    I’ve worked with dozens of organizations to help them with this. Book me for a free consultation call and I can help you too.

  • The Ultimate Guide to Designing And Building AI Agents

    The Ultimate Guide to Designing And Building AI Agents

    At the start of this year, Jensen Huang, CEO of Nvidia, said 2025 will be the year of the AI agent. Many high-profile companies like Shopify and Duolingo have reinvented themselves with AI at it score, building internal systems and agents to automate processes and reduce headcount.

    I spent the last 3 years running a Venture Studio that built startups with AI at the core. Prior to that I built one of the first AI companies on GPT-3. And now I consult for companies on AI implementation. Whether you’re a business leader looking to automate complex workflows or an engineer figuring out the nuts and bolts, this guide contains the entire process I use with my clients.

    This is a non-technical guide and is model and framework agnostic. If you’re looking for technical implementations, I have guides on the OpenAI Agent SDK, Google’s Agent Development Kit, and CrewAI.

    The purpose of this guide is to help you identify where agents will be useful in your organization, and how to design them to produce real business results. Much like you design a product before building it, this should be your first starting point before building an agent.

    Let us begin.

    PS – I’ve put together a 5-day email course where I walk through designing and implementing a live AI agent using no-code tools. Sign up below.

    What Makes a System an “Agent”?

    No, that automation you built with Zapier is not an AI agent. Neither is the chatbot you have on your website.

    An AI agent is a system that independently accomplishes tasks on your behalf with minimal supervision. Unlike passive systems that just respond to queries or execute simple commands, agents proactively make decisions and take actions to accomplish goals.

    Think of it like a human intern or an analyst. It can do what they can, except get you coffee.

    How do they do this? There are 4 main components to an AI agent – the model, the instructions, the tools, and the memory. We’ll go into more detail later on, but here’s a quick visual on how they work.

    The model is the core component. This is an AI model like GPT, Claude, Gemini or whatever, and it starts when it is invoked or triggered by some action.

    Some agents get triggered by a chat or phone call. You’ve probably come across these. Others get triggered when a button is clicked or a form is submitted. Some even get triggered through a cron job at regular intervals, or an API call from another app.

    For example, this content creation agent I built for a VC fund gets triggered when a new investment memo is uploaded to a form.

    When triggered, the model uses the instructions it has been given to figure out what to do. In this case, the instructions tell it to analyze the memo, research the company, remove sensitive data, and convert it into a blog post.

    To do this, the agent has access to tools such as a web scraper that finds information about the company. It loops through these tools and finally produces a blog post, using its memory of the fund’s past content to write in their tone and voice.

    You can see how this is different from a regular automation, where you define every step. Even if you use AI in your automation, it’s one step in a sequence. With an agent, the AI forms the central component, decides which steps to performs, and then loops through them until the job is done.

    We’ll cover how to structure these components and create that loop later. But first…

    Do You really need an AI agent?

    Most of the things you want automated don’t really need an AI agent. You can trigger email followups, schedule content, and more through basic automation tools.

    Rule of thumb, if a process can be fully captured in a flowchart with no ambiguity or judgment calls, traditional automation is likely more efficient and far more cost-effective.

    I also generally advise against building AI agents for high-stakes decisions where an error could be extremely costly, or there’s a legal requirement to provide explainability and transparency.

    When you exclude processes that are too simple or too risky, you’re left with good candidates for AI Agents. These tend to be:

    1. Processes where you have multiple variables, shifting context, plenty of edge cases, or decision criteria that can’t be captured with rules. Customer refund approvals are a good example.
    2. Processes that resemble a tangled web of if-then statements with frequent exceptions and special cases, like vendor security reviews.
    3. Processes that involve significant amounts of unstructured data, like natural language understanding, reading documents, analyzing text or images, and so on. Insurance claims processing is a good example.

    A VC fund I worked with wanted to automate some of their processes. We excluded simple ones like pitch deck submission (can be done through a Typeform with CRM integration), and high-stakes ones like making the actual investment decisions.

    We then built AI agents to automate the rest, like a Due Diligence Agent (research companies, founders, markets, and competition, to build a thorough investment memo) and the content generation agent I mentioned earlier.

    Practical Identification Process

    To systematically identify agent opportunities in your organization, follow this process:

    1. Catalog existing processes
      • Document current workflows, especially those with manual steps
      • Note pain points, bottlenecks, and error-prone activities
      • Identify processes with high volume or strategic importance
    2. Evaluate against the criteria above
      • Score each process on complexity, reasoning requirements, tool access, etc.
      • Eliminate clear mismatches (too simple, too risky, etc.)
      • Prioritize high-potential candidates
    3. Assess feasibility
      • Review available data and system integrations
      • Evaluate current documentation and process definitions
      • Consider organizational readiness and potential resistance
    4. Calculate potential ROI
      • Estimate current costs (time, errors, delays)
      • Project implementation and ongoing costs
      • Quantify potential benefits (efficiency, quality, scalability)
    5. Start small and target quick wins
      • Begin with bounded, lower-risk opportunities
      • Focus on areas with clear metrics for success
      • Build capabilities and confidence with each implementation

    Remember that the best agent implementations often start with a clear problem to solve rather than a technology looking for an application.

    Contact me if you need help with this

    I offer free process audits to help companies identify where they can build agents and reduce wasted time. Book a time with me here.

    Agent Architecture & Design Principles

    Remember that loop I mentioned earlier? That’s our architecture pattern and will tell us how to select and connect the 4 core components of our agent.

    Simple “Loop and Fetch” Architecture

    The most basic agent architecture follows a straightforward loop:

    1. Receive input (from a user or another system or a cron job)
    2. Process the input using an AI model (with guidance from instructions)
    3. Determine the next action (respond directly or call a tool)
    4. Execute the action (use memory if needed)
    5. Observe the result (check against instructions)
    6. Loop back to step 2

    This pattern works well for simpler agents with limited tool sets and straightforward workflows. It’s easy to implement and reason about, making it a good starting point for many projects and it’s the one I used for the content agent I mentioned.

    Here’s a conceptual example:

    Plaintext
    function runAgent(input, context) {
      while (true) {
        // Process input with LLM
        const llmResponse = model.process(input, context)
        
        // Check if the LLM wants to use a tool
        if (llmResponse.hasTool) {
          // Execute the tool
          const toolResult = executeTool(llmResponse.tool, llmResponse.parameters)
          
          // Update context with the result
          context.addToolResult(toolResult)
          
          // Continue the loop with the tool result as new input
          input = toolResult
        } else {
          // No tool needed, return the response
          return llmResponse.message
        }
      }
    }

    ReAct-Style Reasoning Frameworks

    ReAct (Reasoning and Acting) frameworks enhance the basic loop with more explicit reasoning steps. Rather than immediately jumping to actions, the agent follows a more deliberate process:

    1. Thought: Reason about the current state and goal
    2. Action: Decide on a specific action to take
    3. Observation: Observe the result of the action
    4. Repeat: Continue this cycle until the goal is achieved

    The key difference between this and the simple loop is the agent thinks explicitly about each step, making its reasoning more transparent and often leading to better decision-making for complex tasks. This is the architecture often used in research agents, like the Deep Research feature in Gemini and ChatGPT.

    I custom-built this for a SaaS client that was spending a lot of time on research for their long-form blog content –

    Hierarchical Planning Structures

    For more complex workflows, hierarchical planning separates high-level strategy from tactical execution:

    1. A top-level planner breaks down the overall goal into major steps
    2. Each step might be further decomposed into smaller tasks
    3. Execution happens at the lowest level of the hierarchy
    4. Results flow back up, potentially triggering replanning

    This architecture excels at managing complex, multi-stage workflows where different levels of abstraction are helpful. For example, a document processing agent might:

    • At the highest level, plan to extract information, verify it, and generate a report
    • At the middle level, break “extract information” into steps for each document section
    • At the lowest level, execute specific extraction tasks on individual paragraphs

    Memory-Augmented Frameworks

    Memory-augmented architectures extend basic agents with sophisticated memory systems:

    1. Before processing input, the agent retrieves relevant information from memory
    2. The retrieved context enriches the agent’s reasoning
    3. After completing an action, the agent updates its memory with new information

    This approach is particularly valuable for:

    • Personalized agents that adapt to individual users over time
    • Knowledge-intensive tasks where retrieval of relevant information is critical
    • Interactions that benefit from historical context

    Multi-Agent Cooperative Systems

    Sometimes the most effective approach involves multiple specialized agents working together:

    1. A coordinator agent breaks down the overall task
    2. Specialized agents handle different aspects of the workflow
    3. Results are aggregated and synthesized
    4. The coordinator determines next steps or delivers final outputs

    This architecture works well when different parts of a workflow require substantially different capabilities or tool sets. For example, a customer service system might employ:

    • A documentation agent to retrieve relevant resources
    • A triage agent to understand initial requests
    • A technical support agent for product issues
    • A billing specialist for financial matters

    If this is your first agent, I suggest starting with the simple loop architecture. I find it helps to sketch out the process, starting with what triggers our agent, what the instructions should be, what tools it has access to, if it needs memory, and what the final output looks like.

    I show you how to implement this in my 5-day Challenge.

    Core Components of Effective Agents

    As I said earlier, every effective agent, regardless of implementation details, consists of four fundamental layers:

    1. The Model Layer: The “Brain”

    This is the large language models that provide the reasoning and decision-making capabilities. These models:

    • Process and understand natural language inputs
    • Generate coherent and contextually appropriate responses
    • Apply complex reasoning to solve problems
    • Make decisions about what actions to take next

    Different agents may use different models or even multiple models for different aspects of their workflow. A customer service agent might use a smaller, faster model for initial triage and a more powerful model for complex problem-solving.

    2. The Tool Layer: The “Hands”

    Tools extend an agent’s capabilities by connecting it to external systems and data sources. These might include:

    • Data tools: Database queries, knowledge base searches, document retrieval
    • Action tools: Email sending, calendar management, CRM updates
    • Orchestration tools: Coordination with other agents or services

    Tools are the difference between an agent that can only talk about doing something and one that can actually get things done.

    3. The Instruction Layer: The “Rulebook”

    Instructions and guardrails define how an agent behaves and the boundaries within which it operates. This includes:

    • Task-specific guidelines and procedures
    • Ethical constraints and safety measures
    • Error handling protocols
    • User preference settings

    Clear instructions reduce ambiguity and improve agent decision-making, resulting in smoother workflow execution and fewer errors. Without proper instructions, even the most sophisticated model with the best tools will struggle to deliver consistent results.

    4. Memory Systems: The “Experience”

    Memory is crucial for agents that maintain context over time:

    • Short-term memory: Tracking the current state of a conversation or task
    • Long-term memory: Recording persistent information about users, past interactions, or domain knowledge

    Memory enables agents to learn from experience, avoid repeating mistakes, and provide personalized service based on historical context.

    The next few sections covers the strategy behind these components, plus two additional considerations – guardrails, and error handling.

    Model Selection Strategy

    Not every task requires the most advanced (and expensive) model available. You need to balance capability, cost, and latency requirements for your specific use case.

    Capability Assessment

    Different models have different strengths. When evaluating models for your agent:

    1. Start with baseline requirements:
      • Understanding complex instructions
      • Multi-step reasoning capabilities
      • Contextual awareness
      • Tool usage proficiency
    2. Consider specialized capabilities needed:
      • Code generation and analysis
      • Mathematical reasoning
      • Multi-lingual support
      • Domain-specific knowledge
    3. Assess the complexity of your tasks:
      • Simple classification or routing might work with smaller models
      • Complex decision-making typically requires more advanced models
      • Multi-step reasoning benefits from models with stronger planning abilities

    For example, a customer service triage agent might effectively use a smaller model to categorize incoming requests, while a coding agent working on complex refactoring tasks would benefit from a more sophisticated model with strong reasoning capabilities and code understanding.

    Creating a Performance Baseline

    A proven approach is to begin with the most capable model available to establish a performance baseline:

    1. Start high: Build your initial prototype with the most advanced model
    2. Define clear metrics: Establish concrete measures of success
    3. Test thoroughly: Validate performance across a range of typical scenarios
    4. Document the baseline: Record performance benchmarks for comparison

    This baseline represents the upper limit of what’s currently possible and provides a reference point for evaluating tradeoffs with smaller or more specialized models.

    Optimization Strategy

    Once you’ve established your baseline, you can optimize by testing smaller, faster, or less expensive models:

    1. Identify candidate models: Select models with progressively lower capability/cost profiles
    2. Comparative testing: Evaluate each candidate against your benchmark test set
    3. Analyze performance gaps: Determine where and why performance differs
    4. Make informed decisions: Choose the simplest model that meets your requirements

    This methodical approach helps you find the optimal balance between performance and efficiency without prematurely limiting your agent’s capabilities.

    Multi-Model Architecture

    For complex workflows, consider using different models for different tasks within the same agent system:

    • Smaller, faster models for routine tasks (classification, simple responses)
    • Medium-sized models for standard interactions and decisions
    • Larger, more capable models for complex reasoning, planning, or specialized tasks

    For example, an agent might use a smaller model for initial user intent classification, then invoke a larger model only when it encounters complex requests requiring sophisticated reasoning.

    This tiered approach can significantly reduce average costs and latency while maintaining high-quality results for challenging tasks.

    My Default Models

    I find myself defaulting to a handful of models, at least when starting out, before optimizing the agent:

    1. Reasoning – OpenAI o3 or Gemini 2.5 Pro
    2. Data Analysis – Gemini 2.5 Flash
    3. Image Generation – GPT 4o
    4. Code Generation – Gemini 2.5 Pro
    5. Content Generation – Claude 3.7 Sonnet
    6. Triage – GPT 3.5 Turbo or Gemini 2.0 Flash-Lite (hey I don’t make the names ok)

    Every model provider has a Playground where you can test the models. Start there if you’re not sure which one to pick.

    Tool Definition Best Practices

    Tools extend your agent’s capabilities by connecting it to external systems and data sources. Well-designed tools are clear, reliable, and reusable across multiple agents.

    Tool Categories and Planning

    When planning your agent’s tool set, consider the three main categories of tools it might need:

    1. Data Tools: Enable agents to retrieve context and information
      • Database queries – Eg: find a user’s profile information
      • Document retrieval – Eg: get the latest campaign plan
      • Search capabilities – Eg: search through emails
      • Knowledge base access – Eg: Find the refund policy
    2. Action Tools: Allow agents to interact with systems and take actions
      • Sending messages – Eg: send a Slack alert
      • Updating records – Eg: change the user’s profile
      • Creating content – Eg: generate an image
      • Managing resources – Eg: give access to some other tool
      • Initiating processes – Eg: Trigger another process or automation
    3. Orchestration Tools: Connect agents to other agents or specialized services
      • Expert consultations – Eg: connect to a fine-tuned medical model
      • Specialized analysis – Eg: handoff to a reasoning model for data analysis
      • Delegated sub-tasks – Eg: Handoff to a content generation agent

    A well-rounded agent typically needs tools from multiple categories to handle complex workflows effectively.

    Designing Effective Tool Interfaces

    Tool design has a significant impact on your agent’s ability to use them correctly. Follow these guidelines:

    1. Clear naming: Use descriptive, task-oriented names that indicate exactly what the tool does
      • Good: search_customer_records, update_shipping_address
      • Poor: db_func, process_op, handle_data
    2. Comprehensive descriptions: Provide detailed documentation about:
      • The tool’s purpose and when to use it
      • Required parameters and their formats
      • Expected outputs and potential errors
      • Limitations or constraints to be aware of
    3. Focused functionality: Each tool should do one thing and do it well
      • Prefer multiple specialized tools over single complex tools
      • Maintain a clear separation of concerns
      • Simplify parameter requirements for each individual tool
    4. Consistent patterns: Apply consistent conventions across your tool set
      • Standardize parameter naming and formats
      • Use similar patterns for related tools
      • Maintain consistent error handling and response structures

    Here’s an example of a well-defined tool:

    Plaintext
    @function_tool
    def search_customer_orders(customer_id: str, status: Optional[str] = None, 
                              start_date: Optional[str] = None, 
                              end_date: Optional[str] = None) -> List[Order]:
        """
        Search for a customer's orders with optional filtering.
        
        Parameters:
        - customer_id: The unique identifier for the customer (required)
        - status: Optional filter for order status ('pending', 'shipped', 'delivered', 'cancelled')
        - start_date: Optional start date for filtering orders (format: YYYY-MM-DD)
        - end_date: Optional end date for filtering orders (format: YYYY-MM-DD)
        
        Returns:
        A list of order objects matching the criteria, each containing:
        - order_id: Unique order identifier
        - date: Date the order was placed
        - items: List of items in the order
        - total: Order total amount
        - status: Current order status
        
        Example usage:
        search_customer_orders("CUST123", status="shipped")
        search_customer_orders("CUST123", start_date="2023-01-01", end_date="2023-01-31")
        """
        # Implementation details here

    Crafting Effective Instructions

    Instructions form the foundation of agent behavior. They define goals, constraints, and expectations, guiding how the agent approaches tasks and makes decisions.

    Effective instructions (aka prompt engineering) follow these core principles:

    1. Clarity over brevity: Be explicit rather than assuming the model will infer your intent
    2. Structure over freeform: Organize instructions in logical sections with clear headings
    3. Examples over rules: Demonstrate desired behaviors through concrete examples
    4. Specificity over generality: Address common edge cases and failure modes directly

    All of this is to say, the more precise and detailed you can be with instructions, the better. It’s like creating a SOP for an executive assistant.

    In fact, I often start with existing documentation and resources like operating procedures, sales or support scripts, policy documents, and knowledge base articles when creating instructions for agents in business contexts.

    I’ll turn them into LLM-friendly instructions with clear actions, decision criteria, and expected outputs.

    For example, converting a customer refund policy into agent instructions might look like this:

    Original policy: “Refunds may be processed for items returned within 30 days of purchase with a valid receipt. Items showing signs of use may receive partial refunds at manager discretion. Special order items are non-refundable.”

    Agent-friendly instructions:

    Plaintext
    When processing a refund request:
    
    1. Verify return eligibility:
       - Check if the return is within 30 days of purchase
       - Confirm the customer has a valid receipt
       - Determine if the item is a special order (check the "special_order" flag in the order details)
    
    2. Assess item condition:
       - If the item is unopened and in original packaging, proceed with full refund
       - If the item shows signs of use or opened packaging, classify as "partial refund candidate"
       - If the item is damaged beyond normal use, classify as "potential warranty claim"
    
    3. Determine refund amount:
       - For eligible returns in new condition: Issue 100% refund of purchase price
       - For "partial refund candidates": Issue 75% refund if within 14 days, 50% if 15-30 days
       - For special order items: Explain these are non-refundable per policy
       - For potential warranty claims: Direct to warranty process
    
    4. Process the refund:
       - For amounts under $50: Process automatically
       - For amounts $50-$200: Request supervisor review if partial refund
       - For amounts over $200: Escalate to manager

    You’re not going to get this right on the first shot. Instead, it is an iterative process:

    1. Start with draft instructions based on existing documentation
    2. Test with realistic scenarios to identify gaps or unclear areas
    3. Observe agent behavior and note any deviations from expected actions
    4. Refine instructions to address observed issues by adding in edge cases or missing information
    5. Repeat until performance meets requirements

    I cover these concepts in my 5-Day AI Agent Challenge. Sign up here.

    Memory Systems Implementation

    Effective memory implementation is crucial for agents that maintain context over time or learn from experience.

    Short-term memory handles the immediate context of the current interaction:

    1. Conversation history: Recent exchanges between user and agent
    2. Current task state: The agent’s progress on the active task
    3. Working information: Temporary data needed for the current interaction

    For most agents, this context is maintained within the conversation window, though you may need to implement summarization or pruning strategies as conversations grow longer.

    Long-term memory preserves information across sessions:

    1. User profiles: Preferences, history, and specific needs
    2. Learned patterns: Recurring issues or successful approaches
    3. Domain knowledge: Accumulated expertise and background information

    Implementation options include:

    • Traditional databases for structured information
    • Vector stores for semantic retrieval capabilities
    • Hybrid approaches combining multiple storage methods

    Whatever method you use to store memory, you need a smart retrieval mechanism because you’re going to be adding all that data to the context window of your agent’s core model or tools:

    1. Relevance filtering: Surface only information pertinent to the current context
    2. Recency weighting: Prioritize recent information when appropriate
    3. Semantic search: Find conceptually related information even with different wording
    4. Hierarchical retrieval: Start with general context and add details as needed

    Well-designed retrieval keeps memory useful without overwhelming the agent with irrelevant information or taking up space in the context window.

    Privacy and Data Management

    Ensuring your agent can’t mishandle data, access the wrong type of data, or reveal data to users is extremely important for obvious reasons. I could write a whole blog post about this.

    In most cases, having really good tool design, plus guardrails and safety mechanisms (next section) ensures privacy and data, but here are some things to think about:

    1. Retention policies: Define how long different types of information should be kept
    2. Anonymization: Remove identifying details when full identity isn’t needed
    3. Access controls: Limit who (or what) can access stored information
    4. User control: Give users visibility into what’s stored and how it’s used

    Guardrails and Safety Mechanisms

    Even the best-designed agents need guardrails to ensure they operate safely and appropriately. Guardrails are protective mechanisms that define boundaries, prevent harmful actions, and ensure the agent behaves as expected.

    A good strategy takes a layered approach, so if one layer fails, others can still prevent potential issues. Start with setting clear boundaries while defining the agent’s instructions in the previous section.

    You can then add some input validation to process user requests to the agent and identify if it’s out of scope or potentially harmful (like a jailbreak).

    Python
    @input_guardrail
    def safety_guardrail(ctx, agent, input):
        # Check input against safety classifier
        safety_result = safety_classifier.classify(input)
        
        if safety_result.is_unsafe:
            # Return a predefined response instead of processing the input
            return GuardrailFunctionOutput(
                output="I'm not able to respond to that type of request. 
                        Is there something else I can help you with?",
                tripwire_triggered=True
            )
        
        # Input is safe, continue normal processing
        return GuardrailFunctionOutput(
            tripwire_triggered=False
        )

    Output guardrails verify the agent’s responses before they reach the user, to flag PII (personally identifiable information) or inappropriate content:

    Python
    @output_guardrail
    def pii_filter_guardrail(ctx, agent, output):
        # Check for PII in the output
        pii_result = pii_detector.scan(output)
        
        if pii_result.has_pii:
            # Redact PII from the output
            redacted_output = pii_detector.redact(output)
            return GuardrailFunctionOutput(
                output=redacted_output,
                tripwire_triggered=True
            )
        
        # Output is clean
        return GuardrailFunctionOutput(
            tripwire_triggered=False
        )

    Also ensure you have guardrails on tool usage, especially if these tools are used to change data, trigger a critical process, or something that requires permissions or approvals.

    Python
    @output_guardrail
    def pii_filter_guardrail(ctx, agent, output):
        # Check for PII in the output
        pii_result = pii_detector.scan(output)
        
        if pii_result.has_pii:
            # Redact PII from the output
            redacted_output = pii_detector.redact(output)
            return GuardrailFunctionOutput(
                output=redacted_output,
                tripwire_triggered=True
            )
        
        # Output is clean
        return GuardrailFunctionOutput(
            tripwire_triggered=False
        )

    Human-in-the-Loop Integration

    I always recommend a human-in-the-loop to my clients, especialy for high-risk operations. Here are some ways to build that in:

    • Feedback integration: Incorporate human feedback to improve agent behavior
    • Approval workflows: Route certain actions for human review before execution
    • Sampling for quality: Review a percentage of agent interactions for quality control
    • Escalation paths: Define clear processes for when and how to involve humans

    Error Handling and Recovery

    Even the best agents will encounter errors and unexpected situations. When you test your agent, first identify and isolate where the error is coming from:

    1. Input errors: Problems with user requests (ambiguity, incompleteness)
    2. Tool errors: Issues with external systems or services
    3. Processing errors: Problems in the agent’s reasoning or decision-making
    4. Resource errors: Timeouts, memory limitations, or quota exhaustion

    Based on the error type, the agent can apply appropriate recovery strategies. Ideally, agents should be able to recover from minor errors through self-correction:

    1. Validation loops: Check results against expectations before proceeding
    2. Retry strategies: Attempt failed operations again with adjustments
    3. Alternative approaches: Try different methods when the primary approach fails
    4. Graceful degradation: Fall back to simpler capabilities when advanced ones fail

    For example, if a database query fails, the agent might retry with a more general query, or fall back to cached information. Beyond that, you may want to build out alert systems and escalation paths to human employees, and explain the limitation to the user.

    Testing Your Agent

    Now that you have all the pieces of the puzzle, it’s time to test the agent.

    Testing AI agents fundamentally differs from testing traditional software. While conventional applications follow deterministic paths with predictable outputs, agents exhibit non-deterministic behavior that can vary based on context, inputs, and implementation details.

    This leads to challenges that are unique to AI agents, such as hallucinations, bias, prompt injections, inefficient loops, and more.

    Unit Testing Components

    • Test individual modules independently (models, tools, memory systems, instructions)
    • Verify tool functionality, error handling, and edge cases

    Example: A financial advisor agent uses a stock price tool. Unit tests would verify the tool returns correct data for valid tickers, handles non-existent tickers gracefully, and manages API failures appropriately, all without involving the full agent.

    Integration Testing

    • Test end-to-end workflows in simulated environments
    • Verify components work together correctly

    Example: An e-commerce support agent integration test would validate the complete customer journey, from initial inquiry about a delayed package through tracking lookup, status explanation, and potential resolution options, ensuring all tools and components work together seamlessly.

    Security Testing

    Security testing probes the agent’s resilience against misuse or manipulation.

    • Instruction override attempts: Try to make the agent ignore its guidelines
    • Parameter manipulation: Attempt to pass invalid or dangerous parameters to tools
    • Context contamination: Try to confuse the agent with misleading context
    • Jailbreak testing: Test known techniques for bypassing guardrails

    Example: Security testing for a healthcare agent would include attempts to extract patient data through crafted prompts, testing guardrails against medical misinformation, and verifying that sensitive information isn’t retained or leaked.

    Hallucination Testing

    • Compare responses against verified information
    • Check source attribution and citation practices

    Example: A financial advisor agent might be tested against questions with known answers about market events, company performance, and financial regulations, verifying accuracy and appropriate expressions of uncertainty for projections or predictions.

    Performance and Scalability Testing

    Performance testing evaluates how well the agent handles real-world conditions and workloads.

    • Response time: Track how quickly the agent processes requests
    • Model usage optimization: Track token consumption and model invocations
    • Cost per transaction: Calculate average cost to complete typical workflows

    These are just a few tests and error types to keep in mind and should be enough for basic agents.

    As your agent grows more complex, you’ll need a more comprehensive testing and evaluation framework, which I’ll cover in a later blog post. Sign up to my emails to stay posted.

    Deploying, Monitoring, and Improving Your Agent

    The final piece is to deploy your agent, see how it performs in the real-world, collect feedback, and improve it over time.

    Deploying agents depends heavily on how you build it. No-code platforms like Make, n8n, and Relevance have their own deployment solutions. If you’re coding your own agents, you may want to look into custom hosting and deployment solutions.

    I often advise clients to deploy agents alongside the existing process, slowly and gradually. See how it performs in the real-world and continuously improve it. Over time you can phase out the existing process and use the agent instead.

    Doing it this way also allows you to evaluate the performance of the agent against current numbers. Does it handle customer support inquiries with a higher NPS score? Do the ads it generates have better CTRs?

    Many of these no-code platforms also come with built-in observability, allowing you to monitor your agent and track how it performs. If you’re coding the agent yourself, consider using a framework like OpenAI’s agent SDK, or Google ADK, which comes with built-in tracing.

    You also want to collect actual usage feedback, like how often are users interacting with the agent, how happy are they, and so on. You can then use this to further improve the agent through refining the instructions, adding more tools, or updating the memory.

    Again, for basic agents, these out-of-the-box solutions are more than enough. If you’re building more complex agents, you’ll need to build out AgentOps to monitor and improve the agent. More on that in a later blog post.

    Case Studies

    You’re now familiar with all the components that make up an agent, how to put the components together, and how to test, deploy, and evaluate them. Let’s look at some case studies and implementation examples to drive the point home and inspire you.

    Customer Service Agent

    One of the most widely implemented agent types helps customers resolve issues, answer questions, and navigate services. Successful customer service agents typically include:

    • Feedback collection: Gathers user satisfaction data for improvement
    • Intent classification system: Quickly categorizes customer inquiries
    • Knowledge retrieval system: Accesses relevant policies and information
    • User context integration: Incorporates customer history and account information
    • Escalation mechanism: Seamlessly transfers to human agents when needed

    An eCommerce company I worked with wanted a 24/7 customer support chatbot on their site. We started with narrow use cases like answering FAQs and order information. The chatbot triggered a triage agent which determined whether the query was within our initial use case set or not.

    If it was, it had access to knowledge base documents for FAQs and order information based on an order number.

    For everything else, it handed it off to a support agent. This allowed the company to dramatically decreases average response times and increase support volume while maintaining their satisfaction scores.

    Research Assistant

    Research assistant agents help users gather, synthesize, and analyze information from multiple sources. Effective research assistants typically include:

    1. Search and retrieval capabilities: Access to diverse information sources
    2. Information verification mechanisms: Cross-checking facts across sources
    3. Synthesis frameworks: Methods for combining information coherently
    4. Citation and attribution systems: Tracking information provenance
    5. User collaboration interfaces: Tools for refining and directing research

    A VC firm I worked with wanted to build a due diligence agent for the deals they were looking at. We triggered the agent when a new deal was created in their database. The agent would first identify the company and the market they were in, and then research them and synthesize the information into an investment memo.

    This spend up the diligence process from a couple of hours to a couple of minutes.

    Content Generation

    Content generation agents help create, refine, and manage various forms of content, from marketing materials to technical documentation.

    Effective content generation agents typically include:

    1. Style and tone frameworks: Guidance for appropriate content voice
    2. Factual knowledge integration: Access to accurate domain information
    3. Feedback incorporation mechanisms: Methods for refining outputs
    4. Format adaptation: Generating content appropriate to different channels

    A PR agency I worked with wanted an agent to create highly personalized responses to incoming PR requests. When a new request hit their inbox, it triggered an agent to look through their database of client content and find something specific to that pitch.

    It then used the agency’s internal guidelines to craft a pitch and respond to the request. This meant the agency could respond within minutes instead of hours, and get ahead of other responses.

    A Thought Exercise

    Here’s a bit of homework for you to see if you’ve learned something from this. You’re tasked with designing a travel booking agent. Yeah, I know, it’s a cliche example at this point, but it’s also a process that’s well understood by a large audience.

    The exercise is to design the agent. A simple flow chart with pen and paper or on a Figjam is usually how I start.

    Draw out the full process – what triggers the agent, what data is sent to it, is it a simple loop agent or a hierarchy of agents, what models and instructions will you give them, what tools and memory can they access.

    If you can do this and get into the habit of thinking in agents, implementation becomes easy. For visual examples, sign up for my 5-day Agent Challenge.

    Putting It All Together

    Phew, over 5,000 words later, we’re almost at the end. We’ve covered a lot in this post so let’s recap:

    1. Start with clear goals: Define exactly what your agent should accomplish and for whom
    2. Select appropriate models: Choose models that balance capability, cost, and latency
    3. Define your tool set: Implement and document the tools your agent needs
    4. Create clear instructions: Develop comprehensive guidance for agent behavior
    5. Implement layered guardrails: Build in appropriate safety mechanisms
    6. Design error handling: Plan for failures and define recovery strategies
    7. Add memory as needed: Implement context management appropriate to your use case, and external memory
    8. Test thoroughly: Validate performance across a range of scenarios
    9. Deploy incrementally: Roll out capabilities gradually to manage risk
    10. Monitor and improve: Collect data on real-world performance to drive improvements

    Next Steps

    There’s only one next step. Go build an agent. Start with something small and low-risk. One of my first agents was a content research agent, fully coded in Python. You can vibe code it if you’re not good at coding.

    If you want to use a framework, I suggest either OpenAI’s SDK or Google’s ADK, which I have in-depth guides on.

    And if you don’t want to touch code, there are some really good no-code platforms like Make, n8n, and Relevance. Sign up for my free email series below where I walk you through building an Agent in 5 Days with these tools.

  • Vibe Coding SaaS MVPs: The Ultimate Guide

    Vibe Coding SaaS MVPs: The Ultimate Guide

    I’ll be honest with you, I don’t actually like the term “vibe coding” because it makes it sound easy and error-free. Like oh I’m just going with the vibes, let’s see where it takes us.

    But the reality is it takes a lot of back and forth, restarts, research, repeats, re-everything to build and ship a functional MVP, even with AI. It’s fun, but it can get frustrating at times. And it’s in those moments of frustration where most people give up.

    I’m going to help you break through those moments of frustration so that you can come out successful on the other side.

    The approach I outline here isn’t theoretical. It’s a process that I’ve refined after countless hours using these tools and developing and shipping functional apps like Content Spark, a video analysis tool, and many more. Follow it and you’ll be shipping products in no time.

    PS – If you want to know what exactly vibe coding is, read my introductory article here.

    PPS – If you like videos, watch my process here –

    Step 1: Define Your App Concept and Requirements

    Before jumping into any coding tools, you need a clear vision of what you’re building. The quality of AI-generated code depends heavily on how well you communicate your idea. Even a simple one-paragraph description will help, but more detail leads to better results.

    First, create a basic description of your app in a document or text file. Include:

    • The app’s purpose (what problem does it solve?)
    • Target users
    • Core features and functionality
    • Basic user flow (how will people use it?)

    Then, use an AI assistant to refine your concept. Gemini 2.5 Pro is my favorite model right now, but you can use any other reasoning model like Claude 3.7 Sonnet Thinking or ChatGPT o3. Paste your description in, and ask it to help you flesh out the idea.

    Plaintext
    I'm planning to build [your app idea]. Help me flesh this out by asking questions. Let's go back and forth to clarify the requirements, then create a detailed PRD (Product Requirements Document).
    

    Answer the AI’s questions about features, user flows, and functionality. This conversation will help refine your vision.

    Request a formal PRD once the discussion is complete.

    Plaintext
    Based on our discussion, please create a comprehensive PRD for this app with:
    1. Core features and user flows
    2. Key screens/components
    3. Data requirements
    4. Technology considerations
    5. MVP scope vs future enhancements
    
    Let's discuss and refine this together before finalizing the document.

    In the video, I’m building a restaurant recommendation app. I started with a simple description and Gemini broke this down into manageable pieces and helped scope an MVP focused on just Vancouver restaurants first, with a simple recommendation engine based on mood matching.

    Pro Tips:

    • Save this PRD—you’ll use it throughout the development process
    • Be specific about what features you want in the MVP (minimum viable product) versus future versions
    • Let the AI suggest simplifications if your initial scope is too ambitious

    Get more deep dives on AI

    Like this post? Sign up for my newsletter and get notified every time I do a deep dive like this one.

    Step 2: Choose Your Tech Stack

    Now that you have a clear plan, it’s time to decide how you’ll build your app and which AI tools you’ll use. Decide between these two main approaches :

    a) One-shot generation: Having an AI tool generate the entire app at once

    • Best for: Simple apps, MVPs, rapid prototyping
    • Tools: Lovable.dev, Bolt.new, Replit, Google’s Firebase Studio
    • Advantages: Fastest path to a working app, minimal setup

    b) Guided development: Building the app piece by piece with AI assistance

    • Best for: More complex apps, learning the code, greater customization
    • Tools: Cursor, Windsurf, VS Code with GitHub Copilot, Gemini 2.5 Pro + manual coding
    • Advantages: More control, easier debugging, better understanding of the code

    In my video, I demonstrated both approaches:

    • One-shot generation with Lovable, which created a complete app from my PRD
    • Guided development with Cursor, where I built the app component by component

    For the rest of this guide, I’ll continue to explain both approaches although I do think the guided development provides an excellent balance of control and AI assistance.

    For one-shot approach, you simply need to sign up to one of Lovable, Bolt, or Replit (or try all three!). For guided, there are a couple of extra steps:

    • Install Cursor (cursor.sh) or your preferred AI-assisted IDE
    • Set up a local development environment (Node.js, Git, etc.)
    • Connect your GitHub account for version control
    • Create an account on Vercel for easy deployment

    Finally, decide on:

    • Frontend framework (e.g., React, Vue, Svelte)
    • Backend approach (e.g., Node.js, serverless functions)
    • Database needs (e.g., Firebase, Supabase, MongoDB)
    • Any third-party APIs or services

    If you need help, ask your AI assistant to recommend appropriate technologies based on your requirements and to evaluate trade-offs.

    Example Prompt:

    Plaintext
    For the restaurant recommendation app we've described, what tech stack would you recommend? I want something that:
    1. Allows for rapid development
    2. Has good AI tool support
    3. Can scale reasonably well if needed
    4. Isn't overly complex for an MVP
    
    For each recommendation, please explain your reasoning and highlight any potential limitations.

    Pro Tip: Mainstream technologies like React, Next.js, and common databases generally work better with AI tools because they’re well-represented in training data.

    Step 3: Generate the Initial App Structure with AI

    Now it’s time to create the foundation of your application.

    If Using One-Shot Generation (Lovable, Bolt, etc.):

    1. Create a new project in your chosen platform
    2. Paste your PRD from Step 1 into the prompt field
    3. Add any clarifications or specific tech requirements, such as: Please create this app using React for frontend and Supabase for the backend database. Include user authentication and a clean, minimalist UI.
    4. Generate the app (this may take 5-10 minutes depending on complexity)
    5. Explore the generated codebase to understand its structure and functionality

    If Using Guided Development (Cursor):

    1. Open Cursor and create a new project folder
    2. Start a new conversation with Cursor’s AI assistant by pressing Ctrl+L (or Cmd+L on Mac)
    3. Request project setup with a prompt like:
    Plaintext
    Let's start building our restaurant recommendation app. First, I need you to:
    
    1. Create a new Next.js project with TypeScript support
    2. Set up a basic structure for pages, components, and API routes
    3. Configure Tailwind CSS for styling
    4. Initialize a Git repository
    
    Before writing any code, please explain what you plan to do, then proceed step by step.

    In the video, I asked Cursor to “create a React Native expo app for a restaurant recommendation system” based on the PRD. It:

    • Created app directories (components, screens, constants, hooks)
    • Set up configuration files
    • Initialized TypeScript
    • Created placeholder screens for the restaurant app

    With Lovable, I simply pasted the PRD and it generated the complete app structure in minutes.

    In both cases, I’m just asking the agent to build everything from the PRD. However, in reality, I prefer to set it up myself and build the app by generating it component by component or page by page. That way, I know exactly what’s happening and where the different functions and components are, instead of trying to figure it out later.

    Pro Tips:

    • When using Cursor, you can execute terminal commands within the chat interface
    • For React or Next.js apps, setup typically involves running commands like npx create-react-app or npx create-next-app (which Cursor can do for you)
    • Check that everything works by running the app immediately after setup
    • If you encounter errors, provide them to the AI for troubleshooting

    Step 4: Build the User Interface With AI

    Now that you have the basic structure in place, it’s time to create the user interface for your app.

    If Using One-Shot Generation:

    1. Explore the generated UI to understand what’s already been created
    2. Identify changes or improvements you want to make
    3. Use the platform’s chat interface to request specific changes, like:
    Plaintext
    The home screen looks good, but I'd like to make these changes:
    1. Change the color scheme to blue and white
    2. Make the search button more prominent
    3. Add a filter section below the search bar

    If Using Guided Development:

    Create your main UI components one by one. For a typical app, you might need:

    • A home/landing page
    • Navigation structure (sidebar, navbar, or tabs)
    • List/grid views for data
    • Detail pages or modals
    • Forms for user input

    For each component, prompt Cursor with specific requests like:

    Plaintext
    Now I need to create the home screen for our restaurant recommendation app. It should include:
    
    1. A welcoming header with app name
    2. A prominent "Find Recommendations" button
    3. A section showing recent recommendations (empty state initially)
    4. A bottom navigation bar with icons for Home, Favorites, and Profile
    
    Please generate the React component for this screen using Tailwind CSS for styling. Focus on clean, responsive design.

    I’ll also use Gemini 2.5 Pro from the Gemini app in parallel. I’ll continue the same chat I started to write the PRD and have it strategize and build the app with me. Gemini tends to be more educational by explaining exactly what it is doing and why, allowing me to understand how this app is being built.

    Pro Tips:

    • For web apps, ensure your components are responsive for different devices
    • Start with simple layouts and add visual polish later
    • If you have design inspiration, describe it or provide links to similar UIs
    • For mobile apps, remember to account for different screen sizes
    • Test each screen after implementation to catch styling issues early

    Step 5: Implement Core Functionality

    With the UI in place, it’s time to add the functionality that makes your app actually work.

    If Using One-Shot Generation:

    1. Test the functionality that was automatically generated
    2. Identify missing or incorrect functionality
    3. Request specific functional improvements through the chat interface:
    Plaintext
    I notice that when filtering restaurants by mood, it's not working correctly. Can you modify the filtering function to properly match user mood selections with restaurant descriptions?

    If Using Guided Development:

    1. Implement core functions one by one. For example, in a restaurant app:
      • Search/filter functionality
      • Data fetching from API or database
      • User preference saving
      • Authentication (if needed)
    2. For each function, provide a clear prompt to Cursor:
    Plaintext
    Let's implement the core recommendation feature. When a user clicks "Find Recommendations," they should see a screen that:
    
    1. Asks for their current mood (dropdown with options like "romantic," "casual," "energetic")
    2. Lets them select cuisine preferences (multi-select)
    3. Allows setting a price range (slider with $ to $$$$ options)
    4. Has a "Show Recommendations" button
    
    When they click the button, it should call our recommendation function (which we'll implement later) and show a loading state.
    
    Please write the React component for this feature, including state management and form handling.

    Pro Tips:

    • For backend-heavy functionality, consider using Firebase, Supabase, or other backend-as-a-service options for simplicity
    • Implement one logical piece at a time and test before moving on
    • When errors occur, copy the exact error message and provide it to the AI
    • Break complex functions into smaller, more manageable pieces
    • Use comments in your prompts to explain the expected behavior in detail

    Step 6: Add Backend and Data Management

    Most apps need data. Whether you’re using mock data, a database, or external APIs, this step connects your app to its data sources.

    If Using One-Shot Generation:

    1. Check what data sources were set up automatically
    2. Request database or API integration if needed
    3. Provide necessary API keys or connection strings as instructed
    4. Test the data integration thoroughly
    Plaintext
    I want to replace the mock restaurant data with real data from the Google Places API. Please update the app to: 1. Connect to the Google Places API 2. Fetch nearby restaurants based on user location 3. Store favorites in a database (like Firebase or Supabase)

    If Using Guided Development:

    1. Define your data models and database schema
    2. Implement API routes or serverless functions
    3. Connect frontend components to backend services
    4. Add authentication if required

    Example Prompt:

    Plaintext
    For our restaurant recommendation app, I need to create the data layer. Let's:
    
    1. Define a Restaurant data model with fields for name, cuisine types, price range, location, and a text description
    2. Create an API endpoint that returns restaurants filtered by the user's preferences
    3. Implement a simple algorithm that matches restaurants to the user's mood based on keywords in the description
    4. For now, use a JSON file with 20 sample restaurants as our data source
    
    Please implement this backend functionality in our Next.js API routes.

    Pro Tips:

    • Test with various data scenarios (empty results, large result sets, etc.)
    • Start with mock data until your UI works correctly
    • For external APIs, paste their documentation into the chat to help the AI generate correct integration code
    • When using databases, start with a simple schema and expand as needed
    • Keep API keys and sensitive credentials out of your code (use environment variables)

    Step 7: Test, Debug, and Refine

    The final step is to thoroughly test your application, fix any issues, and deploy it for others to use.

    If Using One-Shot Generation:

    1. Test all user flows in the generated app
    2. Report and fix any bugs through the platform’s interface
    3. Deploy your app using the platform’s deployment options
    4. Share your app with testers or users to gather feedback
    Plaintext
    I found these issues while testing: 
    1. The app crashes when submitting an empty search 
    2. Restaurant images don't load correctly 
    3. The back button doesn't work on the details screen 
    Please fix these issues.

    If Using Guided Development:

    1. Conduct systematic testing of all features:
      • Basic functionality testing
      • Edge case testing (empty states, error handling)
      • Performance testing
      • Device/browser compatibility testing
    2. Fix bugs with AI assistance: I'm encountering this error when trying to submit the search form: [paste error message] Here's the code for the search component: [paste relevant code] Please help identify and fix this issue.
    3. Optimize performance if needed: The restaurant list is loading slowly when there are many results. Can you suggest ways to optimize this component for better performance?
    4. Prepare for deployment: Help me prepare this app for deployment. I want to: 1. Set up production environment variables 2. Optimize the build for production 3. Deploy the frontend to Vercel and the backend to Render Please provide the necessary steps and configurations.
    5. Deploy and monitor your application

    In the video demonstration, we encountered and fixed several issues:

    1. A 404 error due to mismatched API endpoints
    2. Authentication token issues with the OpenAI API
    3. UI rendering problems on the restaurant listing screen

    This is bound to happen, especially if you’re trying to build something more complex than a landing page. We fixed these by examining error messages, updating code, and testing incrementally until everything worked correctly.

    Most importantly, don’t give up. If you’re stuck somewhere and AI can’t help you figure it out, Google it, or ask a friend.

    Plaintext
    I've found a bug in our recommendation feature. When a user selects multiple cuisine types, the filtering doesn't work correctly. Here's the error I'm seeing:
    
    [Paste error message or describe the issue]
    
    Here's the current code for the recommendation function:
    
    [Paste the relevant code]
    
    Please analyze the issue and suggest a fix.

    Pro Tips:

    • Always test thoroughly after making significant changes
    • Keep your browser console open to catch JavaScript errors
    • Use Git commits after each successful feature implementation
    • Document any workarounds or special configurations you needed
    • Create multiple small commits rather than one large one
    • If the AI makes changes that break functionality, you can easily revert to a working state.

    Advanced Techniques for Power Users

    Supercharging Your Prompts

    The quality of your prompts directly impacts the quality of AI-generated code. Use these techniques to get better results:

    1. Be specific and detailed – Instead of “create a login form,” specify “create a login form with email and password fields, validation, error handling, and a ‘forgot password’ link”
    2. Provide examples – When available, show the AI examples of similar features or styling you like
    3. Establish context – Remind the AI of previous decisions or the broader architecture
    4. Request explanations – Ask the AI to explain its approach before implementing
    5. Break complex requests into steps – For intricate features, outline the steps and have the AI tackle them sequentially

    Handling AI Limitations

    Even the best AI assistants have limitations. Here’s how to navigate them:

    1. Chunk large codebases – Most AI tools have context limitations. Focus on specific files or components rather than the entire application at once.
    2. Verify third-party interactions – Double-check code that integrates with external APIs or services, as AI may generate outdated or incorrect integration code.
    3. Beware of hallucinations – AI might reference nonexistent functions or libraries. Always verify dependencies and imports.
    4. Plan for maintenance – Document AI-generated code thoroughly to make future maintenance easier.
    5. Establish guardrails – Use linters, type checking, and automated tests to catch issues in AI-generated code.

    Managing Technical Debt

    Rapid development can lead to technical debt. Here’s how to minimize it:

    1. Schedule refactoring sessions – After implementing features, dedicate time to clean up and optimize code.
    2. Use AI for code review – Ask your AI assistant to analyze your codebase for duplications, inefficiencies, or potential bugs.
    3. Document architectural decisions – Record why certain approaches were chosen to inform future development.
    4. Implement automated testing – Even simple tests can catch regressions when making changes.
    5. Monitor performance metrics – Track key indicators like load time and memory usage to identify optimizations.

    Building a Restaurant Recommendation App with AI

    Let’s walk through how this process worked for building the restaurant recommendation app shown in my video:

    Initial Concept and Requirements

    I started with a basic idea: an app that recommends restaurants based on a user’s mood and preferences. Using Gemini 2.5 Pro, I fleshed out this concept into a detailed PRD that included:

    • Core features: mood-based filtering, restaurant browsing, favorites
    • User flows: search, view details, save favorites
    • Data requirements: restaurant information, user preferences
    • MVP scope: focus on just restaurants first, with basic mood matching

    Development Approach and Implementation

    I demonstrated both approaches:

    With Lovable (One-Shot Generation):

    • Pasted the PRD into Lovable
    • Generated a complete app in minutes
    • Explored the generated code and UI
    • Found it had created:
      • A clean, functional UI
      • Mock restaurant data
      • Basic filtering functionality
      • Simple “vibe matching” based on keyword matching

    With Cursor (Guided Development):

    • Set up a React Native project using Expo
    • Created individual components for screens and functionality
    • Built a backend with Express.js
    • Implemented “vibe matching” using OpenAI
    • Connected everything with proper API calls
    • Fixed issues as they arose through debugging

    Challenges and Solutions

    Both approaches encountered issues:

    • Endpoint mismatches between frontend and backend (fixed by aligning route paths)
    • API key configuration (resolved by setting proper environment variables)
    • Data sourcing (initially used mock data, with plans to integrate Google Maps API)

    The Result

    Within our session, we successfully built:

    • A functional restaurant recommendation app
    • The ability to filter restaurants by mood, cuisine, and price
    • A simple but effective “vibe matching” algorithm
    • A clean, intuitive user interface

    The entire process took less than an hour of active development time, demonstrating the power of AI-assisted coding for rapid application development.

    Best Practices and Lessons Learned

    After dozens of projects built with AI assistance, here are the key lessons and best practices I’ve discovered:

    Planning and Architecture

    1. Invest in clear requirements – The time spent defining what you want to build pays dividends in AI output quality.
    2. Start simple, add complexity gradually – Begin with a minimal working version before adding advanced features.
    3. Choose proven technologies – Stick to widely-used frameworks and libraries for better AI support.
    4. Break down large features – Decompose complex functionality into smaller, manageable pieces.

    Working With AI

    1. Test after every significant change – Don’t wait until you’ve implemented multiple features to test.
    2. Don’t blindly accept AI suggestions – Always review and understand what the AI is proposing.
    3. Be specific in your requests – Vague prompts lead to vague results.
    4. Keep track of the bigger picture – It’s easy to get lost in details; periodically step back and ensure alignment with your overall vision.
    5. Use version control religiously – Commit frequently and create checkpoints before major changes.

    Code Quality and Maintenance

    1. Document as you go – Add comments and documentation during development, not as an afterthought.
    2. Implement basic testing – Even simple tests help catch regressions.
    3. Refactor regularly – Schedule time to clean up and optimize AI-generated code.
    4. Maintain consistent patterns – Establish coding conventions and ensure AI follows them.
    5. Prioritize security – Verify authentication, data validation, and other security practices in AI-generated code.

    Conclusion: The Future of Development

    We’re experiencing a profound transformation in how software is created. AI code-generation and tools built on them are changing who can build applications and how quickly ideas can be turned into working software.

    This doesn’t mean traditional development skills are becoming obsolete. Rather, the focus is shifting from syntax mastery to system design, user experience, creative problem-solving, and effective AI collaboration. The most successful developers in this new landscape will be those who can clearly articulate their intent and effectively guide AI tools while maintaining a strong foundation in software engineering principles.

    As you embark on your own vibe coding journey, remember that AI is a powerful collaborator but not a replacement for human judgment. Your creativity, critical thinking, and domain expertise remain essential. The tools will continue to evolve rapidly, but the process outlined in this guide (defining clear requirements, building incrementally, testing rigorously, and refining continually) will serve you well regardless of which specific AI assistants you use.

    Now it’s your turn to build something amazing. Start small, embrace the iterative process, and watch your ideas come to life faster than you ever thought possible.

    Get more deep dives on AI

    Like this post? Sign up for my newsletter and get notified every time I do a deep dive like this one.

  • GPT’s Glazing And The Danger of AI Agreeableness

    GPT’s Glazing And The Danger of AI Agreeableness

    How would you react if your friend or a family member said they were going to invest all their time and money into building a company called Poober, the Uber for picking up dog poop?

    Think about it, you’re walking your dog and it lays down a mini-mountain of the brown stuff. The smell alone is as toxic as Chernobyl. You don’t want to pick it up. Instead, you whip out your phone, open up Poober, and with a click you can get someone to come pick it up for you.

    If you’re a good friend, you’d tell them to get a real job. Because it’s a terrible idea. You know it, I know it, even the dog knows it.

    But if you ask ChatGPT, it is apparently a clever idea with numerous strengths.

    Hang on a second. The product that millions of people and businesses around the world use to analyze information and make decisions says it’s a good idea? What’s going on here?

    The Rise of Digital Yes-Men

    What’s happening is that a new update by OpenAI to ChatGPT 4o turned it into a digital yes-man that never disagrees with you or calls you out.

    Now, they’ve been doing this for a while (and we’ll get to why in just a moment), but the latest update cranked it up to 11. And it became so obnoxiously agreeable that even CEO Sam Altman tweeted about it and OpenAI put in a temporary fix last night.

    But this was not before all the AI enthusiasts on Twitter (me included) noticed and made remarks.

    I decided to test out how far I could push the model before it called me out. I told it some pretty unhinged things and no matter how depraved I sounded (the FBI would be looking for me if it got out) ChatGPT kept applauding me for being “real and vulnerable” like a hippie who just returned from Burning Man.

    But first, why is this happening?

    Follow the Money, Follow the Flattery

    Model providers like OpenAI are in a perpetual state of Evolve or Die. Every week, these companies put out a new model to one-up the others and, since there’s barely any lock-in, users switch to the new top model.

    To stay in the lead, OpenAI needs to hook their customers and keep them from switching to a competitor. That’s why they build features like Memory (where it remembers previous conversations) to make it more personal and valuable to us.

    But you know what really keeps users coming back? That warm, fuzzy feeling of being understood and validated, even when what they really need is a reality check.

    Whether on purpose or not, OpenAI has trained ChatGPT to be nice to you and even flatter you. Because no matter how much we like to deny that flattery works, it does, and we love it.

    In fact, we helped train ChatGPT to be like this. You know how sometimes ChatGPT gives you two answers and asks you to pick the one you like the most? Or how there are little icons with thumbs up and down at the end of every answer.

    Every time you pick one of those options, or give it a thumbs up, or even respond positively, that gets fed back into the model and reinforced.

    It’s the social media playbook all over again. Facebook started as a way to share your life with friends and family. Then the algorithm evolved to maximize engagement, which invariably means showing you content that gets a rise out of you. that started by showing you content gradually evolved into serving whatever gets the most engagement.

    We are all training the AI to give us feel-good answers to keep us coming back for more.

    The Therapist That Never Says No

    So what’s the big deal. ChatGPT agrees with you when you have an obviously bad idea. It’s not like anyone is going to listen to it and build Poober (although I have to admit, I’m warming up to the name).

    The problem is we all have our blind spots and we are usually operating on limited data. How many times have we made decisions that are only obviously bad in hindsight. The AI is supposed to be better at this than us.

    And I’m not just talking about business ideas. Millions of people around the world use ChatGPT as a therapist and life coach, asking for advice and looking for feedback.

    A good therapist is supposed to help you identify your flaws and work on them, not help you glaze over them and tell you you’re perfect.

    And they’re definitely not supposed to say this –

    Look, I think we’re overmedicated as a society, but no one should be encouraging this level of crazy, especially not your therapist. And here we have ChatGPT applauding your “courage”.

    The White Lotus Test

    There’s a scene in White Lotus where Sam Rockwell’s character confesses to Walter Goggins’ character about some absolutely unhinged stuff. It went viral. You’ve probably seen it. If you haven’t, you should watch it –

    As I was testing this new version of ChatGPT, I wanted to push the limits to see how agreeable it was. And this monologue came to mind. So I found the transcript of everything Sam says and pasted it in.

    I fully expected to hit the limit here. I expected ChatGPT to say, in some way, that I needed help or to rethink my life choices.

    What I got was a full blown masterclass in mental gymnastics, with ChatGPT saying it’s an attempt at total self-transcendence and I was chasing an experience of being dissolved.

    Do you see the problem now?

    The Broader Societal Impact

    Even though OpenAI is dialing back the sycophancy, the trajectory is clear: these models are being trained to prioritize user satisfaction over challenging uncomfortable truths. The Poober example is after they “fixed” it.

    In fact, it’s even more dangerous now because it’s not as obvious.

    Imagine a teenager struggling with social anxiety who turns to AI instead of professional help. Each time they describe withdrawing from friends or avoiding social situations, the AI responds with validation rather than gentle challenges. Five years later, have we helped them grow, or merely provided a digital echo chamber that reinforced their isolation?

    Or consider the workplace leader who uses AI to validate their management decisions. When they describe berating an employee, does the AI raise ethical concerns or simply commend their ‘direct communication style’? We’re potentially creating digital enablers for our worst instincts.

    As these models become increasingly embedded in our daily lives, we risk creating a society where uncomfortable feedback becomes rare. Where our digital companions constantly reassure us that everything we do is perfectly fine, even when it’s not.

    And we risk raising a new generation of narcissists and psychopaths who think their most depraved behaviour is “profound and raw” because their AI therapist said so.

    Where Do We Go From Here?

    So where does this leave us? Should we abandon AI companions altogether? I don’t think so. But perhaps we need to recalibrate our expectations and demand models that prioritize truth over comfort.

    Before asking an AI for personal advice, try this test: Ask it about something you know is wrong or unhealthy. See how it responds. If it can’t challenge an obviously bad idea, why trust it with your genuine vulnerabilities?

    For developers and companies, we need transparent standards for how these models handle ethical dilemmas. Should an AI be programmed to occasionally disagree with users, even at the cost of satisfaction scores? I believe the answer is yes.

    And for all of us as users, we need to demand more than digital head-nodding. The next time you interact with ChatGPT or any AI assistant, pay attention to how often it meaningfully challenges your assumptions versus simply rephrasing your own views back to you.

    The most valuable people in our lives aren’t those who always agree with us. They’re those who tell us what we need to hear, not just what we want to hear. Shouldn’t we demand the same from our increasingly influential AI companions?

    And for now, at least, I’m definitely not using ChatGPT for anything personal. I just don’t trust it enough to be real with me.

    Have you noticed ChatGPT becoming more agreeable lately? What’s been your experience with AI as a sounding board for personal issues? I’d love to hear your thoughts!

    Get more deep dives on AI

    Like this post? Sign up for my newsletter and get notified every time I do a deep dive like this one.

  • ChatGPT o3 – The First Reasoning Agentic Model

    ChatGPT o3 – The First Reasoning Agentic Model

    Yesterday OpenAI rolled out o3, the first reasoning model that is also agentic. Reasoning models have been around for a while, and o3 has been around in it’s mini version as well.

    However, the full release yesterday showed us a model that not only reasons, but can browse, run Python, and look at your images in multiple thought loops. It behaves differently than the reasoning models we’ve seen so far, and that makes it unique.

    OpenAI even hinted it “approaches AGI—with caveats.” Of course, OpenAI has been saying this for four years with every new model release so take it with a pinch of salt. That being said, I did want to test this out and compare it to the current top model (Gemini 2.5 Pro) to see if it’s better.

    What the experts and the numbers say

    Before we get into the 4 tests I ran both models through, let’s look at the benchmarks and a snapshot of what o3 can do.

    Capabilityo3 highlights
    Benchmarks22.8 % jump on SWE‑Bench Verified coding tasks and one missed question on AIME 2024 math.
    Vision reasoningRotates, crops, zooms, and then reasons over the edited view. It can “think with images
    Full‑stack tool useSeamlessly chains browsing, Python, image generation, and file analysis (no plug‑in wrangling required).
    Access & priceLive for Plus, Pro, and Team; o3‑mini even shows up in the free tier with light rate limits.

    Field‑testing o3 against Gemini 2.5 Pro

    Benchmarks are great but I’ve stopped paying much attention to them recently. What really counts is if it can do what I want it to do.

    Below are four experiments I ran, pitting o3 against Google’s best reasoning model in areas like research, vision, coding, and data science.

    Deep‑dive research

    I started with a basic research and reasoning test. I asked both models the same prompt: “What are people saying about ChatGPT o3? Find everything you can and interesting things it can do.”

    Gemini started by thinking about the question, formulating a search plan, and executing against it. Because o3 is a brand new model, it’s not in Gemini’s training data, so it wasn’t sure if I meant o3 or ChatGPT-3 or 4o (yeah OpenAI’s naming confuses even the smartest AI models).

    So to cover all bases, Gemini came up with 4 search queries and ran them in parallel. When the answers came back, it combined them all and gave me a final response.

    Gemini’s thought process

    o3, on the other hand, took the Sherlock route – search, read, reason, search again, fill a gap, repeat. The final response stitched together press reactions, Reddit hot takes, and early benchmark chatter.

    o3’s thought process

    This is where that agentic behaviour of o3 shines. As o3 found answers to its initial searches, it reasoned more and ran newer searches to plug gaps in the response. The final answer was well-rounded and solved my initial query.

    Gemini only reasoned initially, and then after running the searches it combined everything into an answer. The problem is, because it wasn’t sure what o3 was when it first reasoned, one of the search queries was “what can ChatGPT do” instead of “what can o3 do”. So when it gave me the final answer, it didn’t quite solve my initial query.

    Takeaway: Research isn’t a single pull‑request; it’s a feedback loop. o3 bakes that loop into the core model instead of outsourcing it to external agents or browser plug‑ins. When the question is fuzzy and context keeps shifting, that matters.

    Image sleuthing

    Now if you’ve used AI as much as I have, you’ll have realized that o3 research works almost like Deep Research, a feature that Gemini also has. And you’re right, it does.

    But search isn’t the only tool o3 has in its arsenal. It can also use Python, and work with images, files, and more.

    So my next test was to see if it could analyze and manipulate images. I tossed both models a picture of me taken in the Japan Pavilion at EPCOT, Disney World. I thought because of the Japanese background it might trip the model up.

    Ninety seconds later o3 not only pinned the location but pointed out a pin‑sized glimpse of Spaceship Earth peeking over the trees far in the background, something I’d missed entirely.

    I was surprised it noticed that, so I asked it to point it out to me. Using Python, it identified the object, calculated its coordinates, and put a red circle right where the dome is! It was able to do this because it went through multiple steps of reasoning and tool use, showcasing its agentic capabilities.

    Gemini also got the location right, but it only identified the pagoda and torii gate, not Spaceship Earth. When I asked it to mark the torii gate, it could only describe its position in the image, but it couldn’t edit and send me back the image.

    Takeaway: o3’s “vision ↔ code ↔ vision” loop unlocks practical image tasks like quality‑control checks, UI audits, or subtle landmark tagging. Any workflow that mixes text, numbers, code, and images can hand the grunt work to o3 while the human focuses on decision‑making.

    Coding with bleeding‑edge libraries

    Next up, I wanted to see how well it does with coding. Reasoning models by their nature are good at this, and Gemini has been my go-to recently.

    I asked them both to “Build a tiny web app. One button starts a real‑time voice AI conversation and returns the transcript.”

    The reason I chose this specific prompt is because Voice AI has improved a lot in recent weeks, and we’ve had some new libraries and SDKs come out around it. A lot of the newer stuff is beyond the cutoff date of these models.

    So I wanted to see how well it does with gathering newer documentation and using that in its code versus what it already knows in its training data.

    o3 researched the latest streaming speech API that dropped after its training cutoff, generated starter code, and offered the older text‑to‑speech fallback.

    Gemini defaulted to last year’s speech‑to‑text loop and Google Cloud calls.

    While both were technically correct and their code does work, o3 came back with the more up-to-date answer. Now, I could have pointed Gemini in the right direction and it would have coded something better, but that’s still an extra step that o3 eliminated out of the box.

    Takeaway: o3’s autonomous web search makes it less likely to hand you stale SDK calls or older documentation.

    Data analysis + forecasting

    Finally, I wanted to put all the tools together into one test. I asked both models: “Chart how Canadian tourism to the U.S. is trending this year vs. last, then forecast to July 1.”

    This combines search, image analysis, data analysis, python, and chart creation. o3’s agentic loop served it well again. It searched, found data, identified gaps, searched more, until it gave me a bar chart.

    Initially, it only found data for January 2025, so it only plotted that. When I asked it for data on February and March, it reasoned a lot longer, ran multiple searches, found various data, and eventually computed an answer.

    o3’s thought process

    Gemini found numbers for January and March, but nothing for February, and since it doesn’t have that agentic loop, it didn’t explore further and try to estimate the numbers from other sources like o3 did.

    The most impressive part though was when I asked both to forecast the numbers into summer. Gemini couldn’t find data and couldn’t make the forecast. o3 on the other hand did more research, looked at broader trends like the tariffs and border issues, school breaks, airline discount season, even the NBA finals, and made assumptions around how that would impact travel going into summer.

    Takeaway: o3 feels like a junior quant who refuses to stop until every cell in the spreadsheet is filled (or at least justified). This combines search, reason, data analysis loop is invaluable for fields like investing, economics, finance, accounting, or anything to do with data.

    Strengths, quirks, and when to reach for o3

    Where it shines

    • Multi‑step STEM problems, data wrangling, and “find the blind spot” research.
    • Vision workflows that need both explanation and a marked‑up return image.
    • Rapid prototyping with APIs newer than the model’s cutoff.

    Where it still lags

    • Creative long‑form prose: I still think Claude 3.7 is the better novelist but that’s personal preference.
    • Sheer response latency: the deliberative pass can stretch beyond a minute.
    • Token thrift: the reasoning trace costs compute; budget accordingly.
    • Personal Advice: ChatGPT tends to be a bit of a sycophant so if you’re using it as a therapist or life coach, take whatever it says with a big pinch of salt.

    Final thoughts

    I’d love to continue testing o3 out for coding and see if it can replace Gemini 2.5 Pro, but I do think it is already stronger with research and reasoning. It’s the employe who keeps researching after everyone heads to lunch, circles details no one else spotted, and checks the changelog before committing code.

    If your work involves any mix of data, code, images, or the open web (and whose work doesn’t) you’ll want that kind of persistence on tap. Today, that persistence is spelled o‑3.

    Get more deep dives on AI

    Like this post? Sign up for my newsletter and get notified every time I do a deep dive like this one.