Factory.ai: A Guide To Building A Software Development Droid Army

Last week, Factory gave us a masterclass in how to launch a product in a crowded space. While every major AI company and their aunt already has a CLI coding agent, all I kept hearing about was Factory and their Droid agents.

So, it is just another CLI coding agent or is there some sauce to the hype? In this article, I’m going to do a deep dive into how to set up Factory, build (or fix) apps with it, and all the features that make it stand out in this crowded space.

Quick note – I’ve previously written about Claude Code and Amp, which have been my two coding agents of choice, so I’ll naturally make comparisons to them or reference some of their features in this as contrast. I’ve also written about patterns to use when coding with AI, which is model/agent/provider agnostic, so I won’t be covering them again in this post.

Let’s dive in.

Are These The Droids You’re Looking For?

Fun fact, Factory incorporated as The San Francisco Droid Company but were forced to change their name because LucasFilm took offence. But yes, it’s a Star Wars reference and they kept the droids, so you’ll be seeing more Star Wars references through this post. Don’t say I didn’t warn you.

The Droids seem to be one of the main differentiators. The core philosophy here is that software development is more than just coding and code gen. There are a bunch of tasks that many software engineers don’t particularly enjoy doing. In Factory, you just hand it off to a Droid that specializes in that task.

They’re really just specialized agents. You can set your own up in Claude Code and Amp, but in Factory they come pre-built with optimized system prompts, specialized tools, and an appropriate model.

Code Droid: Your main engineering Droid. Handles feature development, refactoring, bug fixes, and implementation work. This is the Droid you’ll interact with most for actual coding tasks.

Knowledge Droid: Research and documentation specialist. Searches your codebase, docs, and the entire internet to answer complex questions. Writes specs, generates documentation, and helps you understand legacy systems.

Reliability Droid: Your on-call specialist. Triages production alerts, performs root cause analysis, troubleshoots incidents, and documents the resolution. Saves your sleep schedule.

Product Droid: Ticket and PM work automation. Manages your backlog, prioritizes tickets, handles assignment, and transforms rambling Slack threads into coherent product specs.

Tutorial Droid: Helps you learn Factory itself. Think of it as your onboarding assistant.

Installing the CLI: Getting Your Droid Army Ready

Factory has a web interface and an IDE extension, but I’m going to focus on the CLI as it’s what most developers use these days. It’s pretty easy to install:

Bash
# Install droid
curl -fsSL https://app.factory.ai/cli | sh

# Navigate to your project
cd your-project

# Start your development session
droid

On first launch, you’ll see Droid’s welcome screen in a full-screen terminal interface. If prompted, sign in via your browser to authenticate. You start off with a bunch of free tokens, so you can use it right away.

If you’ve used Claude Code, Amp, or any other coding CLI, you’ll find the interface familiar. In fact, it has the same “multiple modes” feature as Claude Code where you can cycle through default, automatic, and planning using shift-tab.

If you’re in a project with existing code, start by asking droid to explain it to you. It will read your codebase and respond with insights about your project structure, test frameworks, conventions, and how everything connects.

Specification Mode: Planning Before Building

Now switch to Spec mode by hitting Shift-Tab and explain what you want it to do.

Bash
> Add a feature for users to export their personal data as JSON.
> Include proper error handling and rate limiting to prevent abuse.
> Follow our existing patterns for API endpoints.

Droid generates a complete specification that includes:

  • Acceptance Criteria: What “done” looks like
  • Implementation Plan: Step-by-step approach
  • Technical Details: Libraries, patterns, security considerations
  • File Changes: Which files will be created/modified
  • Testing Strategy: What tests need to be written

Build Mode

You review the spec. If something’s wrong or missing, you can hit Escape and correct it. Once you’re satisfied, you have multiple options. You can accept the spec and let it run on default mode where it asks for permissions for every change. Or you can process with one of 3 levels of autonomy:

  • Proceed, manual approval (Low): Allow file edits but approve every other change
  • Proceed, allow safe commands (Medium): Droid handles reversible changes automatically, asks for risky ones
  • Proceed, allow all commands (High): Full autonomy, Droid handles everything

Start with low autonomy and as you build trust with the tool, work your way up. Follow my patterns to ensure that if anything goes wrong, it can always be saved.

Spec Files Are Saved

One really interesting feature is that Droid saves approved specs as markdown files in .factory/docs/. You can toggle this on or off and specify the save directory in the settings (using the /settings command). This means:

  • You have documentation of decisions
  • New team members can understand why things were built certain ways
  • Future Droid sessions can reference these decisions

When using Claude Code I often ask it to save the plan as a markdown, so I love that this is an automatic feature in Factory.

Roger, Roger: Context For Your Droids

Another differentiating feature of Factory is the way it manages context. I’ve written about this before in how to build your own coding agent, but giving your agent the right context is what makes or breaks its performance.

Think about it, all these agents use the same underlying models, right? So why does one perform better? It’s the way they handle context. And Factory has multiple layers to it.

Layer 1: The AGENTS.md File

The primary context file is Agents.md, a standard file that tells AI agents how to work with your project. If you’re coming from Claude Code, it’s basically the same as the Claude.md file. It gets ingested at the start of every conversation.

Your codebase has conventions that aren’t in the code itself, like how to run tests, code style preferences, security requirement, PR guidelines, and build/deployment processes. AGENTS.md documents these for Droids (and other AI coding tools). It’s something you should set up for every project at the start.

If you have a Claude.md file already, just duplicate it and rename it to Agents.md. Or you can ask Droid to write one for you. It should look something like this:

Markdown
# MyProject

Brief overview of what this project does.

## Build & Commands

- Install dependencies: `pnpm install`
- Start dev server: `pnpm dev`
- Run tests: `pnpm test --run`
- Run single test: `pnpm test --run <path>.test.ts`
- Type-check: `pnpm check`
- Auto-fix style: `pnpm check:fix`
- Build for production: `pnpm build`

## Project Layout

├─ client/      → React + Vite frontend
├─ server/      → Express backend
├─ shared/      → Shared utilities
└─ tests/       → Integration tests

- Frontend code ONLY in `client/`
- Backend code ONLY in `server/`
- Shared code in `shared/`

## Development Patterns

**Code Style**:
- TypeScript strict mode
- Single quotes, trailing commas, no semicolons
- 100-character line limit
- Use functional patterns where possible
- Avoid `@ts-ignore` - fix the type issue instead

**Testing**:
- Write tests FIRST for bug fixes
- Visual diff loop for UI changes
- Integration tests for API endpoints
- Unit tests for business logic

**Never**:
- Never force-push `main` branch
- Never commit API keys or secrets
- Never introduce new dependencies without team discussion
- Never skip running `pnpm check` before committing

## Git Workflow

1. Branch from `main` with descriptive name: `feature/<slug>` or `bugfix/<slug>`
2. Run `pnpm check` locally before committing
3. Force-push allowed ONLY on feature branches using `git push --force-with-lease`
4. PR title format: `[Component] Description`
5. PR must include:
   - Description of changes
   - Testing performed
   - Screenshots for UI changes

## Security

- All API endpoints must validate input
- Use parameterized queries for database operations
- Never log sensitive data
- API keys and secrets in environment variables only
- Rate limiting on all public endpoints

## Performance

- Images must be optimized before committing
- Frontend bundles should stay under 500KB
- API endpoints should respond in under 200ms
- Use lazy loading for routes

## Common Commands

**Reset database**:
```bash
pnpm db:reset

You can also set up multiple Agents.md files to manage context better:

/AGENTS.md ← Repository-level conventions
/packages/api/AGENTS.md ← API-specific conventions
/packages/web/AGENTS.md ← Frontend-specific conventions

Layer 2: Dynamic Code Context

When you submit a query, Droid’s first move is usually searching the most relevant files without manually specifying them. You can of course @ mention files but it’s best to let it figure it out on its own and help it when needed.

Since it already has an understanding of your repository from the Agents.md file, it knows where to go looking. It picks out the right sections of code, makes sure it isn’t duplicating context, and also lazy loads context (only pulls in context when necessary).

Factory also captures build outputs, test results, and so on as you execute commands to add to the context.

Layer 3: Tool Integrations

One big friction point in development is dealing with context scattered across code, docs, tickets, etc.

When you go through the sign up process in the Factory web app, the first thing it will prompt you to do is integrate your development tools, so the Droids have the context they need.

The most essential integration is your source code repository. You can connect Factory to your GitHub or GitLab account (cloud or self-hosted) so it can access your codebase. This is required because the Droids need to read and write code on your projects.

But the real differentiator is the integrations to other tools where context lives:

Observability & Logs (Sentry, Datadog):

  • Error traces from production
  • Performance metrics
  • Incident history
  • Stack traces

Documentation (Notion, Google Docs):

  • Architecture decision records (ADRs)
  • Design documents
  • Onboarding guides
  • API specifications

Project Management (Jira, Linear):

  • Ticket descriptions and requirements
  • Acceptance criteria
  • Related issues and dependencies
  • Discussion threads

Communication (Slack):

  • Technical discussions
  • Decisions made in channels
  • Problem-solving threads
  • Team conventions established in chat

Version Control (GitHub, GitLab):

  • Branch strategies
  • Commit history and messages
  • Pull request discussions
  • Code review feedback

If you connect these tools, your Droid can understand your entire project. It can see your code, read design docs, check Jira tickets, review logs from Sentry, and more, all to give you better help.

Layer 4: Organizational Memory

Factory maintains two types of persistent memory that survives across sessions:

User Memory (Private to you):

  • Your development environment setup (OS, containers, tools)
  • Your work history (repos you’ve edited, features you’ve built)
  • Your preferences (diff view style, explanation depth, testing approach)
  • Your common patterns (how you structure code, naming conventions you prefer)

Organization Memory (Shared across team):

  • Company-wide style guides and conventions
  • Security requirements and compliance rules
  • Architecture patterns and anti-patterns
  • Onboarding procedures

How Memory Works:

As you interact with Droids, Factory quietly records stable facts. If you say “Remember that our staging environment is at staging.company.com”, Factory saves this. Next session, Droid already knows.

If your teammate says “Always use snake_case for API endpoints”, that goes into Org Memory. Now every developer’s Droid follows this convention automatically.

Context In Action

Let’s say you implementing a new feature and need to follow the architecture defined in a design doc.

Bash
> Implement the notification system described in this Notion doc:
> https://notion.so/team/notification-system-architecture

Behind the Scenes:

  1. Droid fetches Notion document content
  2. Parses architecture decisions and requirements
  3. Search finds existing notification patterns
  4. Org Memory recalls team’s event-driven architecture conventions
  5. Agents.md shows where notification code should live

Droid implements according to:

  • Architecture specified in the doc
  • Existing patterns in your codebase
  • Team conventions from Org Memory
  • Your project structure

Customizing Factory

Factory.ai becomes even more powerful when you hook it into the broader ecosystem of tools and services your project uses. We’ve already discussed integrations like source control, project trackers, and knowledge bases for providing context.

Here we’ll focus on tips for integrating external APIs or data sources into your Factory workflows, and using custom AI models or agents.

Connecting APIs & External Data

Suppose your project needs data from a third-party API (e.g., a weather service or your company’s internal API). While building your project, you can certainly have the AI write code to call those APIs (it’s quite good at using SDKs or HTTP requests if you provide the API docs).

Another approach is using the web access tool if enabled: Factory’s Droids can have a web browsing tool to fetch content from URLs. You could give the AI a link to API documentation or an external knowledge source and it can then fetch and read it to inform its actions (with your permission).

Always ensure you’re not exposing sensitive credentials in the chat. Use environment variables for any secrets.

Using Slack and Chats

Factory integrates with communication platforms like Slack , which means you can interact with your Droids through chat channels.

For instance, you can mention it with questions or commands. Type “@factory summarize the changes in release 1.2” and the AI will respond in thread with answers or code suggestions.

Ask it to fix an error“@factory help debug this error: <paste error log>” and it will go off and do it on its own.

Customizing and Extending Agents

You can also create Custom Droids (essentially custom sub-agents), much like you do in Claude Code. For example, you could create a “Security Auditor” droid that has a system prompt instructing it to only focus on security concerns, with tools set to read-only mode.

You define these in .factory/droids/ as markdown files with some YAML frontmatter (name, which model to use, which tools it’s allowed, etc.) and instructions. Once enabled, your main Droid (the primary assistant) can delegate tasks to these sub-droids.

Custom Slash Commands

In a similar vein, you can create your own slash commands to automate routine actions or prompts. For example, you might have a /run-test command that triggers a shell script to run your test suite and returns results to the chat. The AI could then monitor those logs and alert if something looks wrong.

Factory allows you to define these commands either as static markdown templates (the content gets injected into the conversation) or as executable scripts that actually run in your environment.

Bring Your Own Model Key

While Factory comes with all the latest coding models (which you can select using /model), you can also use your own key. The benefit is you still get Factory’s orchestration, memory, and interface, but with the model of your choice. You would pay your own API costs but get to use Factory for free.

Droid Exec

Droid Exec is Factory’s headless CLI mode: instead of an interactive chat, you run a single, non-interactive command that does the work and exits. It’s built for automation like CI pipelines, cron jobs, pre-commit hooks, and one-off batch scripts.

So you can say something like:

Bash
droid exec --auto high "run tests, commit all changes, and push to main"

And just walk away. Your droid will follow your commands and complete the task on its own.

There’s Three Of Us and One Of Him

As I mentioned earlier, Factory also has a web app and an IDE integration.

The web application provides an interactive chat-based environment for your AI development assistant. On your first login, you’ll typically see a default session with Code Droid selected (the agent specialized in coding) and an empty workspace ready to connect to your code.

You can connect directly to a remote repository on GitHub or to your local repository via the Factory Bridge app. And once you do that, you can run Factory as a cloud agent!

The UI here is pretty much a chat interface, so you’d use it just like the terminal. You still have @ commands to select certain files or even a Google doc or Linear ticket.

You can also upload files directly into the chat if you want the AI to consider some code, data, and even screenshots not already in the repository.

Sessions and Collaboration

Each chat corresponds to a session, which can be project-specific. Factory is designed for teams, so sessions can potentially be shared or revisited by your team members (for example, an ongoing “incident response” session in Slack, or a brainstorming session for a design doc).

In the web app, you can create multiple sessions (e.g., one per feature or task) and switch between them. You can also see any sessions you started from the CLI. Useful if you want to catch up on a previous session or share with a teammate.

Guess I’m The Commander Now

Factory has actually been around for a couple of years, but they’ve been focused mostly on enterprise deployments. This is obvious from its team features and integrations.

With the recent launch, it looks like they’re trying to enter the broader market, and their message seems to be that they’re a platform to deploy agents not just for code generation, but across the software development lifecycle and the tools your company uses to build and mange products.

So if you’re a solo developer, you probably won’t notice much of a difference switching from Claude Code or Codex, aside from how the agent works in your terminal or IDE.

But if you’re part of a larger engineering team with an existing codebase, Factory is a much different experience, especially if you plug in all your tools and set up automations where your droids can run in the background and get tasks done.

And at that point, you can focus on the big picture while the droid army executes your vision.

Kinda like a commander.

Want to build your own AI agents?

Sign up for my newsletter covering everything from the tools, APIs, and frameworks you need, to building and serving your own multi-step AI agents.