Windmill Labs
Windmill

Workers

Isolated workers that scale with your workload

Route jobs by tag, scale horizontally, and deploy workers anywhere. Every job runs in its own isolated environment.

Where your code actually runs

When you run a script or a flow in Windmill, the code does not execute on the server. It is picked up by a worker, an isolated process that runs one job at a time. The server handles the API, UI, and job queuing. Workers handle execution. You can run one worker or hundreds, on the same machine or across continents.

How workers work

Workers pull jobs one at a time from a shared PostgreSQL queue. No coordinator, no message broker. A single worker handles ~26M jobs/month at ~100ms per job. Adding more workers increases throughput linearly.

Read the docs

Worker groups

Workers are organized into groups with tags that determine which jobs they pick up. Default groups handle language jobs and SQL/API calls. Custom groups can be created for GPU, high-memory, or environment-specific workloads.

Read the docs

Dedicated workers

A dedicated worker stays pinned to a single script with its runtime permanently warm (Enterprise). Execution overhead drops to ~12ms per job versus ~50ms for standard workers, making them 1.35x faster than AWS Lambda for lightweight tasks.

Read the docs

Agent workers

Agent workers connect to the server over HTTP only, with no direct database access (Enterprise). They can run behind firewalls, in remote data centers, or on edge infrastructure, on Linux, Windows, or macOS without Docker.

Read the docs

Autoscaling

Workers scale automatically based on queue depth. Kubernetes-native autoscaling, ECS, and custom script-based scaling are all supported. Scales to zero when idle.

Read the docs

Concurrency and priority

Global concurrency limits prevent scripts from exceeding API rate limits. Priority levels from 1 to 100 ensure critical jobs always run first.

Read the docs

Benchmarks

Windmill is the fastest job orchestrator in the industry. It scales linearly to 100+ workers with near-theoretical throughput. 100 workers achieve 981 jobs/sec on 100ms jobs. Dedicated workers run fibonacci in 54ms versus Lambda's 73ms. Benchmark data is publicly available and reproducible.

Read the docs
10 long tasks40 lightweight tasks

Frequently asked questions

Build your internal platform on Windmill

Scripts, flows, apps, and infrastructure in one place.