koyracloud is an open-source, self-hosted Platform-as-a-Service (PaaS) for Docker Swarm. You connect a git repo with a small .paas/app.yaml manifest and koyracloud builds a container image, runs it behind HTTPS, and gives you secrets, persistent storage, custom domains, push-to-deploy, rollback, background workers, cron jobs and a Redis bus — the 'connect a repo and it deploys' experience of Vercel or Render, on your own hardware.

How do I self-host koyracloud?

Clone https://github.com/hikmahtech/koyracloud, set deploy/koyracloud.env and the Docker secrets described in deploy/README.md, then run ./deploy/deploy.sh against your swarm context. You need a Docker Swarm with Traefik (HTTPS + ACME) and an NFS export reachable by the nodes.

Is koyracloud free and open source?

Yes. koyracloud is open source under the AGPL-3.0 license and free to self-host.

Does koyracloud support background workers and cron jobs?

Yes. In .paas/app.yaml you can declare always-on workers, scheduled cron jobs (5-field UTC), and a shared Redis bus (redis: true) that injects a per-app-isolated REDIS_URL — so a web app can push events to a background worker, all from the same repo and image.

How is koyracloud different from Vercel, Render or Heroku?

koyracloud runs entirely on your own Docker Swarm — no cloud vendor, no per-seat or per-build pricing, and your code and data never leave your infrastructure. It targets single-operator and homelab use: trusted code, internal apps and side projects.

Why we build each app into its own image, off NFS

Name: koyracloud
Author: Arshad Ansari

When we started koyracloud, the natural architecture felt obvious: keep each app's code, node_modules, and venv on a shared NFS volume. A one-off container would land on that volume, build the dependencies in place, and then a long-running service would serve the code directly from there. To avoid rebuilding every time, we hand-rolled a dependency hash check—does the lockfile match what's cached here? If yes, skip the build.

It worked fine for small deployments and light traffic. But as the platform grew, something unexpected happened: NFS became our bottleneck, and not in a fun way.

The NFS Many-Small-Files Problem

NFS is architected around network round-trips. It handles large files reasonably well. But a node_modules tree (or a Python venv) is thousands of tiny files living in nested hierarchies. Every time the build container walked the directory tree to check what existed, or the app container listed a directory, NFS had to make a network call and wait for metadata.

When we triggered a build—especially a heavy one that needed to download and extract packages—the I/O traffic exploded. The NFS server spiked. That's expected and manageable at first. But what wasn't expected was what happened to the control plane.

The control plane's database lived on the same NFS volume. When builds crawled and hammered the NFS server, the database couldn't get the I/O it needed. Healthchecks started timing out. And a few times, mid-deploy, the control plane would fail its own healthcheck, crash, and take the entire deployment with it.

We'd essentially built a way for app deployments to knock over the platform itself.

The hand-rolled dependency hashing didn't help much either. We were checking file mtimes and computing hashes, which meant traversing the same directory trees that were causing I/O contention in the first place. More trips to NFS, more pain.

The Image-Based Rewrite

The fix was to flip the model upside down.

Instead of building dependencies onto a shared NFS volume, we build each app into a Docker image. The build happens on the node's local disk—fast, no network overhead, no contention with the control plane. Once the image is built, we push it to a built-in internal registry (a registry:2 service running as a Swarm stack). Then we tell Swarm to run the app from that image.

The flow looks like this:

repo → clone to local build dir → docker build → docker push → docker stack deploy → pull + run on any node

Any Swarm node can pull and run the image. Apps are no longer pinned to a specific build node. They can reschedule on any healthy node, any time. NFS is now touched only for what it's actually good at: persisted application data (the persist: directories in the manifest) and the registry storage itself (which is read-heavy and batched, not a hotly-contested I/O workload).

Docker's layer cache replaces our hand-rolled hashing. If the dependencies haven't changed, the layer is already there. If they have, Docker rebuilds just that layer. It's simpler, faster, and doesn't require us to maintain sync logic.

Why This Is Cleaner

A few wins fell out of this:

No blast radius to the control plane. Builds run on local disk. Heavy I/O doesn't starve the database. You can deploy apps without risking the platform.

Cache that actually works. Docker layers are a proven, efficient caching mechanism. We didn't have to maintain our own hashing and sync logic. That code is gone—one of those rare refactors where we removed more than we added.

Reschedulability. Apps run from images, not live code on a volume. A node goes down, Swarm pulls the image on another node and keeps going. No fiddling with NFS mount states or waiting for volume mounts to settle. No tying apps to specific hardware.

Isolation. Each deploy gets its own image. If something goes wrong with one app's build, it doesn't affect others. If you need to roll back, you're just running a different image. Clean.

The Trade-offs We Accepted

We do maintain an internal registry now, which is another service to keep alive. The registry itself stores blobs on NFS (or any persistent backend), but the access pattern is different—mostly reads, batched, and not the fine-grained metadata hammering that killed us before.

Builds take a few more seconds per deploy because of the push step. But they're no longer fighting I/O contention on a shared volume, so the total time is faster and more predictable.

We had to think through image garbage collection—don't want the registry filling up over time—but that's a solved problem with a quick policy.

The Outcome

The platform is now resilient in a way it wasn't before. Deploying apps doesn't risk the control plane. Builds are predictable. Apps can run anywhere and reschedule freely. We have less custom code to maintain. The failure mode that could take the whole platform down is gone.

Sometimes the obvious first design teaches you something. In this case, it taught us that NFS and many-small-files are fundamentally at odds, and that Docker images are a better unit of deployment than live code on a shared volume. Worth the rewrite.

For more on how koyracloud orchestrates deploys and manages images, see the architecture docs.