About PlayerZero
PlayerZero is building a self-healing system for software that automates defect resolution and development. We are used by engineering and support teams to:
- autonomously debug problems in the software (technical support)
- fix issues directly in the code
- prevent these problems from recurring
PlayerZero is backed by leading investors such as Foundation Capital, WndrCo, and Green Bay Ventures — and operators like Matei Zaharia, Drew Houston, Dylan Field, Guillermo Rauch, among others.
We believe that as software development speeds up, engineering and support teams face greater challenges maintaining software for their customers. We see this as an opportunity to reinvent how software is supported.
About the role:
We’re searching for a platform engineer who lives and breathes distributed systems. You’ll craft the core platform that synchronizes petabytes of data, indexes billions of lines of code, and coordinates fleets of AI agents running in parallel. If “five-nines,” “exact-once,” and “sub-second latency” set your heart racing, you’ll thrive here.
In this role, you will:
- Design cloud-native architectures that scale elastically to handle thousands of micro-services and GPU workers.
- Build high-throughput data planes for log ingest, change-data-capture (CDC), and real-time feature stores powering our self-healing loop.
- Implement ultra-low-latency indexes (vector, inverted, graph) that serve semantic queries across billions of code tokens.
- Synchronize state across clusters and regions so Fortune 500 customers see consistent results—no matter how big their codebase.
- Automate agent orchestration so hundreds of LLM-driven “fix bots” can work concurrently without stepping on each other.
- Harden reliability & security: champion chaos testing, live migrations, and defense-in-depth for customer data.
- Collaborate closely with ML researchers and product teams to translate novel ideas into battle-tested infrastructure.
You might thrive in this role if:
- 5-10+ years designing and operating large-scale distributed systems in production.
- Deep knowledge of consensus, sharding, replication, and back-pressure techniques.
- Proven success synchronizing data pipelines for high-volume, multi-tenant environments.
- Proven success navigating and debugging in large distributed codebases and infra
- Comfortable writing infra-as-code (Terraform / Pulumi) and automation (Argo, GitHub Actions, Buildkite).