Platform & Ingestion Engineer (Data Ops & Infrastructure)
Own the seams — the evidence creation pipeline that turns raw documents into verified evidence, and the build system that packages everything for deployment.
Stack: Python (data pipelines), Docker/Kubernetes, GitHub Actions / CI/CD, TypeScript, document parsing (OCR, AST)
Why this role exists
The enforcement architecture needs two things to function: clean evidence in the database, and reliable deployment artifacts that work in any environment. You own both. You build the Kura evidence pipeline that transforms raw compliance PDFs into structured, verified database records. And you build the CI/CD and container orchestration that packages our multi-language stack into artifacts that deploy to cloud, API endpoints, or air-gapped bare-metal.
What you'll do
- Build the evidence creation pipeline. Write the workers that ingest compliance documents, extract structural hierarchies, generate vector embeddings, and load them into PostgreSQL. Work closely with our taxonomist to enforce SIRE rules during ingestion — malformed text never enters the index.
- Own container orchestration. Write the Docker and Kubernetes configurations that network our Aurora containers with local inference servers and the frontend application.
- Build multi-target CI/CD. Every commit compiles into a cloud-ready image or a self-contained offline tarball. Build the release pipelines that make both targets work reliably.
- Handle evidence versioning. Regulatory frameworks change. When ISO 27001 is revised, the pipeline must invalidate superseded chunks and integrate the new hierarchy without breaking existing Kura rules or downstream governance gates.
- Keep glue code invisible. Integration code should be fault-tolerant and require zero maintenance once deployed. Build the system, not the fix.
What we're looking for
- Data + DevOps hybrid: Equally comfortable writing a GitHub Action and writing a Python script to chunk a 500-page PDF. You don't just move data — you understand how to transform it.
- Container mastery: Deep experience containerizing complex, multi-language architectures. You optimize image sizes and manage internal microservice networking.
- Scripting proficiency: Strong Python for data processing and shell scripting for infrastructure automation.
- Pragmatism: You automate what should be automated and don't over-engineer what shouldn't be.
Why Kenshiki?
You'll build the data arteries and delivery mechanisms for a governed AI system deployed into environments where failure is not an abstraction. The pipelines you build are what make evidence-grade AI possible.