Description: DevOps Engineer – GPU Infrastructure We are seeking a skilled and motivated DevOps Engineer to join our team in building and maintaining high-performance infrastructure for GPU-based workloads. In this role, you'll be responsible for developing scalable, reliable systems across both on-premises and cloud environments. You’ll work closely with engineering teams to streamline CI/CD pipelines, automate operations, and support advanced compute environments. Key ResponsibilitiesDesign and implement scalable infrastructure using Kubernetes across both on-prem and major cloud service providers (CSPs)Develop and maintain CI/CD pipelines with tools like Buildkite, GitHub Actions, and Jenkins to ensure smooth and reliable software deliveryAutomate infrastructure operations using Ansible, Python, and Bash to reduce manual toil and improve system consistencyManage service deployment within Kubernetes using Helm and GitOps-style workflowsConfigure and support GPU servers, including lifecycle management, health monitoring, and test automationMaintain node health and security, ensuring timely updates and proactive monitoring of GPU server fleetsProvision, scale, and maintain Kubernetes clusters Required Qualifications2+ years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure EngineeringProficiency in Ansible, Python, and Bash for automation and toolingSolid hands-on experience with Kubernetes, Docker, and HelmStrong knowledge of CI/CD pipeline design, version control best practices, and build systemsExperience with monitoring and observability tools (e.g., Prometheus, Grafana, Nagios) Nice to HaveFamiliarity with GPU-based compute environments and automated CI/test workflowsExperience with infrastructure-as-code (IaC) tools such as TerraformFamiliarity with container security practices and CVE scanningBackground in high-performance computing (HPC), Slurm, or ML/AI training pipelinesVIVA is an equal opportunity employer. All qualified applicants have an equal opportunity for placement, and all employees have an equal opportunity to develop on the job. This means that VIVA will not discriminate against any employee or qualified applicant on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status
Description:
DevOps Engineer – GPU Infrastructure
Key Responsibilities
Required Qualifications
Nice to Have
(Please ensure email matches your resume email)
(document types allowed: doc/docx/rtf/pdf/txt) (max 2MB)
By submitting this form, you are consenting to the VIVA team contacting you via Phone/Email