Certified Woman & Minority Owned

DevOps Engineer


Reference Number: GDCADE29

DevOps Engineer
experience  Not Disclosed
location  Santa Clara, CA
duration  12.0 Months
salary  Not Disclosed
jobtype  Not Disclosed
Industry  Manufacturing
duration  $81.69/hour - $86.69/hour
Job Description

Description:

DevOps Engineer – GPU Infrastructure

  • We are seeking a skilled and motivated DevOps Engineer to join our team in building and maintaining high-performance infrastructure for GPU-based workloads. In this role, you'll be responsible for developing scalable, reliable systems across both on-premises and cloud environments. You’ll work closely with engineering teams to streamline CI/CD pipelines, automate operations, and support advanced compute environments.

Key Responsibilities

  • Design and implement scalable infrastructure using Kubernetes across both on-prem and major cloud service providers (CSPs)
  • Develop and maintain CI/CD pipelines with tools like Buildkite, GitHub Actions, and Jenkins to ensure smooth and reliable software delivery
  • Automate infrastructure operations using Ansible, Python, and Bash to reduce manual toil and improve system consistency
  • Manage service deployment within Kubernetes using Helm and GitOps-style workflows
  • Configure and support GPU servers, including lifecycle management, health monitoring, and test automation
  • Maintain node health and security, ensuring timely updates and proactive monitoring of GPU server fleets
  • Provision, scale, and maintain Kubernetes clusters

Required Qualifications

  • 2+ years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Engineering
  • Proficiency in Ansible, Python, and Bash for automation and tooling
  • Solid hands-on experience with Kubernetes, Docker, and Helm
  • Strong knowledge of CI/CD pipeline design, version control best practices, and build systems
  • Experience with monitoring and observability tools (e.g., Prometheus, Grafana, Nagios)

Nice to Have

  • Familiarity with GPU-based compute environments and automated CI/test workflows
  • Experience with infrastructure-as-code (IaC) tools such as Terraform
  • Familiarity with container security practices and CVE scanning
  • Background in high-performance computing (HPC), Slurm, or ML/AI training pipelines

VIVA is an equal opportunity employer. All qualified applicants have an equal opportunity for placement, and all employees have an equal opportunity to develop on the job. This means that VIVA will not discriminate against any employee or qualified applicant on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status

Apply for this Job





(Please ensure email matches your resume email)



(document types allowed: doc/docx/rtf/pdf/txt) (max 2MB)

By submitting this form, you are consenting to the VIVA team contacting you via Phone/Email

Join VIVA and grow

VIVA is faster, easier and you still have complete control