Machine Learning Engineer

Reference Number: RKMNDE14

Machine Learning Engineer

Not Disclosed

Brooklyn Park, MN

5.5 Months

Not Disclosed

Not Disclosed

Services

Job Description

Description:

JOB SUMMARY

Machine Learning Engineer

We are seeking a skilled Machine Learning Engineer (Contractor) to support and maintain critical batch workflows that generate large-scale forecasts. These workflows are orchestrated through a custom in-house scheduling tool and leverage R, Python, Bash, and Spark on a YARN-managed on-premise cluster. The engineer will be responsible for ensuring smooth daily operations, including monitoring, restarting, and troubleshooting jobs to minimize downtime and maintain system reliability. Strong problem-solving skills and the ability to quickly diagnose issues across multiple technologies will be key in this role

In addition to operations, the engineer will contribute development expertise by enhancing the functionality, stability, and performance of existing jobs. This will include submitting code changes in Python (PySpark), Bash, or Terraform to improve orchestration and infrastructure configurations. The role also involves implementing observability metrics and monitoring solutions, using tools such as OTEL, Kibana, REST APIs, and custom instrumentation. The ideal candidate will be comfortable collaborating via GitHub (PRs), proactive in identifying improvement opportunities, and effective at balancing operational support with development contributions.

Required Skills

Technical Skills:
Proficiency with Python (PySpark), Bash, and working knowledge of R
Experience with Apache Spark on YARN-managed clusters (large-scale, on-premise environments preferred)
Familiarity with workflow orchestration tools (Airflow, Luigi, or custom equivalents)
Experience with Terraform (infrastructure-as-code)
Strong background in job monitoring and troubleshooting in distributed environments
Knowledge of observability/monitoring practices using OTEL, Kibana, REST APIs, and custom metrics instrumentation
Hands-on experience with GitHub workflows (pull requests, branching strategies, code reviews)

Soft Skills:
Strong analytical and troubleshooting skills with attention to detail
Clear and effective communication, especially in cross-functional environments
Ability to prioritize operational stability while driving code improvements
Proactive mindset with a focus on reliability and continuous improvement
Collaborative attitude, able to work effectively with developers, data scientists, and operations staff

TECHNICAL SKILLS

Must Have

Apache Hadoop , Apache Hive, Apache Spark, Apache spark ecosystem, Big Data
Docker
Git/GitHub
PySpark
Python

Nice To Have
Airflow or Similar Orchestration Tools
Bash Scripting
Grafana
Kibana
MLOps
OpenTelemetry
R
Terraform

Must have (Spark, Hadoop, orchestration, observability, etc)

VIVA is an equal opportunity employer. All qualified applicants have an equal opportunity for placement, and all employees have an equal opportunity to develop on the job. This means that VIVA will not discriminate against any employee or qualified applicant on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status