Hadoop Platform Engineer

Reference Number: RKMNDE26

Hadoop Platform Engineer

Not Disclosed

Brooklyn Park, MN (100% Remote)

5.0 Months

Not Disclosed

Not Disclosed

Services

$95/hour - $100/hour

Job Description

Role is specifically a Hadoop Platform Engineer

Engineer will have deep, hands-on experience across the Apache Hadoop ecosystem, focused on building and maintaining a reliable, scalable data platform; strong core Hadoop and Java expertise, enabling them to diagnose, optimize, and tune performance at the cluster and infrastructure levels. This role will also drive platform observability improvements, including standardizing monitoring, implementing health checks, and developing automated alerting systems—all to proactively identify and resolve issues before they impact users.

Design, build, and maintain a reliable, scalable, and high-performance Hadoop platform that supports large-scale data processing and analytics workloads.
Diagnose and optimize cluster-level performance issues across core Apache Hadoop components (HDFS, YARN, MapReduce, Hive, Spark, HBase) using deep Hadoop and Java expertise.
Develop and standardize monitoring and observability frameworks for the Hadoop ecosystem, ensuring proactive detection of system health issues.
Leverage strong Linux expertise (especially Ubuntu) to analyze system bottlenecks, perform kernel tuning, and optimize resource allocation for Hadoop workloads.
Implement automated health checks and alerting systems to reduce operational noise and minimize reliance on user-reported issues.
Collaborate with data engineering, platform, and infrastructure teams to tune resource utilization, improve job efficiency, and ensure cluster stability.
Establish and enforce operational standards for performance tuning, capacity planning, and version upgrades across the Hadoop platform.
Automate repetitive operational tasks and improve workflow efficiency using scripting languages (Python, Bash, or Scala)
Maintain and enhance data security, governance, and compliance practices within the Hadoop ecosystem.
Drive root cause analysis (RCA) and develop preventive measures for recurring production incidents.
Document best practices, operational runbooks, and configuration standards to ensure knowledge sharing and consistent platform management.

This role is also in need of automation frameworks such as Chef and Ansible

Must Have

Ansible
Apache Hadoop
Apache Hive
Apache Spark
Chef
Hadoop Distributed File System (HDFS)
Java
Linux

Nice To Have

Grafana
Python
trino

Notes:
Remote
M-F 8:00 - 5:00 PM

VIVA is an equal opportunity employer. All qualified applicants have an equal opportunity for placement, and all employees have an equal opportunity to develop on the job. This means that VIVA will not discriminate against any employee or qualified applicant on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.