Position Requirements:Participates in the planning and execution of policies, practices, and projects that acquire, control, protect, and enhance the value of data assets. Facilitates obtaining data from a variety of different sources, in the right formats, assuring that it adheres to data quality standards, as well as resolving any information flow and content issues. Builds a robust data pipeline that cleans, transforms, and aggregates unorganized data into databases.Proficiency in developing, maintaining, monitoring, and the long-term operations of data pipelines or processing systems running in Cloudera Data Platform. Understanding of data extraction, transformation, loading, and performance tuning of solutions utilizing multiple streams of input data code-based, git and DevOps-enabled technologies using Python or SQL such as PySpark, pandas, or dbt5+ years of experience in application/data development (i.e., Python)5+ years of experience with data integration/ingestion tools (i.e., Apache NiFi)Advanced level knowledge of SQL, Java, Microsoft SQL Server, distributed data/computing platforms (i.e., Apache NiFi, Hadoop, MapReduce, Hive, HBase, Kafka, Spark)Experience with Scrum and Kanban methodologiesExperience with UNIX/Linux including basic commands and shell scriptingExperience in implementing and maintaining continuous integration continuous delivery (CI/CD) pipelines and data platform management
(Please ensure email matches your resume email)
(document types allowed: doc/docx/rtf/pdf/txt) (max 2MB)
By submitting this form, you are consenting to the VIVA team contacting you via Phone/Email