Description:
We are seeking a highly technical and self-directed Senior Software Engineer to contribute to the development of data processing pipelines for a new AI-enabled data analytics product targeted at Large Ag customers.
Ideal candidates will have:
5+ years of professional software development experience using Python
3+ years of hands-on experience with Databricks and PySpark in production environments
We are looking for mid-career professionals with a proven track record of deploying cloud-native solutions in fast-paced software delivery environments.
In addition to technical expertise, successful candidates will demonstrate:
Strong communication skills, with the ability to clearly articulate technical concepts to both technical and non-technical stakeholders (this is extremely important - please vet out accordingly)
The ability to work effectively with limited supervision in a distributed team environment
A disciplined engineering approach: breaking down work into small, reviewable increments, authoring focused pull requests, and iterating toward solutions progressively rather than in large, delayed batches
Key Responsibilities:
Author and optimize PySpark Databricks ETL and streaming jobs to ensure efficient, scalable, and reliable data processing workflows
Design and implement Databricks-native solutions — including Delta Live Tables, Structured Streaming, and Vector Search — to process large-scale datasets for analytical and operational use cases
Build and maintain CI/CD pipelines using GitHub Actions, with a strong emphasis on code quality, test coverage, and incremental delivery
Contribute infrastructure-as-code using Terraform
Support field testing and customer operations by debugging and resolving data issues
Work closely with data scientists to productionize prototypes and proof-of-concept models
Required Skills & Experience:
Excellent coding skills in Python with experience deploying production-grade software
Deep professional experience building Databricks workflows, optimizing PySpark queries, and working with Delta Lake
Hands-on experience with modern Databricks capabilities, particularly Structured Streaming, Delta Live Tables, and Vector Search
Demonstrated proficiency with GitHub: authoring well-scoped pull requests, conducting code reviews, and managing collaborative branching workflows
Solid understanding of cloud computing fundamentals, with working knowledge of AWS services such as S3, Lambda, and IAM
Preferred Experience:
Experience with event-driven architectures and streaming data pipelines (e.g., Kafka, Kinesis)
Prior experience in cross-functional teams involving product, data science, and backend engineering
Experience working with geospatial data and related libraries (beneficial but not required)
Notes:
Onsite
VIVA is an equal opportunity employer. All qualified applicants have an equal opportunity for placement, and all employees have an equal opportunity to develop on the job. This means that VIVA will not discriminate against any employee or qualified applicant on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status