Description:This work currently runs separately from the Negotiation AI MVP, but must be future-ready for seamless integration.Project: Supplier Contract Ingestion & Data Pipeline for Negotiation AIRole Overview:As our Data Engineer, you will own the end-to-end data pipelines. This includes designing scalable databases, developing ingestion workflows, collaborating with our internal Machine Learning Engineering team, and structuring supplier spend data. You’ll work closely with the Full Stack Developer to co-design the database schema for the Negotiation AI and ensure future compatibility with the ingestion pipeline.Key Deliverables:Ingestion Pipeline: Build and deploy a robust ETL/ELT pipeline using Azure to ingest 50,000+ contracts.Metadata Extraction: Configure and run OCR workflows (e.g., OlmOCR/Azure Document Intelligence) to extract key contract fields such as dates, parties, terms etc.Scalable Database Schema: Design and implement a schema in Azure PostgreSQL to store contract metadata, OCR outputs, and supplier spend data. Collaborate with the Software Developer to design a future-ready schema for AI consumption.Required Skills & ExperienceData Engineering & ETL/ELTExperience with Azure PostgreSQL or similar relational databasesSkilled in building scalable ETL/ELT pipelines (preferably using Azure)Proficient in Python for scripting and automationOCR CollaborationAbility to work with internal Machine Learning Engineering teams to validate and structure extracted dataFamiliarity with OCR tools (e.g., Azure Document Intelligence, Tesseract) is a plusSAP Ariba IntegrationExposure to cXML, ARBCI, SOAP/REST protocols is a plusComfortable with API authentication (OAuth, tokens) and enterprise-grade securityAgile Collaboration & DocumentationComfortable working in sprints and cross-functional teamsAble to use Github Copilot to document practices for handoverPreferred QualificationsExperience with large-scale contract ingestion projectsFamiliarity with procurement systems and contract lifecycle managementBackground in integrating data pipelines with AI or analytics platformsFocused Scope with Future Impact: Lay the foundation for an AI-driven negotiation platformCutting-Edge Tools: Work with SAP Ariba, OCR, Azure, and advanced analyticsCollaborative Environment: Partner with Software Developers and AI specialists Notes:8:00 AM - 5:00 PMVIVA is an equal opportunity employer. All qualified applicants have an equal opportunity for placement, and all employees have an equal opportunity to develop on the job. This means that VIVA will not discriminate against any employee or qualified applicant on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status
Description:This work currently runs separately from the Negotiation AI MVP, but must be future-ready for seamless integration.Project: Supplier Contract Ingestion & Data Pipeline for Negotiation AIRole Overview:As our Data Engineer, you will own the end-to-end data pipelines. This includes designing scalable databases, developing ingestion workflows, collaborating with our internal Machine Learning Engineering team, and structuring supplier spend data. You’ll work closely with the Full Stack Developer to co-design the database schema for the Negotiation AI and ensure future compatibility with the ingestion pipeline.Key Deliverables:Ingestion Pipeline: Build and deploy a robust ETL/ELT pipeline using Azure to ingest 50,000+ contracts.Metadata Extraction: Configure and run OCR workflows (e.g., OlmOCR/Azure Document Intelligence) to extract key contract fields such as dates, parties, terms etc.Scalable Database Schema: Design and implement a schema in Azure PostgreSQL to store contract metadata, OCR outputs, and supplier spend data. Collaborate with the Software Developer to design a future-ready schema for AI consumption.Required Skills & Experience
Data Engineering & ETL/ELTExperience with Azure PostgreSQL or similar relational databasesSkilled in building scalable ETL/ELT pipelines (preferably using Azure)Proficient in Python for scripting and automationOCR CollaborationAbility to work with internal Machine Learning Engineering teams to validate and structure extracted dataFamiliarity with OCR tools (e.g., Azure Document Intelligence, Tesseract) is a plusSAP Ariba IntegrationExposure to cXML, ARBCI, SOAP/REST protocols is a plusComfortable with API authentication (OAuth, tokens) and enterprise-grade securityAgile Collaboration & DocumentationComfortable working in sprints and cross-functional teamsAble to use Github Copilot to document practices for handoverPreferred QualificationsExperience with large-scale contract ingestion projectsFamiliarity with procurement systems and contract lifecycle managementBackground in integrating data pipelines with AI or analytics platformsFocused Scope with Future Impact: Lay the foundation for an AI-driven negotiation platformCutting-Edge Tools: Work with SAP Ariba, OCR, Azure, and advanced analyticsCollaborative Environment: Partner with Software Developers and AI specialists
Notes:
8:00 AM - 5:00 PM
(Please ensure email matches your resume email)
(document types allowed: doc/docx/rtf/pdf/txt) (max 2MB)
By submitting this form, you are consenting to the VIVA team contacting you via Phone/Email