Lead Data Engineer + AI Client - Altimetrik Takeda Location: Remote Need minimum 3 years of experien

Remote, USA Full-time
Lead Data Engineer + AIClient - Altimetrik TakedaLocation: RemoteNeed minimum 3 years of experience as Lead. About the roleWe're looking for aSenior Data Engineer to build and scale our Lakehouse and AI data pipelines on Databricks. You'll design robust ETL/ELT, enable feature engineering for ML/LLM use cases, and drive best practices for reliability, performance, and cost. What you'll do• Design, build, and maintain batch/streaming pipelines in Python + PySpark on Databricks (Delta Lake, Autoloader, Structured Streaming).• Implement data models (Bronze/Silver/Gold), optimize with partitioning, Z-ORDER, and indexing, and manage reliability (DLT/Jobs, monitoring, alerting). • Enable ML/AI: feature engineering, MLflow experiment tracking, model registries, and model/feature serving; support RAG pipelines (embeddings, vector stores). • Establish data quality checks (e.g., Great Expectations), lineage, and governance (Unity Catalog, RBAC). • Collaborate with Data Science/ML and Product to productionize models and AI workflows; champion CI/CD and IaC.• Troubleshoot performance and cost issues; mentor engineers and set coding standards. Must-have qualifications• 10+ years in data engineering with a track record of production pipelines. • Expert in Python and PySpark (UDFs, Window functions, Spark SQL, Catalyst basics). • Deep hands-on Databricks: Delta Lake, Jobs/Workflows, Structured Streaming, SQL Warehouses; practical tuning and cost optimization. • Strong SQL and data modeling (dimensional, medallion, CDC). • ML/AI enablement experience: MLflow, feature stores, model deployment/monitoring; familiarity with LLM workflows (embeddings, vectorization, prompt/response logging).• Cloud proficiency on AWS/Azure/GCP (object storage, IAM, networking). • CI/CD (GitHub/GitLab/Azure DevOps), testing (pytest), and observability (logs/metrics). Nice to have• Databricks Delta Live Tables, Unity Catalog automation, Model Serving. • Orchestration (Airflow/Databricks Workflows), messaging (Kafka/Kinesis/Event Hubs). • Data quality & lineage tools (Great Expectations, OpenLineage). • Vector DBs (FAISS, pgvector, Pinecone), RAG frameworks (LangChain/LlamaIndex). • IaC (Terraform), security/compliance (PII handling, data masking).• Experience interfacing with BI tools (Power BI, Tableau, Databricks SQL). Apply tot his job
Apply Now

Similar Jobs

Data Engineer, Marketing

Remote, USA Full-time

AWS/Spark Data & AI Engineer | Remote

Remote, USA Full-time

Senior AI and Data Engineer, Hospital Patient Monitoring

Remote, USA Full-time

Local AI Consultant (San Diego)

Remote, USA Full-time

Consultant - AI Research for Alternative Investment Firm/Family Office Consultin

Remote, USA Full-time

Dental Hygienist, AI Consultant / Advisor

Remote, USA Full-time

[Remote] Managing Consultant, Data Migration

Remote, USA Full-time

Custom GPT Expert - AI Consultant

Remote, USA Full-time

Data & AI Consultant - Boston - Early Career

Remote, USA Full-time

Expert n8n AI-Automation Specialist for Long-Term, Per-Task Projects (OpenAI, Meta API, TikTok) - Contract to Hire

Remote, USA Full-time

Nurse Informaticist IV

Remote, USA Full-time

Social Content Creator - Onsite - Adult Beverages

Remote, USA Full-time

Solution Architect – Industry 4.0/MES/Scada/WMS - Remote

Remote, USA Full-time

Care Manager- Telephonic Nurse Part Time

Remote, USA Full-time

Mental Health Clinician- Mobile Response (Overnight: Remote) (7640)

Remote, USA Full-time

Security Compliance Analyst; Secret Clearance

Remote, USA Full-time

Manager of Clinical Operations (RN) - Hybrid in Phoenix, AZ

Remote, USA Full-time

Remote Customer Support Representative: Entry-Level Online Jobs for Stay-at-Home Moms with No Experience

Remote, USA Full-time

Remote Encounter Analyst II – Healthcare Claims Management & Data Integrity Specialist (Work‑From‑Home)

Remote, USA Full-time

Software Engineer, AI SDK

Remote, USA Full-time
Back to Home