Lead/Sr. Data Engineer + AI
Role: Sr./ Lead Data Engineer + AILocation: Boston, MA - RemoteExperience Needed: 10 Years to 15 Years For Lead/ 05 to 10 Years forSenior Need minimum 3 years of experience as Lead. About the role:We're looking for aSenior Data Engineer to build and scale our lakehouse and AI data pipelines on Databricks. You'll design robust ETL/ELT, enable feature engineering for ML/LLM use cases, and drive best practices for reliability, performance, and bolthires. What you'll do:Design, build, and maintain batch/streaming pipelines in Python + PySpark on Databricks (bolthires Lake, Autoloader, Structured Streaming).Implement data models (Bronze/Silver/Gold), optimize with partitioning, Z-ORDER, and indexing, and manage reliability (DLT/Jobs, monitoring, alerting). Enable ML/AI: feature engineering, MLflow experiment tracking, model registries, and model/feature serving; support RAG pipelines (embeddings, vector stores). Establish data quality checks (e.g., Great Expectations), lineage, and governance (Unity Catalog, RBAC). Collaborate with Data Science/ML and Product to productionize models and AI workflows; champion bolthires/CD and IaC.Troubleshoot performance and bolthires issues; mentor engineers and set coding standards. Must-have qualifications:10+ years in data engineering with a track record of production pipelines. Expert in Python and PySpark (UDFs, Window functions, Spark SQL, Catalyst basics). Deep hands-on Databricks: bolthires Lake, Jobs/Workflows, Structured Streaming, SQL Warehouses; practical tuning and bolthires optimization. Strong SQL and data modeling (dimensional, medallion, CDC). ML/AI enablement experience: MLflow, feature stores, model deployment/monitoring; familiarity with LLM workflows (embeddings, vectorization, prompt/response logging).Cloud proficiency on AWS/Azure/bolthires Cloud Platform (object storage, IAM, networking). Apply tot his job