Data Engineer
Position Overview
We are looking for an experienced Data Engineer to design and implement a scalable data platform using Databricks. The ideal candidate will have strong expertise in building modern data architectures, integrating diverse data sources (SQL Server, MongoDB, InfluxDB), and enabling analytics and reporting use cases.
Key Responsibilities
- Design, build, and maintain a robust data platform on Databricks.
- Develop efficient ETL/ELT pipelines to ingest data from SQL Server, MongoDB, and InfluxDB into Databricks Delta Lake.
- Implement real-time and batch data ingestion strategies using tools such as Kafka, Azure Event Hubs, or similar.
- Optimize data storage and processing for performance, scalability, and cost-effectiveness.
- Build data models to support BI, advanced analytics, and machine learning use cases.
- Collaborate with stakeholders (Data Scientists, Analysts, Product Teams) to define data requirements.
- Ensure data quality, governance, and security across the platform.
- Monitor, troubleshoot, and enhance data workflows for reliability.
Required Skills & Qualifications
- Proven experience in Databricks (Delta Lake, Spark, PySpark, SQL).
- Strong expertise in data integration from multiple sources (SQL Server, MongoDB, InfluxDB).
- Hands-on experience with ETL/ELT pipeline development and orchestration (e.g., Airflow, ADF, or equivalent).
- Proficiency in data modeling, data warehousing concepts, and performance tuning.
- Familiarity with real-time data streaming (Kafka, Azure Event Hubs, or similar).
- Strong programming skills in Python and SQL.
- Experience with cloud platforms (Azure).
- Excellent problem-solving skills and the ability to work in a fast-paced environment.
- Familiar with Change Data Capture
Preferred Qualifications
- Experience with InfluxDB or time-series data handling.
- Exposure to machine learning workflows within Databricks.
- Knowledge of data governance frameworks (Unity Catalog, Purview, or similar).