立即應徵

Data Engineer (Azure Databricks)


不指定
0年工作經驗

職位描述

Responsibility

  • Design, build and maintain scalable and efficient ETL/ELT pipelines in Azure Databricks to process structured, semi-structured and unstructured insurance data from multiple internal and external sources.
  • Collaborate with data architects, modelers, analysts and business stakeholders to gather data requirements and deliver fit-for-purpose data assets that support analytics, regulatory and operational needs.
  • Develop, test and optimize data transformation routines, batch and streaming solutions (leveraging tools such as Azure Data Factory, Data Lake Storage Gen2, Azure Event Hubs and Kafka) to ensure timely and accurate data delivery.
  • Implement rigorous data quality, validation and cleansing procedures — with a focus on enhancing reliability for high-stakes insurance use cases, reporting and regulatory outputs.
  • Integrate Informatica tools to facilitate data governance, including the capture of data lineage, metadata and data cataloguing as required by regulatory and business frameworks.
  • Ensure robust data security by following best practices for RBAC, managed identities, encryption and compliance with Hong Kong's PDPO, GDPR and other relevant regulatory requirements.
  • Automate and maintain deployment pipelines using GitHub Actions to ensure efficient, repeatable and auditable data workflows and code releases.
  • Conduct root cause analysis, troubleshoot pipeline failures and proactively identify and resolve data quality or performance issues.
  • Produce and maintain comprehensive technical documentation for pipelines, transformation rules and operational procedures to ensure transparency, reuse and compliance.
  • Apply subject matter expertise in Hong Kong Life and General Insurance to ensure that development captures local business needs and industry-specific standards.


Requirement

  • Bachelor's degree in Information Technology, Computer Science, Data Engineering or a related discipline.
  • 3+ years of experience as a data engineer, building and maintaining ETL/ELT processes and data pipelines on Azure Databricks (using PySpark or Scala), with a focus on structured, semi-structured and unstructured insurance data.
  • Strong experience orchestrating data ingestion, transformation and loading workflows using Azure Data Factory and Azure Data Lake Storage Gen2.
  • Advanced proficiency in Python and Spark for data engineering, data cleaning, transformation and feature engineering in Databricks for analytics and machine learning.
  • Experience integrating batch and streaming data sources via Kafka or Azure Event Hubs for real-time or near-real-time insurance applications.
  • Hands-on use of Informatica for data quality, lineage and governance to support business and regulatory standards in insurance.
  • Familiarity with automation and CI/CD of Databricks workflows using GitHub Actions.
  • Understanding of data security, RBAC, Key Vault, encryption and best practices for compliance in the insurance sector.
  • Experience optimizing data pipelines to support ML workflows and BI/reporting tools.



工作種類
工作地區 不指定

有關招聘公司
IT Channel (Asia) Limited