
You will design and develop data pipelines using Python, PySpark, and SQL within the AWS ecosystem. You will also implement and manage complex data workflows using Apache Airflow and maintain data catalogs with AWS Glue.
Ideally, a degree in Information Technology, Computer Science, or a related field
Ideally, +5 years of experience within the Data Engineering landscape
Strong expertise in Python
Strong expertise in PySpark
Strong expertise in SQL
Strong expertise in the overall AWS data ecosystem
Proficiency to work with Github
Experience with GitLab as the versioning control system
Experience with S3 buckets for storing large volumes of raw and processed data
Experience with Apache Airflow (MWAA) to orchestrate tasks
Experience with Apache Iceberg (or similar) for managing data in a data lake
Experience with AWS Glue Catalog to organize metadata
Experience with AWS Athena for interactive querying
Familiarity with data modeling techniques
Knowledge of the data journey stages within a data lake (Medallion Architecture)
Terraform (nice-to-have)
CICD pipelines (nice-to-have)
To apply, please follow the link provided. Click this button to apply 👇
Companies that have recently posted new job opportunities
Companies currently hiring for multiple roles