Senior Data Engineer
Analyst
Cape Town – Western Cape – South Africa
Senior Data Engineer
Salary: Up to R80 000 pm CTC
Area: South Africa
Type: Remote
Build serious data systems that actually matter. This is a high-impact senior role for a data engineer who knows how to get the best out of Spark, writes strong Python, and enjoys turning messy legacy pipelines into clean, scalable engineering. You will join a fast-moving team working on modern cloud data platforms, lakehouse architecture, and large-scale processing where performance, quality, and good engineering judgment count.
The core tech includes Spark, PySpark, Python, Delta Lake, Parquet, Azure Synapse, SQL, Docker, and modern orchestration approaches.
Responsibilities
- Design, build, and optimise high-performance data pipelines using Python and PySpark
- Improve Spark workloads through better memory use, partitioning, shuffle tuning, and DAG optimisation
- Refactor legacy SQL-heavy ETL processes into modular, reusable Python libraries
- Build and maintain lakehouse data layers across Bronze, Silver, and Gold
- Work with Delta Lake and Parquet to improve versioning, schema management, and storage performance
- Help drive a code-first approach to orchestration and reduce reliance on cloud-specific tooling
- Support a cloud-agnostic engineering approach with portable, scalable solutions
- Contribute to code reviews, testing standards, and overall platform quality
- Partner with analysts, data scientists, and business teams to deliver practical data solutions
- Mentor junior engineers and help shape strong engineering standards across the team
Requirements
- Bachelor’s degree in Computer Science, Information Systems, Engineering, or a related field
- 6+ years of experience working with Spark or PySpark in production environments
- Strong Python skills, with experience building maintainable, production-grade applications
- Proven ability to identify and fix Spark performance bottlenecks using the Spark UI
- Solid SQL skills, including the ability to interpret and migrate existing ETL logic
- Experience with Azure Synapse Analytics, Dedicated SQL Pools, and Data Factory
- Strong hands-on experience with Delta Lake and Parquet in high-volume environments
- Experience with Docker and open-source or portable engineering standards
- Strong understanding of scalable data architecture and modern data engineering best practice
- Experience working in collaborative engineering teams on complex data platforms
- Ability to work across technical and non-technical teams and communicate clearly
- A strong grasp of security, compliance, and data governance in data engineering environments
If you are a senior data engineer who wants to own performance, shape modern data platforms, and work with a strong technical stack, apply now.