Johannesburg: Big Data Data Engineer posted by PBT Group
Job Description
We are seeking a skilled Data Engineer to design and develop scalable data pipelines that ingest raw, unstructured JSON data from source systems and transform it into clean, structured datasets within our Hadoop-based data platform. The ideal candidate will play a critical role in enabling data availability, quality, and usability by engineering the movement of data from the Raw Layer to the Published and Functional Layers.
Key Responsibilities:
- Design, build, and maintain robust data pipelines to ingest raw JSON data from source systems into the Hadoop Distributed File System (HDFS).
- Transform and enrich unstructured data into structured formats (e.g., Parquet, ORC) for the Published Layer using tools like PySpark, Hive, or Spark SQL.
- Develop workflows to further process and organize data into Functional Layers optimized for business reporting and analytics.
- Implement data validation, cleansing, schema enforcement, and deduplication as part of the transformation process.
- Collaborate with Data Analysts, BI Developers, and Business Users to understand data requirements and ensure datasets are production-ready.
- Optimize ETL/ELT processes for performance and reliability in a large-scale distributed environment.
- Maintain metadata, lineage, and documentation for transparency and governance.
- Monitor pipeline performance and implement error handling and alerting mechanisms.
Technical Skills & Experience:
- 3+ years of experience in data engineering or ETL development within a big data environment.
- Strong experience with Hadoop ecosystem tools: HDFS, Hive, Spark, YARN, and Sqoop.
- Proficiency in PySpark, Spark SQL, and HQL (Hive Query Language).
- Experience working with unstructured JSON data and transforming it into structured formats.
- Solid understanding of data lake architectures: Raw, Published, and Functional layers.
- Familiarity with workflow orchestration tools like Airflow, Oozie, or NiFi.
- Experience with schema design, data modeling, and partitioning strategies.
- Comfortable with version control tools (e.g., Git) and CI/CD processes.
Nice to Have:
- Experience with data cataloging and governance tools (e.g., Apache Atlas, Alation).
- Exposure to cloud-based Hadoop platforms like AWS EMR, Azure HDInsight, or GCP Dataproc.
- Experience with containerization (e.g., Docker) and/or Kubernetes for pipeline deployment.
- Familiarity with data quality frameworks (e.g., Deequ, Great Expectations).
Qualifications:
- Bachelors degree in Computer Science, Information Systems, Engineering, or a related field.
- Relevant certifications (e.g., Cloudera, Databricks, AWS Big Data) are a plus.
* In order to comply with the POPI Act, for future career opportunities, we require your permission to maintain your personal details on our database. By completing and returning this form you give PBT your consent
* If you have not received any feedback after 2 weeks, please consider you application as unsuccessful.
Ready to Apply?
Click below to apply directly with the employer
Safe & secure application
Explore More Opportunities
Get Similar Job Alerts
Job Seeker Tip
Create a professional email address for job hunting - avoid nicknames or casual addresses.
How to Apply
Click “GO APPLY” to visit the company’s application page.
Follow their instructions carefully.
JVR Jobs connects you with employers – we don’t process applications directly.
Latest Job Opportunities
East Rand: Earthmoving Equipment Parts Sales Rep posted by Fusion Personnel
Duties and ResponsibilitiesTrade test - Earthmoving Equipment Mechanic prefferedTechnical expertise requiredIdentify and approach potential customersPromote...
View JobCape Town: Executive Protection Officer (Bodyguard) posted by Merand Corbett & Associates
SKILLS amp CORE COMPETENCIESExcellent physical fitness and staminaAdvanced driving skills across all conditionsStrong communication verbal amp written ndash...
View JobGauteng: Area Operations Manager posted by HandPicked Recruitment
Minimum requirements5 yearsrsquo experience in multi-unit fast foodQSR operations management.Strong knowledge of stock control, food costs, and operational...
View JobGauteng: Sous Chef posted by Hospitality Hire
Key ResponsibilitiesSupport the Head Chef in managing all aspects of kitchen operationsLead and mentor the culinary team, fostering creativity and...
View JobLimpopo: Handyman posted by Bright Placements (PTY) Ltd
Handyman VacancyWe are looking for a skilled and reliable Handyman to join our team. The successful candidate will be responsible…
View JobMpumalanga: Lodge Anchor- Luxury 5* lodge posted by Bright Placements (PTY) Ltd
Lodge Anchor VacancyWe are looking for a dedicated and professional Lodge Anchor to join our hospitality team. The successful candidate…
View Job
Browse Employers
Job Alerts