Menu Close

South Africa: Data Engineering (Us Working Hours)

Date 2024-12-18
Job Title Data Engineering (Us Working Hours)
Employer
More Information
Salary market related Monthly
Category It Computing Software
Location ZA
/ South Africa

Job Summary

The core advanced data engineering skillset is a comprehensive combination of technical expertise, platform knowledge, and problem-solving abilities required to build, maintain, and optimize robust, scalable, and efficient data systems. Data Architecture and Design Data Modeling: Create normalized and denormalized schemas (3NF, star, snowflake). Design data lakes, warehouses, and marts optimized for analytical or transactional workloads. Incorporate modern paradigms like data mesh , lakehouse , and delta architecture . ETL/ELT Pipelines: Develop end-to-end pipelines for extracting, transforming, and loading data. Optimize pipelines for real-time and batch processing. Metadata Management: Implement data lineage, cataloging, and tagging for better discoverability and governance. Distributed Computing and Big Data Technologies Proficiency with big data platforms : Apache Spark (PySpark, Sparklyr). Hadoop ecosystem (HDFS, Hive, MapReduce). Apache Iceberg or Delta Lake for versioned data lake storage. Manage large-scale, distributed datasets efficiently. Utilize query engines like Presto, Trino, or Dremio for federated data access. Data Storage Systems Expertise in working with different types of storage systems: Relational Databases (RDBMS) : SQL Server, PostgreSQL, MySQL, etc. NoSQL Databases : MongoDB, Cassandra, DynamoDB. Cloud Data Warehouses : Snowflake, Google BigQuery, Azure Synapse, AWS Redshift. Object Storage : Amazon S3, Azure Blob Storage, Google Cloud Storage. Optimize storage strategies for cost and performance: Partitioning, bucketing, indexing, and compaction. Programming and Scripting Advanced knowledge of programming languages : Python (pandas, PySpark, SQLAlchemy). SQL (window functions, CTEs, query optimization). R (data wrangling, Sparklyr for data processing). Java or Scala (for Spark and Hadoop customizations). Proficiency in scripting for automation (e.g., Bash, PowerShell). Real-Time and Streaming Data Expertise in real-time data processing : Apache Kafka, Kinesis, Azure Event Hub for event streaming. Apache Flink or Spark Streaming for real-time ETL. Implement event-driven architectures using message queues. Handle time-series data and process live feeds for real-time analytics. Cloud Platforms and Services Experience with cloud environments: AWS : Lambda, Glue, EMR, Redshift, S3, Athena. Azure : Data Factory, Synapse, Data Lake, Databricks. GCP: BigQuery, Dataflow, Dataproc. Manage infrastructure-as-code (IaC) using tools like Terraform or CloudFormation. Leverage cloud-native features like auto-scaling, serverless compute, and managed services. DevOps and Automation Implement CI/CD pipelines for data workflows: Tools: Jenkins, GitHub Actions, GitLab CI, Azure DevOps. Monitor and automate tasks using orchestration tools: Apache Airflow, Prefect, Dagster. Managed services like AWS Step Functions or Azure Data Factory. Automate resource provisioning using tools like Kubernetes or Docker. Data Governance, Security, and Compliance Data Governance: Implement role-based access control (RBAC) and attribute-based access control (ABAC) . Maintain master data and metadata consistency. Security: Apply encryption at rest and in transit. Apply encryption at rest and in transit. Secure data pipelines with IAM roles, OAuth, or API keys. Implement network security (e.g., firewalls, VPCs). Compliance: Ensure adherence to regulations like GDPR, CCPA, HIPAA, or SOC Track and document audit trails for data usage. Performance Optimization – Optimize query and pipeline performance: Query tuning (partition pruning, caching, broadcast joins). Reduce IO costs and bottlenecks with columnar formats like Parquet or ORC. Use distributed computing patterns to parallelize workloads. Implement incremental data processing to avoid full dataset reprocessing. Advanced Data Integration Work with API-driven data integration : Consume and build REST/GraphQL APIs. Implement integrations with SaaS platforms (e.g., Salesforce, Twilio, Google Ads). Integrate disparate systems using ETL/ELT tools like: Informatica, Talend, dbt (data build tool), or Azure Data Factory. Data Analytics and Machine Learning Integration Enable data science workflows by preparing data for ML: Feature engineering, data cleaning, and transformations. Integrate machine learning pipelines : Use Spark MLlib, TensorFlow, or scikit-learn in ETL pipelines. Automate scoring and prediction serving using ML models. Monitoring and Observability Set up monitoring for data pipelines: Tools: Prometheus, Grafana, or ELK stack. Create alerts for SLA breaches or job failures. Track pipeline and job health with detailed logs and metrics. Business and Communication Skills Translate complex technical concepts into business terms. Collaborate with stakeholders to define data requirements and SLAs. Design data systems that align with business goals and use cases. Continuous Learning and Adaptability Stay updated with the latest trends and tools in data engineering: E.g., Data mesh architecture, Fabric, and AI-integrated data workflows. Actively engage in learning through online courses, certifications, and community contributions: Certifications like Databricks Certified Data Engineer , AWS Data Analytics Specialty , or Google Professional Data Engineer .

View Job  South Africa: Pharma Key Accounts Manager (Pta) posted by Datafin

Data Engineering (Us Working Hours) position available in ZA, South Africa. This job position was posted by . The job has been posted on 2024-12-18 in the It Computing Software category

Click Go Apply to apply online!


You might also like to look at the available jobs in the same area.

Apply directly for this position
Some posts have contact information included with other instructions. All applicants are urged to read the instructions carefully.
We will list jobs at the bottom of each job, so feel free to scroll down and continue your search.

We do not accept any job applications and merely offer the service of daily jobs being mined and displayed on our job portal.
Each job post has a link to take you directly to the original post to apply on their website or the original website where the post came from.


Here are some more related positions


South Africa: Data Engineering (Us Working Hours)

The core advanced data engineering skillset is a comprehensive combination of technical expertise, platform knowledge, and problem-solving abilities required to build, maintain, and optimize robust, scalable, and efficient data systems. Data Architecture a


View Job
Data Engineering (Us Working Hours)

South Africa: Engineering Data Controller (Permanent) posted by Profile Personnel

Brief Role DescriptionTo ensure that all activities set out by Supervision are undertaken and completed by the Engineering Data Controller to ensure all data bases are completed on time and to provide product support regarding technical information company

View Job  South Africa: Rebuilds Mechanic posted by Ikwezi Mining (Pty) Ltd

View Job
Engineering Data Controller (Permanent)

Amsterdam: Engineering Project Manager (Ship Building) (Netherlands) posted by Datafin

Engineering Project Manager (Ship Building) (Netherlands)Engineering/Technical ~ Product/Project managementAmsterdam - NetherlandsENVIRONMENT: If you have Shipbuilding experience and want to work in the Netherlands, then this is the role for you!A dynamic


View Job
Engineering Project Manager (Ship Building) (Netherlands)

Johannesburg: Intermediate Electronics Software Engineer (C++ And C#) (Engineering Degree) posted by Datafin

Intermediate Electronics Software Engineer (C++ and C#) (Engineering Degree)IT - Software DevelopmentJohannesburg - GautengENVIRONMENT: DEFINE technical proposal content, ensure teams have appropriate product & technical specs and the flawless execution of


View Job
Intermediate Electronics Software Engineer (C++ And C#) (Engineering Degree)

Johannesburg: Junior Electronics Software Developer (C++ And C#) (Engineering Degree) posted by Datafin

Junior Electronics Software Developer (C++ and C#) (Engineering Degree)IT - Software DevelopmentJohannesburg - GautengENVIRONMENT: DEFINE technical proposal content, ensure teams have appropriate product & technical specs and the flawless execution of proj


View Job
Junior Electronics Software Developer (C++ And C#) (Engineering Degree)

Cape Town: Software Engineering Team Lead (C#/Java/Python) posted by Datafin

Software Engineering Team Lead (C#/Java/Python)IT - Software DevelopmentCape Town - Western CapeENVIRONMENT: THE coding expertise of a hands-on, passionate and self-driven Software Engineering Team Lead is sought to lead the Front Office Research Dev team


View Job
Software Engineering Team Lead (C#/Java/Python)

South Africa: Data Engineering (Us Working Hours) posted by Recruit-It

The core advanced data engineering skillsetis a comprehensive combination of technical expertise, platform knowledge, and problem-solving abilities required to build, maintain, and optimize robust, scalable, and efficient data systems. Data Architecture an


View Job
Data Engineering (Us Working Hours)

Error making API request.
Share this to someone who needs a job:
Posted in Jobs in South Africa, Jobs in ZA

More Jobs in Your Area