Zettamine - Sr. Data Engineer

Nexthire
US - Indiana
View Company Profile / << Go Back

  • Job Type: Full time
  • 30+ days ago

Job Description

Job Title: Senior Data Engineer (PySpark)

Experience: 5 to 8 Years

Location: Bangalore

Job Summary:

We are looking for a highly skilled and experienced Senior Data Engineer to join our team in Bangalore. The ideal candidate will have a strong background in building scalable and high-performance data pipelines using PySpark and the Apache ecosystem. This role involves close collaboration with Data Scientists, Analysts, and cross-functional teams to drive robust data solutions.

Key Responsibilities:

Design, develop, and optimize distributed data pipelines using PySpark.

Work with Apache tools such as Hadoop, Hive, HDFS, and others for large-scale data ingestion, transformation, and processing.

Ensure the performance, reliability, and scalability of ETL workflows in production environments.

Collaborate with stakeholders to gather requirements and deliver scalable data solutions.

Implement robust data quality checks and lineage tracking for auditability and transparency.

Handle data integration from diverse structured and unstructured sources.

Utilize Apache NiFi (if applicable) for automated data flow orchestration.

Write clean and maintainable code primarily in Python, with working knowledge of Java.

Participate in architectural discussions and performance tuning initiatives.

Required Skills:

5--7 years of experience in data engineering roles.

Expertise in PySpark for distributed computing and data transformation.

Strong understanding of Apache ecosystem (Hadoop, Hive, Spark, HDFS).

Knowledge of ETL principles, data modeling, and data warehousing concepts.

Experience working with large-scale datasets and optimizing performance.

Hands-on proficiency with SQL and exposure to NoSQL databases.

Solid coding skills in Python, with working knowledge of Java.

Experience with version control (Git) and working in CI/CD environments.




Fast Track Upload