Big Data Engineer

  • Permanent
  • Full time
  • Hybrid (India)

Job Title Big Data Engineer

Location - Bangalore , Chennai , Gurgaon

Experience required - 5 - 8yrs


Job Summary:

We are seeking Big Data Engineers to join our offshore team. The ideal candidates will have extensive experience in PySpark, real-time data processing, and Kafka. They should be well-versed in designing and implementing ETL pipelines and working with streaming data frameworks in cloud-based environments.



Key Responsibilities:

  • Develop, optimize, and maintain streaming data pipelines using PySpark and Apache Kafka.
  • Design and implement scalable ETL processes for real-time and batch data processing.
  • Work with Amazon EMR, Apache Spark, Apache NIFI, or similar frameworks to build near real-time data pipelines.
  • Develop data solutions and frameworks to handle high-volume, high-velocity data streams.
  • Implement data storage solutions for structured and unstructured data, ensuring efficiency and reliability.
  • Write clean, maintainable, and well-documented code using Python, Groovy, or Java.
  • Optimize data structures and schemas for data processing and retrieval efficiency.
  • Collaborate with cross-functional teams to define, design, and implement data-driven solutions.
  • Troubleshoot and resolve performance bottlenecks and data quality issues.
  • Stay updated with the latest technologies and best practices in big data and streaming architectures.



Required Qualifications & Skills:

  • Proficiency in PySpark and strong understanding of distributed computing principles.
  • Hands-on experience with Apache Kafka (including kSQL, Mirror Maker, or similar tools) for real-time data streaming.
  • Strong programming skills in at least one of the following: Python, Groovy, or Java.
  • Good understanding of data structures, ETL design, and data storage solutions.
  • Experience in working with Amazon EMR, Apache Spark, Apache NIFI, or similar streaming data frameworks.
  • Ability to design and implement scalable and high-performance data pipelines.
  • Experience with cloud-based big data solutions (AWS, GCP, or Azure) is a plus.
  • Strong problem-solving and analytical skills.
  • Excellent communication and collaboration abilities.



Preferred Qualifications:

  • Experience with SQL and NoSQL databases.
  • Exposure to containerization technologies such as Docker and Kubernetes.
  • Familiarity with CI/CD pipelines for data engineering workflows.
  • Understanding of data governance, security, and compliance best practices.



Why Join Us?

  • Work on cutting-edge real-time streaming and big data technologies.
  • Collaborate with an expert team in a fast-paced, dynamic environment.
  • Competitive compensation and opportunities for professional growth.
  • Gain experience in modern cloud-based data architectures.

If you are passionate about data engineering, big data, and real-time analytics, we invite you to apply and be a part of our innovative team!



Please apply here or share your CV to sandya.velamuri@derisk360.com