Data Engineering

SPAN Technology Services

STS-DATA ENGINEERING-1

Posted a few seconds ago

Erode


Job Summary:
We are looking for motivated Data Engineers to join our team and help design, build, andmaintain scalable data pipelines and warehouse solutions. The ideal candidates will havehands-on experience with modern data engineering tools and a strong interest in real-time and batch data processing.

Responsibilities

  • Design and implement data models to support analytical and reporting needs, ensuringefficient schema design for warehouse and transactional systems
  • Build, maintain, and optimize ETL/ELT pipelines using Python, Apache Kafka, Apache Flink for batch and real-time data processing and PySpark
  • Set up and manage data warehouse infrastructure (Doris), including table design,partitioning strategies, and performance tuning
  • Develop and schedule data workflows using Apache Airflow, ensuring reliability andmonitoring of pipeline jobs
  • Integrate data from multiple sources including PostgreSQL, MS SQL, and MinIO into the data warehouse
  • Containerize data pipeline components using Docker for consistent deployment across environments
  • Ensure data quality, consistency, and integrity across pipelines through validation and testing
  • Collaborate with analysts, data scientists, and other engineering teams to understand data requirements and deliver solutions
  • Troubleshoot and resolve issues related to data pipelines, storage, and processing performance
  • Document data flows, schemas, and pipeline architecture for team reference

Required Qualifications

  • Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field
  • 1 - 3 years of hands-on experience in data engineering
  • Proficiency in Python and SQL for data processing and transformation
  • Experience with Apache Kafka for stream processing and messaging
  • Experience with Apache Flink or similar stream/batch processing frameworks
  • Hands-on experience with PySpark for large-scale data processing
  • Familiarity with Apache Airflow for workflow orchestration and scheduling
  • Working knowledge of relational databases (PostgreSQL, MS SQL)
  • Exposure to OLAP/data warehouse systems (Doris or similar columnar databases)
  • Experience with Docker for containerization of data pipelines
  • Familiarity with object storage solutions (MinIO or similar S3-compatible storage)
  • Understanding of data warehousing concepts, data modeling, and ETL/ELT pipeline

design

Preferred Skills

  • Experience building real-time data pipelines using Kafka + Flink
  • Knowledge of distributed computing concepts (partitioning, parallelism, fault tolerance)
  • Familiarity with CI/CD practices for deploying data pipelines
  • Experience with version control (Git)
  • Understanding of data quality, validation, and monitoring practices
  • Basic knowledge of cloud platforms (AWS, Azure, or GCP) is a plus

Soft Skills

  • Strong problem-solving and debugging skills
  • Good communication and collaboration abilities
  • Eagerness to learn and adapt to new tools in a fast-paced environment

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.