Senior Data Engineer
INFOCUSP INNOVATIONS
Key Responsibilities
1. Graph Database Architecture & Development
- Design, develop, and maintain enterprise-scale graph data models using Neo4j.
- Architect and optimize graph storage, indexing, querying, and relationship modeling for high-performance workloads.
- Build and maintain knowledge graph solutions that integrate data from multiple structured and unstructured sources.
- Ensure scalability, reliability, and performance of Neo4j deployments in production environments.
2. Data Pipeline Engineering
- Design and implement end-to-end data extraction, transformation, and loading (ETL/ELT) pipelines.
- Build production-grade data ingestion frameworks for processing large volumes of data from diverse sources.
- Develop automated workflows for data validation, enrichment, lineage tracking, and monitoring.
- Optimize pipeline performance and operational reliability.
3. Unstructured Data Processing
- Develop systems for processing and extracting insights from documents, PDFs, reports, emails, web content, and other unstructured datasets.
- Transform extracted entities and relationships into graph-ready formats for Neo4j ingestion.
- Ensure high-quality data normalization, deduplication, and graph enrichment processes.
4. Workflow Orchestration
- Build, schedule, monitor, and maintain data workflows using Dagster / Airflow.
- Design reusable, modular, and observable pipeline architectures.
- Implement workflow monitoring, error handling, retries, and operational dashboards
5. Collaboration & Ownership
- Work closely with data engineers, Software Engineers, AI/ML engineers, product teams, and business stakeholders.
- Take ownership of technical initiatives from design through production deployment.
- Communicate architecture decisions, trade-offs, and implementation plans effectively.
- effectively.
Qualifications
- 5+ years of experience in Data Engineering or Graph Data Engineering.
- Strong production experience with Neo4j, including deployment, scaling, optimization, and maintenance.
- Proven experience designing and implementing knowledge graphs and graph-based architectures.
- Experience building production-grade data extraction and ingestion pipelines.
- Strong experience working with unstructured data processing systems.
- Hands-on experience with Dagster / Airflow for workflow orchestration and pipeline management.
- Strong proficiency in Python and related data engineering libraries.
- Experience with data modeling, ETL/ELT design, and distributed data processing.
- Strong understanding of data quality, observability, monitoring, and operational best practices.
- Excellent communication skills and ability to work independently with minimal supervision.