Data Engineer

World Bank Group

Chennai

Both national and international

GF - Professional and Technical level

Speaks English

Application deadline: July 03, 2026 (14 days)

Apply

Summary by Impactpool

The World Bank is seeking a Data Engineer to design, build, and maintain data infrastructure that supports data-driven decision-making. This role involves developing ETL processes, optimizing data retrieval performance, and collaborating with stakeholders to gather data requirements. The Data Engineer will work on data pipeline development, integration, and transformation, while ensuring data quality and observability. The position requires strong technical skills and the ability to work in a collaborative environment to support the organization's data initiatives.

Candidate Requirements:

Master's degree with 5 years experience or Bachelor's with 7 years
Expertise in Data Engineering and scalable data pipelines
Knowledge of data modeling and integration techniques
Experience with modern data lake architectures and Databricks
Familiarity with DevOps principles and automation
Understanding of Agile environments and cross-functional collaboration
Strong communication and stakeholder management skills

Data Engineer

Job #:	req37213
Organization:	World Bank
Sector:	Information Technology
Grade:	GF
Term Duration:	3 years 0 months
Recruitment Type:	Local Recruitment
Location:	Chennai,India
Required Language(s):	English
Preferred Language(s):
Closing Date:	7/3/2026 (MM/DD/YYYY) at 11:59pm UTC

Description

Do you want to build a career that is truly worthwhile? Working at the World Bank Group provides a unique opportunity for you to help our clients solve their greatest development challenges. The World Bank Group is one of the largest sources of funding and knowledge for developing countries; a unique global partnership of five institutions dedicated to ending extreme poverty, increasing shared prosperity and promoting sustainable development. With 189 member countries and more than 130 offices worldwide, we work with public and private sector partners, investing in groundbreaking projects and using data, research, and technology to develop solutions to the most urgent global challenges. For more information, visit www.worldbank.org

ITS Vice Presidency Context

The Information and Technology Solutions (ITS) Vice Presidential Unit (VPU) enables the World Bank Group to achieve its mission of ending extreme poverty and boost shared prosperity on a livable planet by delivering transformative information and technologies to its staff working in over 150+ locations. For more information on ITS, see this video: https://www.youtube.com/watch?reload=9&v=VTFGffa1Y7w

Unit Context:

The ITS Data Office is the central entity within the World Bank Group’s Information and Technology Solutions (ITS) department responsible for enabling data, AI, information, and knowledge capabilities across the institution. It comprises four Units focused on platforms & tools, product & service delivery, enablement and governance. The office plays a pivotal role in advancing the Bank’s digital transformation, supporting business domains with trusted data, information and AI capabilities, and fostering a culture of responsible innovation.

The Platforms & Tools unit is responsible for building, integrating, and continuously modernizing the foundational technology infrastructure that powers data, AI, archives, and knowledge services across the World Bank Group. The unit leads the rationalization and simplification of legacy systems, and modernization towards platforms that are scalable, secure, interoperable, and designed for self-service and adoption. The unit plays a critical role in enabling enterprise-wide transformation by delivering data environments, digitization infrastructure, and open knowledge repositories that are AI-ready and aligned with business needs.

Duties and accountabilities:

Role Purpose:

The Data Engineer is responsible for designing, building, and maintaining the data infrastructure that supports the organization's data-driven decision-making processes. With limited supervision, this role develops ETL processes, optimizes data retrieval performance, and collaborates with stakeholders to gather and understand data requirements, ultimately supporting the organization's data integration and transformation initiatives.

Key Responsibilities:

Data Pipeline Development

• Design, develop, and maintain data pipelines for ingestion, transformation, and serving across batch and streaming workloads

• Build ETL/ELT workflows to integrate data from diverse sources into enterprise data platforms

• Develop data transformation logic using Apache Spark, PySpark, SparkSQL, and SQL

• Implement change data capture (CDC) patterns for real-time and near-real-time data synchronization

• Build streaming data pipelines for real-time analytics and operational use cases

• Optimize pipeline performance, resource utilization, and cost efficiency

Federated Data Pipelines & Domain Enablement

• Support federated data pipeline architecture that enables Line of Business (LOB) teams to own and manage their domain data

• Contribute to self-serve data infrastructure that abstracts complexity and allows domain teams to build pipelines independently

• Develop standardized pipeline deployment patterns that LOB teams can adopt while maintaining autonomy

• Support domain teams in building data products that are discoverable, interoperable, and compliant with enterprise standards

• Enable distributed data processing across domains while ensuring consistency through federated governance

• Assist in establishing data contracts and interoperability standards that allow seamless data sharing across domains

• Support the balance between domain autonomy and enterprise-wide governance requirements

Templates, Blueprints & Patterns

• Develop reusable pipeline templates and Infrastructure as Code (IaC) patterns for common data product types

• Create blueprints for data ingestion, transformation, quality validation, and serving that LOB teams can customize

• Build standardized patterns for batch pipelines, streaming pipelines, CDC implementations, and API-based integrations

• Contribute to a pattern library covering medallion architecture, dimensional modeling, and data product packaging

• Document best practices and reference architectures that guide LOB teams in building compliant, high-quality pipelines

• Develop starter kits and accelerators that reduce time-to-value for domain teams building new data products

• Create cookbooks and implementation guides that translate enterprise standards into actionable steps

• Support LOB teams in adopting templates while allowing appropriate customization for domain-specific needs

Data Integration

• Integrate data from multiple internal and external sources into unified data assets

• Build reusable data integration patterns and connectors for enterprise data sources

• Implement data ingestion using Auto Loader, COPY INTO, and other ingestion frameworks

• Develop API-based data integrations and file-based data processing workflows

• Ensure data consistency and reliability across integrated sources

• Support data migration efforts and legacy system integrations

Data Modeling & Transformation

• Implement medallion architecture patterns (bronze, silver, gold) for data organization and quality progression

• Develop dimensional models, fact tables, and aggregations for analytics use cases

• Build data transformation logic that ensures accuracy, consistency, and business alignment

• Create reusable transformation components and modular pipeline designs

• Optimize data models for query performance and consumption patterns

• Support schema evolution and data versioning requirements

Data Quality & Testing

• Implement data quality checks, validation rules, and automated testing within pipelines

• Develop data profiling and anomaly detection to identify quality issues

• Build data reconciliation processes to ensure accuracy across systems

• Implement unit testing, integration testing, and regression testing for pipelines

• Monitor data quality metrics and remediate issues proactively

• Document data quality rules and thresholds for pipeline outputs

Data Observability & Operations

• Implement logging, monitoring, and alerting for pipeline health and performance

• Build dashboards to track pipeline execution, data freshness, and quality metrics

• Develop automated error handling, retry logic, and failure notifications

• Support incident response and troubleshooting for pipeline failures

• Implement data lineage tracking to support auditability and impact analysis

• Ensure pipelines meet SLAs for data availability and freshness

Analytics & AI Enablement

• Build data pipelines that enable analytics, reporting, and business intelligence use cases

• Prepare and serve data for machine learning and AI workloads

• Develop feature engineering pipelines for ML model development

• Create semantic layers and curated datasets that enable self-service analytics

• Support integration with analytics tools including Power BI and Tableau

• Build data products with clear documentation and consumption guidance

Collaboration & Enablement

• Partner with data architects to align pipeline development with architectural standards

• Collaborate with business analysts and data scientists to understand data requirements

• Work with platform engineers to leverage platform capabilities effectively

• Contribute to technical documentation, runbooks, and knowledge sharing

• Support data consumers in understanding and accessing data assets

• Participate in code reviews and follow engineering best practices

Coaching & Technical Mentorship

• Support data engineering delivery with contractor and consultant teams under guidance from senior team members

• Contribute to knowledge-sharing sessions and workshops to build data engineering capability across LOB teams

• Document best practices, lessons learned, and technical standards for data engineering

• Stay current with industry trends in data mesh, federated architectures, and cloud data services

• Share insights and learnings with the broader team to foster continuous improvement

Continuous Improvement

• Assist in evaluating emerging data engineering technologies, frameworks, and tools

• Identify opportunities to enhance pipeline performance, reliability, and cost efficiency

• Contribute to the evolution of best practices and standards for data engineering

• Propose automation opportunities to reduce manual effort and improve consistency

Other duties as assigned

Selection Criteria

Education and Experience:

• Typically requires a master's degree with 5 years of experience or a bachelor’s degree with a minimum of 7 years of relevant experience, or equivalent combination of education and experience.

Core Skills and Capabilities

• Demonstrated expertise in Data Engineering, including the design, development, and optimization of scalable data pipelines, data platforms, and data processing solutions.

• Strong knowledge of data modeling, data structures and algorithms, and data integration techniques to support efficient and reliable data management.

• Advanced experience designing and implementing modern data lake architectures and leveraging Databricks to build and maintain data engineering solutions.

• Proven experience applying DevOps principles and practices, including automation, deployment, monitoring, and continuous improvement of data products and platforms.

• Strong understanding of workflow management and orchestration tools to support complex data processing and integration workflows.

• Experience managing and supporting the Product Development Life Cycle (PDLC), from requirements gathering and solution design through deployment and operational support.

• Demonstrated ability to leverage business intelligence concepts and tools to deliver actionable insights and support data-driven decision-making.

• Strong business acumen with the ability to understand organizational priorities and translate business requirements into effective technical solutions.

• Experience working within Agile environments, including the Scaled Agile Framework (SAFe), and collaborating effectively across cross-functional teams.

• Excellent stakeholder management, communication, and influencing skills, with the ability to build consensus and drive outcomes across technical and non-technical audiences.

Recommended Certifications

• SAFe Product Owner/Product Manager (PO/PM) certification or other relevant Agile certifications.

• Industry-recognized certifications in Data Engineering, Data Analytics, Platform Architecture, Data Integration, Cloud Technologies, or related disciplines.

WBG Culture Attributes:

1. Sense of urgency: Anticipate and quickly respond to the needs of internal and external stakeholders.
2. Thoughtful risk-taking: Challenge the status quo and push boundaries to achieve greater impact.
3. Empowerment and accountability: Empower yourself and others to act and hold each other accountable for results.

World Bank Group Core Competencies

The World Bank Group offers comprehensive benefits, including a retirement plan; medical, life and disability insurance; and paid leave, including parental leave, as well as reasonable accommodations for individuals with disabilities.

We are proud to be an equal opportunity and inclusive employer with a dedicated and committed workforce, and do not discriminate based on gender, gender identity, religion, race, ethnicity, sexual orientation, or disability.

Learn more about working at the World Bank and IFC including our values and inspiring stories.

At Impactpool we do our best to provide you the most accurate info, but closing dates may be wrong on our site. Please check on the recruiting organization's page for the exact info. Candidates are responsible for complying with deadlines and are encouraged to submit applications well ahead.

Before applying, please make sure that you have read the requirements for the position and that you qualify. Applications from non-qualifying applicants will most likely be discarded by the recruiting manager.

Summary by Impactpool

Candidate Requirements:

Master's degree with 5 years experience or Bachelor's with 7 years
Expertise in Data Engineering and scalable data pipelines
Knowledge of data modeling and integration techniques
Experience with modern data lake architectures and Databricks
Familiarity with DevOps principles and automation
Understanding of Agile environments and cross-functional collaboration
Strong communication and stakeholder management skills

Fellowship

Master the recruitment process of the impact sector!

Become a fellow and gain access to Impactpool's premium material that will boost your career in the impact sector

Explore fellowship

Show World Bank Group that you are interested in them

Recruiters from the organization will be able to view your profile, contact you for current & future vacancies and engage you on opportunities that match your skills and interest.

Connect with organization