Senior Research Fellow - Data Engineer-National Project Workforce
Hyderabad
- Organization: ICRISAT - The International Crops Research Institute for the Semi-Arid Tropics
- Location: Hyderabad
- Grade: Senior Research Fellow - Data Engineer
-
Occupational Groups:
- Engineering
- Statistics
- Human Resources
- Information Technology and Computer Science
- Scientist and Researcher
- Project and Programme Management
- Closing Date: 2025-07-30
Key skills
● Master’s degree in statistics, mathematics
Data Science
● Strong programming skills in Python and R
Additional/Preferred Skills
Experience with vector databases
Preferred Qualifications
● Master’s degree
Senior Research Fellow - Data Engineer
ICRISAT seeks applications from motivated and dynamic Indian nationals for the position of a Senior Research Fellow – Data Engineer under the Monsoon Mission-III project supported by the Ministry of Earth Sciences (MoES), Government of India. The role is focused on designing and managing robust data pipelines, integrating and processing multi-source datasets (climate, weather, agronomic), and supporting the backend data infrastructure required for generating AI-powered agro-advisories. The position also contributes to enabling AI workflows such as agent-based advisory generation, fine-tuning of ML/LLM models, and vector DB integration to support personalized recommendations and dashboards. This role is critical to advancing the organization’s mission of improving agricultural productivity and sustainability in semi-arid regions across Asia and sub-Saharan Africa through advanced remote sensing applications in agriculture.
ICRISAT is a non-profit, non-political organization that conducts agricultural research for development in Asia and sub-Saharan Africa with a wide array of partners throughout the world. Covering 6.5 million square kilometers of land in 55 countries, the semi- arid or dryland tropics has over 2 billion people and 644 million of these are the poorest of the poor. ICRISAT and its partners help empower these disadvantaged populations to overcome poverty, hunger and a degraded environment through better agricultural production systems.
ICRISAT is headquartered at Patancheru near Hyderabad, India, with two regional hubs and eight country offices in sub-Saharan Africa. ICRISAT envisions a prosperous, food-secure and resilient dryland tropics. Its mission is to reduce poverty, hunger, malnutrition and environmental degradation in the dryland tropics. ICRISAT conducts research on its mandate crops of chickpea, pigeonpea, groundnut, sorghum, pearl millet and finger millet in the arid and semi-arid tropics. The Institute focuses its work on the drylands and in protecting the environment. Tropical dryland areas are usually seen as resource-poor and perennially beset by shocks such as drought, thereby trapping dryland communities in poverty and hunger and making them dependent on external aid. Please visit - www.icrisat.org
Responsibilities:
Design, develop, and maintain data pipelines and architectures to support real-time historical data integration from different sources and datasets
Clean, transform, and harmonize multi-source data to enable seamless analytical workflows and input-ready formats for models.
Set up and manage relational and NoSQL databases to ensure scalable and query efficient storage for climate, soil, crop, and farmer data.
Support and automate statistical analysis and ML workflows, including bias correction of gridded datasets, Forecast skill score analysis (e.g., RMSE, correlation, CRPSS),
Uncertainty quantification and scenario-based projections, Historical pattern analysis and seasonal outlook assessments
Collaborate with domain experts to operationalize forecast-informed decision rules and integrate them into automated pipelines.
Contribute to machine learning and AI workflows, including model training, inference, embedding generation, and RAG-based pipelines for advisory automation.
Enable agentic LLM-based workflows for dynamic, context-specific agro-advisories using AI platforms.
Create interactive dashboards and analytical tools using Power BI and Python/R for internal teams and external stakeholders.
Ensure documentation, version control, and reproducibility of all data workflows and modeling pipelines
Essential Qualifications:
Master’s degree in statistics, mathematics, data science, Agricultural Engineering, or related quantitative disciplines.
A minimum of 1 year of experience in statistical modeling, or applied machine learning, preferably in agriculture, climate science, or environmental domains.
Strong programming skills in Python and R, with hands-on experience in statistical computing, machine learning workflows, and automation of data analysis.
Proven expertise in handling large datasets and working with SQL/NoSQL databases, including database design, data cleaning, and transformation. Apache Druid is an advantage.
Familiarity with forecast data analysis, including bias correction, skill score computation (e.g., MAE, RMSE, correlation), and evaluation of seasonal/short-term climate models.
Proficiency in Power BI or other visualization tools for interactive reporting and dashboards.
Working knowledge of machine learning libraries (e.g., scikit-learn, XGBoost, TensorFlow) for predictive modeling, clustering, and classification tasks.
Strong written and verbal communication skills, with the ability to work in interdisciplinary teams.
Desirable Qualifications:
Experience with vector databases (e.g., FAISS, ChromaDB, Pinecone) for embedding search and integration into AI workflows.
Familiarity with Retrieval-Augmented Generation (RAG) pipelines, LLM fine-tuning, and agentic architectures for building AI systems.
Hands-on experience with climate and weather datasets (e.g., IMD, CHIRPS, ERA5)
Exposure to bias correction methods, forecast verification metrics, and scenario-based projection models in the context of climate
Understanding of time-series analysis, spatio-temporal data handling, and uncertainty quantification.
Knowledge of cloud data services (e.g., AWS S3, Google BigQuery), APIs for automated data ingestion, and distributed processing tools.
Familiarity with Jupyter Notebooks, RMarkdown, and version-controlled collaborative workflows using Git.
Experience working in interdisciplinary teams involving agriculture, meteorology, and data science
General:
This is a contractual role for a period of 36 months (3 years).
How to apply:
The position will remain open until a suitable candidate is identified. Shortlisting will start from 30 July 2025. All Applicants should apply with their latest Resume, and the names and contact information of three references that are knowledgeable about their professional qualifications and work experience. All applications will be acknowledged; however, only short-listed candidates will be contacted.
ICRISAT is an equal opportunity employer and is committed to increasing diversity and maintaining a progressive and inclusive workplace. We welcome applications from all qualified candidates regardless of their ethnicity, race, gender, religious beliefs, sexual orientation, age, marital status or whether they have a disability.
Applications from non-qualifying applicants will most likely be discarded by the recruiting manager.