The context of Program Evaluation in the Global South. Developing nations are providing increasing evidence that underscores the necessity for improved evaluation frameworks to ensure the long-term sustainability of South-South cooperation. Nations in the global South stress the importance of creating, testing, and consistently applying monitoring and evaluation approaches specifically designed for the principles and practices of South-South and triangular cooperation. Presently, there exists a significant gap in this area, indicating potential shortcomings in the design, delivery, management, and monitoring and evaluation (M&E) of these initiatives. It is crucial to note that the observed challenges do not suggest inherent issues with this form of cooperation but rather indicate possible deficiencies in various aspects (United Nations Office for South South Cooperation, 2018). To fully realize the developmental benefits of South-South and triangular cooperation, especially in reaching excluded and marginalized populations, greater attention must be given to addressing these challenges. As interest in these cooperation modalities grows, stakeholders are calling for discussions on methodologies to assess the impact of these initiatives. However, numerous technical challenges hinder the evaluation process, such as the absence of a universal definition for South-South and triangular cooperation, the diverse nature of activities and actors involved, and varying perspectives on measuring contributions. Various frameworks have been proposed by stakeholders to tackle these challenges. Examples include the framework detailed by China Agricultural University based on China-United Republic of Tanzania collaboration, the NeST Africa chapter’s framework drawn from extensive multi-stakeholder engagement, and the South-South Technical Cooperation Management Manual published by the Brazilian Cooperation Agency (ABC). Additionally, AMEXCID (Mexico) has outlined a strategy for the institutionalization of an evaluation policy, including pilots to assess management processes, service quality, and project relevance and results. While India lacks an overarching assessment system, the Research and Information System for Developing Countries (RIS) think tank has conducted limited case studies to develop a methodological toolkit and analytical framework for assessing the impact of South-South cooperation. In contemporary times, there is widespread acknowledgment that program evaluation initiatives have surged in the Global South. However, the primary focus in the evaluation discourse revolves around narrower aspects such as monitoring and auditing, often driven by the requirements of donors or funders. Moreover, the emphasis on evaluating “impact” often leaves program implementers with insufficient information to enhance program performance or comprehend the underlying mechanisms of program success or failure. This paper explores the gaps and challenges associated with evaluation in the Global South and proposes recommendations to embrace contemporary evaluation approaches that recognize the complexity and context specificity of international development sectors. It also advocates for intentional efforts by researchers, policymakers, and practitioners to build local capacity for designing and conducting evaluations. Program evaluation, the process of generating and interpreting information to assess the value and effectiveness of public programs, is a crucial tool for understanding the success and shortcomings of public health, education, and various social programs. In the Global South’s international development sector, evaluation plays a vital role in discerning what works and why. When appropriately implemented, program and policy evaluation assists policymakers and program planners in identifying development gaps, planning interventions, and evaluating the efficacy of programs and policies. Evaluation also serves as a valuable tool for understanding the distributional impact of development initiatives, providing insights into how programs operate and for whom (Satlaj & Trupti, 2019). Methodological Bias Currently, impact evaluations employing experimental design methods are considered the gold standard in the international development sector. However, there is a growing recognition among evaluation scholars and practitioners of the limitations of “impact measurement” itself. Some argue that a program may not be suitable for a randomized control trial (RCT) and might benefit more from program improvement techniques like formative evaluation. Scholars emphasize the need to reconsider “impact measurement” as the sole criterion for evaluating program success. The discourse has also shifted towards acknowledging the complexity of causality, advocating for evaluators to be context-aware and literate in various ways of thinking about causality. Despite this, the dominance of methods like RCTs often hinders the use of complexity approaches, even when they may be more suitable. Human-Centered Design and Development evaluation Developmental Evaluation (DE) is a form of program evaluation that informs and refines innovation, including program development (Patton, 2011). Formative and summative evaluations tend to assume a linear trajectory for programs or changes in knowledge, behavior, and outcomes. In contrast, developmental evaluation responds to the nature of change that is often seen in complex social systems. DE is currently in use in a number of fields where nonprofits play important roles, from agriculture to human services, international development to arts, and education to health. Another technique that has gained salience around addressing complexity and innovation is human-centered design (HCD) –it shares many parallels with developmental evaluation and attends specifically to the user-experiences throughout the program design process. More generally, it involves a cyclical process of observation, prototyping, and testing (Bason, 2017). Although human-centered design is seemingly focused upon initiation (or program design) and evaluation on assessment after the fact, human-centered design and developmental evaluation share a number of commonalities. Both support rapid-cycle learning among program staff and leadership to bolster learning and innovative program development (Patton,2010; Patton, McKegg & Wehipeihana, 2015). Theory-Driven Evaluation In recent years, theory-driven evaluations have gained traction among evaluators who believe that the purpose of evaluation extends beyond determining whether an intervention works or not. This approach posits that evaluation should seek to understand how and why an intervention is effective. Theory-driven evaluations rely on a conceptual framework called program theory, which consists of explicit or implicit assumptions about the necessary actions to address a social, educational, or health problem and why those actions will be effective. This approach enhances the evaluation’s ability to explain the change caused by a program, distinguishing between implementation failure and theory failure. Unlike impact evaluations using experimental methods, theory-driven evaluations provide insights on scaling up or replicating programs in different settings, explaining the underlying mechanisms responsible for program success. Evaluation Capacity Building Addressing the gaps in