Efficiency analysis has never been a simple push-bottom technology. Within a performance assessment, various interactions can intricate the analysis. Indeed, changing the modelling technique, or the input or output variables might result in significantly different efficiency scores. Therefore, a systematic check list with the several phases which are required to assess performance would make efficiency analysis less costly, more reliable, more repeatable, more manageable and faster.
In addition, the increasing performance of computers enables researchers to evaluate and examine larger datasets. Particularly evaluations of large surveys as in education (e.g., the OECD Pisa dataset, the Department for Education and Skills in England (DfES) or the Belgian SiBo), business performance (e.g., World Economic Forum, CEO confidence surveys) or consumer confidence, and the analysis of large statistical databases (e.g., on company performances) became possible by increased computing power. Nowadays, the weakest link lies (again) with the researcher who has to overview the dataset. Indeed, datasets with more than 800 variables (as the Pisa survey) require significant efforts from the researcher. Therefore, researchers start collaborating with different stakeholders (e.g., policy makers, practitioners), who may be novice users of DEA. This in turn makes the analysis more difficult. A standardized process could facilitate the researcher and reduce the possibility of making mistakes. Many studies dealing with large data, e.g., data mining, or analysing complicated processes such as systems engineering, have developed step-by-step frameworks. For example see data mining life cycles of CRISP-DM (CRoss Industry Standard Process for Data Mining) and SEMMA (Sample, Explore, Modify, Model, Assess) and SDLC (Systems Development Life Cycle) as a standard process of developing systems (Olson and Delen, 2008; Cerrito, 2007; Blanchard and Fabrycky, 2006). This paper presents an alternative step-by-step framework which should facilitate the collaboration between stakeholders and researchers.
In this article, we will focus on non-parametric models to examine the performance of entities. Indeed, the user does not observe the production process (i.e., the transformation of inputs into outputs). Whereas parametric models do assume a particular a priori specification on the production process, non-parametric models let the data speak for themselves. In particular, they estimate the relationship between inputs and outputs with minimal assumptions (Charnes et al., 1985). This makes non-parametric models extremely attractive. We will particularly focus on the widely applied non-parametric Data Envelopment Analysis (DEA) model (for an overview of more than 4000 papers published on DEA during 1978 and 2007, see Emrouznejad et al., 2008). Nevertheless, the different phases of the suggested framework are not limited to the traditional DEA model. As also other methods follow similar phases, the framework can be used for a Stochastic Frontier Analysis (SFA, Meeusen and van den Broeck, 1977) or a parametric application with some modification. Remark that particular models (e.g., order-m, bootstrap, SFA; see below) can not be interchanged (e.g., there is no double bootstrap in SFA). Nevertheless, a similar framework can be adopted for parametric methods.
The DEA model is based on a linear programming technique which evaluates the efficiency of entities relative to best practice observations (Charnes et al., 1978). To do so, the researcher has to specify input and output variables. Although this might seem a reasonable task, the effort increases significantly when the available data are growing. To this end, the present paper introduces a step-by-step framework to evaluate large and unexplored datasets. In this sense, the paper links with previous work of Avkiran (1999), Belton and Vickers (1993), Brown (2006), Dyson et al. (2001), Hollingsworth (2008) and Pedraja-Chaparro et al., (1999). Although previous papers already clearly indicated the pitfalls of DEA (Dyson et al., 2001), provided guidelines for novice users (Avkiran, 1999), visual tools for an insightful implementation (Belton and Vickers, 1993), or difficulties and opportunities of efficiency measurement (Hollingsworth, 2008), this paper explicitly targets the mixture of experienced and novice researchers. Indeed, frequently, experienced researchers (e.g., academics or consultants) collaborate with stakeholders (e.g., civil servants or CEOs), who are less aware of the various methodological advances in the literature. Without a clear framework, the stakeholders may refuse the implementation of more advanced techniques (and prefer, e.g., a simple bivariate analysis). Only by a step-by-step analysis, which gradually constructs the ultimate model, inexperienced stakeholders may be persuaded of advanced (non-)parametric methods. As such, (and in contrast to previous literature) the framework is presented as a process model which overcomes problem definition, data collection, model specification and interpretation of the results. The process model provides an ultimate tool to guide novice users through the set-up of an efficiency analysis application.
The contributions of the paper arise from three particular features of the proposed process model that provides both the structure and the flexibility to suit most non-parametric projects for comparison of a set of entities, especially with large number of units.
Firstly, the proposed model for processing non-parametric projects can help us understand and manage interactions in the complex process of efficiency analysis. Therefore, for the novice analyst, the process model provides guidance, helps to structure the project, and gives advice for each phase of the process. This should result in a more reliable model specification (both in terms of modelling technique as in terms of selecting inputs and outputs). The experienced analyst can benefit from a check list for each phase to make sure that nothing important has been forgotten. But the most important role of a standard process is to allow systematic treatment for comprehensive phases in large non-parametric projects which facilitates the process (e.g., by making it more repeatable and less expensive).
Secondly, structure arises from the checklist for setting up non-parametric analyses. Indeed, non-parametric models as DEA (including Free Disposal Hull, FDH, Deprins et al., 1984) are not push-button technologies but on the contrary a complex process requiring various tools to identify the appropriate set of inputs/outputs and select a suitable model. The success of non-parametric projects depends on the proper mix of managerial information and the skills of the analyst.
Thirdly , consider the flexibility. The suggested framework consists of six connected phases which have various feedback loops. This is particularly an attractive feature for the unexperienced stakeholder who will observe that early (methodological) choices can have an effect in later phases.
In sum, the framework helps to link different tools and different people with diverse skills and backgrounds, in order to work on an efficient and effective project.
The paper unfolds as follows. The next section gives an overview of the proposed framework. Each of the sections 3 to 7 describes a particular phase of the COOPER-framework. Indeed, each of the phases has several sub-phases which in turn cover a broad literature. We present the sub-phases systematically. Finally we present some concluding remarks.