preliminary version

Working Paper

99-08

Maximum Entropy specification of PMP in CAPRI Thomas Heckelei, Wolfgang Britz

University of Bonn Universität Bonn

Thomas Heckelei, Ph.D. is a lecturer and research associate at the Institute of Agricultural Policy, University of Bonn. His current research areas are quantitative agricultural sector modelling and econometric methodology. Phone: URL: E-mail:

+49-228-73 23 23 agp.uni-bonn.de\agpo\staff\heckel_e.htm [email protected]

Dr. Wolfgang Britz is a research associate and lecturer at the Institute for Agricultural Policy, University of Bonn, and is specialising in quantitative economic modelling. In the CAPRI group Bonn, he is responsible for the methodological concept and the EDP realisation. Phone: URL: E-mail:

+49-228-73 25 02 agp.uni-bonn.de\agpo\staff\britz_e.htm [email protected]

Institute contact Phone: +49-228-72 23 31 Fax: +49-228-98 22 923 Address: Institute of Agricultural Policy, Nussallee 21, D-53115 Bonn, Germany

The series "CAPRI, Working papers" contains preliminary manuscripts which are not (yet) published in professional journals and are prepared in the context of the project „Common Agricultural Policy Impact Analysis“, funded by the EU-Commission under the FAIR program. Comments and criticisms are welcome and should be sent to the author(s) directly. All citations need to be cleared with the author(s).

Maximum Entropy Specification of PMP in CAPRI 1 Introduction .....................................................................................................................................1 2 The Maximum Entropy approach to Positive Mathematical Programming.............................1 2.1 Reminder on PMP

..............................................................................................................1

2.2 Introduction to Maximum Entropy (ME) Estimation....................................................................3 2.3 Combination of Maximum Entropy Estimation and Positive Mathematical Programming ..............................................................................................................5 3 A PMP-ME approach based on a cross sectional sample ............................................................8 3.1 Rationale and mathematical formulation.......................................................................................8 3.2 Curvature restriction

............................................................................................................10

4 An application to crop production in France .............................................................................11 4.1 Definition of support points.........................................................................................................11 4.2 Estimation and simulation results................................................................................................12 5 Summary and Conclusions ...........................................................................................................15 6 References ......................................................................................................................................16

Maximum Entropy Specification of PMP in CAPRI 1 Introduction This paper deals with the specification of the non-linear objective functions in the regional programming models of CAPRI based on Positive Mathematical Programming (PMP). The application of PMP allows to calibrate the supply models to observed base year levels (HOWITT 1995). Compared to calibration techniques based on bounds, e.g. so-called flexibility constraints, the methodology has two clear advantages with respect to the simulation behaviour of the resulting model: • the response is not restricted by weakly justified constraints • the response is smooth compared to a linear programming problem. Conditions necessary for perfect calibration allow still for infinite different specifications of the objective function, each one corresponding to a specific behaviour of the calibrated models in simulations. Rather arbitrary approaches have been reviewed in HECKELEI (1997). PARIS and HOWITT (1998) generalise and objectify the choice of a specific objective function in the calibration step by employing a "Maximum Entropy (ME)" procedure. Their approach which opens new possibilities for CAPRI is reviewed and discussed in section 2 pointing at some theoretical and practical drawbacks. The subsequent section describes in detail a ME-PMP approach for crop production in the CAPRI model attacking these problems. It is designed to exploit information contained in a cross sectional sample of NUTS 2 regions to specify - regionally specific - quadratic objective functions with cross effects for crop activities. Section 4 presents results of an ex-post simulation for France to illustrate the simulation behaviour of the resulting model. The last section summarises the findings, draws conclusions with respect to applicability and interpretation of the approach and identifies possible directions of further research.

2 The Maximum Entropy approach to Positive Mathematical Programming

2.1 Reminder on PMP First we remind the reader of the two steps involved in PMP to calibrate typical linear programming models to observed activity levels (See HOWITT 1995 and HECKELEI 1997 for a more detailed description). The general idea of PMP is to use information contained in dual variables of a LPproblem1 bounded to observed activity levels by calibration constraints (Step 1), in order to specify a non-linear objective function such that observed activity levels are reproduced by the optimal solution of the new programming problem without bounds (Step 2).

1

The method can be applied to NLP problems as well. In order to ease the understanding, a simple but general layout of a LP is discussed in here.

2

Step 1 of this procedure is formally described in the following way: Max Z = p' y − c' x x

subject to (1)

x A ≤ b y

[]

x ≤ (x o + ) [ x ≥ [0]

]

where2 Z = objective function value p = (n×1) vector of product prices x = (n×1) vector of production activity levels y = (l×1) vector of sales and purchase activities c = (n×1) vector of variable cost per unit of activity A = (m×(n+l)) matrix of coefficients in resource constraints b = (m×1) vector of available resource quantities π = (m×1) vector of dual variables associated with the resource constraints λ = dual variables associated with the calibration constraints xo = (n×1) vector of observed production activity levels ε = (n×1) vector of a small positive numbers The addition of the calibration constraints forces the optimal solution of the linear programming model (2) to almost perfectly reproduce the observed base year activity levels xo, given that the specified resource constraints allow for this solution (which they should if the data are consistent). "Almost perfectly" is defined by the range of the positive pertubations of the calibration constraints, ε, which are introduced to prevent linear dependencies between resource and calibration constraints. The latter would provoke degenerate dual solutions with marginal values arbitrarily distributed across resource and calibration constraints. In Step 2 of the procedure, the λ are employed to specify a non-linear objective function such that the marginal cost of the preferable activities are equal to their respective revenues at the base year activity levels xo. Given that the implied variable cost function has the right curvature properties (convex in activity levels) the solution to the resulting programming problem will be a "boundary point, which is the combination of binding constraints and first order conditions" (HOWITT, 1995, p.330) and equal to the primal result of (1). Further on, dual values on the resource constraints, π, will be recovered as well. Any non-linear convex cost function with first derivatives correctly calibrated will reproduce the base year solution. The models simulation behaviour, i.e. when all or parts of p, c, A and b are changed depends to a large extent on the matrix of second derivatives of the objective function. Information on the latter are not contained in the dual values of the calibration constraints λ obtained from the solution of (1). We will shed some light on this important problem in the following. For reasons of computational simplicity and lacking strong arguments for other types of functions, and because this functional form has been chosen below for the specification of the regional

2

Matrices and vectors are printed bold.

3

programming models in CAPRI, we will illustrate the ME estimation suggested by PARIS and HOWITT (1998)3 with the following general version of a quadratic variable cost function: (2)

C v = d' x +

1 x' Qx 2

with Cv = variable costs d = (n×1) vector of parameters associated with the linear term and Q = (n×n) symmetric4 positive definite matrix of parameters associated with the quadratic term of Cv. The parameters of (2) need to be specified such that (3)

∂C v (x o ) = MC v = d + Qx o = c + . ∂x

This specification problem is "ill-posed", because the number of parameters to be specified (= n+n(n+1)/2) is greater than the number of observations (= n observations on marginal cost). It should be pointed out again that the simulation behaviour of the resulting model will differ drastically within the feasible set of (3) depending on the matrix of second derivatives, Q. However, the number of parameters allows for enough flexibility to potentially include further constraints relating to Q. Traditional econometric approaches could handle this type of problem if an appropriate number of a-priori restrictions on the parameters leave enough degrees of freedom. Early applications of PMP go without any type of estimation by setting off-diagonal elements of Q to zero and calculating the remaining parameters by some standard approach (see HECKELEI 1997 for a discussion). Although these approaches work perfectly well with respect to the calibration property of PMP by setting appropriate first order derivatives of the objective function according to (3), the resulting simulation behaviour is completely arbitrary as it is not taken into account. PARIS and HOWITT suggest to use ME estimation in this context with the following advantages: § It allows for a more objective specification of the parameters of the non-linear cost function based on an "econometric type" criterion. § It allows to employ different functional forms for the objective function. § It has the potential of incorporating more than one observation on activity levels into the specification of the parameters, thereby broadening the information base for the specification (see also PARIS 1997 for the application of the least squares criterion in this context). § It decreases the need to decide on a priori restrictions on the parameters compared to a traditional econometric approach. 2.2 Introduction to Maximum Entropy (ME) Estimation GOLAN, JUDGE and MILLER (1996) have introduced Maximum Entropy estimation to a wider community of applied econometricians. "Entropy" is a measure for the amount of „chaos“ in a 3

In the 1998 article, PARIS and HOWITT show the general applicability of their approach also with respect to other functional forms. Compared to equation (2) they choose, however, a somewhat restricted quadratic functional form by excluding linear parameters.

4

The second cross derivatives of the cost function (2),

∂C v ∂C v = = 0.5 ⋅ (Q ij + Q ji ) are symmetric by Young's ∂x j∂x i ∂x i ∂x j

theorem. Therefore, we can directly assume symmetry of the Q-matrix (Qi,j = Qi,j ∀ i,j) with no loss of flexibility.

4

system. The more chaos exist the higher the entropy of a system. In information theory the equivalent to chaos is the "noise" that exist when transferring a signal or a message. Let us imagine M previously known possibilities of interpretation of a message, each having - a priori - the same probability. For example, we know that somebody shouts a number between 0 to 9 to us, i.e. we have K = 10 different possible interpretations of what we hear. The entropy of the message is smallest if the probability of one of the K interpretations is equal to one and, consequently, all other probabilities are zero. In this case the informational content of the message is maximised because the identification of the outcome can be done with probability one. Contrary, the entropy is maximised and the informational content minimised, if the probabilities for all M outcomes are the same after hearing the message. The message was just "noise" and added no information to our a priori knowledge. Given that k = 1,...,K is the set of discrete possible informational contents of a message and pk the probability that k is true, the entropy is defined by K

(4)

H(p) = −∑ p k ln p k = −p ′ ln p , k =1

where pk ln pk = 0 for pk = 0. It can be easily verified that H(p) is maximised when all probabilities are equal, i.e. for an uniform distribution over all discrete points k, and minimised when one of the probabilities is equal to 1 and all other zero. The probabilities pk allow to calculate expectations. If we hear only noise in our example with the numbers between 0 and 9, the expectation is 4.5. If we hear something that could have been either 1 or 2 each with probability 0.5, our expectation is 1.5. The entropy is now much smaller, because we know more than before: all other probabilities are 0. This example also illustrates that the expectation does not have to be equal to one of the "support points" (which is how the possible outcomes are usually called), but usually lies somewhere in between. The problem of identifying the "message" in our data is a common one in empirical economics. For example, the elements of a parameter vector β of length M from a linear model (5)

y = Xβ

cannot be uniquely determined if the number of observations on y and X are less than the number of elements in β (t = 1,..,T with T

(6)

E[β m ] = ∑ z k ,m p k ,m ∀ m k =1

which can be interpreted as the "estimates" of the parameters. We now search for the set of probabilities of the support points that add the least amount of information to be consistent with the data, i.e. we maximise the entropy in the following way:

5

K

M

maxH(p) = −∑ ∑ p k ,m ln p k ,m p k =1 m =1 M

s.t. ∑ x m, t E[β m ] = y t , ∀ t (7)

m =1

K

E[β m ] = ∑ z k ,m p k ,m , ∀ m k =1

K

∑ p k ,m = 1, ∀ m k =1

The first set of constraints guarantees that the resulting expectations of the parameters satisfy exactly the t observations in the form of our linear model. The second set of constraints defines the expectations of the parameters, and the third guarantee that the probabilities over all support points of a parameter βm sum up to 1. Contrary to classical econometric problems, no error term is necessary here because the problem is underdetermined. There is an infinite number of parameter vectors which fit the data exactly with the sum of squared errors equal to zero. Therefore, sum-of-square-error-minimisation as an estimation criterion is not feasible, instead the maximum entropy criterion is used (see GOLAN, JUDGE and MILLER 1996, p. 85ff. for a treatment of "generalized" maximum entropy formulations with included error terms). The crucial problem for empirical applications of the entropy approach is the choice of the support points. In the example above, the support points from 0 to 9 were previously known. But if we intend to estimate, for example, a demand system with a large number of parameters it is rather difficult to introduce appropriate support points for the parameters of the system, especially if any prior information only exists on the sign and magnitude of elasticities which are complex functions of parameters and variables. For the case of general a-priori ignorance about the parameters it is suggested to distribute the support points over a large interval, i.e. between a large negative to a large positive number, since the larger the interval of the supports the less the entropy criterion penalise deviations from the a priori expectation. However, for most economic models a general idea about the order of magnitude of certain parameters does exist and the support point can be chosen accordingly. 2.3 Combination of Maximum Entropy Estimation and Positive Mathematical Programming Let us now look back at the problem of "estimating" the parameter vector d and the matrix Q of the variable cost function (2) in the second step of the PMP approach. First, we need to define support points for the parameters. As a starting point one could centre the linear parameters d around the observed accounting cost per unit of the activity, c. For example, we could choose 4 support points for each parameter by setting5

5

The variance of the maximum entropy estimates is negatively correlated with the number of support points defined and has a limit value for an infinite number of support points (see GOLAN, JUDGE and MILLER 1996, p.139). There is no general rule for the "right" number of support points, but tests with our models have shown that choosing more than 4 support points does not change the numerical results of the calculated parameter expectations by an extent of any practical relevance.

6

(8)

− 3 ⋅ ci − 1⋅ c i zd i = ∀i = 1,..., n + 1 ⋅ ci + 3 ⋅ c i

In the case of the Q-matrix we have to distinguish the diagonal (= change in marginal cost of activity i with respect to the level of activity i) from the off-diagonal elements (= change in marginal cost of activity i with respect to the level of activity j). Given that the a priori expectation for the linear parameter vector d are the accounting costs (supports centred around ci in equation (8)), it is consistent with condition (3) to centre the support points for qii around λi/xoi and the off diagonal elements qij around zero. The centre of the support points λi/xoi for the diagonal elements are positive, a necessary condition for convexity of Cv. A suitable specification for the support points of Q would then be

(9)

zq i,i

− 3 ⋅ λ i / x oj 2 ⋅ λ i / x io 4 o − 1 ⋅ λ i / x oj 3 ⋅ λi / xi ∀ i ≠ j ∧ i, j = 1,..., n = ∀ i = 1,..., n and zq i, j = + 1⋅ λ / x o 2 ⋅ λ i / x io i j 3 o o 3 / x + ⋅ λ 0 / x ⋅ λ i j i i

Denoting the probabilities for the K support points zdi, i = 1,...,n, and zqij, i,j = 1,...,n, as pdk,i and pqk,i,j, respectively, the expectations of the parameters are calculated as K

E[d i ] = ∑ pd k ,i zd k ,i , ∀ i = 1,..., n (10)

k =1

[ ]

K

E q i, j = ∑ pq k ,i, j zq k ,i, j , ∀ i, j = 1,...n k =1

Based on the symmetry of Q, we can formulate the following ME problem: K

n

K

n

n

maxH(p) = −∑∑ pd k ,i ln pd k ,i − ∑∑∑ pq k ,i, j ln pq k ,i, j p k =1 i =1 k =1 i =1 j=1 subject to K

∑ pd k ,i = 1, ∀ i k =1

(11)

K

∑ pq k ,i, j = 1, ∀ i, j k =1 K

∑ pd k ,i zd k ,i = E[d i ] , ∀ i k =1 K

∑ pq k ,i, j zq k ,i, j = E[q i, j ] , ∀ i, j k =1

n

E[d i ] + ∑ E[q i, j ] x oj = c i + λ i , ∀ i j=1

E[q i, j ] = E[q j,i ], ∀ i < j

7

Note that the formulation of support points in equations (8) and (9) imply that the expectations of the parameters associated with a uniform distribution over the associated probabilities fulfil the "data constraint" of the estimation problem represented by equation (3). At this point we need to hold for a moment and need to address the question under what conditions a ME formulation for estimating the parameters of the quadratic cost function deems useful. If we have only a 1×n vector of marginal cost available (from calibrating one linear programming problem to one base year solution) and define the support points according to (8) and (9), the ME criterion will have no reason to deviate from the uniform distribution over the probabilities, since the centre of the support points already satisfy the data constraints. Specifically for our case, the resulting parameter estimates will be exactly the ones implied by the "standard approach" as defined in HECKELEI (1997), i.e. linear parameters of the cost function are equal to the respective activity's accounting costs, the off-diagonal elements of the Q-matrix are zero, and the diagonal elements are equal to λi/xoi. The ME application yields different results (including nonzero cross effects between activities) only if either one of the following conditions are met: 1. The support points are centred around values that do not satisfy the data constraints. 2. Additional "hard" information is included in the form of other data constraints. Regarding point 1, centring the support points around other values is easily done. As the support points imply that the expectation based on the uniform distribution become a priori the highest probability in the ME approach, these settings should be carefully guided by the amount of information available on the value of the parameters. Otherwise, arbitrary results will be achieved which will ultimately define the simulation behaviour of the model. This occurs, for example, in the approach of PARIS and HOWITT (1998) who reparameterise the Qmatrix based on a LDL' (Cholesky) decomposition to ensure appropriate curvature properties of the estimated cost function. They set support points for the L and D matrices as if L and D would directly represent off-diagonal and diagonal elements of the Q matrix, respectively, implying - in our view - arbitrary a priori expectations for the parameters of Q. As a consequence they obtain nonzero cross costs effects of activities from their ME solution merely based on this purely technically motivated choice of support points. Since they just have one vector of marginal costs available from calibrating one model, information provided by the data is rather limited and the choice of support points is extremely influential. With respect to point 2, introducing additional "hard information", further ideas have already been implemented to specify the diagonal elements of Q: Parameters of the linear and quadratic terms have been chosen to reflect exogenously given yield distributions and to force average cost to be equal to accounting cost (HOWITT 1995). Others calibrated the parameters to own-price elasticities for the base year levels. All approaches allowed the direct calculation of the parameters based on the marginal cost equations and additional conditions without employing any "estimation" criterion. The ME formulation was introduced by PARIS and HOWITT to accommodate more general functional forms with more parameters typically including cross cost effects between activities. Although they hint in their conclusions at the possibility of including more observations into the analysis, their own examples stick to one observation. Since other valid information is also not introduced by setting the support points (see above), there is no reason to believe that the recovered cost function is any closer to reality than the results of ad-hoc approaches without ME estimation. Especially, since any information on second derivatives is not taken into account, the resulting simulation behaviour of the model is at least as unpredictable as in case of the standard approaches.

8

3 A PMP-ME approach based on a cross sectional sample As a consequence of the considerations at the end of the last section, we searched for a possibility to include more than one observation on marginal cost. The additional information will hopefully allow to specify cost functions for the regional programming models that imply a valid substitution behaviour between activities for simulation purposes. The CAPRI data base allows for the possibility to use time series as well as cross sectional samples for this purpose. Since the technical manageability seemed (and proved) to be a restrictive factor and the inclusion of the time domain would considerably increase the complexity of the analysis we first restricted our approach to the use of a cross sectional sample. However, we later use time series observations to validate the resulting model specification. 3.1 Rationale and mathematical formulation Our objective here is to estimate a quadratic cost function with cross cost effects (full Q-matrix) between crop production activities. Suppose one can generate R 1×n vectors of marginal costs from a set of R regional programming models by applying the first step of PMP. In order to exploit this information for the specification of quadratic cost functions for all regions, we need to define appropriate restrictions on the parameters across regions, since otherwise no informational gain is achieved. Consider the following suggestion for a "scaled" regional vector of marginal cost applied to crop production activities: MC rv = d r + Q r x or (12) Q r = ar cpi cr S r BS r ' Lr s r ,i ,i =

1 Lr

∑ x or,i r

x or ,i

∑ Lr r

with r = regional index ar = average revenue per ha over all regions, i.e. ar =

∑ p ' y r / ∑ Lr r

r

cpi = "crop profitability index" defined as regional average revenue per ha relative to p' y r / L r average revenue per ha over all regions, i.e. cpir = ar c = exponent of the crop profitability index to be estimated Sr = diagonal scaling matrix for region r B = n×n parameter matrix to be estimated Lr = total arable land in region r Equation (12) shows that we introduce a parameter restriction on the regional matrices of slopes Qr by assuming that differences between regions depend on •

the factor (ar cpirc) which reflects differences in regional profitability. The exact magnitude of the effect is defined endogenously by estimating the exponent of the crop profitability index cpi. The latter captures the economic effect of differences in soil, climatic conditions etc. as no other information seemed suitable and manageable for the time being.

9

•

the relations between the crop share on the sectoral to the regional level (pre- and postmultiplying B with Sr). The effect of the scaling can be understood if we compare two regions with identical total area but different shares of a crop. If the Qr matrices are identical for the two regions the same absolute change in levels causes the same change in marginal cost. Let's assume the increase equals 10 ha. If region one would have a crop share equivalent to 1 ha, the relative increase in area would be 1000%. For region two with an assumed level of 100 ha, it would be just 10%. The scaling of the B matrix by one over the crop's share assures the same marginal cost increases in both regions for the same percentage increase in crop acreage. In order to allow setting of the support for B without reflecting differences in average crop shares across regions, we have normalised the scaling matrix by introducing the average sectoral share.

•

reciprocal of total land available in the region. The reasoning follows the argumentation above: the relative increase in marginal cost provoked by a rotational change starting from the same rotation should not depend on total area in the region.

The specification implies that - apart from the effect of the crop profitability index - the Qr's are identical for regions with the same crop rotation. The reader may notice that Lr can be substituted out of the equation (12). We think, however, that the motivation for the specification given above can be more easily understood based on the given notation. We motivated the use of more than one observation by the fact that second derivatives of the cost function strongly influence the simulation behaviour of the model. Where does this information hide in equation (12)? Observed rotations and marginal costs recovered by the calibration step differ between regions. The matrix B - common across regions - is estimated as to describe the differences in marginal costs depending on the differences in crop shares. The parameters are now estimated such that changing region i's rotation to the rotation in region j causes changes in marginal cost matching the observed differences between the two regions (again apart from the effect of the crop profitability index). These differences in first derivatives comprise information about the second derivatives which are relevant for simulation runs. This is the important contribution of the cross-sectional analysis: the simulation behaviour resulting from the ME problems is not longer depending in an arbitrary way on the support points, but is based on a clear hypothesis about the relation between crop rotation and marginal costs. The linear terms dr, in equation (12) allow for enough degrees freedom to fit the data. Compared to previous applications of PMP in aggregate models, this approach - for the first time - allows to base the simulation behaviour implied by the estimated cost function on data rather than on arbitrary parameter restrictions. The general formulation of the appropriate ME problem is straightforward but certainly not easy to read and digest:

10

K

n

R

K

n

n

K

maxH(p) = −∑∑∑ pd k ,i,r ln pd k ,i,r − ∑∑∑ pb k ,i, j ln pb k ,i, j −∑ pc k ln pc k p k =1 i =1 r =1 k =1 i =1 j=i k =1 subject to K

∑ pd k ,i,r = 1, ∀i, r k =1 K

∑ pb k ,i, j = 1, ∀i and j ≥ i k =1 K

∑ pc k = 1 k =1 K

∑ pd k ,i,r zd k ,i,r = E[d i,r ] , ∀i, r k =1 K

(13)

∑ pq k ,i, jzb k ,i, j = E[bi, j ] , ∀i and j ≥ i k =1 K

∑ pc k zc k = E[c] k =1

n

x oj,r

j=1

Lr

E[d i,r ] + ar cpi Er[ c ] ⋅ ∑ s i,i s j, j E[b i, j ]

= c i,r + λ i,r , ∀i, r

E[b i, j ] = E[b j,i ], ∀i < j

3.2 Curvature restriction The current formulation in (13) does not guarantee that a positive definite matrix B - and consequently positive definite matrices Qr - will be recovered. A violated curvature property might result in a specification of the objective function that does not calibrate to the base year, since only first order but not second order conditions for a maximum are satisfied at the observed activity levels. A theoretical solution to this problem is to include sufficient conditions for positive definiteness into the ME Formulation by restricting all principal minors of B to be positive. For more than didactic sizes of models, however, the necessary number of constraints and their highly non-linear character renders this solution computationally infeasible. As mentioned above, PARIS and HOWITT (1998) suggest to use a reparameterisation of the models based on the Cholesky decomposition (14) B = LDL’ in the ME step. L is a n×n unit lower triangular matrix and D a n×n diagonal matrix with all positive elements. Instead of defining support points for B directly, they now need to be defined for the relevant elements of L and D. As long as it is guaranteed that D has only positive elements on its diagonal (by defining only non-negative support points for the elements of D), LDL' will always produce a positive definite matrix B. However, the procedure has two disadvantages: 1. The results for B depend on the order of rows in the matrix - a quite disturbing effect.

11

2. The relation between supports for L and D and such on B is rather complex and not reflected by applications of LDL' in the context of PMP as far as we know. In order to circumvent these problems, a "classic" Cholesky decomposition of the form Q = LL' is indirectly used as constraints of the problem, where is L is a lower triangular matrix whose elements can be calculated from B as follows: i −1

l i,i = b i,i − ∑ l i2,k k =1

(15)

= b j,i − ∑ l j,k l i,k l i,i k =1 i −1

l i, j

∀i = 1,.., n; j = i + 1,.., n

Because B is supposed to be a symmetric and positive definite matrix, the expression under the square root is always positive and all li,j are real (GOLUB & LOAN 1989). Appropriate lower bounds on li,i deviating from zero ensure that the second equation in (15) is defined and that the resulting cost function is "curved enough" to avoid numerical instabilities during simulation runs. For the 15×15 size of the matrix B in the application presented below the solver still handles this implementation of the Cholesky decomposition without any problem6.

4 An application to crop production in France In this section we describe an application of the suggested approach to CAPRI's regional programming models for France for the base year 1990. This allowed an ex-post simulation experiment across the CAP-reform of 1992. Before turning to the results of this experiment the specification of the support points for the parameters needs to be presented. 4.1 Definition of support points The support points for the exponent c of the crop profitability index cpi in (13) are defined as (16) zc = {− 2,− 2 3 ,+ 2 3 ,+2} so that the index cover the range from 1/cpir2 to cpir2. We expected the parameter c to be rather close to 0, which would imply only a small relevance of the general cropping conditions on changes in marginal costs. The linear terms d reflect marginal costs when all production activity levels x are zero. Since an interpretation in economic terms is hardly possible and irrelevant - especially as "fallow land" is one of the production activities - the spread of the support points zd is consequently defined as an ignorance prior where ar reflects the national average in revenue per ha and

6

In earlier test, a pragmatic solution was chosen for the curvature problem by incorporating the first and second order minors in the optimisation step, i.e. extending the conditions in (13) by E b i ,i ≥ 0, ∀i and

[ ] [ ] [ ]

[ ]

E[b i,i ] E b j, j ≥ E b i, j E b j,i , ∀i, j . Furthermore, it was deemed suitable to restrict all off-diagonal elements to

be smaller than diagonal elements: E

[b i,i ] E[b i,i ] ≥ E[b j,i ] E[b j,i ], ∀i, j

The resulting matrix is checked for its

definiteness property by using so-called modified "Cholesky" decomposition after the ME step which ensures definiteness by employing optimal correction factors to the diagonal elements, only (GILL, MURRAY & WRIGHT 1989, pp. 108 ff.). The proceeding has proven to be operational for very large matrices.

12

(17) zd = {− 90,−30,+30,+90} ar The support points for B are defined as follows: (18) zb i, j

{+ 10.,+6.66,+3.3,+0.001} ∀ i = j = MC i + MC j 2 2 {− 2,− 3 ,+ 3 ,+2} ∀ i ≠ j

(

∑ Lr

) (x ∑

r o i ,r

+

x oj,r

)

1 ar

r

The division by average profitability just offsets the multiplication in (12)7. In connection with equation (12) we motivated already a scaling with total land available in the region. Since the B matrix is uniform across regions, one must again abstract from regional size when defining the support points and relate marginal costs MC instead to average national crop shares ∑ x io,r + x oj,r ∑ L r . Leaving the scaling vectors in brackets aside for the moment we are left for

(

r

)

r

the diagonal elements bi,i with average marginal cost divided by average crop share. However, the scaling vector for diagonal elements in (18) is not centred around 1, since we know that they have to be positive but do not have any clear idea of their maximum size. Off-diagonal elements are centred around zero to allow for cross cost effects in both directions but their absolute size is assumed to be smaller than the one of diagonal elements. We experimented with different scaling vectors of the support points and they certainly have effects on the outcome of the estimation, especially if the supports become significantly smaller. However, the ultimate specification presented above leaves considerable ranges for the parameters. Parameter estimates were rather stable for variations of the support point scaling around this size. Therefore, we conceive the estimation of the matrix B as mainly "data-dominated". The influence of the support points on the outcomes is anyway considerably smaller as in applications with just one observation. 4.2 Estimation and simulation results The approach discussed above estimates a non-linear cost function depending on crop production activity levels based on observed regional differences in marginal costs at just one point in time. Naturally, doubt may be raised if that cross-sectional information can be just mapped in the time domain by assuming that changes in crop rotation over time in each single region have similar effect on variable costs as the differences in observed crop rotations for a set of regions at one point of time. We consequently check the resulting simulation behaviour of the models in an ex-post simulation exercise. We started the evaluation of the results with simulation experiments based on 10 % increases of prices and calculated the point elasticity of aggregated national change in area related to the price change. Based on the "conservative" supports defined for B, the resulting elasticity matrices looked normal to us, as diagonal elasticities were in the range between 0.5 and 2 and the cross elements where smaller with signs in both directions. But our knowledge - and we assume the knowledge of most other modellers - does not allow to judge if the elasticities are close to the "true" ones, especially in this case where production activities are rather differentiated. We would accept elasticity matrices as plausible whose simulation results could easily differ by factor four for certain crop shares.

7

The formulation related to ar normalises B during the program run which avoids numerical problems with the cholesky decomposition.

13

Instead, we opted for an ex-post validation - given limited time for just one EU-Member State. We opted for three year averages for the calibration year and simulation year based on the yearly data in the CAPRI data base. Although the natural choice for a German team would be data for Germany, we chose the 22 NUTS-2 regions in France, because German data were deemed unsuitable due to noise introduced by unification. Given data availability, we used averages from 1989 to 1991 ("1990") for the calibration and from 1993 to 1995 ("1994") for the simulation. The move from 1991 to 1994 has the advantage that the 1992 CAP reform lays just in between which offers a good opportunity to test the model under a significant policy change. However, some restrictions apply. We had no data on the participation in voluntary set-aside programs before the CAP-reform - therefore important information was left out in the calibration step. Naturally, no data on obligatory set-aside and non-food production, both introduced by the 1992 CAP reform, entered the calibration for 1990. We therefore had to make some assumptions regarding these activities: •

The parameters in d and B relating to voluntary and obligatory set aside were set equal to the ones obtained for fallow land in 1990, assuming that they have the same rotational effects as represented by the cost function. Nevertheless, voluntary and obligatory set-aside are still treated in simulation according to the policy formulation in the CAP-reform, i.e. they are linked to the production of grandes-cultures in the appropriate way (see below).

•

The driving forces to non-food production on set-aside were unknown to us with respect to hard quantitative information. We know that prices paid for non food rape, for example, differed drastically from the average rape prices during these years and the availability of production contracts for them further plays an important role. We saw little hope to reliably model nonfood production endogenously without data on it in the calibration step. Therefore, we fixed non-food production to known levels in 1994. As non-food has a share around 10 % on oilseeds in total, the resulting improvement in the model's fit is not dramatic. Furthermore, we also applied this assumption to the approaches compared to the ME-PMP calibrated model.

The set-aside regulation is modelled by constraints: the obligation must be fulfilled by an appropriate level of obligatory set-aside or non-food production on set-aside. Voluntary set-aside may be added as long as the sum of total set-aside including non-food production does not exceed 33% of the endogenously determined grand culture area. Premiums are cut if regional base areas are exceeded. As the presented ME-PMP approach is only suitable for annual crops, we fixed animal production and perennials to observed levels in 1994. In order to judge if the new methodology has comparative advantages, we included a "standard PMP" approach in the ex-post validation as well. Here, only diagonal elements of B are specified such that the linear and quadratic terms for each production activity i implicitly define average variable cost ( AC iv ) which match the observed accounting cost ci for the base year. More precisely, these crop specific parameters are calculated according to (19)

MC iv = d i + b i,i x io = c i + λ i AC iv = d i + 1 2 b i,i x io = c i

resulting in (20) b i,i =

2λ i x io

and d i = c i − λ i

14

Furthermore, we defined an "intelligent no change" forecast by taking 1990 levels of annual crops reducing them - where applicable - by set-aside obligations. The resulting areas were then made consistent to the available land in 1994. Figure 1 shows the percentage national deviation of "simulated" production activity levels from the observed activity levels in 1994 for the three approaches. The "standard approach" shows rather high deviations for some major crops. Somewhat surprising, the "no-change" forecast is comparatively close to observed production activity levels. Appearently - the 1992 CAP reform had - at least in France - a relatively small impact on the aggregate crop rotation apart from the set-aside effect. With this in mind, the fit of the ME-PMP approach based on the cross sectional sample is rather promising: apart from sunflowers and potatoes it beats the "no-change" results. The sum of absolute deviation in levels weighted by the observed levels amounts just up to 3 % (see "Total"). Figure 1: Percentage deviation of simulated from observed production activity levels for the french aggregate 50

30

10

-10

-30

-50 Wheat Barley

Maize

SunCereal flow er s s

Rape

Oil Pulses seeds

Potatoes

Arable Fodder

Fallow Set aside

Total

Standard

-11

-3

3

-4

-42

33

2

-28

22

9

2

11

ME

-2

8

-2

1

-3

6

1

-1

8

-3

0

3

No change

-4

9

-8

-2

-1

-13

-5

-12

0

7

-17

7

But the CAPRI-model is designed as a regional model and consequently the variation in regional forecasting errors is at least of the same importance. The results in figure 1 could be a rather "lucky" outweighing of regionally large under- and overestimating of activity levels. Therefore, we additionally checked the regional fit by calculating mean absolute percentage deviations over regions presented for the most important activities and aggregates in figure 2. The standard approach was again no real competitor. However, the performance of the ME-PMP approach is about the same as "no-change" apart from the aggregate of fallow land and set-aside. As explained above, problems could be expected here as no substantial information entered the calibration step.

15

Figure 2: Mean absolute percentage deviations of simulated from observed production activity levels across regions 50 40 30 20 10 0 Wheat Barley Maize

SunCereal flow er s s

Rape

Oil Pulses seeds

Pota- Arable toes Fodder

Fallow Set Aside

Total

Standard

24

42

37

7

49

40

15

29

31

11

36

29

ME

7

12

15

5

14

23

14

14

14

11

32

14

No change

8

14

16

4

11

24

13

18

10

8

17

12

5 Summary and Conclusions So far, most PMP application in aggregate programming models suffered from a rather arbitrary specification of the non-linear objective function. Classical econometric approaches cannot be applied to this typically underdetermined estimation problem, a problem overcome by using the Maximum Entropy criterion as proposed in PARIS and HOWITT (1998). Their application, however, included just one observation on marginal cost and additionally suffered from an unfortunate definition of supports based on a specific implementation of the Cholesky decomposition employed to ensure the correct curvature of the estimated cost function. The problems were addressed by the ME-PMP approach presented in this paper which •

uses a cross-sectional sample in order to derive changes in marginal cost based on observed differences between regions with different crop rotations, and

•

provides a solution for the curvature problem with limited computational burden and direct definition of support points for the parameters of interest.

An ex-post validation of the resulting model specification simulated the 1992 CAP reform for crop production in France. The results show a promising fit of observed production activity levels - not only for the national aggregate, but as well for the regional dimension. The ex-post simulation exercise - rarely executed and published in the context of aggregate programming models - shows the general validity of the calibration procedure for the regional programming models in CAPRI. The allocation behaviour of the resulting models is clearly superior to standard applications of PMP (see CYPRIS, 1999). Nevertheless, the general approach leaves ample opportunities to put the validation on a broader base, to improve the economic foundation, and to introduce more empirical information into the calibration step. Specific issues on our research agenda are: §

the additional use of time series observations to extent the information base and to estimate time dependant parameters,

§

the elaboration of explicit theoretical links of current PMP applications with duality theory,

§

the investigation of links to duality based econometric models with explicit allocation of fixed factors to different production activities.

16

6 References HECKELEI, T.(1997): Positive Mathematical Programming: Review of the Standard Approach. CAPRI-working paper 97-03. PARIS, Q., and R.E. HOWITT (1998): An Analysis of Ill-Posed Production Problems Using Maximum Entropy, AJAE, 80(1), pp. 124-138. HOWITT, R.E. (1995): Positive Mathematical Programming. AJAE, 77(2), pp. 329-42. PARIS, Q. (1997): State-of-the-Art in Use and Interpretation of Positive Mathematical Programming. Presentation at the CAPRI-workshop in Reggio Emilia. GOLAN, A., JUDGE, G. and D. MILLER (1996): Maximum Entropy Econometrics, Chichester UK: Wiley. GOLUB, G.H. and C. F. VAN LOAN (1989): Matrix Computations, 2nd edn., John Hopkins University Press. CYPRIS, Ch.(1999): Positiv Mathematische Programmierung (PMP) im Agrarsektormodell RAUMIS. Dissertation, forthcoming. GILL, P.E., MURRAY, W. and M.H. WRIGHT (1989): Practical Optimization. Academic Press, London.

List of CAPRI Working Papers: 97-01: Britz, Wolfgang; Heckelei, Thomas: Pre-study for a medium-term simulation and forecast model of the agricultural sector for the EU 97-02: Britz, Wolfgang: Regionalization of EU-data in the CAPRI project 97-03: Heckelei, Thomas: Positive Mathematical Programming: Review of the Standard Approach 97-04: Meudt, Markus; Britz, Wolfgang: The CAPRI nitrogen balance 97-05: Löhe, Wolfgang; Britz, Wolfgang: EU's Regulation 2078/92 in Germany and experiences of modelling less intensive production alternatives 97-06: Möllmann, Claus: FADN/RICA Farm Accountancy Data Network Short Introduction 97-07: Löhe, Wolfgang; Specification of variable inputs in RAUMIS 97-08: María Sancho and J.M. García Alvarez-Coque; Changing agricultural systems in the context of “compatible”agriculture. The Spanish “experience” 97-09 Helmi Ahmed El Kamel and J.M.García Alvarez-Coque; Modelling the supply response of perennial crops 97-10: Patrick Gaffney; A Projection of Irish Agricultural Structure Using Markov Chain Analysis 97-11: P.Nasuelli, G.Palladino, M.Setti, C.Zanasi, G.Zucchi; A bottom-up approach for the CAPRI project 97-12: P.Nasuelli, G.Palladino, M.Setti, C.Zanasi, G.Zucchi; FEED MODULE: Requirements functions and Restriction factors 98-01: Heckelei, Thomas; Britz, Wolfgang: EV-Risk analysis for Germany 98-02: Heckelei, Thomas; Britz, Wolfgang; Löhe, Wolfgang: Recursive dynamic or comparative static solution for CAPRI 98-03: Löhe, Wolfgang; Britz, Wolfgang: Modelling alternative technologies based on the RAUMIS-NRW approach 98-04: Sander, Reinhard: General status of the project 98-05: Scott R. Steele and Patrick Gaffney: A Regional Analysis of the Changing Structure of Agricultural Land Holdings 98-06: Heinz Peter Witzke and Wolfgang Britz: A Maximum Entropy Approach to the Calibration of Highly Differentiated Demand System 98-07: Wolfgang Britz, A Synthetic Non-Spatial Multi-Commodity Model as Market Component for CAPRI 98-08: Wolfgang Britz and Stefan Sieber, Estimating feed input demand of the German Compound Feed Industry 98-09: Eoghan Garvey and Scott Steele, Short Term Forecasts of Structural Changes in Irish Agriculture 98-10: Helmi Ahmed El Kamel and Sonia Iborra Gómez, A regionalized data base and forecasts for the supply of Perennial Crops

1

98-11: Nasuelli P., Palladino G., Setti M., Tampellini V., Zanasi C ., A regionalized analysis of the environmental impact of the animal production activities – nitrogen and methane emissions 98-12: Nasuelli P., Palladino G., Setti M., Zanasi C., Estimation of the elasticity of substitution between imported and domestically produced goods. An application of the Armington approach. 99-01: Fischer, Jürg; Energy Inputs in Swiss Agriculture 99-02: Nasuelli P., Palladino G., Setti M., Zanasi C., A demographic model for the definition of the livestock activity level for the EU regions 99-03: Britz, Wolfgang: Conducting Simulation with CAPRI 99-04: Britz, Wolfgang: A Maximum Entropy based approach to input coefficients for the SPEL/EU data base 99-05: Sander, Reinhard: Political variables in CAPRI

2