By Topic

IEEE Quick Preview
  • Abstract

SECTION I

INTRODUCTION

CO-EVOLUTIONARY learning refers to a broad class of population-based, stochastic search algorithms that involves the simultaneous evolution of competing solutions with coupled fitness [1]. The co-evolutionary search process is characterized by the adaptation of solutions in some form of representation involving repeated applications of variation and selection [2]. Co-evolutionary learning offers an attractive alternative approach for problem solving in cases where obtaining an absolute quality measurement to guide the search for solutions is difficult or not possible. One such problem is game-playing [3], [4], [5], [6], [7], [8]. Unlike classical machine learning that requires an absolute quality measurement, the search process in co-evolutionary learning can be guided by strategic interactions between competing solutions (learners). Early studies [9], [10], [11] have further argued that the co-evolutionary search may benefit from these strategic interactions from one generation to the next that results in an arms race of increasingly innovative solutions.

Generalization is one of the main research issues in co-evolutionary learning. Recently, we have formulated a theoretical framework for generalization in co-evolutionary learning [12]. Other past studies such as [13] have investigated an approach to analyze performance in co-evolving populations through non-local adaptation. A general framework for statistical comparison of performance of evolutionary algorithms has been recently formulated [14]. In line with these studies, the generalization framework offers a rigorous approach to performance analysis of co-evolutionary learning, whether for individual co-evolved solutions, or for the population of co-evolved solutions in any generation.

We have demonstrated the generalization framework in the context of game-playing. Generalization performance of a strategy (solution) is estimated using a collection of random test strategies (test cases) by taking the average game outcomes, with confidence bounds provided by Chebyshev's theorem [15]. Chebyshev's bounds have the advantage that they hold for any distribution of game outcomes. However, such a distribution-free framework leads to unnecessarily loose confidence bounds. In this paper, we have taken advantage of the near-Gaussian nature of average game outcomes through the central limit theorem [16] and provided tighter bounds based on parametric testing. Furthermore, we can strictly control the condition (i.e., sample size under a given precision) under which the distribution of average game outcomes (generalization performance estimates) converges to a Gaussian through the Berry-Esseen theorem [17].

These improvements to the generalization framework now provide the means with which we can develop a general and principled approach to improve generalization performance in co-evolutionary learning that can be implemented efficiently. Ideally, if we compute the true generalization performance of any co-evolving solution and directly use it as the fitness measure, co-evolutionary learning would lead to the search for solutions with higher generalization performance. However, direct estimation of the true generalization performance using the distribution-free framework can be computationally expensive [12]. Our new theoretical contributions that exploit the near-Gaussian nature of generalization estimates allow us: 1) to find out in a principled manner the required number of test cases for robust estimations (given a controlled level of precision) of generalization performance, and 2) to subsequently use small sample of random test cases sufficient for robust estimations to compute generalization estimates of solutions directly as the fitness measure to guide and improve co-evolutionary learning.

Early studies [18], [19], [20] have shown that the classical co-evolutionary learning approach that uses relative fitness (i.e., fitness evaluation that depends on other competing solutions in the population) does not necessarily lead to solutions with increasingly higher generalization performance. Others have investigated various approaches to improve performance in co-evolutionary learning, e.g., by exploiting diversity in the population [21], [22], [23], using other notions for fitness measure such as pareto dominance [24], and using archives of test cases [25], [26], [27], among others. In [28], a study has been made to investigate how performance can be improved in a cooperative co-evolutionary learning framework (a population member only represents part of a complete solution) as compared to most other studies that have considered the competitive co-evolutionary learning framework (a population member represents a complete solution).

Unlike these past studies, we demonstrate an approach to improve generalization performance in a principled manner that can be implemented as an efficient algorithm (e.g., using small samples of test cases) that is verified in a principled manner as well. Our approach directly uses generalization estimates as the fitness measure in a competitive co-evolutionary learning setting. A series of empirical studies involving the iterated prisoner's dilemma (IPD) and the more complex Othello game is used to demonstrate how the new approach improves on the classical approach in that evolved strategies with increasingly higher generalization performance are obtained using relatively small samples of test strategies. This is achieved without large performance fluctuations typical of the classical approach. The new approach also leads to faster co-evolutionary search where we can strictly control the condition (sample sizes) under which the speedup is achieved (not at the cost of weakening precision in the estimates). It is faster than using the distribution-free framework (it requires an order of magnitude smaller number of test strategies) to achieve similarly high generalization performance.

More importantly, our approach does not depend on the complexity of the game. For some games that are more complex (under some measures of game complexity), more test strategies may be required to estimate the generalization performance of a strategy for a given level of precision. However, this will come out automatically and in a principled manner from our analysis. The necessary sample size for robust estimations can then be set and subsequently, generalization estimates can be computed and directly used as the fitness measure to guide co-evolutionary search of strategies with higher generalization performance.

We note that this paper is a first step toward understanding and developing theoretically motivated frameworks of co-evolutionary learning that can lead to improvements in the generalization performance of solutions. Although our generalization framework makes no assumption on the underlying distribution of test cases, we demonstrate one application where the underlying distribution in the generalization measure is fixed and known a priori. Generalization estimates are directly used as the fitness measure to improve generalization performance in co-evolutionary learning. This has the effect reformulating co-evolutionary learning to that of an evolutionary learning approach, but with the advantage of a principled and efficient methodology that has the potential of outperforming classical co-evolutionary approach on difficult learning problems such as games. Further studies may involve extending the generalization framework in formulating co-evolutionary learning systems where the population acting as test samples can adapt to approximate a particular distribution that solutions should generalize to.

The rest of this paper is organized as follows. Section II presents the theoretical framework for statistical estimation of generalization performance, and improvements made to provide tighter bounds through the central limit and Berry-Esseen theorems. We mention two kinds of parametric testing: 1) making statistical claims on the hypothesized performance of a strategy, and 2) comparing performance differences of a pair of strategies. Section III demonstrates how one can find out and set the required number of test strategies for robust estimation of generalization performance in a principled manner, using the IPD game for illustration. It is shown that a small number of test strategies is sufficient to estimate generalization performance with good accuracy. Section IV demonstrates how generalization estimates can be used directly as the fitness measure to improve co-evolutionary learning. We first illustrate the new co-evolutionary approach using the IPD game and later consider the more complex Othello game. Finally, Section V concludes the paper with some remarks for future studies.

SECTION II

STATISTICAL ESTIMATION OF GENERALIZATION PERFORMANCE IN CO-EVOLUTIONARY LEARNING

A. Games

In co-evolutionary learning, the quality of a solution is determined relative to other competing solutions in the population through interactions. This can be framed in the context of game-playing, i.e., an interaction is a game played between two strategies (solutions) [12]. We assume that there is a potentially vast but finite set of possible strategies that can be involved in playing the game. At each time step, a strategy can select a move from a finite set of possible moves to play the game. Endowing strategies with memory of their own and opponents' moves results in an exponential explosion in the number of such strategies.

Consider a game and a set Formula${\cal S}$ of Formula$M$ distinct strategies, Formula${\cal S}=\{1,2,\ldots,M\}$. Denote the game outcome of strategy Formula$i$ playing against the opponent strategy Formula$j$ by Formula$G_{i}(j)$. Different definitions of Formula$G_{i}(j)$ (different notions of game outcomes) for the generalization performance indicate different measures of quality [12]. The win-lose function for Formula$G_{i}(j)$ is given by Formula TeX Source $$G_{\rm W}(i,j)=\cases{C_{\rm WIN}, &if $g(i,j)>g(j,i)$\cr C_{\rm LOSE},&{\rm otherwise}}\eqno{\hbox{(1)}}$$ where Formula$g(i,j)$ and Formula$g(j,i)$ are payoffs for strategies Formula$i$ and Formula$j$ at the end of the game, respectively, and Formula$C_{\rm WIN}$ and Formula$C_{\rm LOSE}$ are arbitrary constants with Formula$C_{\rm WIN}>C_{\rm LOSE}$. We use Formula$C_{\rm WIN}=1$ and Formula$C_{\rm LOSE}=0$ and arbitrarily choose a stricter form of Formula$G_{\rm W}(i,j)$ (a loss is awarded to both sides in the case of a tie). As an example that we present later, our choice of Formula$G_{\rm W}(i,j)$ is to simplify analysis. In this case, the “all defect” strategy that plays full defection is known to be the only one with the maximum generalization performance when Formula$G_{\rm W}(i,j)$ is used irrespective of how test strategies are distributed in Formula${\cal S}$ for the IPD game. This is not necessarily true for other definitions, such as the average-payoff function.

B. Estimating Generalization Performance

A priori some strategies may be favored over the others, or all strategies can be considered with equal probability. Let the selection of individual test strategies from Formula${\cal S}$ be represented by a random variable Formula$J$ taking on values Formula$j\in{\cal S}$ with probability Formula$P_{\cal S}(j)$. Formula$G_{i}$ is the true generalization performance of a strategy Formula$i$ and is defined as the mean performance (game outcome) against all possible test strategies Formula$j$ Formula TeX Source $$G_{i}=\sum_{j=1}^{M}P_{\cal S}(j) G_{i}(j).\eqno{\hbox{(2)}}$$

In other words, Formula$G_{i}$ is the mean of the random variable Formula$G_{i}(J)$ Formula TeX Source $$G_{i}=E_{P_{\cal S}}[G_{i}(J)].$$

In particular, when all strategies are equally likely to be selected as test strategies, i.e., when Formula$P_{\cal S}$ is uniform, we have Formula TeX Source $$G_{i}={{1}\over{M}}\sum_{j=1}^{M}G_{i}(j).\eqno{\hbox{(3)}}$$

The size Formula$M$ of the strategy space Formula$S$ can be very large, making direct estimation of Formula$G_{i}$, Formula$i\in{\cal S}$, through (2) infeasible. In practice, one can estimate Formula$G_{i}$ through a random sample Formula$S_{N}$ of Formula$N$ test strategies drawn i.i.d. from Formula${\cal S}$ with probability Formula$P_{\cal S}$. The estimated generalization performance of strategy Formula$i$ is denoted by Formula${\mathhat{G_{i}}}(S_{N})$ and given as follows: Formula TeX Source $${\mathhat{G_{i}}}(S_{N})={{1}\over{N}}\sum_{j\in S_{N}}G_{i}(j).\eqno{\hbox{(4)}}$$

If the game outcome Formula$G_{i}(J)$ varies within a finite interval Formula$[G_{\rm MIN},G_{\rm MAX}]$ of size Formula$R$, the variance of Formula$G_{i}(J)$ is upper-bounded by Formula$\sigma^{2}_{\rm MAX}=(G_{\rm MAX}-G_{\rm MIN})^{2}/4=R^{2}/4$. Using Chebyshev's theorem [15], we obtain Formula TeX Source $$P(\vert{\mathhat{G_{i}}}-G_{i}\vert\geq\epsilon)\leq{{R^{2}}\over{4 N\cdot{\epsilon}^{2}}}\eqno{\hbox{(5)}}$$ for any positive number Formula$\epsilon>0$. Note that Chebyshev's bounds (5) are distribution-free, i.e., no particular form of distribution of Formula$G_{i}(J)$ is assumed. One can make statistical claims of how confident one is on the accuracy of an estimate given a random test sample of a known size Formula$N$ using Chebyshev's bounds [12].

C. Error Estimations for Gaussian-Distributed Generalization Estimates

Selection of the sample Formula$S_{N}$ of test strategies can be formalized through a random variable Formula${\cal S}_{N}$ on Formula${\cal S}^{N}$ endowed with the product measure induced by Formula$P_{\cal S}$. Estimates of the generalization performance of strategy Formula$i$ can be viewed as realizations of the random variable Formula${\mathhat{G_{i}}}({\cal S}_{N})$. Since game outcomes Formula$G_{i}(J)$ have finite mean and variance, by the central limit theorem, for large enough Formula$N$, Formula${\mathhat{G_{i}}}({\cal S}_{N})$ is Gaussian-distributed. Claims regarding the “speed of convergence” of the (cummulative) distribution of Formula${\mathhat{G_{i}}}({\cal S}_{N})$ to the (cummulative) distribution of a Gaussian can be made quantitative using the Berry-Esseen theorem [17].

First, normalize Formula$G_{i}(J)$ to zero mean Formula TeX Source $$X_{i}(J)={G_{i}}(J)-G_{i}.\eqno{\hbox{(6)}}$$

Denote the variance of Formula$G_{i}(J)$ [and hence the variance of Formula$X_{i}(J)$] by Formula$\sigma_{i}^{2}$. Since Formula$G_{i}(J)$ can take on values in a finite domain, the third absolute moment Formula TeX Source $$\rho_{i}=E_{P_{\cal S}}[\vert X_{i}(J)\vert^{3}]$$ of Formula$X_{i}$ is finite.

Second, normalize Formula${\mathhat{G_{i}}}({\cal S}_{N})$ to zero mean Formula TeX Source $$Y_{i}({\cal S}_{N})={\mathhat{G_{i}}}({\cal S}_{N})-G_{i}={{1}\over{N}}\sum_{j\in{\cal S}_{N}}X_{i}(j).\eqno{\hbox{(7)}}$$

Third, normalize Formula${\mathhat{G_{i}}}({\cal S}_{N})$ to unit standard deviation Formula TeX Source $$Z_{i}({\cal S}_{N})={{Y_{i}({\cal S}_{N})}\over{{\sigma_{i}}\over{\sqrt{N}}}}.\eqno{\hbox{(8)}}$$

The Berry-Esseen theorem states that the cummulative distribution function (CDF) Formula$F_{i}$ of Formula$Z_{i}({\cal S}_{N})$ converges (pointwise) to the CDF Formula$\Phi$ of the standard normal distribution Formula$N(0,1)$. For any Formula$x\in{\BBR}$ Formula TeX Source $$\left\vert F_{i}(x)-\Phi (x)\right\vert\leq{{0.7975}\over{\sqrt{N}}}{{\rho_{i}}\over{\sigma_{i}^{3}}}.\eqno{\hbox{(9)}}$$

It is noted that only information on Formula$\sigma_{i}$ and Formula$\rho_{i}$ is required to make an estimate on the pointwise difference between CDFs of Formula$Z_{i}({\cal S}_{N})$ and Formula$N(0,1)$. In practice, since the (theoretical) moments Formula$\sigma_{i}$ and Formula$\rho_{i}$ are unknown, we use their empirical estimates. To ensure that the CDFs of Formula$Z_{i}({\cal S}_{N})$ and Formula$N(0,1)$ do not differ pointwise by more than Formula$\epsilon>0$, we need at least Formula TeX Source $$N_{\rm CDF}(\epsilon)={{0.7975^{2}}\over{\epsilon^{2}}}{{\rho_{i}^{2}}\over{(\sigma_{i}^{2})^{3}}}\eqno{\hbox{(10)}}$$ test strategies.

Let us now assume that the generalization estimates are Gaussian-distributed (e.g., using the analysis above, we gather enough test points to make the means almost Gaussian distributed). Denote by Formula$z_{\alpha/2}$ the upper Formula$\alpha/2$ point of Formula$N(0,1)$, i.e., the area under the standard normal density for Formula$(z_{\alpha/2},\infty)$ is Formula$\alpha/2$, and for Formula$[-z_{\alpha/2},z_{\alpha/2}]$ it is Formula$(1-\alpha)$. For large strategy samples Formula$S_{N}$, the estimated generalization performance Formula${\mathhat{G_{i}}}(S_{N})$ of strategy Formula$i\in{\cal S}$ has standard error of Formula$\sigma_{i}/\sqrt{N}$. Since Formula$\sigma_{i}$ is generally unknown, the standard error can be estimated as Formula TeX Source $${{{\mathhat\sigma}_{i}(S_{N})}\over{\sqrt{N}}}=\sqrt{{\sum\nolimits_{j\in S_{N}}(G_{i}(j)-{\mathhat{G_{i}}}(S_{N}))^{2}}\over{N(N-1)}}\eqno{\hbox{(11)}}$$ and the Formula$100(1-\alpha)\%$ error margin of Formula${\mathhat{G_{i}}}(S_{N})$ is Formula$z_{\alpha/2}\sigma_{i}/\sqrt{N}$, or, if Formula$\sigma_{i}$ is unknown Formula TeX Source $$\Upsilon_{i}(\alpha,N)=z_{\alpha/2}\sqrt{{\sum\nolimits_{j\in S_{N}}(G_{i}(j)-{\mathhat{G_{i}}}(S_{N}))^{2}}\over{N(N-1)}}.\eqno{\hbox{(12)}}$$

Requiring that the error margin be at most Formula$\delta>0$ leads to samples of at least Formula TeX Source $$N_{\rm em}(\delta)={{z_{\alpha/2}^{2} \sigma_{i}^{2}}\over{\delta^{2}}}\eqno{\hbox{(13)}}$$ test strategies.

In other words, to be Formula$100(1-\alpha)\%$ sure that the estimation error Formula$\vert{\mathhat{G_{i}}}(S_{N})-G_{i}\vert$ will not exceed Formula$\delta$, we need Formula$N_{\rm em}(\delta)$ test strategies. Stated in terms of confidence interval, a Formula$100(1-\alpha)\%$ confidence interval for the true generalization performance of strategy Formula$i$ is Formula TeX Source $$({\mathhat{G_{i}}}(S_{N})-\Upsilon_{i}(\alpha,N),{\mathhat{G_{i}}}(S_{N})+\Upsilon_{i}(\alpha,N)).\eqno{\hbox{(14)}}$$

D. Statistical Testing for Comparison of Strategies

One can also make statistical claims regarding hypothesized performance of the studied strategies. For example, one may be only interested in strategies with true generalization performance greater than some threshold Formula${\mathtilde G}$. In this case, we can test whether Formula$i$ is a “bad” strategy by testing for Formula$G_{i}<{\mathtilde G}$. The hypothesis Formula$H_{1}$ that Formula$G_{i}<{\mathtilde G}$ is substantiated at significance level of Formula$\alpha\%$ (against the null hypothesis Formula$H_{0}$ that Formula$G_{i}={\mathtilde G}$) if the test statistic Formula TeX Source $$Z^{\prime}_{i}(S_{N},{\mathtilde G})={{{\mathhat{G_{i}}}(S_{N})-{\mathtilde G}}\over{\sqrt{{\sum\nolimits_{j\in S_{N}}(G_{i}(j)-{\mathhat{G_{i}}}(S_{N}))^{2}}\over{N(N-1)}}}}\eqno{\hbox{(15)}}$$ falls below Formula$-z_{\alpha}$, i.e., if Formula$Z^{\prime}_{i}(S_{N},{\mathtilde G})\leq-z_{\alpha}$. Alternatively, the hypothesis that strategy Formula$i$ is an acceptable strategy, i.e., Formula$G_{i}\,{>}\,{\mathtilde G}$, is accepted (against Formula$H_{0}$) at significance level of Formula$\alpha\%$, if Formula$Z^{\prime}_{i}(S_{N},{\mathtilde G})\geq z_{\alpha}$. We can also simply test for Formula$G_{i}\ne{\mathtilde G}$, in which case we require Formula$\vert Z^{\prime}_{i}(S_{N},{\mathtilde G})\vert\geq z_{\alpha/2}$.

Crucially, we can compare two strategies Formula$i$, Formula$j\in{\cal S}$ for their relative performance. This can be important in the evolutionary or co-evolutionary learning setting when constructing a new generation of strategies. Assume that both strategies Formula$i$ and Formula$j$ play against the same set of Formula$N$ test strategies Formula$S_{N}=\{t_{1}, t_{2},\ldots, t_{N}\}$. Statistical tests regarding the relation between the true generalization performances of Formula$i$ and Formula$j$ can be made using paired tests. One computes a series of performance differences on Formula$S_{N}$ Formula TeX Source $$D(n)=G_{i}(t_{n})-G_{j}(t_{n}) n=1,2,\ldots,N.$$

The performance differences are then analyzed as a single sample. At significance level of Formula$\alpha\%$, strategy Formula$i$ appears to be better than strategy Formula$j$ by more than a margin Formula${\mathtilde D}$ (against the null hypothesis that Formula$i$ beats Formula$j$ exactly by the margin Formula${\mathtilde D}$), if Formula$Z^{\prime\prime}_{i}(S_{N},{\mathtilde D})\geq z_{\alpha}$, where Formula TeX Source $$Z^{\prime\prime}_{i}(S_{N},{\mathtilde D})={{{\mathhat{D}}(S_{N})-{\mathtilde D}}\over{\sqrt{{\sum\nolimits_{n=1}^{N}(D(n)-{\mathhat{D}}(S_{N}))^{2}}\over{N(N-1)}}}}\eqno{\hbox{(16)}}$$ and Formula TeX Source $${\mathhat{D}}(S_{N})={{1}\over{N}}\sum_{n=1}^{N}D(n).\eqno{\hbox{(17)}}$$

For simply testing whether strategy Formula$i$ outperforms strategy Formula$j$ we set the margin to 0, i.e., Formula${\mathtilde D}=0$. Analogously, strategy Formula$i$ appears to be worse than strategy Formula$j$ at significance level of Formula$\alpha\%$, provided Formula$Z^{\prime\prime}_{i}(S_{N},0)\leq-z_{\alpha}$.

Finally, strategies Formula$i$ and Formula$j$ appear to be different at significance level of Formula$\alpha\%$, if Formula$\vert Z^{\prime\prime}_{i}(S_{N},0)\vert\geq z_{\alpha/2}$. We stress that the comparison of strategies Formula$i$, Formula$j\in{\cal S}$ is done through a set of test strategies in Formula$S_{N}$ and not through a game of strategy Formula$i$ playing against strategy Formula$j$. Although one may want to compare one strategy with another directly by having them competing against each other, it should be noted that specific properties in games such as intransitivity may lead to misleading results.

For small samples Formula$S_{N}$ of test strategies, we would need to use the Formula$t$-statistic instead of the normally distributed Formula$Z$-statistic employed here. Distribution of the Formula$t$-statistic is the Student's Formula$t$-distribution with Formula$N-1$ degrees of freedom. However, for sample sizes Formula$N\geq 50$ used in this paper, the Student's Formula$t$-distribution can be conveniently replaced by the standard normal distribution Formula$N(0,1)$.

E. Properties of Gaussian-Distributed Generalization Estimates

It is common to use instead of the true standard deviation Formula$\sigma_{i}$ of game outcomes for strategy Formula$i$ its sample estimate [see (11)] Formula TeX Source $${\mathhat\sigma}_{i}(S_{N})=\sqrt{{\sum\nolimits_{j\in S_{N}}(G_{i}(j)-{\mathhat{G_{i}}}(S_{N}))^{2}}\over{N-1}}.\eqno{\hbox{(18)}}$$

If we generate Formula$n$ i.i.d. test strategy samples Formula$S_{N}^{r}$, Formula$r=1,2,\ldots,n$, each of size Formula$N$, then the generalization performance estimates Formula TeX Source $${\mathhat{G_{i}}}(S^{r}_{N})={{1}\over{N}}\sum_{j\in S^{r}_{N}}G_{i}(j)$$ are close to being Gaussian-distributed with mean Formula$G_{i}$ and standard deviation Formula$\sigma_{i}/\sqrt{N}$ (for large enough Formula$N$). Such generalization estimates can be used to estimate the confidence interval for Formula$\sigma_{i}$ as follows.

The sample variance of the estimates Formula${\mathhat{G_{i}}}(S^{r}_{N})$ is Formula TeX Source $$V^{2}_{n}={{\sum\nolimits_{r=1}^{n}({\mathhat{G_{i}}}(S^{r}_{N})-\Gamma_{i})^{2}}\over{n-1}}\eqno{\hbox{(19)}}$$ where Formula TeX Source $$\Gamma_{i}={{1}\over{n}}\sum_{r=1}^{n}{\mathhat{G_{i}}}(S^{r}_{N}).\eqno{\hbox{(20)}}$$

The normalized sample variance of a Gaussian-distributed Formula${\mathhat{G_{i}}}(S^{r}_{N})$ Formula TeX Source $$U^{2}_{n}={{(n-1) V^{2}_{n}}\over{{\sigma_{i}^{2}}\over{N}}}\eqno{\hbox{(21)}}$$ is known to be Formula$\chi^{2}$-distributed with Formula$n-1$ degrees of freedom.1

The Formula$100(1-\alpha)\%$ confidence interval for Formula$\sigma_{i}/\sqrt{N}$ is Formula TeX Source $$\left(V_{n}\sqrt{{n-1}\over{\chi^{2}_{\alpha/2}}},V_{n}\sqrt{{n-1}\over{\chi^{2}_{1-\alpha/2}}}\right)\eqno{\hbox{(22)}}$$ where Formula$\chi^{2}_{\beta}$ is the value such that the area to the right of Formula$\chi^{2}_{\beta}$ under the Formula$\chi^{2}$ distribution with Formula$N-1$ degrees of freedom is Formula$\beta$. It follows that the Formula$100(1-\alpha)\%$ confidence interval for Formula$\sigma_{i}$ is Formula TeX Source $$\left(V_{n}\sqrt{{N(n-1)}\over{\chi^{2}_{\alpha/2}}},V_{n}\sqrt{{N(n-1)}\over{\chi^{2}_{1-\alpha/2}}}\right)\eqno{\hbox{(23)}}$$ which can be rewritten as Formula TeX Source $$\left(\sqrt{{N\cdot\sum\nolimits_{r=1}^{n}({\mathhat{G_{i}}}(S^{r}_{N})-\Gamma_{i})^{2}}\over{\chi^{2}_{\alpha/2}}},\sqrt{{N\cdot\sum\nolimits_{r=1}^{n}({\mathhat{G_{i}}}(S^{r}_{N})-\Gamma_{i})^{2}}\over{\chi^{2}_{1-\alpha/2}}}\right).\eqno{\hbox{(24)}}$$

F. Ramifications of Statistical Estimation of Generalization Performance in Co-Evolutionary Learning

This framework provides a computationally feasible approach to estimate generalization performance in co-evolutionary learning. A small sample of test strategies may be sufficient to estimate the generalization performance of strategies, even though the strategy space is huge. Furthermore, the framework has the potential application for developing efficient algorithms to improve co-evolutionary search. Our theoretical framework allows us to develop a methodology to find the number of test strategies required for the robust estimation of generalization performance. Subsequently, generalization estimates obtained using a small sample of test strategies (compared to the case of direct estimation of the true generalization) can lead to the co-evolutionary search of strategies with increasingly higher generalization performance since the selection of evolved strategies are based on their estimated generalization performances.

SECTION III

EXAMPLES OF STATISTICAL ESTIMATION OF GENERALIZATION PERFORMANCE IN CO-EVOLUTIONARY LEARNING

We first illustrate several examples of statistical estimation of generalization performance in co-evolutionary learning based on our theoretical framework in Section II. We consider the three-choice IPD game with deterministic and reactive, memory-one strategies since we can compute the true generalization performance (for simplicity, we assume that test strategies are randomly sampled from Formula${\cal S}$ with a uniform distribution). We demonstrate how one can find and set the required number of random test strategies for robust estimation (given a controlled level of precision) of generalization performance for subsequent use of generalization estimates directly as the fitness measure in co-evolutionary learning. Our results would show that a smaller number of test strategies than predicted previously in [12] is sufficient for robust estimation of generalization performance.

A. Iterated Prisoner's Dilemma Game

In the classical, two-player IPD game, each player is given two choices to play, cooperate or defect [30]. The game is formulated with the predefined payoff matrix specifying the payoff a player receives given the joint move it made with the opponent. Both players receive Formula$R$ (reward) units of payoff if both cooperate. They both receive Formula$P$(punishment) units of payoff if they both defect. However, when one player cooperates while the other defects, the cooperator receives Formula$S$(sucker) units of payoff while the defector receives Formula$T$ (temptation) units of payoff. The values Formula$R$, Formula$S$, Formula$T$, and Formula$P$ must satisfy the constraints: Formula$T>R>P>S$ and Formula$R>(S+T)/2$. Any set of values can be used as long as they satisfy the IPD constraints (we use Formula$T=5$, Formula$R=4$, Formula$P=1$, and Formula$S=0$). The game is played when both players choose between the two alternative choices over a series of moves (repeated interactions).

The classical IPD game has been extended to more complex versions, e.g., the IPD with multiple, discrete levels of cooperation [31], [32], [33], [34], [35]. The Formula$n$-choice IPD game can be formulated using payoffs obtained through the following linear interpolation: Formula TeX Source $$p_{\rm A}=2.5-0.5c_{\rm A}+2c_{\rm B}-1\leq c_{\rm A},c_{\rm B}\leq1\eqno{\hbox{(25)}}$$ where Formula$p_{\rm A}$ is the payoff to player A, given that Formula$c_{\rm A}$ and Formula$c_{\rm B}$ are the cooperation levels of the choices that players A and B make, respectively.The payoff matrix for the three-choice IPD game is given in Fig. 1 [12].

Figure 1
Fig. 1. Payoff matrix for the two-player three-choice IPD game [12]. Each element of the matrix gives the payoff for player A.

The payoff matrix for any Formula$n$-choice IPD game must satisfy the following conditions [32]:

  1. for Formula$c_{\rm A}<c^{\prime}_{\rm A}$ and constant Formula$c_{\rm B}: p_{\rm A}(c_{\rm A},c_{\rm B})>p_{\rm A}(c^{\prime}_{\rm A},c_{\rm B})$;
  2. for Formula$c_{\rm A}\leq c^{\prime}_{\rm A}$ and Formula$c_{\rm B}<c^{\prime}_{\rm B}: p_{\rm A}(c_{\rm A},c_{\rm B})<p_{\rm A}(c^{\prime}_{\rm A},c^{\prime}_{\rm B})$;
  3. for Formula$c_{\rm A}<c^{\prime}_{\rm A}$ and Formula$c_{\rm B}<c^{\prime}_{\rm B}: p_{\rm A}(c^{\prime}_{\rm A},c^{\prime}_{\rm B})>(1/2)(p_{\rm A}(c_{\rm A},c^{\prime}_{\rm B})+p_{\rm A}(c^{\prime}_{\rm A},c_{\rm B}))$.

These conditions are analogous to those for the classical IPDs: 1) defection always pays more; 2) mutual cooperation has a higher payoff than mutual defection; and 3) alternating between cooperation and defection pays less in comparison to just playing cooperation.

B. What is the Required Number of Test Strategies?

We would like to find out and set the required number Formula$N$ of random test strategies drawn i.i.d. from Formula${\cal S}$ for robust estimation of generalization performance for a game. Instead of making some assumptions about the complexity of a game and the impact on the required number of random test strategies, we demonstrate a principled approach based on our theoretical framework in Section II. Our approach exploits the near-Gaussian nature of generalization estimates and finds out the rate at which the distribution of generalization estimates converges to a Gaussian as the number of random test strategies to compute generalization estimates grows.

We illustrate our approach for the three-choice IPD game. We first collect a sample of 50 base strategies Formula$i$, which we obtain by randomly sampling from Formula${\cal S}$ with uniform distribution. We also collect 1000 independent samples Formula$S_{N}$ to compute 1000 estimates Formula${\mathhat{G}}_{i}(S_{N})$. Each random sample Formula$S_{N}$ consists of Formula$N$ test strategies drawn i.i.d. from Formula${\cal S}$ with uniform distribution.

For each base strategy Formula$i$, we directly estimate the true generalization performance Formula$G_{i}$ from (2) and normalize Formula$G_{i}(J)$ by taking Formula$X_{i}(J)=G_{i}(J)-G_{i}$. We can then compute estimates of the variance and the third absolute moment of Formula$X_{i}(J)$ with respect to Formula$S_{N}$, i.e., for each strategy Formula$i$, we have a 1000-sample estimate of Formula${\mathhat{\sigma}}_{i}^{2}$ and another 1000-sample estimate of Formula${\mathhat{\rho}}_{i}$.

From the Berry-Esseen theorem (10) [17], we can compute for each base strategy Formula$i$ the deviation from the Gaussian Formula TeX Source $$\epsilon={{0.7975\cdot{\mathhat{\rho}}_{i}}\over{\sqrt{N}\cdot{\mathhat{\sigma}}_{i}^{3}}}\eqno{\hbox{(26)}}$$ given different sample sizes of Formula$S_{N}$. By systematically computing the error Formula$\epsilon$ for Formula$S_{N}$ with Formula$N=\{50, 100, 200, 300, 400, 500, 1000, 2000, 3000, 4000, 5000, 10\,000, 50\,000\}$, we can observe how fast the distribution of generalization estimates is converging to a Gaussian.

Since we do not know the true value of Formula$\epsilon$, we take a pessimistic estimate of Formula$\epsilon$. Both 1000-sample estimates of Formula${\mathhat{\sigma}}_{i}^{2}$ and Formula${\mathhat{\rho}}_{i}$ are first rank-ordered in an ascending order. A pessimistic estimate of Formula$\epsilon$ would be to take a smaller value (2.5%-tile) of Formula${\mathhat{\sigma}}_{i}^{2}$ and a larger value (97.5%-tile) of Formula${\mathhat{\rho}}_{i}$.

Although we can directly compute the quantile intervals for Formula${\mathhat{\sigma}}_{i}^{2}$ from the Formula$\chi^{2}$-distribution, we have loose bounds for Formula${\mathhat{\rho}}_{i}$ (based on the inequality from [29, p. 210]), which would result in unnecessarily larger values in our pessimistic estimate of Formula$\epsilon$. Our comparison of quantiles from a 1000-sample estimate of Formula${\mathhat{\sigma}}_{i}^{2}$ between empirical estimates and estimates obtained directly from the Formula$\chi^{2}$-distribution (24) indicate an absolute difference around 0.03 when Formula$N=50$ (which is the smallest sample size we consider) and is smaller for larger values of Formula$N$ on average. Given the small absolute difference and that we are already computing pessimistic estimates of Formula$\epsilon$, we will use quantiles for Formula${\mathhat{\sigma}}_{i}^{2}$ and Formula${\mathhat{\rho}}_{i}$ obtained empirically for subsequent experiments.2

Fig. 2 plots the results for the 50 strategies Formula$i$ showing Formula$\epsilon$ against Formula$N$. Table I lists out Formula$\epsilon$ for different Formula$N{\rm s}$ for ten strategies Formula$i$. Naturally, increasing the sample size Formula$N$ leads to decreasing values of Formula$\epsilon$. However, there is a tradeoff between more robust estimation of generalization performance and increasing computational cost. Fig. 2 shows that Formula$\epsilon$ decreases rapidly when Formula$N$ increases from 50 to 1000, but starts to level off from around Formula$N=1000$ onward. Table I suggests that at Formula$N=2000$, Formula$S_{N}$ would provide a sufficiently robust estimate of generalization performance for a reasonable computational cost since one would need a five-fold increase of Formula$N$ to Formula$10\,000$ to reduce Formula$\epsilon$ by half.3 Furthermore, since for non-pathological strategies,4 Formula${\mathhat{\sigma}}_{i}^{2}$ and Formula${\mathhat{\rho}}_{i}$ in (26) are finite moments bounded away from 0, for larger Formula$N$, Formula$\epsilon$ is dominated by the term Formula$N^{-1/2}$. In our experiments, this implies that the tradeoff between more robust estimations of generalization and computational cost is roughly the same for most of the strategies.

Figure 2
Fig. 2. Pessimistic estimate of Formula$\epsilon$ as a function of sample size Formula$N$ of test strategies for 50 random base strategies Formula$i$.
Table 1
TABLE I PESSIMISTIC ESTIMATES OF Formula$\epsilon$ FROM (26) FOR TEN STRATEGIES Formula$i$

Leaving the previous analysis aside for a moment and assuming that estimates Formula${\mathhat{G}}_{i}({\cal S}_{N})$ for a base strategy Formula$i$ are Gaussian-distributed, from (13), we obtain the error Formula$\delta$ Formula TeX Source $$\delta={{z_{\alpha/2} {\mathhat{\sigma}}_{i}}\over{\sqrt{N}}}.\eqno{\hbox{(27)}}$$

We compute the pessimistic estimate of Formula$\delta$, taking 97.5%-tile of Formula${\mathhat{\sigma}}_{i}^{2}$ from the rank-ordered 1000-sample estimates. Our results for Formula$\delta$ also indicate a tradeoff between more robust estimation and increasing computational cost, and suggest that Formula$S_{N}$ at Formula$N=2000$ would provide a sufficiently robust estimate of generalization performance for a reasonable computational expense. The results in Table II show that the absolute difference between Formula$\epsilon$ and Formula$\delta$ becomes smaller for larger sample sizes (e.g., at Formula$N>1000$ the absolute difference is less than 0.01).

Table 2
TABLE II STATISTICS OF Formula$\{\vert\epsilon-\delta\vert\}_{i}$ FOR 50 STRATEGIES Formula$i$

We also illustrate how two strategies can be compared with respect to their generalization performances through a test of statistical significance based on the normally distributed Formula$Z$-statistic. For example, Table III shows numerical results for the Formula$p$-values obtained from Formula$Z$-tests directly using (16) to find out whether one strategy outperforms another with respect to a sample of random test strategies of size Formula$N$. In this case, since the two strategies actually differ substantially in performance, a small sample size of test strategies would be sufficient to test for its statistical significance (at around Formula$N=400$, the Formula$p$-values are smaller than the significance level of 0.05). Our experiments with other pairs of strategies with smaller performance differences indicate the need for a larger sample size of test strategies to test for statistical significance.

Table 3
TABLE III COMPUTED Formula$p$-VALUES OF Formula$Z$-TESTS TO DETERMINE WHETHER A STRATEGY Formula$i$ OUTPERFORMS A STRATEGY Formula$j$

We have illustrated examples of statistical estimation of generalization performance in co-evolutionary learning. Our studies have indicated that the number of test strategies that is required for robust estimation (given a controlled level of precision) of generalization performance is smaller than predicted earlier using the distribution-free framework (Chebyshev's) [12]. This has an obvious impact in the use of generalization estimates as a fitness measure in co-evolutionary learning since estimations have to be repeated throughout the evolutionary process. Although we use the IPD game as an example, our theoretical framework presented in Section II can be applied to other more complex problems or scenarios. The information we need to find and set the required number of test strategies for robust estimation only involves the second (variance) and third order moments, which can be estimated as well. As an example that we present in Section IV, we illustrate how the framework can be applied in the co-evolutionary learning of the more complex Othello game.

SECTION IV

USING THE NOTION OF STATISTICAL ESTIMATION OF GENERALIZATION PERFORMANCE AS FITNESS MEASURE IN CO-EVOLUTIONARY LEARNING

We will investigate the notion of directly using generalization estimates as a form of fitness measure in co-evolutionary learning. Ideally, we would like to make a direct estimation on the true generalization performance of the evolved strategy. In this case, co-evolutionary learning would lead to the search of strategies with increasingly higher generalization performance since the selection of evolved strategies are based on their generalization performances.5 However, such direct estimations can be computationally expensive. Instead, we investigate the use of relatively small samples of test strategies to guide and improve the co-evolutionary search following our earlier studies on the number of test strategies required for robust estimation. We first study this new approach of directly using generalization estimates as the fitness measure for the co-evolutionary learning of the IPD game before applying it to the more complex game of Othello.

A. Co-Evolutionary Learning of IPD

1) Strategy Representation

Various strategy representations for the co-evolutionary learning of IPD have been studied in the past, e.g., the look-up table with bit-string encoding [3], finite state machines [37], [38], and neural networks [32], [35], [39], [40]. The study in [41] has further investigated other forms of representations such as cellular representation for finite state machines and Markov chains among others, and their impact on the evolution of cooperation. We use the direct look-up table strategy representation [32] that directly represents IPD strategy behaviors through a one-to-one mapping between the genotype space (strategy representation) and the phenotype space (behaviors). The main advantage of using this representation is that the search space given by the strategy representation and the strategy space is the same (assuming a uniform strategy distribution in Formula${\cal S}$), which simplifies and allows direct investigation on the co-evolutionary search for strategies with higher generalization performance [12].

For a deterministic and reactive, memory-one Formula$n$-choice IPD strategy, the direct look-up table representation takes the form of Formula$m_{ij}$, Formula$i$, Formula$j=1,2,\ldots,n$ table elements that specify the choice to be made given the inputs of Formula$i$ (player's own previous choice) and Formula$j$ (opponent's previous choice). The first move Formula$m_{\rm fm}$ is specified independently rather than using pre-game inputs (two for memory-one strategies). Formula$m_{ij}$ and Formula$m_{\rm fm}$ can take any of the Formula$n$ values (choices) used to produce the payoffs in the payoff matrix through a linear interpolation. Fig. 3 illustrates the direct look-up table representation for the three-choice IPD strategy [32] where each table element can take Formula${+}{1}$, 0, or Formula${-}{1}$.

Figure 3
Fig. 3. Direct look-up table representation for the deterministic and reactive memory-one IPD strategy that considers three choices (also includes Formula$m_{\rm fm}$ for the first move, which is not shown in the figure).

Mutation is used to generate an offspring from a parent strategy when using the direct look-up table for strategy representation [32]. Mutation replaces the original choice of an element in the direct look-up table with one of the remaining Formula$n-1$ possible choices with an equal probability of Formula$1/(n-1)$. Each element (Formula$m_{ij}$ and Formula$m_{\rm fm}$) has a fixed probability Formula$p_{\rm m}$ of being replaced. The mutation can provide sufficient variations on strategy behaviors directly with the use of the direct look-up table representation (even for the more complex IPD game with more choices) [32].

2) Co-Evolutionary Learning Procedure

The following describes the classical co-evolutionary learning procedure [12], [32].

  1. Generation step, Formula$t=1$. Initialize Formula$\vert{\rm POP}\vert/2$ parent strategies Formula$i=1,2,\ldots,\vert{\rm POP}\vert/2$ randomly.
  2. Generate Formula$\vert{\rm POP}\vert/2$ offspring strategies Formula$i=\vert{\rm POP}\vert/2+1,\vert{\rm POP}\vert/2+2,\ldots,\vert{\rm POP}\vert$ from Formula$\vert{\rm POP}\vert/2$ parent strategies through a mutation operator with Formula$p_{\rm m}=0.05$.
  3. All pairs of strategies in the population POP compete, including the pair where a strategy plays itself (round-robin tournament). For Formula$\vert{\rm POP}\vert$ strategies, every strategy competes a total of Formula$\vert{\rm POP}\vert$ games. The fitness of a strategy Formula$i$ is Formula${{1}\over{\vert{\rm POP}\vert}}\sum_{j\in{\rm POP}}G_{i}(j)$.
  4. Select the best Formula$\vert{\rm POP}\vert/2$ strategies based on fitness. Increment generation step, Formula$t\leftarrow t+1$.
  5. Steps 2–4 are repeated until termination criterion (a fixed number of generation) is met.

All IPD games involve a fixed game length of 150 iterations. A fixed and sufficiently long duration for the evolutionary process Formula$(t=300)$ is used. As in [12], we observe how the generalization performance of co-evolutionary learning (we measure the generalization performance of the top performing evolved strategy) changes during the evolutionary process. All experiments are repeated in 30 independent runs to allow for statistical analysis.

The classical co-evolutionary learning (CCL) is used as a baseline for comparison with the improved co-evolutionary learning (ICL) that directly uses generalization performance estimates as the fitness measure. We first study a simple implementation of this co-evolutionary learning approach where the estimate Formula${\mathhat{G}}_{i}(S_{N})$ is directly used as the fitness of the evolved strategy Formula$i$. The procedure for this new approach is similar to the baseline with the exception of Step 3). This is to allow a more direct comparison. We investigate the approach with Formula$\vert{\rm POP_{ICL}}\vert=20$ and Formula$S_{N}$ with different sample sizes Formula$N=\{50, 500, 1000, 2000, 10\,000, 50\,000\}$. The sample Formula$S_{N}$ is generated anew every generation. We use a sample size of Formula$N=50\,000$ to provide a “ground truth” estimate close to the true generalization performance (based on the distribution-free framework) with which to compare results for cases where much smaller samples are used. For a more direct comparison, the baseline CCL uses Formula$\vert{\rm POP_{CCL}}\vert=50$ since experiments using generalization estimates directly as fitness in ICL starts with Formula$S_{N}=50$.

B. Results and Discussion

Fig. 4 shows results for our experiments. Each graph plots the true generalization performance Formula$G_{i}$ of the top performing strategy of the population throughout the evolutionary process for all 30 independent runs. In particular, Fig. 4(a) shows that when co-evolutionary learning uses a fitness measure based on relative performance between competing strategies in the population, the search process can exhibit large fluctuations in the generalization performance of strategies throughout co-evolution. This is consistent with observations from previous studies such as [21], [22], and [32], where it has been shown that fluctuations in the generalization performance during co-evolution are due to overspecialization of the population to a specific strategy that is replaced by other strategies that can exploit it. Results from our baseline experiment show that the use of relative fitness measure does not necessarily lead to the co-evolutionary learning of strategies with increasingly higher generalization performance.

Figure 4
Fig. 4. FComparison of CCL and different ICLs for the three-choice IPD game. (a) CCL. (b) ICL-N50. (c) ICL-N500. (d) ICL-N2000. (e) ICL-N10000. (f) ICL-N50000. Shown are plots of the true generalization performance Formula$G_{i}$ of the top performing strategy of the population throughout the evolutionary process for all 30 independent runs.

However, for all ICL experiments where estimates Formula${\mathhat{G}}_{i}(S_{N})$ are directly used as the fitness measure, no evolutionary run is observed to exhibit large fluctuations in generalization performance (Fig. 4). This is in contrast to the case of CCL where runs exhibit large fluctuations during co-evolution [Fig. 4(a)]. Starting with the case of a small sample of size 50, the search process of ICL-N50 Formula$(S_{N}, N=50)$ exhibits only small fluctuations in the generalization performance during co-evolution. These fluctuations are a result of sampling errors from using a small sample of test strategies to estimate Formula${\mathhat{G}}_{i}(S_{N})$, which can affect the ranking of strategies Formula$i$ in the co-evolving population for selection.

Results from Fig. 4 suggest that when generalization estimates are directly used as the fitness measure, co-evolutionary learning converges to higher generalization performance. For example, when a sample of 500 test strategies is used to estimate Formula${\mathhat{G}}_{i}(S_{N})$, more evolutionary runs converge to higher generalization performance without fluctuations compared to the case when 50 test strategies are used. However, we do not observe significant differences at the end of the evolutionary runs for ICLs when the sample size is increased further, i.e., between ICL-N2000 and ICL-N50000 (Fig. 4). Closer inspection on evolved strategies reveals that they play nearly “all defect” or are actually “all defect” strategies. This observation is expected since “all defect” strategy has the maximum generalization performance for the game outcome defined by (1).

We have also collected various statistics on Formula$G_{i}$ measurements of CCL and ICLs using different sample sizes in Table IV. The table shows that starting from a small sample of 50 test strategies, the increase in the generalization performance of ICL is statistically significant in comparison to the case of CCL. The generalization performance of ICL appears to have settled with no significant increase when sample size Formula$N$ is increased from 500 to Formula$50\,000$ (which is the sample size based on the distribution-free framework and close in number to all possible strategies for the three-choice IPD). The estimates Formula${\mathhat{G}}_{i}(S_{N})$ appear to be robust at small sample sizes of Formula$S_{N}$ to guide and improve co-evolutionary search to obtain strategies with high generalization performance. The co-evolutionary learning is also much faster since significantly smaller sample sizes (around an order of magnitude smaller in number of test strategies) are sufficient to achieve similarly high generalization performance.

Table 4
TABLE IV SUMMARY OF RESULTS FOR DIFFERENT CO-EVOLUTIONARY LEARNING APPROACHES FOR THE THREE-CHOICE IPD TAKEN AT THE FINAL GENERATION

At this point, we have compared only the generalization performances of the co-evolutionary learning that directly uses Formula${\mathhat{G}}_{i}(S_{N})$ with the classical co-evolutionary learning that uses relative fitness measure. However, it is of interest to investigate the co-evolutionary learning that uses a fitness measure consisting of a mixture of the two fitness values to determine the impact on generalization performance. We consider the simple implementation of a weighted sum of fitness measures Formula TeX Source $${\rm fitness}_{i}\,{=}\,(\eta)\cdot\left({{1}\over{N}}\sum_{j\in S_{N}}G_{i}(j)\right){+}(1-\eta){\cdot}\left({{1}\over{\vert{\rm POP}\vert}}\sum_{k\in{\rm POP}}G_{i}(k)\right)\eqno{\hbox{(28)}}$$ where higher Formula$\eta$ values give more weight to the contribution of estimates Formula${\mathhat{G}}_{i}(S_{N})$ for the selection of evolved strategies. We investigate this approach where the estimate Formula${\mathhat{G}}_{i}(S_{N})$ is computed with Formula$N=10\,000$ test strategies (to ensure a reasonable tradeoff between accuracy and computational expense) and Formula$\eta$ at 0.25 (MCL25-N10000), 0.50 (MCL50-N10000), and 0.75 (MCL75-N10000).

Results show that co-evolutionary learning is able to search for strategies with high generalization performance (Fig. 5). However, the inclusion of relative fitness leads to fluctuations in the generalization performance of co-evolutionary learning. The fluctuations are smaller and localized around a high generalization performance when the contribution of relative fitness is reduced [Fig. 5(b)]. Our results suggest that the co-evolutionary search of strategies with high generalization performance is due to the estimate Formula${\mathhat{G}}_{i}(S_{N})$ that contributes to the fitness measure. There is no positive impact to the generalization performance of co-evolutionary learning by including relative fitness.

Figure 5
Fig. 5. Different MCL-N10000s for the three-choice IPD game. (a) MCL25-N10000 Formula$(\eta = 0.25)$. (b) MCL75-N10000 Formula$(\eta = 0.75)$. Shown are plots of the true generalization performance Formula$G_{i}$ of the top performing strategy of the population throughout the evolutionary process for all 30 independent runs.

C. Co-Evolutionary Learning of Othello

In this section, we demonstrate our new approach to more complex problems. As an example, we will show that ICL also improves on co-evolutionary learning for the more complex game of Othello. We can achieve similarly high generalization performance using estimates requiring an order of magnitude smaller number of test strategies than the case of a distribution-free framework. We do not necessarily need larger samples of test strategies when applying the new approach to more complex games. Instead, we can find out in advance the required number of test strategies for robust estimation before applying ICL to the game of Othello.

1) Othello

Othello is a deterministic, perfect information, zero-sum board game played by two players (black and white) that alternatively place their (same colored) pieces on an eight-by-eight board. The game starts with each player having two pieces already on the board as shown in Fig. 6. In Othello, the black player starts the game by making the first move. A legal move is one where the new piece is placed adjacent horizontally, vertically, or diagonally to an opponent's existing piece [e.g., Fig. 6(b)] such that at least one of opponent's pieces lies between the player's new piece and existing pieces [e.g., Fig. 6(c)]. The move is completed when the opponent's surrounded pieces are flipped over to become the player's pieces [e.g., Fig. 6(d)]. A player that could not make a legal move forfeits and passes the move to the opponent. The game ends when all the squares of the board are filled with pieces, or when neither player is able to make a legal move [7].

Figure 6
Fig. 6. Figure illustrates basic Othello moves. (a) Positions of respective players' pieces at the start of the game. (b) Possible legal moves (which are indicated by black, crossed circles) at a later point of the game. (c) Black player selecting a legal move. (d) Black move is completed where surrounded white pieces are flipped over to become black pieces [7].

2) Strategy Representation

Among the strategy representations that have been studied for the co-evolutionary learning of Othello strategies (in the form of a board evaluation function) are weighted piece counters [42] and neural networks [7], [43]. We consider the simple strategy representation of a weighted piece counter in the following empirical study. This is to allow a more direct investigation of the impact of fitness evaluation in the co-evolutionary search of Othello strategies with higher generalization performance.

A weighted piece counter (WPC) representing the board evaluation function of an Othello game strategy can take the form of a vector of 64 weights, indexed as Formula$w_{rc}$, Formula$r=1,\ldots,8$, Formula$c=1,\ldots,8$, where Formula$r$ and Formula$c$ represent the position indexes for rows and columns of an eight-by-eight Othello board, respectively. Let the Othello board state be the vector of 64 pieces, indexed as Formula$x_{rc}$, Formula$r=1,\ldots,8$, Formula$c=1,\ldots,8$, where Formula$r$ and Formula$c$ represent the position indexes for rows and columns. Formula$x_{rc}$ takes the value of Formula${+}{1}$, Formula${-}{1}$, or 0 for black piece, white piece, and empty piece, respectively [42].

The WPC would take the Othello board state as input, and output a value that gives the worth of the board state. This value is computed as Formula TeX Source $${\rm WPC}({\mbi x})=\sum_{r=1}^{8}\sum_{c=1}^{8}w_{rc}\cdot x_{rc}\eqno{\hbox{(29)}}$$ where the more positive value of Formula${\rm WPC}({\mbi x})$ would indicate WPCs interpretation that the board state Formula${\mbi x}$ is more favorable if WPC is a black player. The more negative value of Formula${\rm WPC}({\mbi x})$ would indicate WPCs interpretation that the board state Formula${\mbi x}$ is more favorable if WPC is a white player [42].

We consider a simple mutation operator, where the WPC weight of the offspring Formula$w^{\prime}_{rc}$ can be obtained by adding a small random value to the corresponding WPC weight of the parent Formula$w_{rc}$ Formula TeX Source $$w^{\prime}_{rc}=w_{rc}+k\ast F_{rc}\quad r=1,\ldots,8\quad c=1,\ldots,8\eqno{\hbox{(30)}}$$ where Formula$k$ is a scaling constant Formula$(k=0.1)$ and Formula$F_{rc}$ is a real number randomly drawn from Formula$[{-1,1}]$ with a uniform distribution and resampled for every combination of Formula$r$ and Formula$c$ (total of 64 weights). For the experiments, we consider the space of Othello strategies given by the WPC representation with Formula$w_{rc}\in [{-10,10}]$. The simple mutation-operator can provide sufficient variation to the Othello game strategy represented in the form of a WPC evaluation function. We note that choices for various parameters are not optimized. The main emphasis of our study is to investigate the impact of generalization performance estimates used as fitness measure in improving the generalization performance of co-evolutionary learning.

3) Measuring Generalization Performance on Othello Strategies

Unlike the IPD game that is symmetric, the Othello game is not necessarily symmetric, i.e., the black and white players may not have the same sets of available strategies [23]. In this case, we consider two estimates of generalization performance. We estimate the generalization performance of a black WPC through Othello game-plays against a random test sample of white WPCs. Conversely, we estimate the generalization performance of a white WPC through Othello game-plays against a random test sample of black WPCs. A random test sample of Othello WPC is obtained through random sampling of Formula$w_{rc}$ from Formula$[{-10,10}]$ having a uniform distribution and resampled for every combination of Formula$r$ and Formula$c$. We use a random sample of Formula$50\,000$ test WPCs (opposite color) to directly estimate the generalization performance of evolved WPC since we cannot compute the true generalization performance.

4) Co-Evolutionary Learning Procedure

Given the approach we use to measure the generalization performance of evolved Othello WPC, we repeat all experiment settings twice: one for black WPC and one for white WPC. For example, the CCL of black WPC is described as follows.

  1. Generation step, Formula$t=1$. Initialize Formula$\vert{\rm POP}\vert/2$ parent strategies Formula$i=1,2,\ldots,\vert{\rm POP}\vert/2$ randomly. For a Formula${\rm WPC}_{i}$, Formula$w_{rc}^{i}$ is real number randomly sampled from Formula$[-0.2,0.2]$ having a uniform distribution and resampled for every combination of Formula$r$ and Formula$c$.
  2. Generate Formula$\vert{\rm POP}\vert/2$ offspring strategies Formula$i=\vert{\rm POP}\vert/2+1,\vert{\rm POP}\vert/2+2,\ldots,\vert{\rm POP}\vert$ from Formula$\vert{\rm POP}\vert/2$ parent strategies through a mutation operator given by (30).
  3. All pairs of strategies in the population POP compete, including the pair where a strategy plays itself (round-robin tournament). For Formula$\vert{\rm POP}\vert$ strategies, every strategy competes a total of Formula$\vert{\rm POP}\vert$ games. The fitness of a black WPC strategy Formula$i$ is Formula${{1}\over{\vert{\rm POP}\vert}}\sum_{j\in{\rm POP}}G_{i}(j)$, where Formula$G_{i}(j)$ is the game outcome to Formula$i$ for an Othello game played by Formula$i$ (black) and Formula$j$ (white).
  4. Select the best Formula$\vert{\rm POP}\vert/2$ strategies based on fitness. Increment generation step, Formula$t\leftarrow t+1$.
  5. Steps 2–4 are repeated until termination criterion (i.e., a fixed number of generation) is met.

For the co-evolutionary learning of Othello, we consider a shorter evolutionary duration of 200 generations compared to the co-evolutionary learning of IPD. This is due to the increase in computational expense in a single Othello game compared to a single IPD game. All experiments are repeated in 30 independent runs to allow for statistical analysis.

ICLs with Formula$\vert{\rm POP_{ICL}}\vert=20$ and different sample sizes of Formula$N=\{50, 500, 1000, 5000, 10\,000, 50\,000\}$ to estimate Formula${\mathhat{G}}_{i}(S_{N})$ are considered while CCL with Formula$\vert{\rm POP_{CCL}}\vert=50$ is used as a baseline for more direct comparison. The sample size of Formula$N=50\,000$ is to provide a “ground truth” estimate close to the true generalization performance with which to compare results for cases where much smaller samples are used. Note that different samples of Formula$S_{N}$ are used to estimate Formula${\mathhat{G}}_{i}(S_{N})$ as the fitness measure in ICL and to estimate the generalization performance of ICL for analysis.

5) Results and Discussion

Fig. 7 shows results for our experiments. Each graph plots the estimated generalization performance Formula${\mathhat{G_{i}}}(S_{N}) (N=50\,000)$ of the top performing strategy of the population throughout the evolutionary process for all 30 independent runs. As with the CCL of the simpler IPD game [Fig. 4(a)], results for the CCL of black and white WPCs indicate a search process with large fluctuations in the generalization performance of strategies throughout co-evolution [Fig. 7(a) and (b)]. Our results suggest that co-evolutionary learning does not necessarily lead to Othello WPC strategies with increasingly higher generalization performance when a relative fitness measure is used.

Figure 7
Fig. 7. Comparison of CCL and different ICLs for the Othello game. (a) CCL black WPC. (b) CCL white WPC. (c) ICL-N500 black WPC. (d) ICL-N500 white WPC. (e) ICL-N50000 black WPC. (f) ICL-N50000 white WPC. Shown are plots of the estimated generalization performance Formula${\mathhat{G_{i}}}(S_{N})$ (with Formula$N=50\,000$) of the top performing strategy of the population throughout the evolutionary process for all 30 independent runs.

When estimates Formula${\mathhat{G_{i}}}(S_{N})$ are directly used as the fitness measure in co-evolutionary learning, fluctuations in the generalization performance are reduced and that the co-evolutionary search converges to higher generalization performance compared to the case of CCL for the Othello game (Fig. 7). We observe that ICL can search WPCs with higher generalization performance although small fluctuations can be seen during co-evolution when estimates Formula${\mathhat{G_{i}}}(S_{N})$ are computed using a small sample of 50 test strategies. Further increase in sample size leads to further improvements in generalization performance of ICL, e.g., when Formula$N=500$ [Fig. 7(c) and (d)].

There is a point where we observe that a significant increase in the sample size does not bring about a significant increase in generalization performance of ICL. For example, results for ICL-N5000 is similar to that of ICL-N50000 [Fig. 7(e) and (f)]. This is consistent with results from experiments to find the required number of test strategies for robust estimation of generalization performance (Fig. 8). The figure suggests a tradeoff at Formula$N=5000$ for Formula$S_{N}$ to provide sufficiently robust estimation for a reasonable computation cost since substantially increasing Formula$N$ to Formula$50\,000$ would not lead to a significant decrease in the error Formula$\epsilon$ for the Othello game.

Figure 8
Fig. 8. Pessimistic estimate of Formula$\epsilon$ as a function of sample size Formula$N$ of test strategies for 50 random base strategies Formula$i$ for the Othello game. (a) Black WPC.(b) White WPC.

Tables V and VI compare the generalization performance of CCL with ICLs at the end of the generational runs for black and white WPCs, respectively. They show that there is a positive and significant impact on the generalization performance in co-evolutionary learning when estimates Formula${\mathhat{G_{i}}}(S_{N})$ are directly used as the fitness measure. The means over 30 runs are higher while the standard errors at 95% confidence interval are lower when comparing results between ICLs and CCL. In addition, results of controlled experiments for co-evolutionary learning with the fitness measure being a mixture of the estimate Formula${\mathhat{G}}_{i}(S_{N})$ and relative fitness (28) indicate that when the contribution of the relative fitness is reduced while that of the estimate Formula${\mathhat{G}}_{i}(S_{N})$ is increased, higher generalization performance can be obtained with smaller fluctuations throughout co-evolution. These results further support our previous observation from Fig. 7 that co-evolution converges to higher generalization performance without large fluctuations as a result of directly using the generalization estimate Formula${\mathhat{G}}_{i}(S_{N})$ as the fitness measure.

Table 5
TABLE V SUMMARY OF RESULTS FOR DIFFERENT CO-EVOLUTIONARY LEARNING APPROACHES FOR BLACK OTHELLO WPC TAKEN AT THE FINAL GENERATION
Table 6
TABLE VI SUMMARY OF RESULTS FOR DIFFERENT CO-EVOLUTIONARY LEARNING APPROACHES FOR WHITE OTHELLO WPC TAKEN AT THE FINAL GENERATION

Our empirical studies indicate that the use of generalization estimates directly as the fitness measure can have a positive and significant impact on the generalization performance of co-evolutionary learning for both the IPD and Othello games. The new approach (ICL) can obtain strategies with higher generalization performance without large performance fluctuations and is faster compared to the case when a distribution-free framework is used, requiring an order of magnitude smaller number of test strategies to achieve similarly high generalization performance. More importantly, it is not necessary to use larger samples of test strategies when applying ICL to more complex games. One can observe the similarity in the rate at which the error Formula$\epsilon$ decreases for increasing sample size Formula$N$ for both the IPD and Othello games (Figs. 2 and 8), and subsequently the similarity of the impact of using Formula${\mathhat{G}}_{i}(S_{N})$ on the generalization performance of ICL (Figs. 4 and 7). We stress that one can use our approach to find and set the required number of test strategies for robust estimation in a principled manner before applying ICL to a new game.

We do note that there are many issues related to the design of co-evolutionary learning systems for high performance. For example, design issues can be problem-specific and involve representation, variation and selection operators [4], [7] as well as more sophisticated development of systems involving incorporation of domain knowledge [44] that have the potential to provide superior solutions compared to other learning approaches [8]. We only address the issue of selection (generalization estimates used to guide co-evolutionary search) in a principled manner that also can be implemented practically. Although fine-tuning parameters such as mutation rate and population size can have an impact on our numerical results, our general observations would hold. In addition, various selection and variation approaches can have different impact on generalization performance in co-evolutionary learning for different real-world problems (and games in particular). Here, it is of interests to use common tools for rigorous quantitative analysis such as generalization measures we have formulated in [12]. As an example, we have previously studied both generalization estimates using unbiased sample of random test strategies (obtained through uniform sampling of Formula${\cal S}$) and biased sample of random test strategies that are superior in game-play and more likely to be encountered in a competitive setting (obtained through a multiple partial enumerate search). We have also recently started a preliminary investigation on the impact of diversity on the generalization performance of co-evolutionary learning [45].

SECTION V

CONCLUSION

We have addressed the issue of loose confidence bounds associated with the distribution-free (Chebyshev's) framework we have formulated earlier for the estimation of generalization performance in co-evolutionary learning and demonstrated in the context of game-playing. Although Chebyshev's bounds hold for any distribution of game outcomes, they have high computational requirements, i.e., a large sample of random test strategies is needed to estimate the generalization performance of a strategy as average game outcomes against test strategies. In this paper, we take advantage of the near-Gaussian nature of average game outcomes (generalization performance estimates) through the central limit theorem and provide tighter bounds based on parametric testing. Furthermore, we can strictly control the condition (sample size under a given precision) under which the distribution of average game outcomes converges to a Gaussian through the Berry-Esseen theorem.

These improvements to our generalization framework provide the means with which we develop a general and principled approach to improve generalization performance in co-evolutionary learning that can be implemented as an efficient algorithm. Ideally, co-evolutionary learning using the true generalization performance directly as the fitness measure would be able search for solutions with higher generalization performance. However, direct estimation of the true generalization performance using the distribution-free framework can be computationally expensive. Our new theoretical contributions that exploit the near-Gaussian nature of generalization estimates provide the means with which we can now; 1) find out in a principled manner the required number of test cases for robust estimations of generalization performance, and 2) subsequently use the small sample of random test cases to compute generalization estimates of solutions directly as the fitness measure to guide and improve co-evolutionary learning.

We have demonstrated our approach on the co-evolutionary learning of the IPD and the more complex Othello game. Our new approach is shown to improve on the classical approach in that we can obtain increasingly higher generalization performance using relatively small samples of test strategies and without large performance fluctuations typical of the classical approach. Our new approach also leads to faster co-evolutionary search where we can strictly control the condition (sample sizes) under which the speedup is achieved (not at the cost of weakening precision in the estimates). It is much faster compared to the distribution-free framework approach as it requires an order of magnitude smaller number of test strategies to achieve similarly high generalization performance for both the IPD and Othello game. Note that our approach does not depend on the complexity of the game. That is, no assumption needs to be made about the complexity of the game and how it may have an impact on the required number of test strategies for robust estimations of generalization performance.

This paper is a first step toward understanding and developing theoretically motivated frameworks of co-evolutionary learning that can lead to improvements in the generalization performance of solutions. There are other research issues relating to generalization performance in co-evolutionary learning that need to be addressed. Although our generalization framework makes no assumption on the underlying distribution of test cases Formula$(P_{\cal S})$, we have demonstrated one application where Formula$P_{\cal S}$ in the generalization measure is fixed and known a priori. Generalization estimates are directly used as the fitness measure to improve generalization performance of co-evolutionary learning (in effect, reformulating the approach as evolutionary learning) in this paper. There are problems where such an assumption has to be relaxed and it is of interests to us for future studies to formulate naturally and precisely co-evolutionary learning systems where the population acting as test samples can adapt to approximate a particular distribution that solutions should generalize to.

ACKNOWLEDGMENT

The authors would like to thank Prof. S. Lucas and Prof. T. Runarsson for providing access to their Othello game engine that was used for the experiments of this paper.

Footnotes

This work was supported in part by the Engineering and Physical Sciences Research Council, under Grant GR/T10671/01 on “Market Based Control of Complex Computational Systems.”

S. Y. Chong is with the School of Computer Science, University of Nottingham, Semenyih 43500, Malaysia. He is also with the Automated Scheduling, Optimization and Planning Research Group, School of Computer Science, University of Nottingham, Nottingham NG8 1BB, U.K. (e-mail: siang-yew.chong@nottingham.edu.my).

P. Tiňo is with the School of Computer Science, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K. (e-mail: p.tino@cs.bham.ac.uk).

D. C. Ku is with the Faculty of Information Technology, Multimedia University, Cyberjaya 63100, Malaysia (e-mail: dcku@mmu.edu.my).

X. Yao is with the Center of Excellence for Research in Computational Intelligence and Applications, School of Computer Science, University of Birmingham, Edgbaston, Birmingham B15 2TT, U.K. (e-mail: x.yao@cs.bham.ac.uk).

1There is a trivial inequality for the third moment [29, p. 210] that however leads to rather broad bounds.

2The non-parametric quantile estimation is performed in the usual manner on ordered samples. Uniform approximation of the true distribution function by an empirical distribution function based on sample values is guaranteed, e.g., by the Glivenko-Cantelli theorem [29], [36].

3Note that because we take a pessimistic estimate of Formula$\epsilon$, it is possible that the computed value of Formula$\epsilon$ is greater than the real one, especially for small sample sizes (e.g., strategy #7 at Formula$N=50$).

4By pathological strategies we mean strategies with very little variation of game outcomes when playing against a wide variety of opponent test strategies.

5 In the same way, the selection process in the pareto co-evolution [24] is based on the pareto-dominance relationship although establishing such a relationship requires competing solutions to interact (solve) a sample of test cases.

References

No Data Available

Authors

Siang Yew Chong

Siang Yew Chong

Siang Yew Chong (M'99) received the B.Eng. (Honors) and M.Eng.Sc. degrees in electronics engineering from Multimedia University, Melaka, Malaysia, in 2002 and 2004, respectively, and the Ph.D. degree in computer science from the University of Birmingham, Edgbaston, Birmingham, U.K., in 2007.

He was a Research Associate with the Center of Excellence for Research in Computational Intelligence and Applications, School of Computer Science, University of Birmingham, in 2007. Currently, he is an Honorary Research Fellow with the School of Computer Science, University of Birmingham. He joined the School of Computer Science, University of Nottingham, Semenyih, Malaysia, in 2008, and is currently a member of the Automated Scheduling, Optimization and Planning Research Group, School of Computer Science, University of Nottingham, Nottingham, U.K. He has co-edited the book The Iterated Prisoners' Dilemma: 20 Years On (Singapore: World Scientific Press, 2007). His current research interests include evolutionary computation, machine learning, and game theory.

Dr. Chong received the Outstanding Ph.D. Dissertation Award from the IEEE Computational Intelligence Society in 2009 for his work on co-evolutionary learning. He was awarded the Student Travel Grant for the 2003 Congress on Evolutionary Computation. He is a recipient for the 2011 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION Outstanding Paper Award with P. Tiňo and X. Yao for his work on co-evolutionary learning published in 2008. He is an Associate Editor of the IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES.

Peter Tiňo

Peter Tiňo

Peter Tiňo received the M.S. degree from the Slovak University of Technology, Bratislava, Slovakia, in 1988, and the Ph.D. degree from the Slovak Academy of Sciences, Bratislava, in 1997.

From 1994 to 1995, he was a Fullbright Fellow with the NEC Research Institute, Princeton, NJ. He was a Post-Doctoral Fellow with the Austrian Research Institute for AI, Vienna, Austria, from 1997 to 2000, and a Research Associate with Aston University, Birmingham, U.K., from 2000 to 2003. From 2003 to 2006, he was a Lecturer with the School of Computer Science, University of Birmingham, Edgbaston, Birmingham, U.K., and has been a Senior Lecturer there since 2007. His current research interests include probabilistic modeling and visualization of structured data, statistical pattern recognition, dynamical systems, evolutionary computation, and fractal analysis.

Dr. Tiňo received the Fullbright Fellowship in 1994 and the U.K.–Hong Kong Fellowship for Excellence in 2008. He received the Outstanding Paper of the Year Award from the IEEE TRANSACTIONS ON NEURAL NETWORKS with T. Lin, B. G. Horne, and C. L. Giles in 1998 for his work on recurrent neural networks. He received the 2002 Best Paper Award at the International Conference on Artificial Neural Networks with B. Hammer. He is on the editorial board of several journals.

Day Chyi Ku

Day Chyi Ku

Day Chyi Ku received the B.Eng. (Honors) degree in electrical and electronics engineering and the M.S. degree in computer systems engineering from University Putra Malaysia, Selangor, Malaysia, in 2000 and 2002, respectively, and the Ph.D. degree in engineering design from Brunel University, London, U.K., in 2007.

From 2002 to 2003, she was a Lecturer with the Faculty of Information Science and Technology, Multimedia University (MMU), Melaka, Malaysia. She was then with Frontier Developments Ltd., Trinity College, University of Cambridge, Cambridge, U.K., to work on the next-generation game engine during her final year of Ph.D. studies, in 2006. Since 2008, she has been a Lecturer with the Faculty of Information Technology, MMU, Cyberjaya, Malaysia. Her current research interests include computer graphics, human-computer interaction, and games.

Dr. Ku received the Vice Chancellor's Travel Prize, Brunel University, for the 2006 Eurographics.

Xin Yao

Xin Yao

Xin Yao (M'91–SM'96–F'03) received the B.S. degree from the University of Science and Technology of China (USTC), Hefei, China, in 1982, the M.S. degree from the North China Institute of Computing Technology, Beijing, China, in 1985, and the Ph.D. degree from USTC in 1990.

From 1985 to 1990, he was an Associate Lecturer and Lecturer with USTC, while working toward the Ph.D. degree. He was a Post-Doctoral Fellow with the Computer Sciences Laboratory, Australian National University, Canberra, Australia, in 1990, and continued his work on simulated annealing and evolutionary algorithms. He joined the Knowledge-Based Systems Group, CSIRO Division of Building, Construction and Engineering, Melbourne, Australia, in 1991, where he worked primarily on an industrial project on automatic inspection of sewage pipes. He returned to Canberra in 1992 to take up a lectureship with the School of Computer Science, University College, University of New South Wales, Australian Defense Force Academy, Canberra, where he was later promoted to a Senior Lecturer and Associate Professor. Attracted by the English weather, he moved to the University of Birmingham, Edgbaston, Birmingham, U.K., as a Professor (Chair) with the School of Computer Science in 1999. Currently, he is the Director of the Center of Excellence for Research in Computational Intelligence and Applications, a Distinguished Visiting Professor with USTC, and a Visiting Professor with three other universities. His major research interests include evolutionary computation, neural network ensembles, and real-world applications. He has more than 350 refereed publications in these areas.

Dr. Yao received the President's Award for Outstanding Thesis from the Chinese Academy of Sciences for his Ph.D. work on simulated annealing and evolutionary algorithms. He received the 2001 IEEE Donald G. Fink Prize Paper Award for his work on evolutionary artificial neural networks. From 2003 to 2008, he was the Editor-in-Chief of the IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION. He is an Associate Editor or an editorial board member of 12 international journals. He is the Editor of the World Scientific Book Series on Advances in Natural Computation. He has given more than 60 invited keynote and plenary speeches at international conferences worldwide.

Cited By

No Data Available

Keywords

Corrections

None

Multimedia

No Data Available
This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available

Text Size