By Topic

Neural Networks, IEEE Transactions on

Issue 2 • Date Feb. 2011

Filter Results

Displaying Results 1 - 19 of 19
  • Table of contents

    Page(s): C1
    Save to Project icon | Request Permissions | PDF file iconPDF (110 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Neural Networks publication information

    Page(s): C2
    Save to Project icon | Request Permissions | PDF file iconPDF (39 KB)  
    Freely Available from IEEE
  • Optimized Discriminative Kernel for SVM Scoring and Its Application to Speaker Verification

    Page(s): 173 - 185
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (803 KB) |  | HTML iconHTML  

    The decision-making process of many binary classification systems is based on the likelihood ratio (LR) scores of test patterns. This paper shows that LR scores can be expressed in terms of the similarity between the supervectors (SVs) formed by stacking the mean vectors of Gaussian mixture models corresponding to the test patterns, the target model, and the background model. By interpreting the support vector machine (SVM) kernels as a specific similarity (or discriminant) function between SVs, this paper shows that LR scoring is a special case of SVM scoring and that most sequence kernels can be obtained by assuming a specific form for the similarity function of SVs. This paper further shows that this assumption can be relaxed to derive a new general kernel. The kernel function is general in that it is a linear combination of any kernels belonging to the reproducing kernel Hilbert space. The combination weights are obtained by optimizing the ability of a discriminant function to separate the positive and negative classes using either regression analysis or SVM training. The idea was applied to both high-and low-level speaker verification. In both cases, results show that the proposed kernels achieve better performance than several state-of-the-art sequence kernels. Further performance enhancement was also observed when the high-level scores were combined with acoustic scores. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Count Data Modeling and Classification Using Finite Mixtures of Distributions

    Page(s): 186 - 198
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (595 KB) |  | HTML iconHTML  

    In this paper, we consider the problem of constructing accurate and flexible statistical representations for count data, which we often confront in many areas such as data mining, computer vision, and information retrieval. In particular, we analyze and compare several generative approaches widely used for count data clustering, namely multinomial, multinomial Dirichlet, and multinomial generalized Dirichlet mixture models. Moreover, we propose a clustering approach via a mixture model based on a composition of the Liouville family of distributions, from which we select the Beta-Liouville distribution, and the multinomial. The novel proposed model, which we call multinomial Beta-Liouville mixture, is optimized by deterministic annealing expectation-maximization and minimum description length, and strives to achieve a high accuracy of count data clustering and model selection. An important feature of the multinomial Beta-Liouville mixture is that it has fewer parameters than the recently proposed multinomial generalized Dirichlet mixture. The performance evaluation is conducted through a set of extensive empirical experiments, which concern text and image texture modeling and classification and shape modeling, and highlights the merits of the proposed models and approaches. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Domain Adaptation via Transfer Component Analysis

    Page(s): 199 - 210
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (799 KB) |  | HTML iconHTML  

    Domain adaptation allows knowledge from a source domain to be transferred to a different but related target domain. Intuitively, discovering a good feature representation across domains is crucial. In this paper, we first propose to find such a representation through a new learning method, transfer component analysis (TCA), for domain adaptation. TCA tries to learn some transfer components across domains in a reproducing kernel Hilbert space using maximum mean miscrepancy. In the subspace spanned by these transfer components, data properties are preserved and data distributions in different domains are close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. Furthermore, in order to uncover the knowledge hidden in the relations between the data labels from the source and target domains, we extend TCA in a semisupervised learning setting, which encodes label information into transfer components learning. We call this extension semisupervised TCA. The main contribution of our work is that we propose a novel dimensionality reduction framework for reducing the distance between domains in a latent space for domain adaptation. We propose both unsupervised and semisupervised feature extraction approaches, which can dramatically reduce the distance between domain distributions by projecting data onto the learned transfer components. Finally, our approach can handle large datasets and naturally lead to out-of-sample generalization. The effectiveness and efficiency of our approach are verified by experiments on five toy datasets and two real-world applications: cross-domain indoor WiFi localization and cross-domain text classification. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Mixing Matrix Estimation From Sparse Mixtures With Unknown Number of Sources

    Page(s): 211 - 221
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (824 KB) |  | HTML iconHTML  

    In blind source separation, many methods have been proposed to estimate the mixing matrix by exploiting sparsity. However, they often need to know the source number a priori, which is very inconvenient in practice. In this paper, a new method, namely nonlinear projection and column masking (NPCM), is proposed to estimate the mixing matrix. A major advantage of NPCM is that it does not need any knowledge of the source number. In NPCM, the objective function is based on a nonlinear projection and its maxima just correspond to the columns of the mixing matrix. Thus a column can be estimated first by locating a maximum and then deflated by a masking operation. This procedure is repeated until the evaluation of the objective function decreases to zero dramatically. Thus the mixing matrix and the number of sources are estimated simultaneously. Because the masking procedure may result in some small and useless local maxima, particle swarm optimization (PSO) is introduced to optimize the objective function. Feasibility and efficiency of PSO are also discussed. Comparative experimental results show the efficiency of NPCM, especially in the cases where the number of sources is unknown and the sources are relatively less sparse. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Dynamic Channel Assignment for Large-Scale Cellular Networks Using Noisy Chaotic Neural Network

    Page(s): 222 - 232
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1222 KB) |  | HTML iconHTML  

    This paper presents a novel dynamic channel assignment (DCA) technique for large-scale cellular networks (LCNs) using noisy chaotic neural network. In this technique, an LCN is first decomposed into many subnets, which are designated as decomposed cellular subnets (DCSs). The DCA process is independently performed in every subnet to alleviate the signaling overheads and to apportion the DCA computational load among the subnets. Then a novel energy function is formulated to avoid causing mutual interference among neighboring subnets based on the real-time interference channel table. In each subnet, the proposed energy function also satisfies three interference constraints among cells and the number of required channels of each cell, and simultaneously minimizes the total number of assigned channels to improve spectrum utilization. A typical 441-cell LCN with 70 available channels, which can be decomposed into nine 49-cell DCSs, is examined to demonstrate the validity of the proposed technique by blocking probability, including uniform and hot spot traffic patterns. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Graph Characterization via Ihara Coefficients

    Page(s): 233 - 245
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (1057 KB) |  | HTML iconHTML  

    The novel contributions of this paper are twofold. First, we demonstrate how to characterize unweighted graphs in a permutation-invariant manner using the polynomial coefficients from the Ihara zeta function, i.e., the Ihara coefficients. Second, we generalize the definition of the Ihara coefficients to edge-weighted graphs. For an unweighted graph, the Ihara zeta function is the reciprocal of a quasi characteristic polynomial of the adjacency matrix of the associated oriented line graph. Since the Ihara zeta function has poles that give rise to infinities, the most convenient numerically stable representation is to work with the coefficients of the quasi characteristic polynomial. Moreover, the polynomial coefficients are invariant to vertex order permutations and also convey information concerning the cycle structure of the graph. To generalize the representation to edge-weighted graphs, we make use of the reduced Bartholdi zeta function. We prove that the computation of the Ihara coefficients for unweighted graphs is a special case of our proposed method for unit edge weights. We also present a spectral analysis of the Ihara coefficients and indicate their advantages over other graph spectral methods. We apply the proposed graph characterization method to capturing graph-class structure and clustering graphs. Experimental results reveal that the Ihara coefficients are more effective than methods based on Laplacian spectra. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Logistic Regression by Means of Evolutionary Radial Basis Function Neural Networks

    Page(s): 246 - 263
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (511 KB) |  | HTML iconHTML  

    This paper proposes a hybrid multilogistic methodology, named logistic regression using initial and radial basis function (RBF) covariates. The process for obtaining the coefficients is carried out in three steps. First, an evolutionary programming (EP) algorithm is applied, in order to produce an RBF neural network (RBFNN) with a reduced number of RBF transformations and the simplest structure possible. Then, the initial attribute space (or, as commonly known as in logistic regression literature, the covariate space) is transformed by adding the nonlinear transformations of the input variables given by the RBFs of the best individual in the final generation. Finally, a maximum likelihood optimization method determines the coefficients associated with a multilogistic regression model built in this augmented covariate space. In this final step, two different multilogistic regression algorithms are applied: one considers all initial and RBF covariates (multilogistic initial-RBF regression) and the other one incrementally constructs the model and applies cross validation, resulting in an automatic covariate selection [simplelogistic initial-RBF regression (SLIRBF)]. Both methods include a regularization parameter, which has been also optimized. The methodology proposed is tested using 18 benchmark classification problems from well-known machine learning problems and two real agronomical problems. The results are compared with the corresponding multilogistic regression methods applied to the initial covariate space, to the RBFNNs obtained by the EP algorithm, and to other probabilistic classifiers, including different RBFNN design methods [e.g., relaxed variable kernel density estimation, support vector machines, a sparse classifier (sparse multinomial logistic regression)] and a procedure similar to SLIRBF but using product unit basis functions. The SLIRBF models are found to be competitive when compared with the corresponding multilogistic regression methods and the - BFEP method. A measure of statistical significance is used, which indicates that SLIRBF reaches the state of the art. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Guiding Hidden Layer Representations for Improved Rule Extraction From Neural Networks

    Page(s): 264 - 275
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (779 KB) |  | HTML iconHTML  

    The production of relatively large and opaque weight matrices by error backpropagation learning has inspired substantial research on how to extract symbolic human-readable rules from trained networks. While considerable progress has been made, the results at present are still relatively limited, in part due to the large numbers of symbolic rules that can be generated. Most past work to address this issue has focused on progressively more powerful methods for rule extraction (RE) that try to minimize the number of weights and/or improve rule expressiveness. In contrast, here we take a different approach in which we modify the error backpropagation training process so that it learns a different hidden layer representation of input patterns than would normally occur. Using five publicly available datasets, we show via computational experiments that the modified learning method helps to extract fewer rules without increasing individual rule complexity and without decreasing classification accuracy. We conclude that modifying error backpropagation so that it more effectively separates learned pattern encodings in the hidden layer is an effective way to improve contemporary RE methods. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Multiconlitron: A General Piecewise Linear Classifier

    Page(s): 276 - 289
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (824 KB) |  | HTML iconHTML  

    Based on the “convexly separable” concept, we present a solid geometric theory and a new general framework to design piecewise linear classifiers for two arbitrarily complicated nonintersecting classes by using a “multiconlitron,” which is a union of multiple conlitrons that comprise a set of hyperplanes or linear functions surrounding a convex region for separating two convexly separable datasets. We propose a new iterative algorithm called the cross distance minimization algorithm (CDMA) to compute hard margin non-kernel support vector machines (SVMs) via the nearest point pair between two convex polytopes. Using CDMA, we derive two new algorithms, i.e., the support conlitron algorithm (SCA) and the support multiconlitron algorithm (SMA) to construct support conlitrons and support multiconlitrons, respectively, which are unique and can separate two classes by a maximum margin as in an SVM. Comparative experiments show that SMA can outperform linear SVM on many of the selected databases and provide similar results to radial basis function SVM on some of them, while SCA performs better than linear SVM on three out of four applicable databases. Other experiments show that SMA and SCA may be further improved to draw more potential in the new research direction of piecewise linear learning. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Client–Server Multitask Learning From Distributed Datasets

    Page(s): 290 - 303
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (818 KB) |  | HTML iconHTML  

    A client-server architecture to simultaneously solve multiple learning tasks from distributed datasets is described. In such architecture, each client corresponds to an individual learning task and the associated dataset of examples. The goal of the architecture is to perform information fusion from multiple datasets while preserving privacy of individual data. The role of the server is to collect data in real time from the clients and codify the information in a common database. Such information can be used by all the clients to solve their individual learning task, so that each client can exploit the information content of all the datasets without actually having access to private data of others. The proposed algorithmic framework, based on regularization and kernel methods, uses a suitable class of “mixed effect” kernels. The methodology is illustrated through a simulated recommendation system, as well as an experiment involving pharmacological data coming from a multicentric clinical trial. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Learning Ensembles of Neural Networks by Means of a Bayesian Artificial Immune System

    Page(s): 304 - 316
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (446 KB) |  | HTML iconHTML  

    In this paper, we apply an immune-inspired approach to design ensembles of heterogeneous neural networks for classification problems. Our proposal, called Bayesian artificial immune system, is an estimation of distribution algorithm that replaces the traditional mutation and cloning operators with a probabilistic model, more specifically a Bayesian network, representing the joint distribution of promising solutions. Among the additional attributes provided by the Bayesian framework inserted into an immune-inspired search algorithm are the automatic control of the population size along the search and the inherent ability to promote and preserve diversity among the candidate solutions. Both are attributes generally absent from alternative estimation of distribution algorithms, and both were shown to be useful attributes when implementing the generation and selection of components of the ensemble, thus leading to high-performance classifiers. Several aspects of the design are illustrated in practical applications, including a comparative analysis with other attempts to synthesize ensembles. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Objective Functions of Online Weight Noise Injection Training Algorithms for MLPs

    Page(s): 317 - 323
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (360 KB) |  | HTML iconHTML  

    Injecting weight noise during training has been a simple strategy to improve the fault tolerance of multilayer perceptrons (MLPs) for almost two decades, and several online training algorithms have been proposed in this regard. However, there are some misconceptions about the objective functions being minimized by these algorithms. Some existing results misinterpret that the prediction error of a trained MLP affected by weight noise is equivalent to the objective function of a weight noise injection algorithm. In this brief, we would like to clarify these misconceptions. Two weight noise injection scenarios will be considered: one is based on additive weight noise injection and the other is based on multiplicative weight noise injection. To avoid the misconceptions, we use their mean updating equations to analyze the objective functions. For injecting additive weight noise during training, we show that the true objective function is identical to the prediction error of a faulty MLP whose weights are affected by additive weight noise. It consists of the conventional mean square error and a smoothing regularizer. For injecting multiplicative weight noise during training, we show that the objective function is different from the prediction error of a faulty MLP whose weights are affected by multiplicative weight noise. With our results, some existing misconceptions regarding MLP training with weight noise injection can now be resolved. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stabilizing Effects of Impulses in Discrete-Time Delayed Neural Networks

    Page(s): 323 - 329
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (261 KB) |  | HTML iconHTML  

    This brief studies the global exponential stability of the equilibrium point of discrete-time delayed Hopfield neural networks (DHNNs) with impulse effects by using difference inequalities. We shall consider the stabilizing effects of impulses when the corresponding impulse-free DHNN is even not asymptotically stable. The obtained results characterize the aggregated effects of impulses and deviation of the impulse-free DHNN from its equilibrium point on the exponential stability of the whole system. It is shown that, because of effects of impulses, the impulsive discrete-time DHNN may be exponentially stable even if the evolution of impulse-free component deviates from its equilibrium point exponentially. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Exponential Synchronization of Linearly Coupled Neural Networks With Impulsive Disturbances

    Page(s): 329 - 336
    Save to Project icon | Request Permissions | Click to expandQuick Abstract | PDF file iconPDF (732 KB) |  | HTML iconHTML  

    This brief investigates globally exponential synchronization for linearly coupled neural networks (NNs) with time-varying delay and impulsive disturbances. Since the impulsive effects discussed in this brief are regarded as disturbances, the impulses should not happen too frequently. The concept of average impulsive interval is used to formalize this phenomenon. By referring to an impulsive delay differential inequality, we investigate the globally exponential synchronization of linearly coupled NNs with impulsive disturbances. The derived sufficient condition is closely related with the time delay, impulse strengths, average impulsive interval, and coupling structure of the systems. The obtained criterion is given in terms of an algebraic inequality which is easy to be verified, and hence our result is valid for large-scale systems. The results extend and improve upon earlier work. As a numerical example, a small-world network composing of impulsive coupled chaotic delayed NN nodes is given to illustrate our theoretical result. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Call for papers IEEE Transactions on Neural Networks Special Issue: Online Learning in Kernel Methods

    Page(s): 336
    Save to Project icon | Request Permissions | PDF file iconPDF (150 KB)  
    Freely Available from IEEE
  • IEEE Computational Intelligence Society Information

    Page(s): C3
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE
  • IEEE Transactions on Neural Networks Information for authors

    Page(s): C4
    Save to Project icon | Request Permissions | PDF file iconPDF (37 KB)  
    Freely Available from IEEE

Aims & Scope

IEEE Transactions on Neural Networks is devoted to the science and technology of neural networks, which disclose significant technical knowledge, exploratory developments, and applications of neural networks from biology to software to hardware.

 

This Transactions ceased production in 2011. The current retitled publication is IEEE Transactions on Neural Networks and Learning Systems.

Full Aims & Scope