MLSP2018
twitter facebook linkedin share privacy policy contact search
PROGRAM -
Many thanks to everyone who has contributed to the success of MLSP2018!
-
MLSP 2018 Best Student Paper Awards go to:
-
Eero Siivola, Aki Vehtari, Javier González, Jarno Vanhatalo and Michael Riis Andersen. Correcting Boundary Over-Exploration Deficiencies In Bayesian Optimization With Virtual Derivative Sign Observations.
-
Ismail Senoz and Bert De Vries. Online Variational Message Passing In The Hierarchical Gaussian Filter.
-
Jesper L. Hinrich, Søren F. V. Nielsen, Kristoffer H. Madsen and Morten Mørup. Variational Bayesian Partially Observed Non-Negative Tensor Factorization.
-
Best student papers were selected according to this selection process by The Best Paper Award Selection Committee composed of three distinguished researchers:
-
Simo Särkkä, Aalto University, Finland
-
David Miller, The Pennsylvania State University
-
Jim Reilly, McMaster University
-
The program book: Click here
-
The independent one-day Satellite Workshop on Machine Learning for Audio Processing has been cancelled.
-
Presentation and Session Chair Guidelines: Click here


Monday September 17, 2018
9:00-17:002nd floorRegistration
13:15-13:30Kilden, 2nd floorOpening Greeting
Chair: Zheng-Hua Tan, Aalborg University, Denmark
13:30-14:30Kilden, 2nd floorKeynote Lecture: The Bayesian Bonus: Benefits of Being Bayesian in the Deep Learning Era

Max Welling
University of Amsterdam, the Netherlands

Chair: Børge Lindberg, Aalborg University, Denmark

14:30-15:002nd floorCoffee Break
15:00-16:30Kilden, 2nd floorTutorial: Opening the Black Box - How to Interpret Machine Learning Functions and Their Decisions

Lars Kai Hansen and Laura Rieger
Section for Cognitive Systems, DTU Compute Technical University of Denmark, Denmark

Chair: Ulisses M. Braga-Neto, Texas A&M University, USA

16:45-18:15Kilden, 2nd floorTutorial: Bayesian Filtering and Smoothing Methods for Machine Learning

Simo Särkkä
Aalto University, Finland

Chair: Zhanyu Ma, Beijing University of Posts and Telecommunications, China

18:15-18:40Kilden, 2nd floorMLSP 2018 Data Competition: Map Synchronization (video)
Yuxin Chen, Yuejie Chi, Junting Dong, Qixing Huang, Kaixiang Lei, Haoyun Wang, Qianqian Wang, Kun Xu, Xiaowei Zhou
19:00-20:00Ground floorWelcome Reception Hosted by the City of Aalborg


Tuesday September 18, 2018
8:30-12:002nd floorRegistration
8:30-9:002nd floorWelcome Refreshment
9:00-10:00Kilden, 2nd floorKeynote Lecture: End-to-end Speech Recognition Systems Explored

Dong Yu
Tencent AI Lab, Seattle, USA

Chair: Zheng-Hua Tan, Aalborg University, Denmark

10:00-10:302nd floorCoffee Break
10:30-12:30Kilden, 2nd floorLecture Session 1: Speech and audio processing
Chair: Tomoko Matsui, The Institute of Statistical Mathematics, Japan

10:30
×

In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach. We explore the use of neural networks as an alternative to a popular speech variance model based on supervised non-negative matrix factorization (NMF). More precisely, we use a variational autoencoder as a speaker-independent supervised generative speech model, highlighting the conceptual similarities that this approach shares with its NMF-based counterpart. In order to be free of generalization issues regarding the noisy recording environments, we follow the approach of having a supervised model only for the target speech signal, the noise model being based on unsupervised NMF. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the variational autoencoder and estimating the unsupervised model parameters. Experiments show that the proposed method outperforms a semi-supervised NMF baseline and a state-of-the-art fully supervised deep learning approach.
 
10:50
Logical Access Attacks Detection Through Audio Fingerprinting In Automatic Speaker Verification
Juan Manuel Espín López, Roberto Font Ruiz, Javier Gómez Marín-Blazquez, Francisco Esquembre Martínez
×

Automatic Speaker Verification (ASV) is being implemented in many applications, where maximum security and robustness against attacks must be guaranteed. One of the most challenging attacks that an ASV system can face is the so called ``logical access attack”, in which the attacker has the possibility to directly inject a compromised audio sample into the system. The development of countermeasures for this kind of attack has received little attention to date. When the injected audio is identical to a sample previously seen by the system, current audio fingerprinting techniques can detect most of these attacks. However, we show that, with trivial modifications that do not require any special signal processing knowledge, the audio can bypass these countermeasures while keeping the ability of the ASV system to authenticate the user. To address this issue, we propose an alternative method and validate it against a variety of audio perturbations. It generalizes previous fingerprinting techniques and acquire robustness against changes of tempo.
 
11:10
×

In this paper, we aim to understand what makes replay spoofing detection difficult in the context of the ASVspoof 2017 corpus. We use FFT spectra, mel frequency cepstral coefficients (MFCC) and inverted MFCC (IMFCC) frontends and investigate different backends based on Convolutional Neural Networks (CNNs), Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs). On this database, we find that IMFCC frontend based systems show smaller equal error rate (EER) for high quality replay attacks but higher EER for low quality replay attacks in comparison to the baseline. However, we find that it is not straightforward to understand the influence of an acoustic environment (AE), a playback device (PD) and a recording device (RD) of a replay spoofing attack. One reason is the unavailability of metadata for genuine recordings. Second, it is difficult to account for the effects of the factors: AE, PD and RD, and their interactions. Finally, our frame-level analysis shows that the presence of cues (recording artefacts) in the first few frames of genuine signals (missing from replayed ones) influence class prediction.
 
11:30
×

Traditional deep denoising autoencoders (DDAE) use magnitude domain features and training targets to separate speech from background noise. Phase enhancement, however, has recently been shown to improve perceptual and objective speech quality. We present an approach that uses a DDAE to estimate phase-aware training targets from phase-aware input features. This network is denoted as a phase-aware deep denoising autoencoder (paDDAE). The short-time Fourier transform (STFT) of noisy speech is the network input, and the network estimates a phase-aware time-frequency mask. The proposed approach is evaluated across multiple conditions, including various signal-to-noise ratios (SNRs), noise types, and speakers. The results show that the paDDAE offers improvements over traditional DDAEs in terms of objective speech quality and intelligibility.
 
11:50
Speech Emotion Recognition Using Cyclostationary Spectral Analysis
Amin Jalili, Sadid Sahami, Chong-Yung Chi, Rassoul Amirfattahi
×

Inspired by the modulated and non-stationary nature of speech signals, this paper proposes a new feature extraction scheme for speech emotion recognition (SER) using cyclostationary spectral analysis (CSA). This spectral analysis discloses the underlying first-order and second-order (hidden) periodicities in emotional speech signals using the estimated spectral correlation function (SCF) via FAM algorithm. Experiments on the Berlin database of emotional speech (EmoDB) show that the proposed scheme using cyclostationary spectral features (CSFs) significantly outperforms state-of-the-art methods in terms of recognition accuracy.
 
12:10
Noise-Adaptive Deep Neural Network For Single-Channel Speech Enhancement
Hanwook Chung, Taesup Kim, Eric Plourde, Benoit Champagne
×

We introduce a noise-adaptive feed-forward deep neural network (DNN) for single-channel speech enhancement. The goal is to better exploit individual noise characteristics while training a spectral mapping DNN. To this end, we employ noise-dependent adaptation vectors, which are obtained based on the output of an auxiliary noise classification DNN, to adjust the weights and biases of the spectral mapping DNN. The parameters of the spectral mapping DNN, noise classification DNN and adaptation vectors are estimated jointly during the training stage. During the enhancement stage, we combine a classical unsupervised speech enhancement algorithm with the proposed DNN-based approach to further improve the enhanced speech quality. Experiments show that the proposed method provides better enhancement performance than the selected benchmark algorithms.
12:30-14:001st floorLunch
14:00-16:00Kilden, 2nd floorLecture Session 2: Bayesian learning and modeling
Chair: James Reilly, McMaster University, Canada

14:00
×

This paper shows that kernel-based estimates of unknown input-output maps can be complemented with uncertainty bounds more robust than those commonly derived in the Gaussian regression framework. This is obtained by using the kernel not to define Gaussian priors but a much vaster class of symmetric distributions. Such class is then handled by extending to the Bayesian setting the recently developed sign-perturbed sums (SPS) framework.
 
14:20
Markov Recurrent Neural Networks
Che-Yu Kuo, Jen-Tzung Chien
×

Deep learning has achieved great success in many real-world applications. For speech and language processing, recurrent neural networks are learned to characterize sequential patterns and extract the temporal information based on dynamic states which are evolved through time and stored as an internal memory. Traditionally, simple transition function using input-to-hidden and hidden-to-hidden weights is insufficient. To strengthen the learning capability, it is crucial to explore the diversity of latent structure in sequential signals and learn the stochastic trajectory of signal transitions to improve sequential prediction. This paper proposes the stochastic modeling of transitions in deep sequential learning. Our idea is to enhance latent variable representation by discovering the Markov state transitions in sequential data based on a K-state long short-term memory (LSTM) model. Such a latent state machine is capable of learning the complicated latent semantics in highly structured and heterogeneous sequential data. Gumbel-softmax is introduced to implement stochastic learning procedure with discrete states. Experimental results on visual and text language modeling illustrate the merit of the proposed stochastic transitions in sequential prediction with limited amount of parameters.
 
14:40
Learning Stochastic Differential Equations With Gaussian Processes Without Gradient Matching
Cagatay Yildiz, Markus Heinonen, Jukka Intosalmi, Henrik Mannerström, Harri Lähdesmäki
×

We introduce a novel paradigm for learning non-parametric drift and diffusion functions for stochastic differential equation (SDE). The proposed model learns to simulate path distributions that match observations with non-uniform time increments and arbitrary sparseness, which is in contrast with gradient matching that does not optimize simulated responses. We formulate sensitivity equations for learning and demonstrate that our general stochastic distribution optimisation leads to robust and efficient learning of SDE systems.
 
15:00
×

Bayesian optimization is known to be a method of choice when it comes to solving optimization problems involving black-box, non-convex and low-dimensional functions in a few iterations. Yet, how to scale this method up to higher dimensions is a challenging and still unsolved research issue. In this paper, we first present and structure recent axes of research addressing this topic. We then experimentally compare three selected high-dimensional Bayesian optimization algorithms to random search on diverse high-dimensional functions. Our results suggest that no algorithm consistently outperforms the others across all types of difficulties encountered and that random search is in general very competitive, confirming recent research results.
 
15:20
Correcting Boundary Over-Exploration Deficiencies In Bayesian Optimization With Virtual Derivative Sign Observations
Eero Siivola, Aki Vehtari, Javier González, Jarno Vanhatalo, Michael Riis Andersen
×

Bayesian optimization (BO) is a global optimization strategy designed to find the minimum of an expensive black-box function, typically defined on a compact subset of R^d, by using a Gaussian process (GP) as a surrogate model for the objective. Although currently available acquisition functions address this goal with different degree of success, an over-exploration effect of the contour of the search space is typically observed. However, in problems like the configuration of machine learning algorithms, the function domain is conservatively large and with a high probability the global minimum does not sit on the boundary of the domain. We propose a method to incorporate this knowledge into the search process by adding virtual derivative observations in the GP at the boundary of the search space. We use the properties of GPs to impose conditions on the partial derivatives of the objective. The method is applicable with any acquisition function, it is easy to use and consistently reduces the number of evaluations required to optimize the objective irrespective of the acquisition used. We illustrate the benefits of our approach in an extensive experimental comparison.
 
15:40
Computational Optimization For Normal Form Realization Of Bayesian Model Graphs
Giovanni Di Gennaro, Amedeo Buonanno, Francesco A. N. Palmieri
×

Bayesian networks in their Factor Graph Reduced Normal Form (FGrn) represent a very appealing paradigm for the realization of structures for probabilistic inference. Unfortunately, the computational and memory complexity of such networks remains high, especially if the network has to extend to large structures such as multi-layers and highly connected graphs. In this paper we focus on the details on probability propagation and learning that can reduce such complexity. More specifically we propose new algorithms and proceed to create a library that allows a significant reduction in costs with respect to direct use of the standard sums-products and Maximum Likelihood (ML) learning. Analysis and results are presented with reference to a Latent Variable Model (LVM).
16:00-16:302nd floorCoffee Break
16:30-18:30Kilden, 2nd floorPoster Session 1: Bayesian learning and modeling
Chair: Gianluigi Pillonetto, University of Padova, Italy

16:30
Model-Order Selection In Statistical Shape Models
Alma Eguizabal, Peter J. Schreier, David Ramírez
×

Statistical shape models enhance machine learning algorithms providing prior information about deformation. A Point Distribution Model (PDM) is a popular landmark-based statistical shape model for segmentation. It requires choosing a model order, which determines how much of the variation seen in the training data is accounted for by the PDM. A good choice of the model order depends on the number of training samples and the noise level in the training data set. Yet the most common approach for choosing the model order simply keeps a predetermined percentage of the total shape variation. In this paper, we present a technique for choosing the model order based on information-theoretic criteria, and we show empirical evidence that the model order chosen by this technique provides a good trade-off between over- and underfitting.
 
16:30
Optimal Classifier Model Status Selection Using Bayes Boundary Uncertainty
David Ha, Emilie Delattre, Yuya Tomotoshi, Masahiro Senda, Hideyuki Watanabe, Shigeru Katagiri, Miho Ohsaki
×

We propose a method to select the optimal parameter status for any classifier model. In the statistical pattern recognition framework, optimal classification is defined as achieving the minimum classification error probability (Bayes error). Although the error probability is defined on infinite data, in practice only a finite amount of data is available. Using the same finite data for classifier training and evaluation provides a serious underestimate of the Bayes error. Traditional solutions consist in holding out some of the available data for evaluation, which unavoidably decreases the data available for either training or evaluation. By contrast, our proposed method uses the same data for training and evaluation in a single training without splitting, which is made possible by evaluating the ideality of the classifier”s classification boundary instead of estimating the error probability. Here, ideal classification boundary (Bayes boundary) refers to the boundary that leads to the Bayes error. We use the fact that the Bayes boundary solely consists of uncertain samples, namely samples whose class posterior probability is equal for the two classes separated by the boundary. Tests on several real-life datasets and experimental comparison to Cross-Validation clearly show the potential of our method.
 
16:30
×

In this work, we appropriate the popular tool of Gaussian processes to solve the problem of reconstructing networks from time-series perturbation data. To this end, we propose a construction for multivariate Gaussian processes to describe the continuous-time trajectories of the states of the network entities. We then show that this construction admits a state-space representation for the network dynamics. By exploiting Kalman filtering techniques, we are able to infer the underlying network in a computationally efficient manner.
 
16:30
Space-Time Extension Of The MEM Approach For Electromagnetic Neuroimaging
Marie-Christine Roubaud, Jean-Marc Lina, Julie Carrier, Bruno Torrésani
×

The wavelet Maximum Entropy on the Mean (wMEM) approach to the MEG inverse problem is revisited and extended to infer brain activity from full space-time data. The resulting dimensionality increase is tackled using a collection of techniques, that includes time and space dimension reduction (using respectively wavelet and spatial filter based reductions), Kronecker product modeling for covariance matrices, and numerical manipulation of the free energy directly in matrix form. This leads to a smooth numerical optimization problem of reasonable dimension, solved using standard approaches. The method is applied to the MEG inverse problem. Results of a simulation study in the context of slow wave localization from sleep MEG data are presented and discussed.
 
16:30
×

We address the problem of online state and parameter estimation in hierarchical Bayesian nonlinear dynamic systems. We focus on the Hierarchical Gaussian Filter (HGF), which is a popular model in the computational neuroscience literature. For this filter, explicit equations for online state estimation (and offline parameter estimation) have been derived before. We extend this work by casting the HGF as a probabilistic factor graph and present variational message passing update rules that facilitate both online state and parameter estimation as well as online tracking of the free energy (or ELBO), which can be used as a proxy for Bayesian evidence. Due to the locality and modularity of the factor graph framework, our approach supports application of HGF”s and variations as plug-in modules to a wide variety of dynamic modelling applications.
 
16:30
×

Neuroimage correspondence analysis is critical in applications that model neurodegenerative disease progression. Establishing meaningful relations between non-rigid objects such as brain structures poses a challenging topic in the bio-imaging signal processing field. In this paper, we introduce a novel nonlinear probabilistic latent variable model approach to infer shape correspondences of brain structures. To this end, we perform an unsupervised clustering process that is automatically carried out by a nonlinear kernelized probabilistic latent variable model. The kernel embeddings are accomplished by using random Fourier features as nonlinear mappings of 3D shape descriptors. We experimentally show how the model proposed can accurately establish meaningful relations between any pair of non-rigid shapes such as those brain structures related to the Alzheimer’s disease.
 
16:30
×

The total variation distance is a core statistical distance between probability measures that satisfies the metric axioms, with value always falling in $[0,”]$. Since the total variation distance does not admit closed-form expressions for statistical mixtures, one often has to rely in practice on costly numerical integrations or on fast Monte Carlo approximations that however do not guarantee deterministic bounds. In this work, we consider two methods for bounding the total variation of univariate mixture models: The first is based on the information monotonicity property of the total variation to design guaranteed nested deterministic lower bounds. The second method relies on computing the geometric lower and upper envelopes of weighted mixture components to derive deterministic bounds based on density ratio. We demonstrate the tightness of our bounds through simulating Gaussian, Gamma and Rayleigh mixture models.
 
16:30
×

In this paper, the connection between the Matérn kernel and scale mixtures of squared exponential kernels is explored. It is shown that the Matérn kernel can be approximated by a finite scale mixture of squared exponential kernels through a quadrature approximation which in turn allows for (i) state space approximations of the Matérn kernel for arbitrary smoothness parameters using established state space approximations of the squared exponential kernel and (ii) exact calculation of the Bayesian quadrature weights for the approximate kernel under a Gaussian measure. The method is demonstrated in inference in a log-Gaussian Cox process as well as in approximating a Gaussian integral arising from a financial problem using Bayesian quadrature.
 
16:30
×

Transfer learning is a framework that includes--among other topics--the design of knowledge transfer mechanisms between Bayesian filters. Transfer learning strategies in this context typically rely on a complete stochastic dependence structure being specified between the participating learning procedures (filters). This paper proposes a method that does not require such a restrictive assumption. The solution in this incomplete modelling case is based on the fully probabilistic design of an unknown probability distribution which conditions on knowledge in the form of an externally supplied distribution. We are specifically interested in the situation where the external distribution accumulates knowledge dynamically via Kalman filtering. Simulations demonstrate that the proposed algorithm outperforms alternative methods for transferring this dynamic knowledge from the external Kalman filter.
 
16:30
Causality Analysis Based On Matrix Transfer Entropy
Rongjin Ma, Badong Chen, Jianfeng Xiao, Jingli Shao
×

Transfer Entropy (TE) is one of the most commonly used methods to detect the causal relationship between a pair of time series. However, the computational complexity of the TE is very hign, because its calculation needs to estimate the probability distribution of the variables. In order to solve this problem, we propose a new version of the TE based on the concept of Matrix Entropy (MT), called Matrix Transfer Entropy (MTE). MTE can be used for two variables with linear or non-linear causal relationships. Compared with the traditional TE, the new approach can achieve more robust results. Bypassing the estimation of the probability density functions (PDFs) of the variables, the computational complexity of the MTE is not high. Experimental results on two toy examples are provided to demonstrate the performance of the MTE. Additionally, the new method is applied to a real clinical dataset to analyze the cardiorespiratory causality.


Wednesday September 19, 2018
8:30-12:002nd floorRegistration
8:30-9:002nd floorWelcome Refreshment
9:00-10:00Kilden, 2nd floorKeynote Lecture: Temporal models with low-rank spectrogram

Cédric Févotte
CNRS, Toulouse, France

Chair: Nelly Pustelnik, ENS Lyon, France

10:00-10:301st and 2nd floorCoffee Break
10:30-12:30Kilden, 2nd floorLecture Session 3: Semi-supervised and Unsupervised learning
Chair: David Jonathan Miller, Penn State University, USA

10:30
Unsupervised Parsimonious Cluster-Based Anomaly Detection (PCAD)
David Jonathan Miller, George Kesidis, Zhicong Qiu
×

Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.
 
10:50
×

Many methods for processing classical or quantum data are based on estimating some parameters, e.g. those of adaptive filters, artificial neural networks (including deep learning approaches), blind source separation systems or quantum gates. For classical signals, these methods include stochastic algorithms, such as stochastic gradient descent. We here first introduce a partly related, general, type of stochastic algorithms for quantum data and we prove the asymptotic efficiency of the proposed estimator. We then show the attractiveness of this stochastic approach for the quantum version of blind multiple-input multiple-output system identification and blind source separation: the resulting model estimation methods can operate with a single copy of each considered quantum state, whereas the previous methods require many copies of the same (unknown) states to be available and are thus ”less blind”. Numerical tests show that good performance is obtained with 10^4 such quantum states.
 
11:10
×

In this paper, the problem of distributed semi-supervised multi-label classification (S^2MLC) over a networked system is considered, and a distributed semi-supervised multi-label learning (dS^2ML^2) algorithm is developed. In our algorithm, to utilize the information of both labeled and unlabeled data, the maximum entropy principle with the entropy regularization is used to design the cost function. Besides, to exploit the higher order correlation among labels, a common low-dimensional subspace as well as label-specific weight vectors is learned. The effectiveness of the proposed algorithm is verified by simulations on two real datasets.
 
11:30
×

Multivariate signal processing is often based on dimensionality reduction techniques. We propose a new method, Dynamical Component Analysis (DyCA), leading to a classification of the underlying dynamics and - for a certain type of dynamics - to a signal subspace representing the dynamics of the data. In this paper the algorithm is derived leading to a generalized eigenvalue problem of correlation matrices. The application of the DyCA on high-dimensional chaotic signals is presented both for simulated data as well as real EEG data of epileptic seizures.
 
11:50
Variational Bayesian Partially Observed Non-Negative Tensor Factorization
Jesper L. Hinrich, Sřren F. V. Nielsen, Kristoffer H. Madsen, Morten Mřrup
×

Non-negative matrix and tensor factorization (NMF/NTF) have become important tools for extracting part based representations in data. It is however unclear when an NMF or NTF approach is most suited for data and how reliably the models predict when trained on partially observed data. We presently extend a recently proposed variational Bayesian NMF (VB-NMF) to non-negative tensor factorization (VB-NTF) for partially observed data. This admits bi- and multi-linear structure quantification considering both model prediction and evidence. We evaluate the developed VB-NTF on synthetic and a real dataset of gene expression in the human brain and contrast the performance to VB-NMF and conventional NMF/NTF. We find that the gene expressions are better accounted for by VB-NMF than VB-NTF and that VB-NMF/VB-NTF more robustly handle partially observed data than conventional NMF/NTF. In particular, probabilistic modeling is beneficial when large amounts of data is missing and/or the model order over-specified.
 
12:10
×

Given a convolutional dictionary underlying a set of observed signals, can a carefully designed auto-encoder recover the dictionary in the presence of noise? We introduce an auto-encoder architecture, termed constrained recurrent sparse auto-encoder (CRsAE), that answers this question in the affirmative. Given an input signal and an approximate dictionary, the encoder finds a sparse approximation using FISTA. The decoder reconstructs the signal by applying the dictionary to the output of the encoder. The encoder and decoder in CRsAE parallel the sparse-coding and dictionary update steps in optimization-based alternating-minimization schemes for dictionary learning. As such, the parameters of the encoder and decoder are not independent, a constraint which we enforce for the first time. We derive the back-propagation algorithm for CRsAE. CRsAE is a framework for blind source separation that, only knowing the number of sources (dictionary elements), and assuming sparsely-many can overlap, is able to separate them. We demonstrate its utility in the context of spike sorting, a source separation problem in computational neuroscience. We demonstrate the ability of CRsAE to recover the underlying dictionary and characterize its sensitivity as a function of SNR.
12:30-13:301st floorLunch
13:30-15:30Kilden, 2nd floorLecture Session 4: Sparse learning
Chair: Pierre Chainais, University Lille, France

13:30
×

Logistic regression has been extensively used to perform classification in machine learning and signal/image processing. Bayesian formulations of this model with sparsity-inducing priors are particularly relevant when one is interested in drawing credibility intervals with few active coefficients. Along these lines, the derivation of efficient simulation-based methods is still an active research area because of the analytically challenging form of the binomial likelihood. This paper tackles the sparse Bayesian binary logistic regression problem by relying on the recent split-and-augmented Gibbs sampler (SPA). Contrary to usual data augmentation strategies, this Markov chain Monte Carlo (MCMC) algorithm scales in high dimension and divides the initial sampling problem into simpler ones. These sampling steps are then addressed with efficient state-of-the-art methods, namely proximal MCMC algorithms that can benefit from the recent closed-form expression of the proximal operator of the logistic cost function. SPA appears to be faster than efficient proximal MCMC algorithms and presents a reasonable computational cost compared to optimization-based methods with the advantage of producing credibility intervals. Experiments on handwritten digits classification problems illustrate the performances of the proposed approach.
 
13:50
×

Given a set of dictionary filters, the most widely used formulation of the convolutional sparse coding (CSC) problem is convolutional basis pursuit denoising (CBPDN), in which an image is represented as a sum over a set of convolutions of coefficient maps. When the input image is noisy, CBPDN’s regularization parameter greatly influences the quality of the reconstructed image. Results for an automatic and sensible selection of this parameter are very limited for the CSC / CBPDN case. In this paper we propose a regularization parameter-free method to solve the CSC problem via its projection onto the L1-Ball formulation coupled with a warm-start like strategy, which, driven by the Morozov’s discrepancy principle, adaptively increases/decreases its constrain at each major iteration. While the time performance of our proposed method is slower than that measured when solving CSC for a fixed regularization parameter, our computational results also show that our method”s reconstruction quality is, in average, very close (within 0.16 SNR, 0.16 PSNR, 0.003 SSIM) to that obtained when the regularization parameter for CBPDN is selected to produce the best (SNR) quality result.
 
14:10
BALSON: Bayesian Least Squares Optimization With Nonnegative L1-Norm Constraint
Jiyang Xie, Zhanyu Ma, Guoqiang Zhang, Jing-Hao Xue, Jen-Tzung Chien, Zhiqing Lin, Jun Guo
×

A Bayesian approach termed the BAyesian Least Squares Optimization with Nonnegative L1-norm constraint (BALSON) is proposed. The error distribution of data fitting is described by Gaussian likelihood. The parameter distribution is assumed to be a Dirichlet distribution. With the Bayes rule, searching for the optimal parameters is equivalent to finding the mode of the posterior distribution. In order to explicitly characterize the nonnegative L1-norm constraint of the parameters, we further approximate the true posterior distribution by a Dirichlet distribution. We estimate the moments of the approximated Dirichlet posterior distribution by sampling methods. Four sampling methods have been introduced and implemented. With the estimated posterior distributions, the original parameters can be effectively reconstructed in polynomial fitting problems, and the BALSON framework is found to perform better than conventional methods.
 
14:30
×

In this paper, a joint graph-signal recovery approach is investigated when we have a set of noisy graph signals generated based on a causal graph process. By leveraging the Kalman filter framework, a three steps iterative algorithm is utilized to predict and update signal estimation as well as graph topology learning, called Graph Kalman Filter or GKF. Similar to the regular Kalman filter, we first predict the a posterior signal state based on the prior available data and then this prediction is updated and corrected based on the recently arrived measurement. But contrary to the conventional Kalman filter algorithm, we have no information of the transition matrix and hence we relate this matrix to the graph weight matrix which can be extracted by graph topology estimation. Thus, given the set of updated graph signals, we update the graph topology estimate so as the graph weight and the state transition matrices. Since the proposed method is recursive and can update estimates online, it can keep track of changes in the underlying graph topology and signals based on the sequential arrival of new data and moreover, it suits for non-stationary processes. The experimental results show that for different scenarios, GKF has a lower mean squared error of the signal estimation when we compare it with the error of an available batch approach. Moreover, the proposed GKF has a low normalized mean squared error in terms of the graph topology inference and achieves the error associated to the batch method rapidly when the number of measurements increases.
 
14:50
×

An ensemble of neural networks is known to be more robust and accurate than an individual network, however usually with linearly-increased cost in both training and testing. In this work, we propose a two-stage method to learn Sparse Structured Ensembles (SSEs) for neural networks. In the first stage, we run SG-MCMC with group sparse priors to draw an ensemble of samples from the posterior distribution of network parameters. In the second stage, we apply weight-pruning to each sampled network and then perform retraining over the remained connections. In this way of learning SSEs, we not only achieve high prediction accuracy but also reduce memory and computation cost in both training and testing. We conduct a series of evaluation experiments by learning SSE ensembles with both FNNs and LSTMs. To the best of our knowledge, this work represents the first methodology and empirical study of integrating SG-MCMC, group sparse prior and network pruning together for learning NN ensembles.
 
15:10
×

Natively learned separable filters for Convolutional Sparse Coding (CSC) have recently been shown to provide equivalent reconstruction performance to their non-separable counterparts (as opposed to approximated separable filters), while reducing computational cost. Furthermore, multiple approaches to optimize the Dictionary Update stage of Convolutional Dictionary Learning (CDL) methods based on the Accelerated Proximal Gradient (APG) framework have recently been proposed. In this paper, we propose a novel separable filter learning method based on the rank-1 decomposition, and test its performance against the existing separable approaches. In adittion, we evaluate how APG-based variations couple with our proposed method in order to improve computational runtime. Our results show that the filters learned through our proposed method match the performance of other natively-learned separable filters, while providing a significant runtime improvement in the learning process through our APG-based implementation.
15:30-16:002nd floorCoffee Break
16:00-18:00Kilden, 2nd floorPoster Session 2: Unsupervised to supervised learning
Chair: Paul Rodriguez, Pontifical Catholic University of Peru, Peru

16:00
Anomaly Detection Of Attacks (ADA) On DNN Classifiers At Test Time
David Jonathan Miller, Yujia Wang, George Kesidis
×

A significant threat to wide deployment of machine learning-based classifiers is adversarial learning attacks, especially at test-time. Recently there has been significant development in defending against such attacks. Several such works seek to robustify the classifier to make ”correct” decisions on perturbed patterns. We argue it is often operationally more important to detect the attack, rather than to ”correctly classify” in the face of it (Classification can proceed if no attack is detected). We hypothesize that, even if human-imperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD), based on suitable (null hypothesis) density models for the different layers of a deep neural net and a novel decision statistic built upon the Kullback-Leibler divergence. This paper addresses: ”) when is it appropriate to aim to ”correctly classify” a perturbed pattern?; 2) What is a good AD detection statistic, one which exploits all likely sources of anomalousness associated with a test-time attack? 3) Where in a deep neural net (DNN) (in an early layer, a middle layer, or at the penultimate layer) will the most anomalous signature manifest? Tested on MNIST and CIFAR-”0 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and substantially better accuracy than previously reported on the third (strongest) attack.
 
16:00
K-Svd With A Real L0 Optimization: Application To Image Denoising
Yuan Liu, Stéphane Canu, Paul Honeine, Su Ruan
×

This paper deals with sparse coding for dictionary learning in sparse representations. Because sparse coding involves an l0- norm, most, if not all, existing solutions only provide an ap- proximate solution. Instead, in this paper, a real l0 optimiza- tion is considered for the sparse coding problem providing a global optimal solution. The proposed method reformulates the optimization problem as a Mixed-Integer Quadratic Pro- gram (MIQP), allowing then to obtain the global optimal solu- tion by using an off-the-shelf solver. Because computing time is the main disadvantage of this approach, two techniques are proposed to improve its computational speed. One is to add suitable constraints and the other to use an appropriate ini- tialization. The results obtained on an image denoising task demonstrate the feasibility of the MIQP approach for process- ing well-known benchmark images while achieving good per- formance compared with the most advanced methods.
 
16:00
×

Many domain adaptation methods are based on learning a projection or a transformation of the source and target domains to a common domain and training a classifier there, while the performance of such algorithms has not been theoretically studied yet. Previous studies proposing generalization bounds for domain adaptation relate the target loss to the discrepancy between the source and target distributions, however, do not take into account the possible effects of learning a transformation between the two domains. In this work, we present generalization bounds that study the target performance of domain adaptation methods learning a transformation of the source and target domains along with a hypothesis. We show that, under some conditions on the loss regularity, if the domain transformations reduce the distribution distance at a sufficiently high rate, then the expected target loss can be bounded with probability improving at an exponential rate with the number of labeled samples.
 
16:00
×

Salient dictionary learning has recently proven to be effective for unsupervised activity video summarization by key-frame extraction. All relevant methods select a small subset of the original data points/video frames as dictionary atoms/representatives that, in concert, both optimally reconstruct the original entire dataset/video sequence and are salient. Therefore, they attempt to simultaneously optimize a reconstruction term, pushing towards a dictionary/summary that best reconstructs the entire dataset, and a saliency term, pushing towards a dictionary composed of salient data points. In this paper, a hypothesis is proposed and empirically tested, namely that more salient data points can be obtained by attempting to restrain reconstruction error separately for each original data point. Thus, salient dictionary learning is extended by adding a third term to the objective function, pushing towards optimal point reconstruction. A pre-existing greedy, iterative algorithm for salient dictionary learning is modified according to the proposed extension in two alternative ways. The resulting methods achieve state-of-the-art performance in three databases, verifying the validity of our hypothesis.
 
16:00
×

A multi-layer perceptron for indicating the number of targets present in a range-velocity cell of automotive radar sensors is examined and compared with a state-of-the-art approach based on a Generalized Likelihood Ratio Test. The multi-target indication is typically used for direction-of arrival estimation to decide whether resolution in the angular domain is necessary. We focus on the practically relevant challenge of deciding between a single-target and a two-target scenario. Compared to the state-of-the-art approach which requires a preceding maximum likelihood DoA estimate and a precise array model, the proposed multi-layer perceptron directly operates on the single-snapshot spatial covariance matrix estimate. The array model inherently is learned by the network during the training process. The evaluation of the MLP in terms of classification accuracy shows that a performance similar to the Generalized Likelihood Ratio Test is achieved.
 
16:00
Enhanced Noisy Sparse Subspace Clustering Via Reweighted L1-Minimization
Jwo-Yuh Wu, Liang-Chi Huang, Ming-Hsun Yang, Ling-Hua Chang, Chun-Hung Liu
×

Sparse subspace clustering (SSC) relies on sparse regression for accurate neighbor identification. Inspired by recent progress in compressive sensing (CS), this paper proposes a new sparse regression scheme for SSC via reweighted L”-minimization, which also generalizes a two-step L”-minimization algorithm introduced by E. J. Candčs al all in [The Annals of Statistics, vol. 42, no. 2, pp. 669–699, 20”4] without incurring extra complexity burden. To fully exploit the prior information conveyed by the computed sparse vector in the first step, our approach places a weight on each component of the regression vector, and solves a weighted LASSO in the second step. We discuss the impact of weighting on neighbor identification, argue that a popular weighting rule used in CS literature is not suitable for the SSC purpose, and propose a new weighting scheme for enhancing neighbor identification accuracy. Extensive simulation results are provided to validate our discussions and evidence the effectiveness of the proposed approach. Some key issues for future works are also highlighted.
 
16:00
Acoustic Scene Classification: A Competition Review
Shayan Gharib, Honain Derrar, Daisuke Niizumi, Tuukka Senttula, Janne Tommola, Toni Heittola, Tuomas Virtanen, Heikki Huttunen
×

In this paper we study the problem of acoustic scene classification, i.e., categorization of audio sequences into mutually exclusive classes based on their spectral content. We describe the methods and results discovered during a competition organized in the context of a graduate machine learning course; both by the students and external participants. We identify the most suitable methods and study the impact of each by performing an ablation study of the mixture of approaches. We also compare the results with a neural network baseline, and show the improvement over that. Finally, we discuss the impact of using a competition as a part of a university course, and justify its importance in the curriculum based on student feedback.
 
16:00
×

Many supervised dimensionality reduction methods have been proposed in the recent years. Linear manifold learning methods often have limited flexibility in learning effective representations, whereas nonlinear methods mainly focus on the embedding of the training samples and do not consider the performance of the generalization of the embedding to initially unseen test samples. In this paper, we build on recent theoretical results on the generalization performance of supervised manifold learners, which state that in order to achieve good generalization performance, a trade-off needs to be sought between the separation of different classes in the embedding and the possibility of constructing out-of-sample interpolators with good Lipschitz regularity. In the light of these results, we propose a new supervised manifold learning algorithm that computes an embedding of the training samples along with a smooth interpolation function generalizing the embedding to the whole space. Our method is based on a learning objective that explicitly takes into account the generalization performance to novel test samples. Experimental results show that the proposed method achieves high classification accuracy in comparison with state-of-the-art supervised manifold learning algorithms.
 
16:00
Remote Sensing Image Regression For Heterogeneous Change Detection
Luigi Tommaso Luppino, Filippo Maria Bianchi, Gabriele Moser, Stian Normann Anfinsen
×

Change detection in heterogeneous multitemporal satellite images is an emerging topic in remote sensing. In this paper we propose a framework, based on image regression, to perform change detection in heterogeneous multitemporal satellite images, which has become a main topic in remote sensing. Our method learns a transformation to map the first image to the domain of the other image, and vice versa. Four regression methods are selected to carry out the transformation: Gaussian processes, support vector machines, random forests, and a recently proposed kernel regression method called homogeneous pixel transformation. To evaluate not only potentials and limitations of our framework, but also the pros and cons of each regression method, we perform experiments on two data sets. The results indicates that random forests achieve good performance, are fast and robust to hyperparameters, whereas the homogeneous pixel transformation method can achieve better accuracy at the cost of a higher complexity.
 
16:00
×

In recovering low-dimensional representations of high-dimensional data, graph or manifold-regularized schemes have been investigated as a key tool in many areas to preserve the neighborhood structure of the data set. In spite of its effectiveness, these methods are often not tractable in practice, because graph structures of data lead to a large matrix (e.g., affinity of Laplacian matrix) and the methods require eigenanalysis of it interactively. In this paper, we propose an efficient low-rank matrix approximation that regularized by graph information derived from row and column range spaces of data. To deal with high computational complexity issue, we leverage the Nystr¨om method, which has been universally used to approximate low-rank component of Symmetric Positive Semi-Definite (SPSD) matrices with sampling. Moreover, we devise a Clustered Nystrom extension with QR decomposition to efficiently aggregate more information from samples and to accurately approximate low-rank structure. We compare the performance of the proposed algorithm with other several general algorithms in clustering experiments on benchmark dataset. Our experimental results show that our method has a favorable running speed while the accuracy of our proposed method is better or comparable to the competing methods.
 
16:00
APE: Archetypal-Prototypal Embeddings For Audio Classification
Arshdeep Singh, Anshul Thakur, Padmanabhan Rajan
×

Archetypal analysis deals with representing data points using archetypes, which capture the boundary of the data. Prototypal analysis deals with representing data points using prototypes, which capture the average behaviour of the data. Utilising these two complementary representations, we propose a simple, fixed-length representation for audio signals. We employ a well-studied method for determining archetypes, and utilise Gaussian mixture modelling to represent prototypes. Archetypes and prototypes are concatenated and utilised within a dictionary learning framework. Time-frequency representations of audio signals are projected on these dictionaries, under simplex constraints, to obtain the proposed archetypal prototypal embedding or APE. Experimental results on the tasks of bioacoustic classification and acoustic scene classification demonstrate the effectiveness of the APE representation for audio classification.
 
16:00
Label Propagation For Learning With Label Proportions
Rafael Poyiadzi, Raul Santos-Rodriguez, Niall Twomey
×

Learning with Label Proportions (LLP) is the problem of re-covering the underlying true labels given a dataset when the data is presented in the form of bags. This paradigm is particularly suitable in contexts where providing individual labels is expensive and label aggregates are more easily obtained.In the healthcare domain, it is a burden for a patient to keep a detailed diary of their daily routines, but often they will be amenable to provide higher level summaries of daily behavior. We present a novel and efficient graph-based algorithm that encourages local smoothness and exploits the global structure of the data, while preserving the ‘mass’ of each bag.
 
16:00
Detecting Industrial Fouling By Monotonicity During Ultrasonic Cleaning
Chang Rajani, Arto Klami, Ari Salmi, Timo Rauhala, Edward Hćggström, Petri Myllymäki
×

High power ultrasound permits non-invasive cleaning of industrial equipment, but to make such cleaning systems energy efficient, one needs to recognize when the structure has been sufficiently cleaned without using invasive diagnostic tools. This can be done using ultrasound reflections generated inside the structure. This inverse modeling problem cannot be solved by forward modeling for irregular and complex structures, and it is difficult to tackle also with machine learning since human-annotated labels are hard get. We provide a deep learning solution that relies on the physical properties of the cleaning process. We rely on the fact that the amount of fouling is reduced as we clean more. Using this monotonicity property as indirect supervision we develop a semi-supervised model for detecting when the equipment has been cleaned.
 
16:00
×

Standard classification models are usually additive models, which only consider the contributions from the main effects of features. When the features are highly correlated, the interactions between features provide us not only more additional features, but also the underlying graphs between features. In this paper, we integrate into multiclass SVM a strong hierarchy regularization in order to learn the main effects and the interactions. A primal-dual proximal algorithm with epigraphical projection is proposed to minimize the objective function. The proposed algorithm is applied to face classification task on the Extended YaleB database and the results validate its effectiveness.
 
16:00
×

This paper demonstrates a semi-supervised learning approach to frame-level proximity and touch recognition with machine learning algorithms for sequential modeling. We focus on capacitive sensing, which is employable in low cost embedded devices and provides high sensing capability. We optimize our models to run with minimum complexity to enable the use of state-of-the-art machine learning models in low cost embedded devices. We evaluate two different models, either based on recurrent neural networks (RNN) with gated recurrent units or hidden markov models (HMM). We show that the developed models are capable of a robust proximity and touch recognition invariant to interference factors. However, the RNN model outperforms the HMM model reaching a superior frame-level recognition accuracy of 97.1% on a challenging set of touches containing multiple interference factors like the use of gloves with different materials and invalid touches, where the test persons swiped over the sensor.
18:45-22:00Musikkens HusBanquet


Thursday September 20, 2018
8:30-12:002nd floorRegistration
8:30-9:002nd floorWelcome Refreshment
9:00-10:00Kilden, 2nd floorKeynote Lecture: Industrial Keynote: A reality check on data driven business-what are the real life potential and barriers?

Kaare Brandt Petersen
Implement Consulting Group, Copenhagen, Denmark

Chair: Søren Holdt Jensen, Aalborg University, Denmark

10:00-10:302nd floorCoffee Break
10:30-12:30Kilden, 2nd floorPoster Session 3: Neural network and deep learning
Chair: Robert Jensen, The Arctic University of Norway, Norway

10:30
Simple Deep Learning Network Via Tensor-Train Haar-Wavelet Decomposition Without Retraining
Wei-Zhi Huang, Sung-Hsien Hsieh, Chun-Shien Lu, Soo-Chang Pei
×

Deep neural network has revolutionized machine learning recently. However, it suffers from both high computation and memory cost such that deploying it on a hardware with limited resources (e.g., mobile devices) becomes a challenge. To address this problem, we propose a new technique, called Tensor-Train Haar-wavelet decomposition, that decomposes a large weight tensor from a fully-connected layer into a sequence of partial Haar-wavelet matrices without retraining. The novelty originates from the deterministic partial Haarwavelet matrices such that we only need to store row indices instead of the whole matrix. Empirical results demonstrate that our method achieves efficient model compression while maintaining limited accuracy loss, even without retraining.
 
10:30
Evaluation Of Loss Functions For Estimation Of Latent Vectors From GAN
Arun Patro, Vishnu Vardhan Makkapati, Jayanta Mukhopadhyay
×

Generative Adversarial Networks (GANs) are being used to learn distributions of image data. We attempt to estimate the latent vector that results in the best approximation of a real world image. We estimate the latent vector by using a metric that compares the original image and it’s generated version. The existing methods minimize the error between these two images to estimate it. In our work, we also maximize the signal content in the generated image while formulating these metrics. We present several metrics based on error, signal to noise ratio and energy of the gradient image. We evaluate them by using images of t-shirts and present quantitative and qualitative results. We demonstrate an application of the proposed methods to generate new designs that are inspired by the input ones.
 
10:30
Quality Preserving Face De-Identification Against Deep CNNs
Panteleimon Chriskos, Rosen Zhelev, Vasileios Mygdalis, Ioannis Pitas
×

In this paper, two face de-identification methods are proposed regarding face identification hindering against a deep neural network. Our work focuses on achieving a delicate balance, so that the facial images are miss-classified by the deep network, while the human observer can still identify the persons depicted in a scene. The proposed methods are based on achieving face de-identification by partly degrading image quality in order to hinder face recognition from deep neural networks, while maintaining the highest possible image quality, at the same time. To this end, we employ de-identification methods based on singular value decomposition and image hypersphere projections, respectively. From the conducted experiments, it can be concluded that these methods are capable of reducing correct face identification rates of the VGG-face network by over 90 %. Moreover, it is shown that these error rates preserve adequate image quality as is demonstrated through the values of the complex wavelet structural similarity index, allowing face recognition by humans contrary to most face de-identification methods.
 
10:30
×

In this paper, we propose a novel technique that combines the concept of spatially targeted optical flow with image processing, for affect state recognition, concerning a wide variety of learner types, including children with autism and mainstream children. We exploit the advantages of deep Neural Networks on image classification, by adopting a two--stream CNN approach for the recognition task, based on gaze analysis. As there is not a publicly available dataset to contain such a variety of learner types, a dataset was created in order to evaluate the performance of our algorithm. We validate our approach using this dataset, by optimising a mean--square error loss function, obtaining promising results for this challenging task.
 
10:30
×

Strapdown inertial navigation systems are sensitive to the quality of the data provided by the accelerometer and gyroscope. Low-grade IMUs in handheld smart-devices pose a problem for inertial odometry on these devices. We propose a scheme for constraining the inertial odometry problem by complementing non-linear state estimation by a CNN-based deep-learning model for inferring the momentary speed based on a window of IMU samples. We show the feasibility of the model using a wide range of data from an iPhone, and present proof-of-concept results for how the model can be combined with an inertial navigation system for three-dimensional inertial navigation.
 
10:30
×

This paper demonstrates how elementary convolutional neural networks can be used to classify noise signals of three types: normal, uniform, exponential -- where the signals have identical power, which means that a classifier has to rely on their ”structural” properties. A key innovation in our approach, as compared to existing research, is that our networks take raw data as input and automatically generate a selection of informative features. We have also analyzed the structure of trained convolutional networks and their decision-making process. Robustness to contamination of input data (model of channel/sensor cut-off) and the capability to detect prevailing signal in a mixture of signals under the conditions of a priori uncertainty have been evaluated as well. The study has shown that neural networks are effective in applications involving narrowband or broadband stochastic processes, as well as distinct patterns, and can, therefore, be used for signal processing tasks.
 
10:30
×

Convolutional neural networks (CNNs) have attracted researchers’ increasing attention for almost three decades now, achieving superior results in such domains as computer vision, signal processing etc. Their success can be mainly attributed to a specific network architecture, which is conceived by assigning values to a large number of hyper-parameters, each influencing the resulting error rate. Yet a search for good hyper-parameter values is a challenging task, being usually done manually and taking a considerable amount of work. This paper is dedicated to the problem of designing automated hyper-parameter search algorithms for convolutional architectures. We propose two algorithms based on such metaheuristics as evolutionary computation and local search. To our knowledge, they have never been applied to the case of CNN architectures before. Using image recognition datasets, we compare the algorithms and show that they can produce CNNs with nearly state of the art performance without any user interference, saving much tedious effort.
 
10:30
Convex Likelihood Alignments For Bioacoustic Classification
Anshul Thakur, Arshdeep Singh, Padmanabhan Rajan
×

In this work, we propose a bioacoustic classification framework based on Gaussian mixture models (GMM) and archetypal analysis (AA). The framework utilizes acoustic topic modelling to obtain an intermediate symbolic representation where the discrimination between target classes is more evident than in the input feature space. The proposed framework utilizes the GMM as an acoustic topic model and weighted likelihoods obtained from this GMM are utilized as the intermediate symbolic representation. Class-specific archetypal dictionaries are used to obtain the proposed feature representation, designated as convex likelihood alignments (CLAs), from this intermediate representation. Class-specific signatures are highly evident in these CLAs making them an ideal representation for various bioacoustic classification tasks. Through experiments on two available datasets, it is shown that the proposed CLAs yield comparable or better results than state-of-art approaches.
 
10:30
×

Warm restart techniques on training deep neural networks often achieve better recognition accuracies and can be regarded as easy methods to obtain multiple neural networks with no additional training cost from a single training process. Ensembles of intermediate neural networks obtained by warm restart techniques can provide higher accuracy than a single neural network obtained finally by a whole training process. However, existing methods on both of warm restart and its ensemble techniques use fixed cyclic schedules and have little degree of parameter adaption. This paper extends a class of possible schedule strategies of warm restart, and clarifies their effectiveness for recognition performance. Specifically, we propose parameterized functions and various cycle schedules to improve recognition accuracies by the use of deep neural networks with no additional training cost. Experiments on CIFAR-10 and CIFAR-100 show that our methods can achieve more accurate rates than the existing cyclic training and ensemble methods.
 
10:30
Deep Neural Networks For Application Awareness In SDN-Based Network
Jun Xu, Jingyu Wang, Qi Qi, Bo He, Haifeng Sun
×

Accurate traffic classification is essential for traffic engineering and Quality of Service (QoS) guarantee, especially in Internet of Things (IoT). Different applications have different network resource requirements, so an excellent classification algorithm can realize application awareness in traffic engineering and significantly improve QoS. Software Defined Network (SDN) with centralized controlling of network resources provides opportunities for fine-grained resource allocation. However, there are many issues when deep learning is employed in SDN, for example, sampling and classifying traffic data consume a lot of IO and computing resources of the SDN controller. In this paper, we deploy the Deep Neural Network (DNN) on Virtualized Network Function (VNF) to solve the problems of applying deep learning in SDN. The experiments show that the proposed DNN model outperforms existing traffic classification algorithm and the SDN controller can assign more appropriate route paths for different types of traffic and highly improve the network QoS.
 
10:30
Light Field Based Face Recognition Via A Fused Deep Representation
Alireza Sepas-Moghaddam, Paulo Lobato Correia, Kamal Nasrollahi, Thomas B Moeslund, Fernando Pereira
×

The emergence of light field cameras opens new frontiers in terms of biometric recognition. This paper proposes the first deep CNN solution for light field based face recognition, exploiting the richer information available in a lenslet light field image. Additionally, for the first time, the exploitation of disparity maps together with 2D-RGB images and depth maps has been considered in the context of a fusion scheme to further improve the face recognition performance. The proposed solution uses the 2D-RGB central sub-aperture view as well as the disparity and depth maps extracted from the full set of sub-aperture images associated to a lenslet light field. After, feature extraction is performed using a VGG-Face deep descriptor for texture and independently fine-tuned models for disparity and depth maps. Finally, the extracted features are concatenated to be fed into an SVM classifier. A comprehensive set of experiments has been conducted with the IST-EURECOM light field face database, showing the superior performance of the fused deep representation for varied and challenging recognition tasks.
 
10:30
×

Echo State Newtworks (ESNs) are simplified recurrent neural network models composed of a reservoir and a linear, trainable readout layer. The reservoir is tunable by some hyper-parameters that control the network behaviour. ESNs are known to be effective in solving tasks when configured on a region in (hyper-)parameter space called Edge of Criticality (EoC), where the system is maximally sensitive to perturbations hence affecting its behaviour. In this paper, we propose binary ESNs, which are architecturally equivalent to standard ESNs but consider binary activation functions and binary recurrent weights. For these networks, we derive a closed-form expression for the EoC in the autonomous case and perform simulations in order to assess their behavior in the case of noisy neurons and in the presence of a signal. We propose a theoretical explanation for the fact that the variance of the input plays a major role in characterizing the EoC.
 
10:30
×

Recently multiple high performance algorithms have been developed to infer high-resolution images from low-resolution image input using deep learning algorithms. The related problem of super-resolution from blurred or corrupted low-resolution images has however received much less attention. In this work, we propose a new deep learning approach that simultaneously addresses deblurring and super-resolution from blurred low resolution images. We evaluate the state-of-the-art super-resolution convolutional neural network (SRCNN) architecture proposed in [1] for the blurred reconstruction scenario and propose a revised deeper architecture that proves its superiority experimentally both when the levels of blur are known and unknown a priori.
 
10:30
Detection Of Cut Points For Automatic Music Rearrangement
Daniel Stoller, Vincent Akkermans, Simon Dixon
×

Existing music recordings are often rearranged, for example to fit their duration and structure to video content. Often an expert is needed to find suitable cut points allowing for imperceptible transitions between different sections. In previous work, the search for these cuts is restricted to the beginnings of beats or measures and only timbre and loudness are taken into account, while melodic expectations and instrument continuity are neglected. We instead aim to learn these features by training neural networks on a dataset of over 300 popular Western songs to classify which note onsets are suitable entry or exit points for a cut. We investigate existing and novel architectures and different feature representations, and find that best performance is achieved using neural networks with two-dimensional convolutions applied to spectrogram input covering several seconds of audio with a high temporal resolution of 23 or 46 ms. Finally, we analyse our best model using saliency maps and find it attends to rhythmical structures and the presence of sounds at the onset position, suggesting instrument activity to be important for predicting cut quality.
12:30-13:301st floorLunch
13:30-14:30Kilden, 2nd floorLecture Session 5: Neural network and deep learning
Chair: Jen-Tzung Chien, National Chiao Tung University, Taiwan

13:30
Recurrent Neural Networks With Flexible Gates Using Kernel Activation Functions
Simone Scardapane, Steven Van Vaerenbergh, Danilo Comminiello, Simone Totaro, Aurelio Uncini
×

Gated recurrent neural networks have achieved remarkable results in the analysis of sequential data. Inside these networks, gates are used to control the flow of information, allowing to model even very long-term dependencies in the data. In this paper, we investigate whether the original gate equation (a linear projection followed by an element-wise sigmoid) can be improved. In particular, we design a more flexible architecture, with a small number of adaptable parameters, which is able to model a wider range of gating functions than the classical one. To this end, we replace the sigmoid function in the standard gate with a non-parametric formulation extending the recently proposed kernel activation function (KAF), with the addition of a residual skip-connection. A set of experiments on sequential variants of the MNIST dataset shows that the adoption of this novel gate allows to improve accuracy with a negligible cost in terms of computational power and with a large speed-up in the number of training iterations.
 
13:50
Sketchsegnet: A RNN Model For Labeling Sketch Strokes
Xingyuan Wu, Yonggang Qi, Jun Liu, Jie Yang
×

We investigate the problem of stroke-level sketch segmentation, which is to train machines to assign strokes with semantic part labels given a input sketch. Solving the problem of sketch segmentation opens the door for fine-grained sketch interpretation, which can benefit many novel sketch-based applications, including sketch recognition and sketch-based image retrieval. In this paper, we treat the problem of stroke-level sketch segmentation as a seqence-to-sequence generation problem, and a reccurent nueral networks (RNN)-based model SketchSegNet is presented to translate sequence of strokes into thier semantic part labels. In addition, for the first time a large-scale stroke-level sketch segmentation dataset is proposed, which is composed of 57K annotated free-hand human sketch selected from QuickDraw. Experimental results of stroke-level sketch segmentation on this novel dataset shows that our approach offers an average accuracy over 90% for stroke labeling.
 
14:10
×

Attention over natural language aims to spotlight on the meaningful region with representative keywords which can extract desirable meanings to accomplish a task of interest. The attention parameter is a latent variable which was indirectly estimated by minimizing the classification loss. For the task of question answering (QA), the classification loss may not sufficiently reflect the target answer. This paper proposes a direct solution which attends the meaningful region by minimizing the reconstruction loss due to auxiliary or supporting data which are available in different scenarios. In particular, minimizing the classification and reconstruction losses are carried out under the end-to-end memory network so that the memory-augmented question answering is realized. Such a supportive attention is implemented as a sequence-to-sequence model which reconstructs the supporting sentence to assure the translation invariance. The merit of this method is sufficiently demonstrated for sequential learning by using the bAbI QA and dialog tasks.
14:30-15:002nd floorCoffee Break
15:00-17:00Kilden, 2nd floorPoster Session 4: Biomedical applications
Chair: Simo Särkkä, Aalto University, Finland

15:00
Machine Learning As Digital Therapy Assessment For Mobile Gait Rehabilitation
Javier Conte Alcaraz, Sanam Moghaddamnia, Nils Poschadel, Jürgen Peissig
×

A novel real-time acoustic feedback (RTAF) based on machine learning to reduce the duration and to improve the progress in the rehabilitation is presented. Wearable technology (WT) has emerged as a viable means to provide low-cost digital healthcare and therapy course outside the medical environment like hospitals and clinics. In this paper we show that the RTAF together with WTs can offer an excellent solution to be used in rehabilitation. The method of RTAF based on machine learning as well as a study for proving its effectiveness are presented below. The results show a faster recovery time using RTAF. The proposed RTAF shows a great potential to be used and deployed to support digital healthcare, therapy and rehabilitation.
 
15:00
Spectro-Temporal ECG Analysis For Atrial Fibrillation Detection
Zheng Zhao, Simo Särkkä, Ali Bahrami Rad
×

This article is concerned with spectro-temporal (i.e., time varying spectrum) analysis of ECG signals for application in atrial fibrillation (AF) detection. We propose a Bayesian spectro-temporal representation of ECG signal using state-space model and Kalman filter. The 2D spectro-temporal data are then classified by a densely connected convolutional networks (DenseNet) into four different classes: AF, non-AF normal rhythms (Normal), non-AF abnormal rhythms (Others), and noisy segments (Noisy). The performance of the proposed algorithm is evaluated and scored with the PhysioNet/Computing in Cardiology (CinC) 20”7 dataset. The experiment results shows that the proposed method achieves the overall F” score of 80.2%, which is in line with the state-of-the-art algorithms. In addition, the proposed spectro-temporal estimation approach outperforms standard time-frequency analysis methods, that is, short-time Fourier transform, continuous wavelet transform, and autoregressive spectral estimation for AF detection.
 
15:00
Cross-Corpus EEG-Based Emotion Recognition
Soheil Rayatdoost, Mohammad Soleymani
×

Lack of generalization is a common problem in automatic emotion recognition. The present study aims to explore the suitability of the existing EEG features for emotion recognition and investigate the performance of emotion recognition methods across different corpora. We introduce a novel dataset which includes spontaneous emotions and was analyzed in addition to the existing datasets for cross-corpus evaluation. We demonstrate that the performance of the existing methods significantly decreases when evaluated across different corpora. The best results are obtained by a convolutional neural network fed by spectral topography maps from different bands. We provide some evidence that stimuli-related sensory information is learned by machine learning models for emotion recognition using EEG signals.
 
15:00
×

This paper proposes a Bayesian approach to parameter estimation in electrocardiogram state space models. The on-line nature of the proposed method allows it to be applied to real-world electrocardiogram recordings with varying beat morphology, heart rate, and noise; it thereby provides clear advantages over the conventional Gaussian kernel approach. The applicability of the proposed method is demonstrated on benchmark electrocardiogram data. The results indicate that the method provides a promising framework for noise reduction and wave delineation in electrocardiograms.
 
15:00
A Deep Learning Architecture To Detect Events In EEG Signals During Sleep
Stanislas Chambon, Valentin Thorey, Pierrick J Arnal, Emmanuel Mignot, Alexandre Gramfort
×

Electroencephalography (EEG) during sleep is used by clinicians to evaluate various neurological disorders. In sleep medicine, it is relevant to detect macro-events (> ”0s) such as sleep stages, and micro-events (< 2s) such as spindles and K-complexes. Annotations of such events require a trained sleep expert, a time consuming and tedious process with a large inter-scorer variability. Automatic algorithms have been developed to detect various types of events but these are event-specific. We propose a deep learning method that jointly predicts locations, durations and types of events in EEG time series. It relies on a convolutional neural network that builds a feature representation from raw EEG signals. Numerical experiments demonstrate efficiency of this new approach on various event detection tasks compared to current state-of-the-art, event specific, algorithms.
 
15:00
Single-Channel EEG Classification By Multi-Channel Tensor Subspace Learning And Regression
Simon Van Eyndhoven, Martijn Boussé, Borbála Hunyadi, Lieven De Lathauwer, Sabine Van Huffel
×

The classification of brain states using neural recordings such as electroencephalography (EEG) finds applications in both medical and non-medical contexts, such as detecting epileptic seizures or discriminating mental states in brain-computer interfaces, respectively. Although this endeavor is well-established, existing solutions are typically restricted to lab or hospital conditions because they operate on recordings from a set of EEG electrodes that covers the whole head. By contrast, a true breakthrough for these applications would be the deployment `in the real world”, by means of wearable devices that encompass just one (or a few) channels. Such a reduction of the available information inevitably makes the classification task more challenging. We tackle this issue by means of a multilinear subspace learning step (using data from multiple channels during training) and subsequently solving a regression problem with a low-rank structure to classify new trials (using data from only a single channel during testing). We demonstrate the feasibility of this approach on EEG data recorded during a mental arithmetic task.
 
15:00
Uncertainty Modeling And Interpretability In Convolutional Neural Networks For Polyp Segmentation
Kristoffer Knutsen Wickstrřm, Michael Kampffmeyer, Robert Jenssen
×

Convolutional Neural Networks (CNNs) are propelling advances in a range of different computer vision tasks such as object detection and object segmentation. Their success has motivated research in applications of such models for medical image analysis. If CNN-based models are to be helpful in a medical context, they need to be precise, interpretable, and uncertainty in predictions must be well understood. In this paper, we develop and evaluate recent advances in uncertainty estimation and model interpretability in the context of semantic segmentation of polyps from colonoscopy images. We evaluate and enhance several architectures of Fully Convolutional Networks (FCNs) for semantic segmentation of colorectal polyps and provide a comparison between these models. Our highest performing model achieves a 76.06% mean IOU accuracy on the EndoScene dataset, a considerable improvement over the previous state-of-the-art.
 
15:00
×

We propose a new algorithm for inference of gene regulatory networks (GRN) from noisy gene expression data based on maximum-likelihood (ML) adaptive filtering and the discrete fish school search algorithm (DFSS). The approach is based on the general partially-observed Boolean dynamical system (POBDS) model, and as such can be used for simultaneous state and parameter estimation for any Boolean dynamical system observed in noise. The proposed DFSS-ML- BKF algorithm combines the ML adaptive Boolean Kalman Filter (ML-BKF) with DFSS, a version of the Fish School Search algorithm tailored for discrete parameter spaces. Results based on synthetic gene expression time-series data us- ing the well-known p53-MDM2 negative-feedback loop GRN demonstrate that DFSS-ML-BKF can infer the network topol- ogy accurately and efficiently.
 
15:00
×

Type 1 Diabetes is characterized by the lack of insulin- producing beta cells in the pancreas. The artificial pancreas promises to alleviate the burdens of self-management. While the physical components of the system – the continuous glu- cose monitor and insulin pump – have experienced rapid advances, a technological bottleneck remains in the control algorithm, which is responsible for translating data from the former into instructions for the latter. In this work, we pro- pose to bring machine learning techniques to bear upon the challenges of blood glucose control. Specifically, we employ reinforcement learning to learn an optimal insulin policy. Learning is generalized using nonparametric regression with functional features, exploiting information contained in the shape of the glucose curve. Our algorithm is model-free, data-driven and personalized. In-silico simulations with T1D models demonstrate the potential of the proposed algorithm.
 
15:00
Chronic Wound Tissue Classification Using Convolutional Networks And Color Space Reduction
Vitor Godeiro, José Francisco Silva Neto, Bruno Motta De Carvalho, Julianny Ferraz, Bruno Santana, Renata Antonaci Gama
×

Chronic Wounds are ulcers presenting a difficult or nearly interrupted cicatrization process that increase the risk of complications to the health of patients, like amputation and infections. This research proposes a general noninvasive methodology for the segmentation and analysis of chronic wounds images by computing the wound areas affected by necrosis. Invasive techniques are usually used for this calculation, such as manual planimetry with plastic films. We investigated algorithms to perform the segmentation of wounds as well as the use of several convolutional networks for classifying tissue as Necrotic, Granulation or Slough. We tested four architectures: U-Net, Segnet, FCN8 and FCN32, and proposed a color space reduction methodology that increased the reported accuracies, specificities, sensitivities and Dice coefficients for all 4 networks, achieving very good levels.
17:00-17:30Kilden, 2nd floorClosing Ceremony
Powered by CONWIZ, © Copyright 2018 | Privacy Policy and Terms of Use | Contact Page Editor