B. Kégl

Introduction to
multivariate discrimination

In T. Delemontex and A. Lucotte, editors, SOS
2012 – IN2P3 School of Statistics. EPJ Web of Conferences, 2013.

For physicists and other practicioners.

B. Kégl

Open problem: A
(missing) boosting-type convergence result for AdaBoost.MH with factorized
multi-class classifiers

In Conference on Computational Learning Theory (open problems
track), 2014.

In this companion paper we showed that AdaBoost.MH works very well in practice with factorized base classifiers. This note states the open problem of the exponential convergence of the algorithm.

B. Kégl

The return of
AdaBoost.MH: multi-class Hamming trees

In International Conference on Learning Representations, 2014.

We train vector-valued decision trees within the framework of AdaBoost.MH. The key element of the method is a vector-valued decision stump, factorized into an input-independent vector of length K and label-independent scalar classifier.

I. Hegedüs,
R. Busa-Fekete, R. Ormándi, M. Jelasity, and B. Kégl

Peer-to-peer multi-class boosting

In International European Conference on Parallel and Distributed
Computing, pages 389–400, June 2012.

D. Benbouzid,
R. Busa-Fekete, and B. Kégl

Fast
classification using sparse decision DAGs

In International Conference on Machine Learning, Volume 29, June
2012.

We build sparse decision DAGs (directed acyclic graphs) from a list of base classifiers using an MDP. Each instance can decide to use or to skip each base classifier, based on the current state of the classifier being built. The algorithm clearly outperforms state-of-the-art cascade detectors. It is also readily applicable for multi-class classification.

D. Benbouzid,
R. Busa-Fekete, N. Casagrande, F.-D. Collin, and B. Kégl

MultiBoost: a
multi-purpose boosting package

Journal of Machine Learning Research, 13:549–553, 2012.

The MultiBoost package provides a fast C++ implementation of multi-class/multi-label/multi-task boosting algorithms. It is based on AdaBoost.MH but it also implements popular cascade classifiers and FilterBoost. The package contains common multi-class base learners (stumps, trees, products, Haar filters). Further base learners and strong learners following the boosting paradigm can be easily implemented in a flexible framework.

R. Busa-Fekete and
B. Kégl

Fast boosting using
adversarial bandits

In International Conference on Machine Learning, Volume 27, pages
143–150, June 2010.

We improved on our previous bandit boosting paper by using an adversarial bandit algorithm instead of stochastic bandits. The practical performance is comparable but this choice allows us to prove a weak-to-strong-learning theorem, which means that the proposed technique remains a boosting algorithm in a formal sense.

R. Busa-Fekete and
B. Kégl

Accelerating
AdaBoost using UCB

In KDDCup 2009 (JMLR workshop and conference proceedings),
Volume 7, pages 111–122, 2009.

We show how to accelerate AdaBoost using multi-armed bandits (MABs). The basic idea is that we gradually learn the utility of the features (or, in general, subsets of base learners) using MABs, and use this utility to bias the random selection in each boosting iteration.

B. Kégl and
R. Busa-Fekete

Boosting products of
base classifiers

In International Conference on Machine Learning, Volume 26, pages
497–504, June 2009.

We show how to boost products of base learners. Similarly to trees, we call the base learner as a subroutine but in an iterative rather than recursive fashion. In experiments on relatively large data sets we clearly outperform boosted trees. We also propose an improved base learner for nominal features and show that boosting the product of two of these new subset indicator base learners solves the maximum margin matrix factorization problem.

S. Gambs, B. Kégl,
and E. Aımeur

Privacy-preserving
boosting

Data Mining and Knowledge Discovery, 14(1):131–170, 2007.

How to AdaBoost when the data set is divided among two or more participants who don't want to share possibly sensitive details on their data.

B. Kégl and
L. Wang

Boosting
on manifolds: adaptive regularization of base classifiers

In Advances in Neural Information Processing Systems, Volume 17,
pages 665–672. The MIT Press, Dec. 2004.

How to use additive regularization penalty at the base-learner level, and what the implied global objective function is.

B. Kégl

Generalization
error and algorithmic convergence of median boosting

In Advances in Neural Information Processing Systems, Volume 17,
pages 657–664. The MIT Press, Dec. 2004.

B. Kégl

Robust regression by
boosting the median

In 16th Conference on Computational Learning Theory, pages
258–272, Aug. 2003.

B. Kégl

Confidence-rated
regression by boosting the median

Technical Report 1241, Department of Computer Science, University of Montreal,
2003.

This TR contains all the detailed information on MedBoost published in the two conference papers above. It is interesting that the "natural" extension of AdaBoost to regression implies that one should use the weighted median of the base regressors instead of the weighted mean usually used in additive models. By "natural" I mean that you can prove a weak-learning rightarrow strong-learning theorem. While this was already noted in Freund and Schapire's original AdaBoost paper, the original algorithm is not practical without using Rätsch and Warmuth's marginal boosting idea.

R. Busa-Fekete,
B. Kégl, T. Éltető, and Gy. Szarvas

Tune and mix:
learning to rank using ensembles of calibrated multi-class
classifiers

Machine Learning Journal, 93(2):261–292, 2013.

We show that a point-wise ranking approach (based on multi-class classification, calibration, and mixing models) is competitive to more complex pairwise and listwise methods, especially on large sets. Our extensive experimental study is itself an important contribution: we compare most of the existing learning-to-rank techniques on all of the available large-scale benchmark data sets using a standardized implementation of the NDCG score.

R. Busa-Fekete,
B. Kégl, T. Éltető, and Gy. Szarvas

An apple-to-apple comparison of learning-to-rank algorithms in terms of
Normalized Discounted Cumulative Gain

In ECAI Workshop on Preference Learning, August 2012.

R. Busa-Fekete,
B. Kégl, T. Éltető, and Gy. Szarvas

A robust ranking
methodology based on diverse calibration of AdaBoost

In European Conference on Machine Learning, Volume 22, pages
263–279, 2011.

We show that a simple boosting-based pointwise ensemble approach with a slight listwise touch at the last combination step is competitive to sophisticated (and computationally expensive) ranking techniques.

R. Busa-Fekete,
B. Kégl, T. Éltető, and Gy. Szarvas

Ranking by
calibrated AdaBoost

In Yahoo! Ranking Challenge 2010 (JMLR workshop and conference
proceedings), Volume 14, pages 37–48, 2011.

Our boosting-based pointwise ensemble approach ended up 6th in Track 1 and 11th in Track 2.

A. Roodaki,
R. Bardenet, O. Cappé, and B. Kégl

Detection and estimation of high energy physical particles using Monte
Carlo methods

In GRETSI, Brest, France, September 2013.

R. Bardenet,
M. Brendel, B. Kégl, and M. Sebag

Collaborative
hyperparameter tuning

In International Conference on Machine Learning, June 2013.

We simultaneously tune hyperparameters of a given algorithm on several data sets with a surrogate-based ranking and optimization technique.

R. Bardenet,
O. Cappé, G. Fort, and B. Kégl

Adaptive
Metropolis with online relabeling

In International Conference on Artificial Intelligence and
Statistics, Volume 22, pages 91–99, Apr. 2012.

We propose a novel adaptive MCMC algorithm named AMOR for efficiently simulating from permutation-invariant targets occurring in, for example, Bayesian analysis of mixture models.

R. Bardenet and
B. Kégl

An adaptive
Monte-Carlo Markov chain algorithm for inference from mixture
signals

Journal of Physics: Conference Series, 368(1):012044, 2012.

We describe adaptive Metropolis with online relabeling (AMOR) for modeling mixture signals and illustrate the algorithm on a synthetic mixture model inspired by the muonic water Cherenkov signal of the surface detectors in the Pierre Auger Experiment.

J. Bergstra,
R. Bardenet, Y. Bengio, and B. Kégl

Algorithms for
hyper-parameter optimization

In Advances in Neural Information Processing Systems, Volume 24.
The MIT Press, Dec. 2011.

We optimize hyper-parameters using random search and two new greedy sequential methods based on the expected improvement criterion, and compare the methods on tasks of training neural networks and deep belief networks.

R. Bardenet and
B. Kégl

Surrogating the
surrogate: accelerating Gaussian-process-based global optimization with a
mixture cross-entropy algorithm

In International Conference on Machine Learning, Volume 27, pages
55–62, June 2010.

We propose a mixture cross-entropy optimizer and use it for merit function optimization in a Gaussian-process surrogate optmization framework. We outperform the classical single-Gaussian cross-entropy method when the fitness function is highly multimodal, and we improve on standard exhaustive search in GP-based surrogate optimization.

B. Kégl

Correlation-based
construction of neighborhood and edge features

In International Conference on Learning Representations (workshop
track), 2014.

Motivated by an abstract notion of low-level edge detector filters, we propose a simple method of unsupervised feature construction based on pairwise statistics of features. In the first step, we construct neighborhoods of features by regrouping features that correlate. Then we use these subsets as filters to produce new neighborhood features. Next, we connect neighborhood features that correlate, and construct edge features by subtracting the correlated neighborhood features of each other.

A. N. Gorban,
B. Kégl, D. C. Wunsch, and A. Zinovyev, editors

Principal Manifolds
for Data Visualization and Dimension Reduction

Volume 58 of Lecture Notes in Computational Science and
Engineering

Springer, 2008.

Proceedings of a small workshop organized by Alexander Gorban. The book has a Russian web site that used to come up as the third hit if you type главных компонент ("principal component" in Russian) into Google.

N. Le Roux,
Y. Bengio, P. Lamblin, Joliveau M., and B. Kégl

Learning
the 2-D topology of images

In Advances in Neural Information Processing Systems, Volume 20,
pages 841–848. The MIT Press, Dec. 2007.

Take a set of images and shuffle their pixels in an arbitrary way but using the same permutation for all the images. The two-dimensional pixel topology can be reconstructed by looking at the pairwise correlations of pixels across the set, using only a few thousand pictures.

B. Kégl

Intrinsic dimension
estimation using packing numbers

In Advances in Neural Information Processing Systems, Volume 15,
pages 681–688. The MIT Press, Dec. 2002.

A simple algorithm to measure the degrees of freedom in a multivariate data set. I have a web site with a java software. Pretty high impact considering that it's "only" a conference paper.

B. Kégl,
A. Krzyżak, T. Linder, and K. Zeger

Learning and
design of principal curves

IEEE Transactions on Pattern Analysis and Machine Intelligence,
22(3):281–297, 2000.

Algorithm and theory for estimating curves that go through the "middle" of a multivariate "cloud" of points. Arguably my most significant research contribution in terms of impact. I have another paper on the subject in the Applications/Image processing section, and three conference papers (not listed) that overlap with the papers and my thesis. I have a web page dedicated to the subject with a java software that works on most of the platforms.

B. Kégl

Principal
Curves: Learning, Design, and Applications

PhD thesis, Concordia University, Montreal, Canada, 1999.

A. Antos,
B. Kégl, T. Linder, and G. Lugosi

Data-dependent
margin-based generalization bounds for classification

Journal of Machine Learning Research, 3:73–98, 2003.

A. Krzyżak,
J. S c asiadek, and B. Kégl

Non-parametric identification of dynamic non-linear systems by a Hermite
series approach

International Journal of Systems Science, 32(10):1261–1285,
2001.

B. Kégl,
A. Krzyżak, and H. Niemann

Radial basis function networks and complexity regularization in function
learning and classification

In International Conference on Pattern Recognition, pages
2/81–86, Sept. 2000.

B. Kégl,
T. Bertin-Mahieux, and D. Eck

Metropolis-Hastings
sampling in a FilterBoost music classifier

In International Workshop on Machine Learning and Music (ICML08
Workshop), July 2008.

J. Bergstra,
N. Casagrande, D. Erhan, D. Eck, and B. Kégl

Aggregate features
and AdaBoost for music classification

Machine Learning Journal, 65(2/3):473–484, 2006.

N. Casagrande,
D. Eck, and B. Kégl

Frame-level audio
feature extraction using AdaBoost

In 6th International Conference on Music Information Retrieval,
pages 345–350, Sept. 2005.

N. Casagrande,
D. Eck, and B. Kégl

Geometry in sound:
A speech/music audio classifier inspired by an image
classifier

In International Computer Music Conference, Volume 17, Sept.
2005.

J. Perez,
C. Germain-Renaud, B. Kégl, and C. Loomis

Multiobjective
reinforcement learning for responsive grids

Journal of Grid Computing, 8(3), 2010.

J. Perez,
C. Germain-Renaud, B. Kégl, and C. Loomis

Responsive elastic computing

In IEEE International Conference on Autonomic Computing, Volume 6,
pages 55–64, June 2009.

J. Perez,
C. Germain-Renaud, B. Kégl, and C. Loomis

Utility-based
reinforcement learning for reactive grids

In IEEE International Conference on Autonomic Computing, Volume 5,
pages 205–206, June 2008.

S. Bouktif,
H. Sahraoui, B. Kégl, D. Azar, and D. Precup

Improving rule set based software quality prediction: A genetic algorithm based
approach

Journal of Object Technology, 3(4):227–241, 2004.

D. Azar,
S. Bouktif, B. Kégl, H. Sahraoui, and D. Precup

Combining and adapting software quality predictive models by genetic
algorithms

In 17th IEEE International Conference on Automated Software
Engineering, pages 285–288, Sept. 2002.

S. Bouktif,
B. Kégl, and H. Sahraoui

Combining software quality predictive models: an evolutionary
approach

In IEEE International Conference on Software Maintenance, pages
385–392, Oct. 2002.

G.Y. Chen and
B. Kégl

Invariant pattern
recognition using contourlets and AdaBoost

Pattern Recognition, 43(3):579–583, 2010.

G.Y. Chen and
B. Kégl

Image denoising
with complex ridgelets

Pattern Recognition, 40(2):578–585, 2007.

B. Kégl and
A. Krzyżak

Piecewise linear
skeletonization using principal curves

IEEE Transactions on Pattern Analysis and Machine Intelligence,
24(1):59–74, 2002.

P. Abreu et
al.

Muons in air showers at
the Pierre Auger Observatory: Measurement of atmospheric production
depth

Physical Review Letters D, 90:012012, 2014

Auger Collaboration paper.

B. Kégl and for
the Pierre Auger Collaboration

Measurement of the
muon signal using the temporal and spectral structure of the signals in
surface detectors of the Pierre Auger Observatory

In 33rd International Cosmic Ray Conference (ICRC), 2013.

We measure the muonic signal in surface detectors at a core distance of 1 km for showers of 10 EeV primary energy, using different filtering techniques and a multivariate approach.

B. Kégl,
D. Veberič, and D. Garcia Gamez

A
simple estimator of S_{μ19}(1000)

Technical Report 2012-127, Auger Project Technical Note, 2012.

We develop an estimator for the muon fraction using a parametric function of simple statistics extracted from the FADC signal.

D. Veberič,
B. Kégl, and S. Mićanović

Estimating
muon fraction from individual traces of SD stations

Technical Report 2012-126, Auger Project Technical Note, 2012.

B. Kégl,
R. Busa-Fekete, K. Louedec, R. Bardenet, X. Garrido, I.C. Mari c s,
D. Monnier-Ragaigne, S. Dagoret-Campagne, and M. Urban

Reconstructing
N_{μ19}(1000)

Technical Report 2011-054, Auger Project Technical Note, 2011.

We develop an estimator for the number of muons in an Auger tank given the FADC signal by tuning the jump method with a multivariate approach. Using the tankwise estimator, we reconstruct the lateral distribution function in an empirical Bayes setup that can be intuitively understood as a constrained fit where constraints are estimated from the data.

R. Bardenet,
B. Kégl, and D. Veberič

Single
muon response: The signal model

Technical Report 2010-110, Auger Project Technical Note, 2010.

The lowest part of the comprehensive generative model for the tank signal.

B. Kégl,
M. Unger, and R. Busa-Fekete

A
nonparametric approach to estimate X_{max} using Gaussian
process regression

Technical Report 2010-034, Auger Project Technical Note, 2010.

B. Kégl and
D. Veberič

Single
muon response: Tracklength

Technical Report 2009-043, Auger Project Technical Note, 2009.

The tank geometry part of the comprehensive generative model for the tank signal.

J. Abraham et
al.

Correlation of the highest
energy cosmic rays with nearby extragalactic objects

Science, 318:938–943, 2007

Auger Collaboration paper.
Erratum.

The famous AGN correlation paper that made the cover of Science in November 2007, and was selected as one of the ten breakthroughs of 2007. It's not without controversy though.