Les communs numériques : éléments d’économie politique

COUV_CAHIER_EGND_8par Henri Verdier & Charles Murciano

Cet article interroge l’actualité intellectuelle et légale des communs. Plus précisément, notre travail explore la notion de commun numérique (ou informationnel) et ses spécificités au regard du concept classique de communs en théorie économique. Les communs numériques désignent les nouveaux modes d’administration d’une ressource informationnelle par une communauté, qui sont permis par les technologies de l’information et de la communication. Ils constituent un mode de partage de ressources socialement valorisées. Les économistes s’accordent sur une conception classique des biens communs, désignant une ressource rivale et non-exclusive. Les communs numériques étant immatériels, cette définition est insatisfaisante. En outre, les travaux d’Elinor Ostrom ont souligné la dualité des communs, à la fois ressource exploitée en commun et régime de droits de propriété dérogeant au paradigme de la propriété privée. Notre article spécifie ainsi cette ambivalence à l’ère numérique. L’examen attentif d’exemples de communs numériques promus par l’Etat permet d’expliciter la logique contributive à l’oeuvre dans ces communs. Nous soutenons que cette logique autorise une nouvelle forme d’action publique. Alliée à la « multitude », en promouvant et en encadrant les communs, la puissance publique pourrait s’armer contre l’hégémonie croissante de grandes plateformes monopolistiques, dont les logiques s’opposent de plus en plus à celles des Etats.

Télécharger l’article

On the gap between restricted isometry properties and sparse recovery conditions

COUV_CAHIER_EGND_7by Sjoerd Dirksen, Guillaume Lecué & Holger Rauhut

Representational Similarity Analysis is a popular framework to flexibly represent the statistical dependencies between multi-voxel patterns on the one hand, and sensory or cognitive stimuli on the other hand. It has been used in an inferential framework, whereby significance is given by a permutation test on the samples. In this paper, we outline an issue with this statistical procedure: namely that the so-called pattern similarity used can be influenced by various effects, such as noise variance, which can lead to inflated type I error rates. What we propose is to rely instead on proper linear models.

Download the paper

The digital commons: a political and economic game-changer


This paper addresses the current intellectual and legal status of the commons. Specifically, we explore the notion of the digital (or information) commons and its specificities with regard to the classic concept of the commons in economic theory. The digital commons concerns new ways of administering an information resource by a community, made possible by information and communications technology. It constitutes a means of sharing socially valued resources.

Economists agree on a classic conception of common goods, designating a rival and non-exclusive resource. Because the digital commons is immaterial, this definition is unsatisfactory. In addition, the work of Elinor Ostrom emphasizes the duality of the commons: both a resource used in common and a property-rights regime running counter to the paradigm of private property. Our paper thus examines this ambiguity in the digital age.

Careful study of examples of digital commons promoted by the state can clarify the contributory logic at work in these commons. We argue that this approach authorizes a new form of public action. Allied with the “multitude”, the public authorities could, by nurturing and supervising the commons, arm themselves against the growing hegemony of big monopolistic platforms, whose logic is increasingly opposed to that of the state.

Download the paper

Optimal Real-Time Bidding Strategies

COUVER_CAHIER_EGND_5by Joaquin Fernandez-Tapia, Olivier Guéant, Jean-Michel Lasry

The ad-trading desks of media-buying agencies are increasingly relying on complex algorithms for purchasing advertising inventory. In particular, Real-Time Bidding (RTB) algorithms respond to many auctions – usually Vickrey auctions – throughout the day for buying ad-inventory with the aim of maximizing one or several key performance indicators (KPI). The optimization problems faced by companies building bidding strategies are new and interesting for the community of applied mathematicians. In this article, we introduce a stochastic optimal control model that addresses the question of the optimal bidding strategy in various realistic contexts: the maximization of the inventory bought with a given amount of cash in the framework of audience strategies, the maximization of the number of conversions/acquisitions with a given amount of cash, etc. In our model, the sequence of auctions is modeled by a Poisson process and the price to beat for each auction is modeled by a random variable following almost any probability distribution. We show that the optimal bids are characterized by a Hamilton-Jacobi-Bellman equation, and that almost-closed form solutions can be found by using a fluid limit. Numerical examples are also carried out.

Keywords: Real-Time Bidding, Vickrey auctions, Stochastic optimal control, Convex analysis, Fluid limit approximation.

Download the paper

Correlations of Correlations Are Not Reliable Statistics: Implications for Multivariate Pattern Analysis

COUV_CAHIER_EGND_3by Bertrand Thirion, Fabian Pedregosa, Michael Eickenberg & Gaël Varoquaux

Representational Similarity Analysis is a popular framework to flexibly represent the statistical dependencies between multi-voxel patterns on the one hand, and sensory or cognitive stimuli on the other hand. It has been used in an inferential framework, whereby significance is given by a permutation test on the samples. In this paper, we outline an issue with this statistical procedure: namely that the so-called pattern similarity used can be influenced by various effects, such as noise variance, which can lead to inflated type I error rates. What we propose is to rely instead on proper linear models.

Download the paper

On the Consistency of Ordinal Regression Methods

COUV_CAHIER_EGND_4by Fabian Pedregosa, Francis Bach & Alexandre Gramfort

Many of the ordinal regression models that have been proposed in the literature can be seen as methods that minimize a convex surrogate of the zero-one, absolute, or squared loss functions. A key property that allows to study the statistical implications of such approximations is that of Fisher consistency. In this paper we will characterize the Fisher consistency of a rich family of surrogate loss functions used in the context of ordinal regression, including support vector ordinal regression, ORBoosting and least absolute deviation. We will see that, for a family of surrogate loss functions that subsumes support vector ordinal regression and ORBoosting, consistency can be fully characterized by the derivative of a real-valued function at zero, as happens for convex margin-based surrogates in binary classication. We also derive excess risk bounds for a surrogate of the absolute error that generalize existing risk bounds for binary classication. Finally, our analysis suggests a novel surrogate of the squared error loss. To prove the empirical performance of such surrogate, we benchmarked it in terms of cross-validation error on 9 different datasets, where it outperforms competing approaches on 7 out of 9 datasets.

Download the paper

Agents’ Behavior on Multi-Dealer-to-Client Bond Trading Platforms

CAHIER_EGND_2by Jean-David Fermanian, Olivier Guéant, Arnaud Rachez

For the last two decades, most financial markets have undergone an evolution toward elec- tronification. The market for corporate bonds is one of the last major financial markets to follow this unavoidable path. Traditionally quote-driven (that is, dealer-driven) rather than order-driven, the market for corporate bonds is still mainly dominated by voice trading, but a lot of electronic platforms have emerged that make it possible for buy-side agents to simul- taneously request several dealers for quotes, or even directly trade with other buy-siders. The research presented in this article is based on a large proprietary database of requests for quotes (RFQ) sent, through the multi-dealer-to-client (MD2C) platforms operated by Bloomberg Fixed Income Trading and Tradeweb, to one of the major liquidity providers in European corporate bonds. Our goal is (i) to model the RFQ process on these platforms and the resulting competition between dealers, (ii) to use the RFQ database in order to implicit from our model the behavior of both dealers and clients on MD2C platforms, and (iii) to study the influence of several bond characteristics on the behavior of market participants.

Download the paper

A Smoothed Dual Approach for Variational Wasserstein Problems


by Marco Cuturi, Gabriel Peyré

Variational problems that involve Wasserstein distances have been recently proposed to summarize and learn from probability measures. Despite being conceptually simple, such problems are computationally challenging because they involve minimizing over quantities (Wasserstein distances) that are themselves hard to compute. We show that the dual formulation of Wasserstein variational problems introduced recently by Carlier et al. (2014) can be regularized using an entropic smoothing, which leads to smooth, differentiable, convex optimization problems that are simpler to implement and numerically more stable. We illustrate the versatility of this approach by applying it to the computation of Wasserstein barycenters and gradient flows of spacial regularization functionals.

Download the paper

Estimation of matrices with row sparsity

COUV_CAHIER_EGND_A33by O. Klopp, A. B. Tsybakov

An increasing number of applications is concerned with recovering a sparse matrix from noisy observations. In this paper, we consider the setting where each row of the unknown matrix is sparse. We establish minimax optimal rates of convergence for estimating matrices with row sparsity. A major focus in the present paper is on the derivation of lower bounds.

Download the paper

Posterior concentration rates for counting processes with Aalen multiplicative intensities

COUV_CAHIER_EGND_A32by Sophie Donnet, Vincent Rivoirard, Judith Rousseau and Catia Scricciolo

We provide general conditions to derive posterior concentration rates for Aalen counting processes. The conditions are designed to resemble those pro- posed in the literature for the problem of density estimation, for instance in Ghosal et al. (2000), so that existing results on density estimation can be adapted to the present setting. We apply the general theorem to some prior models including Dirichlet process mixtures of uniform densities to estimate monotone non- increasing intensities and log-splines.

Download the paper

Optimal Exponential Bounds on the Accuracy of Classification

COUV_CAHIER_EGND_A31by G. Kerkyacharian, A. B. Tsybakov, V. Temlyakov, D. Picard & V. Koltchinskii

Consider a standard binary classification problem, in which (X, Y) is a random couple in X ×{0, 1}, and the training data consist of n i.i.d. copies of (X, Y). Given a binary classifier f : X → {0, 1}, the generalization error of f is defined by R( f ) = P{Y = f (X)}. Its minimum R∗ over all binary classifiers f is called the Bayes risk and is attained at aBayes classifier. The performance of any binary classifier fˆn based on the training data is characterized by the excess risk R(fn) − R*.We study Bahadur’s type exponential bounds on the following minimax accuracy confidence function based on the excess risk: (…)

Download the paper

Minimax rate of convergence and the performance of empirical risk minimization in phase retrieval

COUV_CAHIER_EGND_A30by Guillaume Lecué and Shahar Mendelson

We study the performance of Empirical Risk Minimization in both noisy and noiseless phase retrieval problems, indexed by subsets of Rn and relative to subgaussian sampling; that is, when the given data is yi = <ai, x0 >2 + wi for a subgaussian random vector a, independent subgaussian noise w and a fixed but unknown x0 that belongs to a given T⊂Rn


Download the paper

Aggregation and minimax approach in high-dimensional estimation

COUV_CAHIER_EGND_A29by Alexandre B. Tsybakov

Given a collection of estimators, the problem of linear, convex or model selection type aggregation consists in constructing a new estimator, called the aggregate, which is nearly as good as the best among them (or nearly as good as their best linear or convex combination), with respect to a given risk criterion. When the underlying model is sparse, which means that it is well approximated by a linear combination of a small number of functions in the dictionary, the aggregation techniques turn out to be very useful in taking advantage of sparsity. On the other hand, aggregation is a general technique of producing adaptive nonparametric estimators, which is more powerful than the classical methods since it allows one to combine estimators of different nature. Aggregates are usually constructed by mixing the initial estimators or functions of the dictionary with data-dependent weights that can be computed is several possible ways. Important example is given by aggregates with exponential weights. They satisfy sharp oracle inequalities that allow one to treat in a unified way three different problems: Adaptive nonparametric estimation, aggregation and sparse estimation.

Download the paper

Adaptive Lasso and group-Lasso for functional Poisson regression

COUV_CAHIER_EGND_A28by S. Ivanoff, F. Picard & V. Rivoirard

High dimensional Poisson regression has become a standard framework for the analysis of massive counts datasets. In this work we estimate the intensity function of the Poisson regression model by using a dictionary approach, which generalizes the classical basis approach, combined with a Lasso or a group-Lasso procedure. Selection depends on penalty weights that need to be calibrated. Standard methodologies developed in the Gaussian framework can not be directly applied to Poisson models due to heteroscedasticity. Here we provide data-driven weights for the Lasso and the group-Lasso derived from concentration inequalities adapted to the Poisson case. We show that the associated Lasso and group-Lasso procedures are theoretically optimal in the oracle approach. Simulations are used to assess the empirical performance of our procedure, and an original application to the analysis of Next Generation Sequencing data is provided.

Download the paper

Estimation of Low-Rank Covariance Function

COUV_CAHIER_EGND_A27by Koltchinskii, V., Lounici, K., Tsybakov, A.B.

We consider the problem of estimating a low rank covariance function K(t, u) of a Gaussian process S(t), t ∈ [0, 1] based on n i.i.d. copies of S observed in a white noise. We suggest a new estimation procedure adapting simultaneously to the low rank structure and the smoothness of the covariance function. The new procedure is based on nuclear norm penalization and exhibits superior performances as compared to the sample covariance function by a polynomial factor in the sample size n. Other results include a minimax lower bound for estimation of low-rank covariance functions showing that our procedure is optimal as well as a scheme to estimate the unknown noise variance of the Gaussian process.

Download the paper