College of Engineering
https://hdl.handle.net/2144/971
Fri, 30 Jul 2021 18:00:37 GMT2021-07-30T18:00:37ZA high probability analysis of adaptive SGD with momentum
https://hdl.handle.net/2144/42800
A high probability analysis of adaptive SGD with momentum
Li, Xiaoyu; Orabona, Francesco
Stochastic Gradient Descent (SGD) and its variants are the most used algorithms in machine learning applications. In particular, SGD with adaptive learning rates and momentum is the industry standard to train deep networks. Despite the enormous success of these methods, our theoretical understanding of these variants in the non-convex setting is not complete, with most of the results only proving convergence in expectation and with strong assumptions on the stochastic gradients. In this paper, we present a high probability analysis for adaptive and momentum algorithms, under weak assumptions on the function,stochastic gradients, and learning rates. We use it to prove for the first time the convergence of the gradients to zero in high probability in the smooth non convex setting for Delayed AdaGrad with momentum.
Fri, 17 Jul 2020 00:00:00 GMThttps://hdl.handle.net/2144/428002020-07-17T00:00:00ZTemporal variability in implicit online learning
https://hdl.handle.net/2144/42799
Temporal variability in implicit online learning
Campolongo, Nicolò; Orabona, Francesco
In the setting of online learning, Implicit algorithms turn out to be highly suc-cessful from a practical standpoint. However, the tightest regret analyses onlyshow marginal improvements over Online Mirror Descent. In this work, we shedlight on this behavior carrying out a careful regret analysis. We prove a novelstatic regret bound that depends on the temporal variability of the sequence ofloss functions, a quantity which is often encountered when considering dynamiccompetitors. We show, for example, that the regret can be constant if the tempo-ral variability is constant and the learning rate is tuned appropriately, without theneed of smooth losses. Moreover, we present an adaptive algorithm that achievesthis regret bound without prior knowledge of the temporal variability and prove amatching lower bound. Finally, we validate our theoretical findings on classifica-tion and regression datasets.
Sun, 06 Dec 2020 00:00:00 GMThttps://hdl.handle.net/2144/427992020-12-06T00:00:00ZMechanical MNIST Crack Path
https://hdl.handle.net/2144/42757
Mechanical MNIST Crack Path
Mohammadzadeh, Saeed; Lejeune, Emma
The Mechanical MNIST Crack Path dataset contains Finite Element simulation results from phase-field models of quasi-static brittle fracture in heterogeneous material domains subjected to prescribed loading and boundary conditions. For all samples, the material domain is a square with a side length of $1$. There is an initial crack of fixed length ($0.25$) on the left edge of each domain. The bottom edge of the domain is fixed in $x$ (horizontal) and $y$ (vertical), the right edge of the domain is fixed in $x$ and free in $y$, and the left edge is free in both $x$ and $y$. The top edge is free in $x$, and in $y$ it is displaced such that, at each step, the displacement increases linearly from zero at the top right corner to the maximum displacement on the top left corner. Maximum displacement starts at $0.0$ and increases to $0.02$ by increments of $0.0001$ ($200$ simulation steps in total). The heterogeneous material distribution is obtained by adding rigid circular inclusions to the domain using the Fashion MNIST bitmaps as the reference location for the center of the inclusions. Specifically, each center point location is generated randomly inside a square region defined by the corresponding Fashion MNIST pixel when the pixel has an intensity value higher than $10$. In addition, a minimum center-to-center distance limit of $0.0525$ is applied while generating these center points for each sample. The values of Young’s Modulus $(E)$, Fracture Toughness $(G_f)$, and Failure Strength $(f_t)$ near each inclusion are increased with respect to the background domain by a variable rigidity ratio $r$. The background value for $E$ is $210000$, the background value for $G_f$ is $2.7$, and the background value for $f_t$ is $2445.42$. The rigidity ratio throughout the domain depends on position with respect to all inclusion centers such that the closer a point is to the inclusion center the higher the rigidity ratio will be. We note that the full algorithm for constructing the heterogeneous material property distribution is included in the simulations scripts shared on GitHub. The following information is included in our dataset: (1) A rigidity ratio array to capture heterogeneous material distribution reported over a uniform $64\times64$ grid, (2) the damage field at the final level of applied displacement reported over a uniform $256\times256$ grid, and (3) the force-displacement curves for each simulation. All simulations are conducted with the FEniCS computing platform (https://fenicsproject.org/). The code to reproduce these simulations is hosted on GitHub (https://github.com/saeedmhz/phase-field).
The paper “Predicting Mechanically Driven Full-Field Quantities of Interest with Deep Learning-Based Metamodels” can be found at <link to be posted>. A larger, more flexible and information-rich version of this dataset is available at <link to be posted>. All code necessary to reproduce these finite element simulations is available on GitHub (https://github.com/saeedmhz/phase-field). For questions, please contact Emma Lejeune (elejeune@bu.edu).
Thu, 01 Jul 2021 00:00:00 GMThttps://hdl.handle.net/2144/427572021-07-01T00:00:00ZSeeing around corners with edge-resolved transient imaging
https://hdl.handle.net/2144/42719
Seeing around corners with edge-resolved transient imaging
Rapp, Joshua; Saunders, Charles; Tachella, Julián; Murray-Bruce, John; Altmann, Yoann; Tourneret, Jean-Yves; McLaughlin, Stephen; Dawson, Robin M.A.; Wong, Franco N.C.; Goyal, Vivek K.
Non-line-of-sight (NLOS) imaging is a rapidly growing field seeking to form images of objects outside the field of view, with potential applications in autonomous navigation, reconnaissance, and even medical imaging. The critical challenge of NLOS imaging is that diffuse reflections scatter light in all directions, resulting in weak signals and a loss of directional information. To address this problem, we propose a method for seeing around corners that derives angular resolution from vertical edges and longitudinal resolution from the temporal response to a pulsed light source. We introduce an acquisition strategy, scene response model, and reconstruction algorithm that enable the formation of 2.5-dimensional representations-a plan view plus heights-and a 180∘ field of view for large-scale scenes. Our experiments demonstrate accurate reconstructions of hidden rooms up to 3 meters in each dimension despite a small scan aperture (1.5-centimeter radius) and only 45 measurement locations.
Mon, 23 Nov 2020 00:00:00 GMThttps://hdl.handle.net/2144/427192020-11-23T00:00:00ZBeliefs in decision-making cascades
https://hdl.handle.net/2144/42718
Beliefs in decision-making cascades
Seo, Daewon; Raman, Ravi Kiran; Rhim, Joong Bum; Goyal, Vivek K.; Varshney, Lav R.
This work explores a social learning problem with agents having nonidentical noise variances and mismatched beliefs. We consider an N-agent binary hypothesis test in which each agent sequentially makes a decision based not only on a private observation, but also on preceding agents' decisions. In addition, the agents have their own beliefs instead of the true prior, and have nonidentical noise variances in the private signal. We focus on the Bayes risk of the last agent, where preceding agents are selfish. We first derive the optimal decision rule by recursive belief update and conclude, counterintuitively, that beliefs deviating from the true prior could be optimal in this setting. The effect of nonidentical noise levels in the two-agent case is also considered and analytical properties of the optimal belief curves are given. Next, we consider a predecessor selection problem wherein the subsequent agent of a certain belief chooses a predecessor from a set of candidates with varying beliefs. We characterize the decision region for choosing such a predecessor and argue that a subsequent agent with beliefs varying from the true prior often ends up selecting a suboptimal predecessor, indicating the need for a social planner. Lastly, we discuss an augmented intelligence design problem that uses a model of human behavior from cumulative prospect theory and investigate its near-optimality and suboptimality.
Tue, 01 Oct 2019 00:00:00 GMThttps://hdl.handle.net/2144/427182019-10-01T00:00:00ZKeep ballots secret: on the futility of social learning in decision making by voting
https://hdl.handle.net/2144/42717
Keep ballots secret: on the futility of social learning in decision making by voting
Rhim, Joong Bum; Goyal, Vivek K.
We show that social learning is not useful in a model of team binary decision making by voting, where each vote carries equal weight. Specifically, we consider Bayesian binary hypothesis testing where agents have any conditionally-independent observation distribution and their local decisions are fused by any L-out-of-N fusion rule. The agents make local decisions sequentially, with each allowed to use its own private signal and all precedent local decisions. Though social learning generally occurs in that precedent local decisions affect an agent's belief, optimal team performance is obtained when all precedent local decisions are ignored. Thus, social learning is futile, and secret ballots are optimal. This conclusion contrasts with typical studies of social learning because we include a fusion center rather than concentrating on the performance of the latest-acting agents.
Wed, 01 May 2013 00:00:00 GMThttps://hdl.handle.net/2144/427172013-05-01T00:00:00ZSocial teaching: being informative vs. being right in sequential decision making
https://hdl.handle.net/2144/42716
Social teaching: being informative vs. being right in sequential decision making
Rhim, Joong Bum; Goyal, Vivek K.
We consider sequential Bayesian binary hypothesis testing where each individual agent makes a binary decision motivated only by minimization of her own perception of the Bayes risk. The information available to each agent is an initial belief, a private signal, and decisions of all earlier-acting agents; it is follows that each agent should apply a standard Bayesian update of her belief as in social learning. The effect of the set of initial beliefs on the decision-making performance of the last agent is studied. In general, the optimal initial beliefs are not equal to the actual prior probability. When the private signals are described by Gaussian likelihoods, they also are not haphazard, but rather follow a systematic pattern: The earlier-acting agents should act as if the prior probability is larger than it is in reality when the true prior probability is small, and vice versa. We interpret this as being open-minded toward the unlikely hypothesis. Such open-mindedness increases but does not maximize the mutual information between the true hypothesis and a decision.
Mon, 01 Jul 2013 00:00:00 GMThttps://hdl.handle.net/2144/427162013-07-01T00:00:00ZDistributed hypothesis testing with social learning and symmetric fusion
https://hdl.handle.net/2144/42715
Distributed hypothesis testing with social learning and symmetric fusion
Rhim, Joong Bum; Goyal, Vivek K.
We study the utility of social learning in a distributed detection model with agents sharing the same goal: a collective decision that optimizes an agreed upon criterion. We show that social learning is helpful in some cases but is provably futile (and thus essentially a distraction) in other cases. Specifically, we consider Bayesian binary hypothesis testing performed by a distributed detection and fusion system, where all decision-making agents have binary votes that carry equal weight. Decision-making agents in the team sequentially make local decisions based on their own private signals and all precedent local decisions. It is shown that the optimal decision rule is not affected by precedent local decisions when all agents observe conditionally independent and identically distributed private signals. Perfect Bayesian reasoning will cancel out all effects of social learning. When the agents observe private signals with different signal-to-noise ratios, social learning is again futile if the team decision is only approved by unanimity. Otherwise, social learning can strictly improve the team performance. Furthermore, the order in which agents make their decisions affects the team decision.
Mon, 01 Dec 2014 00:00:00 GMThttps://hdl.handle.net/2144/427152014-12-01T00:00:00ZReduced damage in electron microscopy by using interaction-free measurement and conditional reillumination
https://hdl.handle.net/2144/42712
Reduced damage in electron microscopy by using interaction-free measurement and conditional reillumination
Agarwal, Akshay; Berggren, Karl K.; van Staaden, Yuri J.; Goyal, Vivek K.
Interaction-free measurement (IFM) has been proposed as a means of high-resolution, low-damage imaging of radiation-sensitive samples, such as biomolecules and proteins. The basic setup for IFM is a Mach-Zehnder interferometer, and recent progress in nanofabricated electron-diffraction gratings has made it possible to incorporate a Mach-Zehnder interferometer in a transmission electron microscope (TEM). Therefore, the limits of performance of IFM with such an interferometer and a shot-noise limited electron source (such as that in a TEM) are of interest. In this paper, we compare the error probability and sample damage for ideal IFM and classical imaging schemes, through theoretical analysis and numerical simulation. We consider a sample that is either completely transparent or completely opaque at each pixel. In our analysis, we also evaluate the impact of an additional detector for scattered electrons. Inclusion of the scattering detector results in reduction of error by up to an order of magnitude, for both IFM and classical schemes. We also investigate a sample reillumination scheme based on updating priors after each round of illumination and find that this scheme further reduces error. Implementation of these methods is likely achievable with existing instrumentation and would result in improved resolution in low-dose electron microscopy.
https://hdl.handle.net/2144/42712Asymptotic analysis of MAP estimation via the replica method and applications to compressed sensing
https://hdl.handle.net/2144/42711
Asymptotic analysis of MAP estimation via the replica method and applications to compressed sensing
Rangan, Sundeep; Fletcher, Alyson K.; Goyal, Vivek K.
The replica method is a nonrigorous but well-known technique from statistical physics used in the asymptotic analysis of large, random, nonlinear problems. This paper applies the replica method, under the assumption of replica symmetry, to study estimators that are maximum a posteriori (MAP) under a postulated prior distribution. It is shown that with random linear measurements and Gaussian noise, the replica-symmetric prediction of the asymptotic behavior of the postulated MAP estimate of an -dimensional vector “decouples” as scalar postulated MAP estimators. The result is based on applying a hardening argument to the replica analysis of postulated posterior mean estimators of Tanaka and of Guo and Verdú. The replica-symmetric postulated MAP analysis can be readily applied to many estimators used in compressed sensing, including basis pursuit, least absolute shrinkage and selection operator (LASSO), linear estimation with thresholding, and zero norm-regularized estimation. In the case of LASSO estimation, the scalar estimator reduces to a soft-thresholding operator, and for zero norm-regularized estimation, it reduces to a hard threshold. Among other benefits, the replica method provides a computationally tractable method for precisely predicting various performance metrics including mean-squared error and sparsity pattern recovery probability.
Thu, 01 Mar 2012 00:00:00 GMThttps://hdl.handle.net/2144/427112012-03-01T00:00:00ZMessage-passing de-quantization with applications to compressed sensing
https://hdl.handle.net/2144/42710
Message-passing de-quantization with applications to compressed sensing
Kamilov, U.S.; Goyal, V.K.; Rangan, S.
Estimation of a vector from quantized linear measurements is a common problem for which simple linear techniques are suboptimal-sometimes greatly so. This paper develops message-passing de-quantization (MPDQ) algorithms for minimum mean-squared error estimation of a random vector from quantized linear measurements, notably allowing the linear expansion to be overcomplete or undercomplete and the scalar quantization to be regular or non-regular. The algorithm is based on generalized approximate message passing (GAMP), a recently-developed Gaussian approximation of loopy belief propagation for estimation with linear transforms and nonlinear componentwise-separable output channels. For MPDQ, scalar quantization of measurements is incorporated into the output channel formalism, leading to the first tractable and effective method for high-dimensional estimation problems involving non-regular scalar quantization. The algorithm is computationally simple and can incorporate arbitrary separable priors on the input vector including sparsity-inducing priors that arise in the context of compressed sensing. Moreover, under the assumption of a Gaussian measurement matrix with i.i.d. entries, the asymptotic error performance of MPDQ can be accurately predicted and tracked through a simple set of scalar state evolution equations. We additionally use state evolution to design MSE-optimal scalar quantizers for MPDQ signal reconstruction and empirically demonstrate the superior error performance of the resulting quantizers. In particular, our results show that non-regular quantization can greatly improve rate-distortion performance in some problems with oversampling or with undersampling combined with a sparsity-inducing prior.
Sat, 01 Dec 2012 00:00:00 GMThttps://hdl.handle.net/2144/427102012-12-01T00:00:00ZAn information-theoretic characterization of channels that die
https://hdl.handle.net/2144/42709
An information-theoretic characterization of channels that die
Varshney, Lav R.; Mitter, Sanjoy K.; Goyal, Vivek K.
Given the possibility of communication systems failing catastrophically, we investigate limits to communicating over channels that fail at random times. These channels are finite-state semi-Markov channels. We show that communication with arbitrarily small probability of error is not possible. Making use of results in finite blocklength channel coding, we determine sequences of blocklengths that optimize transmission volume communicated at fixed maximum message error probabilities. We provide a partial ordering of communication channels. A dynamic programming formulation is used to show the structural result that channel state feedback does not improve performance.
Sat, 01 Sep 2012 00:00:00 GMThttps://hdl.handle.net/2144/427092012-09-01T00:00:00ZQuantization of prior probabilities for collaborative distributed hypothesis testing
https://hdl.handle.net/2144/42552
Quantization of prior probabilities for collaborative distributed hypothesis testing
Rhim, Joong Bum; Varshney, Lav R.; Goyal, Vivek K.
This paper studies the quantization of prior probabilities, drawn from an ensemble, in distributed detection with data fusion by combination of binary local decisions. Design and performance equivalences between a team of N agents and a more powerful single agent are obtained. Effects of identical quantization and diverse quantization on mean Bayes risk are compared. It is shown that when agents using diverse quantizers interact to agree on a perceived common risk, the effective number quantization levels is increased. With this collaboration, optimal diverse regular quantization with K cells per quantizer performs as well as optimal identical quantization with N ( K -1)+1 cells per quantizer. Similar results are obtained for the maximum Bayes risk error criterion.
Sat, 01 Sep 2012 00:00:00 GMThttps://hdl.handle.net/2144/425522012-09-01T00:00:00ZComputational nanosensing from defocus in single particle interferometric reflectance microscopy
https://hdl.handle.net/2144/42545
Computational nanosensing from defocus in single particle interferometric reflectance microscopy
Yurdakul, Celalettin; Ünlü, M. Selim
Single particle interferometric reflectance (SPIR) microscopy has been studied as a powerful imaging platform for label-free and highly sensitive biological nanoparticle detection and characterization. SPIR's interferometric nature yields a unique 3D defocus intensity profile of the nanoparticles over a large field of view. Here, we utilize this defocus information to recover high signal-to-noise ratio nanoparticle images with a computationally and memory efficient reconstruction framework. Our direct inversion approach recovers this image from a 3D defocus intensity stack using the vectorial-optics-based forward model developed for sub-diffraction-limited dielectric nanoparticles captured on a layered substrate. We demonstrate proof-of-concept experiments on silica beads with a 50 nm nominal diameter.
Tue, 01 Dec 2020 00:00:00 GMThttps://hdl.handle.net/2144/425452020-12-01T00:00:00ZTime-stampless adaptive nonuniform sampling for stochastic signals
https://hdl.handle.net/2144/42544
Time-stampless adaptive nonuniform sampling for stochastic signals
Feizi, Soheil; Goyal, Vivek K.; Medard, Muriel
In this paper, we introduce a time-stampless adaptive nonuniform sampling (TANS) framework, in which time increments between samples are determined by a function of the m most recent increments and sample values. Since only past samples are used in computing time increments, it is not necessary to save sampling times (time stamps) for use in the reconstruction process. We focus on two TANS schemes for discrete-time stochastic signals: a greedy method, and a method based on dynamic programming. We analyze the performances of these schemes by computing (or bounding) their trade-offs between sampling rate and expected reconstruction distortion for autoregressive and Markovian signals. Simulation results support the analysis of the sampling schemes. We show that, by opportunistically adapting to local signal characteristics, TANS may lead to improved power efficiency in some applications.
Mon, 01 Oct 2012 00:00:00 GMThttps://hdl.handle.net/2144/425442012-10-01T00:00:00ZRanked sparse signal support detection
https://hdl.handle.net/2144/42543
Ranked sparse signal support detection
Fletcher, A.K.; Rangan, S.; Goyal, V.K.
This paper considers the problem of detecting the support (sparsity pattern) of a sparse vector from random noisy measurements. Conditional power of a component of the sparse vector is defined as the energy conditioned on the component being nonzero. Analysis of a simplified version of orthogonal matching pursuit (OMP) called sequential OMP (SequOMP) demonstrates the importance of knowledge of the rankings of conditional powers. When the simple SequOMP algorithm is applied to components in nonincreasing order of conditional power, the detrimental effect of dynamic range on thresholding performance is eliminated. Furthermore, under the most favorable conditional powers, the performance of SequOMP approaches maximum likelihood performance at high signal-to-noise ratio.
Thu, 01 Nov 2012 00:00:00 GMThttps://hdl.handle.net/2144/425432012-11-01T00:00:00ZDistributed functional scalar quantization simplified
https://hdl.handle.net/2144/42542
Distributed functional scalar quantization simplified
Sun, John Z.; Misra, Vinith; Goyal, Vivek K.
Distributed functional scalar quantization (DFSQ) theory provides optimality conditions and predicts performance of data acquisition systems in which a computation on acquired data is desired. We address two limitations of previous works: prohibitively expensive decoder design and a restriction to source distributions with bounded support. We show that a much simpler decoder has equivalent asymptotic performance to the conditional expectation estimator studied previously, thus reducing decoder design complexity. The simpler decoder features decoupled communication and computation blocks. Moreover, we extend the DFSQ framework with the simpler decoder to source distributions with unbounded support. Finally, through simulation results, we demonstrate that performance at moderate coding rates is well predicted by the asymptotic analysis, and we give new insight on the rate of convergence.
Mon, 01 Jul 2013 00:00:00 GMThttps://hdl.handle.net/2144/425422013-07-01T00:00:00ZIntersensor collaboration in distributed quantization networks
https://hdl.handle.net/2144/42541
Intersensor collaboration in distributed quantization networks
Sun, John Z.; Goyal, Vivek K.
Several key results in distributed source coding offer the intuition that little improvement in compression can be gained from intersensor communication when the information is coded in long blocks. However, when sensors are restricted to code their observations in small blocks (e.g., one) or desire fidelity of a computation applied to source realizations, intelligent collaboration between sensors can greatly reduce distortion. For networks where sensors are allowed to "chat" using a side channel that is unobservable at the fusion center, we provide asymptotically-exact characterization of distortion performance and optimal quantizer design in the high-resolution (low-distortion) regime using a framework called distributed functional scalar quantization (DFSQ). The key result is that chatting can dramatically improve performance even when intersensor communication is at very low rate. We also solve the rate allocation problem when communication links have heterogeneous costs and provide a detailed example to demonstrate the theoretical and practical gains from chatting. This example for maximum computation gives insight on the gap between chatting and distributed networks, and how to optimize the intersensor communication.
Sun, 01 Sep 2013 00:00:00 GMThttps://hdl.handle.net/2144/425412013-09-01T00:00:00ZDistributed scalar quantization for computing: high-resolution analysis and extensions
https://hdl.handle.net/2144/42540
Distributed scalar quantization for computing: high-resolution analysis and extensions
Misra, Vinith; Goyal, Vivek K.; Varshney, Lav R.
Communication of quantized information is frequently followed by a computation. We consider situations of distributed functional scalar quantization: distributed scalar quantization of (possibly correlated) sources followed by centralized computation of a function. Under smoothness conditions on the sources and function, companding scalar quantizer designs are developed to minimize mean-squared error (MSE) of the computed function as the quantizer resolution is allowed to grow. Striking improvements over quantizers designed without consideration of the function are possible and are larger in the entropy-constrained setting than in the fixed-rate setting. As extensions to the basic analysis, we characterize a large class of functions for which regular quantization suffices, consider certain functions for which asymptotic optimality is achieved without arbitrarily fine quantization, and allow limited collaboration between source encoders. In the entropy-constrained setting, a single bit per sample communicated between encoders can have an arbitrarily large effect on functional distortion. In contrast, such communication has very little effect in the fixed-rate setting.
Mon, 01 Aug 2011 00:00:00 GMThttps://hdl.handle.net/2144/425402011-08-01T00:00:00ZSimultaneously sparse solutions to linear inverse problems with multiple system matrices and a single observation vector
https://hdl.handle.net/2144/42526
Simultaneously sparse solutions to linear inverse problems with multiple system matrices and a single observation vector
Zelinski, Adam C.; Goyal, Vivek K.; Adalsteinsson, Elfar
A problem that arises in slice-selective magnetic resonance imaging (MRI) radio-frequency (RF) excitation pulse design is abstracted as a novel linear inverse problem with a simultaneous sparsity constraint. Multiple unknown signal vectors are to be determined, where each passes through a different system matrix and the results are added to yield a single observation vector. Given the matrices and lone observation, the objective is to find a simultaneously sparse set of unknown vectors that approximately solves the system. We refer to this as the multiple-system single-output (MSSO) simultaneous sparse approximation problem. This manuscript contrasts the MSSO problem with other simultaneous sparsity problems and conducts an initial exploration of algorithms with which to solve it. Greedy algorithms and techniques based on convex relaxation are derived and compared empirically. Experiments involve sparsity pattern recovery in noiseless and noisy settings and MRI RF pulse design.
Wed, 20 Jan 2010 00:00:00 GMThttps://hdl.handle.net/2144/425262010-01-20T00:00:00Z