**2017**

**Friday, October 27, 3-4 pm, 3401 Walnut St Room 401B**

Speakers:

1. Luiz Chamon (University of Pennsylvania)

2. Anupama Jha (University of Pennsylvania)

—

Luiz Chamon (University of Pennsylvania)

**Title**

Approximate Supermodularity Bounds for Experimental Design

**Abstract**

This work provides performance guarantees for the greedy solution of experimental design problems. In particular, it focuses on A- and E-optimal designs, for which typical guarantees do not apply since the mean-square error and the maximum eigenvalue of the estimation error covariance matrix are not supermodular. To do so, it leverages the concept of approximate supermodularity to derive non-asymptotic worst-case suboptimality bounds for these greedy solutions. These bounds reveal that as the SNR of the experiments decreases, these cost functions behave increasingly as supermodular functions. As such, greedy A- and E-optimal designs approach (1-1/e)-optimality. These results reconcile the empirical success of greedy experimental design with the non-supermodularity of the A- and E-optimality criteria.

**Bio**

Luiz Chamon is a Ph.D. student in the Department of Electrical and Systems Engineering at the University of Pennsylvania working with Prof. Alejandro Ribeiro. His research interests include signal processing, discrete optimization, statistics, and control.

—

Anupama Jha (University of Pennsylvania)

**Title**

Integrative Deep Models for Alternative Splicing

**Abstract**

Advancements in sequencing technologies have highlighted the role of alternative splicing (AS) in increasing transcriptome complexity. This role of AS, combined with the relation of aberrant splicing to malignant states, motivated two streams of research, experimental and computational. The first involves a myriad of techniques such as RNA-Seq and CLIP-Seq to identify splicing regulators and their putative targets. The second involves probabilistic models, also known as splicing codes, which infer regulatory mechanisms and predict splicing outcome directly from genomic sequence. To date, these models have utilized only expression data. In this work we address two related challenges: Can we improve on previous models for AS outcome prediction and can we integrate additional sources of data to improve predictions for AS regulatory factors. We perform a detailed comparison of two previous modeling approaches, Bayesian and Deep Neural networks, dissecting the confounding effects of datasets and target functions. We then develop a new target function for AS prediction and show that it significantly improves model accuracy. Next, we develop a modeling framework to incorporate CLIP-Seq, knockdown and over-expression experiments, which are inherently noisy and suffer from missing values. Using several datasets involving key splice factors in mouse brain, muscle and heart we demonstrate both the prediction improvements and biological insights offered by our new models. Overall, the framework we propose offers a scalable integrative solution to improve splicing code modeling as vast amounts of relevant genomic data become available.

**Bio**

Anupama Jha is a Ph.D. student at the University of Pennsylvania working with Prof. Yoseph Barash. Her research interests lie in the intersection of machine learning and computational biology.

**Friday, October 13, 3-4 pm, 3401 Walnut St Room 401B**

1. Santiago Paternain (University of Pennsylvania)

2. Arpit Agarwal (University of Pennsylvania)

—

Santiago Paternain (University of Pennsylvania)

**Title**

A Second Order Method for Nonconvex Optimization

**Abstract**

Machine learning problems such as neural network training, tensor decomposition, and matrix factorization, require local minimization of a nonconvex function. This local minimization is challenged by the presence of saddle points, of which there can be many and from which descent methods may take inordinately large number of iterations to escape. In this talk we present a second-order method that modifies the update of Newton’s method by replacing the negative eigenvalues of the Hessian by their absolute values and uses a truncated version of the resulting matrix to account for the objective’s curvature. The method is shown to escape saddles in at most $1 + \log_{3/2} (\delta/2\varepsilon)$ iterations where $\varepsilon$ is the target optimality and $\delta$ characterizes a point sufficiently far away from the saddle. The base of this exponential escape is $3/2$ independently of problem constants. Adding classical properties of Newton’s method, the paper proves convergence to a local minimum with probability $1-p$ in $O\left(\log(1/p)) + O(\log(1/\varepsilon)\right)$ iterations.

**Bio**

Santiago Paternain is a Ph.D. student in the Department of Electrical and Systems Engineering at the University of Pennsylvania working with Prof. Alejandro Ribeiro. His research interests include optimization and control of dynamical systems.

—

Arpit Agarwal (University of Pennsylvania)

**Title**

Learning with Limited Rounds of Adaptivity: Coin Tossing, Multi-Armed Bandits, and Ranking from Pairwise Comparisons

**Abstract**

In many learning settings, active/adaptive querying is possible, but the number of rounds of adaptivity is limited. We study the relationship between query complexity and adaptivity in identifying the k-most biased coins among a set of n coins with unknown biases. This problem is a common abstraction of many well-studied problems, including the problem of identifying the k-best arms in a stochastic multi-armed bandit, and the problem of top-k ranking from pairwise comparisons.

An r-round adaptive algorithm for the k most biased coins problem specifies in each round the set of coin tosses to be performed based on the observed outcomes in earlier rounds, and outputs the set of k-most biased coins at the end of r rounds. When r = 1, the algorithm is known as non-adaptive; when r is unbounded, the algorithm is known as fully adaptive. While the power of adaptivity in reducing query complexity is well known, full adaptivity requires repeated interaction with the coin tossing (feedback generation) mechanism, and is highly sequential, since the set of coins to be tossed in each round can only be determined after we have observed the outcomes of the coin tosses from the previous round. In contrast, algorithms with only few rounds of adaptivity require fewer rounds of interaction with the feedback generation mechanism, and offer the benefits of parallelism in algorithmic decision-making. Motivated by these considerations, we consider the question of how much adaptivity is needed to realize the optimal worst case query complexity for identifying the k most biased coins. Given any positive integer r, we derive essentially matching upper and lower bounds on the query complexity of r-round algorithms. We then show that Θ(log* n) rounds are both necessary and sufficient for achieving the optimal worst case query complexity for identifying the k-most biased coins. In particular, our algorithm achieves the optimal query complexity in at most log* n rounds, which implies that on any realistic input, 5 parallel rounds of exploration suffice to achieve the optimal worst-case sample complexity. The best known algorithm prior to our work required Θ(log n) rounds to achieve the optimal worst case query complexity even for the special case of k = 1.

**Bio**

Arpit Agarwal is a Ph.D. student in the Department of Computer & Information Science at University of Pennsylvania working with Prof. Shivani Agarwal. His research interests span the areas of machine learning, learning theory and information elicitation, with a focus on problems at the interface of ranking and crowdsourcing.