重温 2024 年亚马逊云科技 re:Invent 的精彩瞬间,一键查看主题演讲及创新讲座的精彩回放

 ✕

首页  »  亚马逊云科技解决方案  »  智慧教育行业  »  高等教育

面向高等教育领域的亚马逊云科技服务

借助亚马逊云科技, 您可以为教学提供支持,并加快研究工作

首页  »  亚马逊云科技解决方案  »  智慧教育行业  »  高等教育

面向高等教育领域的

亚马逊云科技服务

借助亚马逊云科技, 您可以为教学提供支持,
并加快研究工作

为研究人员助力,加快研究速度

大部分大学在疫情后重新开学时,依然会强调要在校园和学生宿舍等设施中保持社交距离,全球多数大学继续同时提供线下和线上教学与考试两种模式,并鼓励科研工作向云端协作迁移。近年来随着云计算、云原生应用的发展,HPC 的部署难度和成本大幅度降低,结合云原生技术发展出来的托管服务如亚马逊云科技 Batch 等,不仅无需运维管理底层硬件,还能依托于亚马逊云科技的全球云资源构建更大规模、硬件资源更多样的 HPC 集群。

基于云端 HPC 解决方案越来越多地被采用,但对于其成本、安全性和性能的误解仍然存在。对这些理念发起挑战,及打破基于云的 HPC 的常见障碍至关重要。在亚马逊云科技上,研究人员可以访问专门构建的 HPC 工具和服务,以及科学和技术专业知识,从而加快研发的步伐,并更加便于关注科研本身。研究人员可以通过本网站,了解亚马逊云科技客户如何使用云端 HPC 运行材料仿真、结构力学、合成生物、遥感测绘、分子动力学、医学成像模拟,以及基因组学智能化研究等工作负载的解决方案。

亚马逊云科技科研成果锦集

亚马逊云科技通过云端弹性可扩展的算力,支持并助力科研任务的云端原型验证、快速部署及海量计算;同时,亚马逊云科技致力于为杰出科研成果提供云端可展示的平台,鼓励科研工作者通过亚马逊云科技宣传有学术价值的科研成果,鼓励科研工作者基于已发表的高水平科研工作向全球发布新领域研究成果、新理论实践与创新。

《细胞色素 P450 家族基因分析》 在亚马逊云科技的支持下,澳大利亚博物馆研究中心的科研工作者在云端部署了 FALCON 基因组装的自动化化流程,使用 12 到 20 台 R3.8xlarge 竞价实例,对考拉的基因进行了研究。如下图中所示,在考拉 CYP2 基因家族的系统发育树中,观察到 CYP2C 亚科的两个独立的单系扩展基因组(以红色弧线显示)

《细胞色素P450家族基因分析》
点击了解更多

亚马逊云科技部分行业软件测试效果

亚马逊云科技直接或间接参与的科研工作

文章名称 作者 摘要 被引用次数
Sampling Matters in Deep Embedding Learning Wu, CY; Manmatha, R; Smola, AJ; Krahenbuhl, P Deep embeddings answer one simple question: How similar are two images? Learning these embeddings is the bedrock of verification, zero-shot learning, and visual search. The most prominent approaches optimize a deep convolutional network with a suitable loss function, such as contrastive loss or triplet loss. While a rich line of work focuses solely on the loss functions, we show in this paper that selecting training examples plays an equally important role. We propose distance weighted sampling, which selects more informative and stable examples than traditional approaches. In addition, we show that a simple margin based loss is sufficient to outperform all other loss functions. We evaluate our approach on the Stanford Online Products, CAR196, and the CUB200-2011 datasets for image retrieval and clustering, and on the LFW dataset for face verification. Our method achieves state-of-the-art performance on all of them. 294
The QUIC Transport Protocol: Design and Internet-Scale Deployment Langley, A; Riddoch, A; Wilk, A; Vicente, A; Krasic, C; Zhang, D; Yang, F; Kouranov, F; Swett, I; Iyengar, J; Bailey, J; Dorfman, J; Roskind, J; Kulik, J; Westin, P; Tenneti, R; Shade, R; Hamilton, R; Vasiliev, V; Chang, WT; Shi, ZY We present our experience with QUIC, an encrypted, multiplexed, and low-latency transport protocol designed from the ground up to improve transport performance for HTTPS traffic and to enable rapid deployment and continued evolution of transport mechanisms. QUIC has been globally deployed at Google on thousands of servers and is used to serve traffic to a range of clients including a widely-used web browser (Chrome) and a popular mobile video streaming app (YouTube). We estimate that 7% of Internet traffic is now QUIC. We describe our motivations for developing a new transport, the principles that guided our design, the Internet-scale process that we used to perform iterative experiments on QUIC, performance improvements seen by our various services, and our experience deploying QUIC globally. We also share lessons about transport design and the Internet ecosystem that we learned from our deployment. 254
Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation Omran, M; Lassner, C; Pons-Moll, G; Gehler, PV; Schiele, B Direct prediction of 3D body pose and shape remains a challenge even for highly parameterized deep learning models. Mapping from the 2D image space to the prediction space is difficult: perspective ambiguities make the loss function noisy and training data is scarce. In this paper, we propose a novel approach (Neural Body Fitting (NBF)). It integrates a statistical body model within a CNN, leveraging reliable bottom-up semantic body part segmentation and robust top-down body model constraints. NBF is fully differentiable and can be trained using 2D and 3D annotations. In detailed experiments, we analyze how the components of our model affect performance, especially the use of part segmentations as an explicit intermediate representation, and present a robust, efficiently trainable framework for 3D human pose estimation from 2D images with competitive results on standard benchmarks. Code will be made available at http://github.com/mohomran/neural_body_fitting 186
A FORMAL PROOF OF THE KEPLER CONJECTURE Hales, T; Adams, M; Bauer, G; Dang, TD; Harrison, J; Hoang, LT; Kaliszyk, C; Magron, V; Mclaughlin, S; Nguyen, TT; Nguyen, QT; Nipkow, T; Obua, S; Pleso, J; Rute, J; Solovyev, A; Ta, THA; Tran, NT; Trieu, TD; Urban, J; Vu, K; Zumkeller, R This article describes a formal proof of the Kepler conjecture on dense sphere packings in a combination of the HOL Light and Isabelle proof assistants. This paper constitutes the official published account of the now completed Flyspeck project. 146
Efficient Privacy-Preserving Ciphertext-Policy Attribute Based-Encryption and Broadcast Encryption Zhou, ZB; Huang, DJ; Wang, ZJ Ciphertext Policy Attribute-Based Encryption (CP-ABE) enforces expressive data access policies and each policy consists of a number of attributes. Most existing CP-ABE schemes incur a very large ciphertext size, which increases linearly with respect to the number of attributes in the access policy. Recently, Herranz et al. proposed a construction of CP-ABE with constant ciphertext. However, Herranz et al. do not consider the recipients' anonymity and the access policies are exposed to potential malicious attackers. On the other hand, existing privacy preserving schemes protect the anonymity but require bulky, linearly increasing ciphertext size. In this paper, we proposed a new construction of CP-ABE, named Privacy Preserving Constant CP-ABE (denoted as PP-CP-ABE) that significantly reduces the ciphertext to a constant size with any given number of attributes. Furthermore, PP-CP-ABE leverages a hidden policy construction such that the recipients' privacy is preserved efficiently. As far as we know, PP-CP-ABE is the first construction with such properties. Furthermore, we developed a Privacy Preserving Attribute-Based Broadcast Encryption (PP-AB-BE) scheme. Compared to existing Broadcast Encryption (BE) schemes, PP-AB-BE is more flexible because a broadcasted message can be encrypted by an expressive hidden access policy, either with or without explicit specifying the receivers. Moreover, PP-AB-BE significantly reduces the storage and communication overhead to the order of O(log N), where N is the system size. Also, we proved, using information theoretical approaches, PP-AB-BE attains minimal bound on storage overhead for each user to cover all possible subgroups in the communication system. 107
Compressed Video Action Recognition Wu, CY; Zaheer, M; Hu, HX; Manmatha, R; Smola, AJ; Krahenbuhl, P Training robust deep video representations has proven to be much more challenging than learning deep image representations. This is in part due to the enormous size of raw video streams and the high temporal redundancy; the true and interesting signal is often drowned in too much irrelevant data. Motivated by that the superfluous information can be reduced by up to two orders of magnitude by video compression (using H. 264, HEVC, etc.), we propose to train a deep network directly on the compressed video. This representation has a higher information density, and we found the training to be easier. In addition, the signals in a compressed video provide free, albeit noisy, motion information. We propose novel techniques to use them effectively. Our approach is about 4.6 times faster than Res3D and 2.7 times faster than ResNet-152. On the task of action recognition, our approach outperforms all the other methods on the UCF-101, HMDB-51, and Charades dataset. 106
The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings Kozlowski, AC; Taddy, M; Evans, JA We argue word embedding models are a useful tool for the study of culture using a historical analysis of shared understandings of social class as an empirical case. Word embeddings represent semantic relations between words as relationships between vectors in a high-dimensional space, specifying a relational model of meaning consistent with contemporary theories of culture. Dimensions induced by word differences (rich - poor) in these spaces correspond to dimensions of cultural meaning, and the projection of words onto these dimensions reflects widely shared associations, which we validate with surveys. Analyzing text from millions of books published over 100 years, we show that the markers of class continuously shifted amidst the economic transformations of the twentieth century, yet the basic cultural dimensions of class remained remarkably stable. The notable exception is education, which became tightly linked to affluence independent of its association with cultivated taste. 105
Optimal recharging scheduling for urban electric buses: A case study in Davis Wang, YS; Huang, YX; Xu, JP; Barclay, N In this paper, a modeling framework to optimize electric bus recharging schedules is developed, which determines both the planning and operational decisions while minimizing total annual costs. The model is demonstrated using a real-world transit network based in Davis, California. The results showed that range anxiety can be eliminated by adopting certain recharging strategies. Sensitivity analyses revealed that the model could provide transit agencies with comprehensive guidance on the utilization of electric buses and development of a fast charging system. The comparative analyses showed that it was more economical and environmentally friendly to utilize electric buses than diesel buses. (C) 2017 Elsevier Ltd. All rights reserved. 100
Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision Niemeyer, M; Mescheder, L; Oechsle, M; Geiger, A Learning-based 3D reconstruction methods have shown impressive results. However; most methods require 3D supervision which is often hard to obtain for real-world datasets. Recently, several works have proposed differentiable rendering techniques to train reconstruction models from RGB images. Unfortunately, these approaches are currently restricted to voxel- and mesh-based representations, suffering from discretization or low resolution. In this work, we propose a differentiable rendering formulation for implicit shape and texture representations. Implicit representations have recently gained popularity as they represent shape and texture continuously. Our key insight is that depth gradients can be derived analytically using the concept of implicit differentiation. This allows us to learn implicit shape and texture representations directly from RGB images. We experimentally show that our single-view reconstructions rival those learned with fill 3D supervision. Moreover; we find that our method can be used for multi-view 3D reconstruction, directly resulting in watertight meshes. 98
MeshCNN: A Network with an Edge Hanocka, R; Hertz, A; Fish, N; Giryes, R; Fleishman, S; Cohen-Or, D Polygonal meshes provide an efficient representation for 3D shapes. They explicitly capture both shape surface and topology, and leverage non-uniformity to represent large flat regions as well as sharp, intricate features. This non-uniformity and irregularity, however, inhibits mesh analysis efforts using neural networks that combine convolution and pooling operations. In this paper, we utilize the unique properties of the mesh for a direct analysis of 3D shapes using MeshCNN, a convolutional neural network designed specifically for triangular meshes. Analogous to classic CNNs, MeshCNN combines specialized convolution and pooling layers that operate on the mesh edges, by leveraging their intrinsic geodesic connections. Convolutions are applied on edges and the four edges of their incident triangles, and pooling is applied via an edge collapse operation that retains surface topology, thereby, generating new mesh connectivity for the subsequent convolutions. MeshCNN learns which edges to collapse, thus forming a task-driven process where the network exposes and expands the important features while discarding the redundant ones. We demonstrate the effectiveness of MeshCNN on various learning tasks applied to 3D meshes. 96
Log-based Predictive Maintenance Sipos, R; Fradkin, D; Moerchen, F; Wang, Z Predictive maintenance strives to anticipate equipment failures to allow for advance scheduling of corrective maintenance, thereby preventing unexpected equipment downtime and improving service quality for the customers. We present a data-driven approach based on multiple-instance learning for predicting equipment failures by mining equipment event logs which, while usually not designed for predicting failures, contain rich operational information. We discuss problem domain and formulation, evaluation metrics and predictive maintenance work flow. We experimentally compare our approach to competing methods. For evaluation, we use real life datasets with billions of log messages from two large fleets of medical equipment. We share insights gained from mining such data. Our predictive maintenance approach, deployed by a major medical device provider over the past several months, learns and evaluates predictive models from terabytes of log data, and actively monitors thousands of medical scanners. 95
Street-view change detection with deconvolutional networks Alcantarilla, PF; Stent, S; Ros, G; Arroyo, R; Gherardi, R We propose a system for performing structural change detection in street-view videos captured by a vehicle-mounted monocular camera over time. Our approach is motivated by the need for more frequent and efficient updates in the large-scale maps used in autonomous vehicle navigation. Our method chains a multi-sensor fusion SLAM and fast dense 3D reconstruction pipeline, which provide coarsely registered image pairs to a deep Deconvolutional Network (DN) for pixel-wise change detection. We investigate two DN architectures for change detection, the first one is based on the idea of stacking contraction and expansion blocks while the second one is based on the idea of Fully Convolutional Networks. To train and evaluate our networks we introduce a new urban change detection dataset which is an order of magnitude larger than existing datasets and contains challenging changes due to seasonal and lighting variations. Our method outperforms existing literature on this dataset, which we make available to the community, and an existing panoramic change detection dataset, demonstrating its wide applicability. 92
Uncovering Collusive Spammers in Chinese Review Websites Xu, C; Zhang, J; Chang, KY; Long, C As the rapid development of China's e-commerce in recent years and the underlying evolution of adversarial spamming tactics, more sophisticated spamming activities may carry out in Chinese review websites. Empirical analysis, on recently crawled product reviews from a popular Chinese e-commerce website, reveals the failure of many state-of-the-art spam indicators on detecting collusive spammers. Two novel methods are then proposed: 1) a KNN-based method that considers the pairwise similarity of two reviewers based on their group-level relational information and selects k most similar reviewers for voting; 2) a more general graph-based classification method that jointly classifies a set of reviewers based on their pairwise transaction correlations. Experimental results show that both our methods promisingly outperform the indicator-only classifiers in various settings. 88
Learning From More Than One Data Source: Data Fusion Techniques for Sensorimotor Rhythm-Based Brain-Computer Interfaces Fazli, S; Dahne, S; Samek, W; Biessmann, F; Muller, KR Brain-computer interfaces (BCIs) are successfully used in scientific, therapeutic and other applications. Remaining challenges are among others a low signal-to-noise ratio of neural signals, lack of robustness for decoders in the presence of inter-trial and inter-subject variability, time constraints on the calibration phase and the use of BCIs outside a controlled lab environment. Recent advances in BCI research addressed these issues by novel combinations of complementary analysis as well as recording techniques, so called hybrid BCIs. In this paper, we review a number of data fusion techniques for BCI along with hybrid methods for BCI that have recently emerged. Our focus will be on sensorimotor rhythm-based BCIs. We will give an overview of the three main lines of research in this area, integration of complementary features of neural activation, integration of multiple previous sessions and of multiple subjects, and show how these techniques can be used to enhance modern BCI systems. 69
ManTra-Net: Manipulation Tracing Network For Detection And Localization of Image Forgeries With Anomalous Features Wu, Y; AbdAlmageed, W; Natarajan, P To fight against real-life image forgery, which commonly involves different types and combined manipulations, we propose a unified deep neural architecture called ManTra-Net. Unlike many existing solutions, ManTra-Net is an end-to-end network that performs both detection and localization without extra preprocessing and postprocessing. ManTra-Net is a fully convolutional network and handles images of arbitrary sizes and many known forgery types such splicing, copy-move, removal, enhancement, and even unknown types. This paper has three salient contributions. We design a simple yet effective self-supervised learning task to learn robust image manipulation traces from classifying 385 image manipulation types. Further, we formulate the forgery localization problem as a local anomaly detection problem, design a Z-score feature to capture local anomaly, and propose a novel long short-term memory solution to assess local anomalies. Finally, we carefully conduct ablation experiments to systematically optimize the proposed network design. Our extensive experimental results demonstrate the generalizability, robustness and superiority of ManTra-Net, not only in single types of manipulations/forgeries, but also in their complicated combinations. 67
Semantic Correlation Promoted Shape-Variant Context for Segmentation Ding, HH; Jiang, XD; Shuai, B; Liu, AQ; Wang, G Context is essential for semantic segmentation. Due to the diverse shapes of objects and their complex layout in various scene images, the spatial scales and shapes of contexts for different objects have very large variation. It is thus ineffective or inefficient to aggregate various context information from a predefinedfixed region. In this work, we propose to generate a scale- and shape-variant semantic mask for each pixel to confine its contextual region. To this end, we first propose a novel paired convolution to infer the semantic correlation of the pair and based on that to generate a shape mask. Using the inferred spatial scope of the contextual region, we propose a shape-variant convolution, of which the receptive field is controlled by the shape mask that varies with the appearance of input. In this way, the proposed network aggregates the context information of a pixel from its semantic-correlated region instead of a predefinedfixed region. Furthermore, this work also proposes a labeling denoising model to reduce wrong predictions caused by the noisy low-level features. Without bells and whistles, the proposed segmentation network achieves new state-of-the-arts consistently on the six public segmentation datasets. 66
3D Face Morphable Models In-the-Wild Booth, J; Antonakos, E; Ploumpis, S; Trigeorgis, G; Panagakis, Y; Zafeiriou, S 3D Morphable Models (3DMMs) are powerful statistical models of 3D facial shape and texture, and among the state-of-the-art methods for reconstructing facial shape from single images. With the advent of new 3D sensors, many 3D facial datasets have been collected containing both neutral as well as expressive faces. However, all datasets are captured under controlled conditions. Thus, even though powerful 3D facial shape models can be learnt from such data, it is difficult to build statistical texture models that are sufficient to reconstruct faces captured in unconstrained conditions (in-the-wild). In this paper, we propose the first, to the best of our knowledge, in-the-wild 3DMM by combining a powerful statistical model of facial shape, which describes both identity and expression, with an in-the-wild texture model. We show that the employment of such an in-the-wild texture model greatly simplifies the fitting procedure, because there is no need to optimise with regards to the illumination parameters. Furthermore, we propose a new fast algorithm for fitting the 3DMM in arbitrary images. Finally, we have captured the first 3D facial database with relatively unconstrained conditions and report quantitative evaluations with state-of-the-art performance. Complementary qualitative reconstruction results are demonstrated on standard in-the-wild facial databases. 64
Local Collaborative Ranking Lee, J; Bengio, S; Kim, S; Lebanon, G; Singer, Y Personalized recommendation systems are used in a wide variety of applications such as electronic commerce, social networks, web search, and more. Collaborative filtering approaches to recommendation systems typically assume that the rating matrix (e.g., movie ratings by viewers) is low-rank. In this paper, we examine an alternative approach in which the rating matrix is locally low-rank. Concretely, we assume that the rating matrix is low-rank within certain neighborhoods of the metric space defined by (user, item) pairs. We combine a recent approach for local low-rank approximation based on the Frobenius norm with a general empirical risk minimization for ranking losses. Our experiments indicate that the combination of a mixture of local low-rank matrices each of which was trained to minimize a ranking loss outperforms many of the currently used state-of-the-art recommendation systems. Moreover, our method is easy to parallelize, making it a viable approach for large scale real-world rank-based recommendation systems. 64
PAMTRI: Pose-Aware Multi-Task Learning for Vehicle Re-Identification Using Highly Randomized Synthetic Data Tang, Z; Naphade, M; Birchfield, S; Tremblay, J; Hodge, W; Kumar, R; Wang, S; Yang, XD In comparison with person re-identification (ReID), which has been widely studied in the research community, vehicle ReID has received less attention. Vehicle ReID is challenging due to 1) high intra-class variability (caused by the dependency of shape and appearance on viewpoint), and 2) small inter-class variability (caused by the similarity in shape and appearance between vehicles produced by different manufacturers). To address these challenges, we propose a Pose-Aware Multi-Task Re-Identification (PAMTRI) framework. This approach includes two innovations compared with previous methods. First, it overcomes viewpointdependency by explicitly reasoning about vehicle pose and shape via keypoints, heatmaps and segments from pose estimation. Second, it jointly classifies semantic vehicle attributes (colors and types) while performing ReID, through multi-task learning with the embedded pose representations. Since manually labeling images with detailed pose and attribute information is prohibitive, we create a large-scale highly randomized synthetic dataset with automatically annotated vehicle attributes for training. Extensive experiments validate the effectiveness of each proposed component, showing that PAMTRI achieves significant improvement over state-of-the-art on two mainstream vehicle ReID benchmarks: VeRi and CityFlow-ReID. 62
Dynamic Word Embeddings for Evolving Semantic Discovery Yao, ZJ; Sun, YF; Ding, WC; Rao, N; Xiong, H Word evolution refers to the changing meanings and associations of words throughout time, as a byproduct of human language evolution. By studying word evolution, we can infer social trends and language constructs over different periods of human history. However, traditional techniques such as word representation learning do not adequately capture the evolving language structure and vocabulary. In this paper, we develop a dynamic statistical model to learn time-aware word vector representation. We propose a model that simultaneously learns time-aware embeddings and solves the resulting alignment problem. This model is trained on a crawled NYTimes dataset. Additionally, we develop multiple intuitive evaluation strategies of temporal word embeddings. Our qualitative and quantitative tests indicate that our method not only reliably captures this evolution over time, but also consistently outperforms state-of-the-art temporal embedding approaches on both semantic accuracy and alignment quality. 62
OpenTag: Open Attribute Value Extraction from Product Profiles Zheng, GN; Mukherjee, S; Dong, XLN; Li, FF Extraction of missing attribute values is tofi nd values describing an attribute of interest from a free text input. Most past related work on extraction of missing attribute values work with a closed world assumption with the possible set of values known beforehand, or use dictionaries of values and hand-crafted features. How can we discover new attribute values that we have never seen before? Can we do this with limited human annotation or supervision? We study this problem in the context of product catalogs that often have missing values for many attributes of interest. In this work, we leverage product profile information such as titles and descriptions to discover missing values of product attributes. We develop a novel deep tagging model OpenTag for this extraction problem with the following contributions: (1) we formalize the problem as a sequence tagging task, and propose a joint model exploiting recurrent neural networks (specifically, bidirectional LSTM) to capture context and semantics, and Conditional Random Fields (CRF) to enforce tagging consistency; (2) we develop a novel attention mechanism to provide interpretable explanation for our model's decisions; (3) we propose a novel sampling strategy exploring active learning to reduce the burden of human annotation. OpenTag does not use any dictionary or hand-crafted features as in prior works. Extensive experiments in real-life datasets in different domains show that OpenTag with our active learning strategy discovers new attribute values from as few as 150 annotated samples (reduction in 3.3x amount of annotation effort) with a high F-score of 83%, outperforming state-of-the-art models. 61
STAR: Preventing flow-table overflow in software-defined networks Guo, ZH; Liu, RY; Xu, Y; Gushchin, A; Walid, A; Chao, HJ The emerging Software-Defined Networking (SDN) enables network innovation and flexible control for network operations. A key component of SDN is the flow table at each switch, which stores flow entries that define how to process the received flows. In a network that has a large number of active flows, flow tables at switches can be easily overflowed, which could cause blocking of new flows or eviction of entries of some active flows. The eviction of active flow entries, however, can severely degrade the network performance and overload the SDN controller. In this paper, we propose Software-defined Adaptive Routing (STAR), an online routing scheme that efficiently utilizes limited flow-table resources to maximize network performance. In particular, STAR detects real-time flow-table utilization of each switch, intelligently evicts expired flow entries when needed to accommodate new flows, and selects routing paths for new flows based on flow-table utilizations of switches across the network. Simulation results based on the Spanish backbone network show that, STAR outperforms existing schemes by decreasing the controller's workload for routing new flows by about 87%, reducing packet delay by 49%, and increasing average throughput by 123% on average when the flow-table resource is scarce. (C) 2017 Elsevier B.V. All rights reserved. 61
Multivariate Machine Learning Methods for Fusing Multimodal Functional Neuroimaging Data Dahne, S; Biessmann, F; Samek, W; Haufe, S; Goltz, D; Gundlach, C; Villringer, A; Fazli, S; Muller, KR Multimodal data are ubiquitous in engineering, communications, robotics, computer vision, or more generally speaking in industry and the sciences. All disciplines have developed their respective sets of analytic tools to fuse the information that is available in all measured modalities. In this paper, we provide a review of classical as well as recent machine learning methods (specifically factor models) for fusing information from functional neuroimaging techniques such as: LFP, EEG, MEG, fNIRS, and fMRI. Early and late fusion scenarios are distinguished, and appropriate factor models for the respective scenarios are presented along with example applications from selected multimodal neuroimaging studies. Further emphasis is given to the interpretability of the resulting model parameters, in particular by highlighting how factor models relate to physical models needed for source localization. The methods we discuss allow for the extraction of information from neural data, which ultimately contributes to 1) better neuroscientific understanding; 2) enhance diagnostic performance; and 3) discover neural signals of interest that correlate maximally with a given cognitive paradigm. While we clearly study the multimodal functional neuroimaging challenge, the discussed machine learning techniques have a wide applicability, i.e., in general data fusion, and may thus be informative to the general interested reader. 60
DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild Guler, RA; Trigeorgis, G; Antonakos, E; Snape, P; Zafeiriou, S; Kokkinos, I In this paper we propose to learn a mapping from image pixels into a dense template grid through a fully convolutional network. We formulate this task as a regression problem and train our network by leveraging upon manually annotated facial landmarks in-the-wild. We use such landmarks to establish a dense correspondence field between a three-dimensional object template and the input image, which then serves as the ground-truth for training our regression system. We show that we can combine ideas from semantic segmentation with regression networks, yielding a highly-accurate 'quantized regression' architecture. Our system, called DenseReg, allows us to estimate dense image-to-template correspondences in a fully convolutional manner. As such our network can provide useful correspondence information as a stand-alone system, while when used as an initialization for Statistical Deformable Models we obtain landmark localization results that largely outperform the current state-of-the-art on the challenging 300W benchmark. We thoroughly evaluate our method on a host of facial analysis tasks, and demonstrate its use for other correspondence estimation tasks, such as the human body and the human ear. DenseReg code is made available at http://alpguler.com/DenseReg.html along with supplementary materials. 57
Semantic Segmentation With Context Encoding and Multi-Path Decoding Ding, HH; Jiang, XD; Shuai, B; Liu, AQ; Wang, G Semantic image segmentation aims to classify every pixel of a scene image to one of many classes. It implicitly involves object recognition, localization, and boundary delineation. In this paper, we propose a segmentation network called CGBNet to enhance the segmentation performance by context encoding and multi-path decoding. We first propose a context encoding module that generates context-contrasted local feature to make use of the informative context and the discriminative local information. This context encoding module greatly improves the segmentation performance, especially for inconspicuous objects. Furthermore, we propose a scale-selection scheme to selectively fuse the segmentation results from different-scales of features at every spatial position. It adaptively selects appropriate score maps from rich scales of features. To improve the segmentation performance results at boundary, we further propose a boundary delineation module that encourages the location-specific very-low-level features near the boundaries to take part in the final prediction and suppresses them far from the boundaries. The proposed segmentation network achieves very competitive performance in terms of all three different evaluation metrics consistently on the six popular scene segmentation datasets, Pascal Context, SUN-RGBD, Sift Flow, COCO Stuff, ADE20K, and Cityscapes. 55
De novo design of a non-local beta-sheet protein with high stability and accuracy Marcos, E; Chidyausiku, TM; McShan, AC; Evangelidis, T; Nerli, S; Carter, L; Nivon, LG; Davis, A; Oberdorfer, G; Tripsianes, K; Sgourakis, NG; Baker, D beta-sheet proteins carry out critical functions in biology, and hence are attractive scaffolds for computational protein design. Despite this potential, de novo design of all-beta-sheet proteins from first principles lags far behind the design of all-alpha or mixed-alpha beta domains owing to their non-local nature and the tendency of exposed beta-strand edges to aggregate. Through study of loops connecting unpaired beta-strands (beta-arches), we have identified a series of structural relationships between loop geometry, side chain directionality and beta-strand length that arise from hydrogen bonding and packing constraints on regular beta-sheet structures. We use these rules to de novo design jellyroll structures with double-stranded beta-helices formed by eight antiparallel beta-strands. The nuclear magnetic resonance structure of a hyperthermostable design closely matched the computational model, demonstrating accurate control over the beta-sheet structure and loop geometry. Our results open the door to the design of a broad range of non-local beta-sheet protein structures. 51
Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques Haeb-Umbach, R; Watanabe, S; Nakatani, T; Bacchiani, M; Hoffmeister, B; Seltzer, ML; Zen, H; Souden, M Once a popular theme of futuristic science fiction or far-fetched technology forecasts, digital home assistants with a spoken language interface have become a ubiquitous commodity today. This success has been made possible by major advancements in signal processing and machine learning for so-called far-field speech recognition, where the commands are spoken at a distance from the sound-capturing device. The challenges encountered are quite unique and different from many other use cases of automatic speech recognition (ASR). The purpose of this article is to describe, in a way that is amenable to the nonspecialist, the key speech processing algorithms that enable reliable, fully hands-free speech interaction with digital home assistants. These technologies include multichannel acoustic echo cancellation (MAEC), microphone array processing and dereverberation techniques for signal enhancement, reliable wake-up word and end-of-interaction detection, and high-quality speech synthesis as well as sophisticated statistical models for speech and language, learned from large amounts of heterogeneous training data. In all of these fields, deep learning (DL) has played a critical role. 50
Measuring Group Differences in High-Dimensional Choices: Method and Application to Congressional Speech Gentzkow, M; Shapiro, JM; Taddy, M We study the problem of measuring group differences in choices when the dimensionality of the choice set is large. We show that standard approaches suffer from a severe finite-sample bias, and we propose an estimator that applies recent advances in machine learning to address this bias. We apply this method to measure trends in the partisanship of congressional speech from 1873 to 2016, defining partisanship to be the ease with which an observer could infer a congressperson's party from a single utterance. Our estimates imply that partisanship is far greater in recent years than in the past, and that it increased sharply in the early 1990s after remaining low and relatively constant over the preceding century. 49
SPOTLIGHT: Detecting Anomalies in Streaming Graphs Eswaran, D; Faloutsos, C; Guha, S; Mishra, N How do we spot interesting events from e-mail or transportation logs? How can we detect port scan or denial of service attacks from IP-IP communication data? In general, given a sequence of weighted, directed or bipartite graphs, each summarizing a snapshot of activity in a time window, how can we spot anomalous graphs containing the sudden appearance or disappearance of large dense subgraphs (e.g., near bicliques) in near real-time using sublinear memory? To this end, we propose a randomized sketching-based approach called SpotLight, which guarantees that an anomalous graph is mapped 'far' away from 'normal' instances in the sketch space with high probability for appropriate choice of parameters. Extensive experiments on real-world datasets show that SpotLight (a) improves accuracy by at least 8.4% compared to prior approaches, (b) is fast and can process millions of edges within a few minutes, (c) scales linearly with the number of edges and sketching dimensions and (d) leads to interesting discoveries in practice. 48
Fast, Automated, Scalable Generation of Textured 3D Models of Indoor Environments Turner, E; Cheng, P; Zakhor, A 3D modeling of building architecture from mobile scanning is a rapidly advancing field. These models are used in virtual reality, gaming, navigation, and simulation applications. State-of-the-art scanning produces accurate point-clouds of building interiors containing hundreds of millions of points. This paper presents several scalable surface reconstruction techniques to generate watertight meshes that preserve sharp features in the geometry common to buildings. Our techniques can automatically produce high-resolution meshes that preserve the fine detail of the environment by performing a ray-carving volumetric approach to surface reconstruction. We present methods to automatically generate 2D floor plans of scanned building environments by detecting walls and room separations. These floor plans can be used to generate simplified 3D meshes that remove furniture and other temporary objects. We propose a method to texture-map these models from captured camera imagery to produce photo-realistic models. We apply these techniques to several data sets of building interiors, including multi-story datasets. 48
GANs for Biological Image Synthesis Osokin, A; Chessel, A; Salas, REC; Vaggi, F In this paper, we propose a novel application of Generative Adversarial Networks (GAN) to the synthesis of cells imaged by fluorescence microscopy. Compared to natural images, cells tend to have a simpler and more geometric global structure that facilitates image generation. However, the correlation between the spatial pattern of different fluorescent proteins reflects important biological functions, and synthesized images have to capture these relationships to be relevant for biological applications. We adapt GANs to the task at hand and propose new models with casual dependencies between image channels that can generate multichannel images, which would be impossible to obtain experimentally. We evaluate our approach using two independent techniques and compare it against sensible baselines. Finally, we demonstrate that by interpolating across the latent space we can mimic the known changes in protein localization that occur through time during the cell cycle, allowing us to predict temporal evolution from static images. 46
Deep Single-Image Portrait Relighting Zhou, H; Hadap, S; Sunkavalli, K; Jacobs, DW Conventional physically-based methods for relighting portrait images need to solve an inverse rendering problem, estimating face geometry, reflectance and lighting. However, the inaccurate estimation of face components can cause strong artifacts in relighting, leading to unsatisfactory results. In this work, we apply a physically-based portrait relighting method to generate a large scale, high quality, in the wild portrait relighting dataset (DPR). A deep Convolutional Neural Network (CNN) is then trained using this dataset to generate a relit portrait image by using a source image and a target lighting as input. The training procedure regularizes the generated results, removing the artifacts caused by physically-based relighting methods. A GAN loss is further applied to improve the quality of the relit portrait image. Our trained network can relight portrait images with resolutions as high as 1024 x 1024. We evaluate the proposed method on the proposed DPR datset, Flickr portrait dataset and Multi-PIE dataset both qualitatively and quantitatively. Our experiments demonstrate that the proposed method achieves state-of-the-art results. Please refer to https://zhhoper.github.io/dpr.html for dataset and code. 44
Webly-Supervised Video Recognition by Mutually Voting for Relevant Web Images and Web Video Frames Gan, C; Sun, C; Duan, LX; Gong, BQ Video recognition usually requires a large amount of training samples, which are expensive to be collected. An alternative and cheap solution is to draw from the large-scale images and videos from the Web. With modern search engines, the top ranked images or videos are usually highly correlated to the query, implying the potential to harvest the labeling-free Web images and videos for video recognition. However, there are two key difficulties that prevent us from using the Web data directly. First, they are typically noisy and may be from a completely different domain from that of users' interest (e.g. cartoons). Second, Web videos are usually untrimmed and very lengthy, where some query-relevant frames are often hidden in between the irrelevant ones. A question thus naturally arises: to what extent can such noisy Web images and videos be utilized for labeling-free video recognition? In this paper, we propose a novel approach to mutually voting for relevant Web images and video frames, where two forces are balanced, i.e. aggressive matching and passive video frame selection. We validate our approach on three large-scale video recognition datasets. 41
Building better measures of role ambiguity and role conflict: The validation of new role stressor scales Bowling, NA; Khazon, S; Alarcon, GM; Blackmore, CE; Bragg, CB; Hoepf, MR; Barelka, A; Kennedy, K; Wang, Q; Li, HY Occupational stress researchers have given considerable attention to role ambiguity and role conflict as predictors of employee health, job attitudes and behaviour. However, the validity of the Rizzo, House, and Lirtzman's (1970) scales - the most popular role stressor measures - has been a source of disagreement among researchers. In response to the disputed validity of the Rizzo et al. scales, we developed new measures of role ambiguity and role conflict and conducted five studies to examine their psychometric qualities (Study 1 N = 101 U.S. workers; Study 2 N = 118 workers primarily employed in the U.S.; Study 3 N = 135 employed U.S. MBA students; Study 4 N = 973 members of the U.S. Air Force (USAF); Study 5 N = 234 workers primarily employed in the U.S.). Across these five studies, we found that the new role stressor scales have desirable psychometric qualities: they displayed high levels of substantive validity, high levels of internal consistency and test-retest reliability, they produced an interpretable factor structure, and we found evidence of their construct validity. We therefore recommend that these new scales be used in future research on role stress. 41
Sustainable computational science: the ReScience initiative Rougier, NP; Hinsen, K; Alexandre, F; Arildsen, T; Barba, LA; Benureau, FCY; Brown, CT; de Buyl, P; Caglayan, O; Davison, AP; Delsuc, MA; Detorakis, G; Diem, AK; Drix, D; Enel, P; Girard, B; Guest, O; Hall, MG; Henriques, RN; Hinaut, X; Jaron, KS; Khamassi, M; Klein, A; Manninen, T; Marchesi, P; McGlinn, D; Metzner, C; Petchey, O; Plesser, HE; Poisot, T; Ram, K; Ram, Y; Roesch, E; Rossant, C; Rostami, V; Shifman, A; Stachelek, J; Stimberg, M; Stollmeier, F; Vaggi, F; Viejo, G; Vitay, J; Vostinar, AE; Yurchak, R; Zito, T Computer science offers a large set of tools for prototyping, writing, running, testing, validating, sharing and reproducing results; however, computational science lags behind. In the best case, authors may provide their source code as a compressed archive and they may feel confident their research is reproducible. But this is not exactly true. James Buckheit and David Donoho proposed more than two decades ago that an article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code, and data that produced the result. This implies new workflows, in particular in peer-reviews. Existing journals have been slow to adapt: source codes are rarely requested and are hardly ever actually executed to check that they produce the results advertised in the article. ReScience is a peer-reviewed journal that targets computational research and encourages the explicit replication of already published research, promoting new and open-source implementations in order to ensure that the original research can be replicated from its description. To achieve this goal, the whole publishing chain is radically different from other traditional scientific journals. ReScience resides on GitHub where each new implementation of a computational study is made available together with comments, explanations, and software tests. 39
SUBIC: A supervised, structured binary code for image search Jain, H; Zepeda, J; Perez, P; Gribonval, R For large-scale visual search, highly compressed yet meaningful representations of images are essential. Structured vector quantizers based on product quantization and its variants are usually employed to achieve such compression while minimizing the loss of accuracy. Yet, unlike binary hashing schemes, these unsupervised methods have not yet benefited from the supervision, end-to-end learning and novel architectures ushered in by the deep learning revolution. We hence propose herein a novel method to make deep convolutional neural networks produce supervised, compact, structured binary codes for visual search. Our method makes use of a novel block-softmax non-linearity and of batch-based entropy losses that together induce structure in the learned encodings. We show that our method outperforms state-of-the-art compact representations based on deep hashing or structured quantization in single and cross-domain category retrieval, instance retrieval and classification. We make our code and models publicly available online. 35
Temporal Structure Mining for Weakly Supervised Action Detection Yu, T; Ren, Z; Li, YC; Yan, EX; Xu, N; Yuan, JS Different from the fully-supervised action detection problem that is dependent on expensive frame-level annotations, weakly supervised action detection (WSAD) only needs video-level annotations, making it more practical for real-world applications. Existing WSAD methods detect action instances by scoring each video segment (a stack of frames) individually. Most of them fail to model the temporal relations among video segments and cannot effectively characterize action instances possessing latent temporal structure. To alleviate this problem in WSAD, we propose the temporal structure mining (TSM) approach. In TSM, each action instance is modeled as a multi-phase process and phase evolving within an action instance, i.e., the temporal structure, is exploited. In this framework, phase filters are used to calculate the confidence scores of the presence of an action's phases in each segment. Since in the WSAD task, frame-level annotations are not available and thus phase filters cannot be trained directly. To tackle the challenge, we treat each segment's phase as a hidden variable. We use segments' confidence scores from each phase filter to construct a table and determine hidden variables, i.e., phases of segments, by a maximal circulant path discovery along the table. Experiments conducted on three benchmark datasets demonstrate good performance of the proposed TSM. 35
Improving mobility by optimizing the number, location and usage of loading/unloading bays for urban freight vehicles Alho, AR; Silva, JDE; de Sousa, JP; Blanco, E The role of urban freight vehicle trips in fulfilling the consumption needs of people in urban areas is often overshadowed by externality-causing parking practices (e.g., double-parking associated with traffic delays). Loading/unloading bays are generally viewed as an effective way to avoid freight vehicles double-parking, but are often misused by non-freight vehicles. We assess the potential of reducing freight vehicles double-parking mobility impacts by changing: (a) the spatial configuration (number, location, size) of loading/unloading bays and, (b) the non-freight vehicles parking rules compliance levels. Parking demand models were created with data from an establishment-based freight survey and a parking observation exercise. Two case studies were defined for 1 km(2) zones in the city of Lisbon, Portugal. Alternative bay systems were derived from an iterative implementation of the maximize capacitated coverage algorithm to a range of bays to be located. Parking operations in current and alternative bay systems were compared using a microsimulation. Bay systems' ability in reducing double-parking impacts was assessed via a set of indicators (e.g., average speed). Freight traffic causes a disproportionate amount of externalities and the current bay configuration leads to greater mobility impacts than some of the proposed systems. Enforcement was a crucial element in reducing parking operations impact on traffic flow in one of the case-studies. Road network characteristics were demonstrated to play a role in the adequate strategy of arranging the spatial configuration of bays. (C) 2017 Elsevier Ltd. All rights reserved. 35
The 4th AI City Challenge Naphade, M; Wang, S; Anastasiu, DC; Tang, Z; Chang, MC; Yang, XD; Zheng, L; Sharma, A; Chellappa, R; Chakraborty, P The AI City Challenge was created to accelerate intelligent video analysis that helps make cities smarter and safer. Transportation is one of the largest segments that can benefit from actionable insights derived from data captured by sensors, where computer vision and deep learning have shown promise in achieving large-scale practical deployment. The 4th annual edition of the AI City Challenge has attracted 315 participating teams across 37 countries, who leverage city-scale real traffic data and high-quality synthetic data to compete in four challenge tracks. Track 1 addressed video-based automatic vehicle counting, where the evaluation is conducted on both algorithmic effectiveness and computational efficiency. Track 2 addressed city-scale vehicle re-identification with augmented synthetic data to substantially increase the training set for the task. Track 3 addressed city-scale multi-target multi-camera vehicle tracking. Track 4 addressed traffic anomaly detection. The evaluation system shows two leader boards, in which a general leader board shows all submitted results, and a public leader board shows results limited to our contest participation rules, that teams are not allowed to use external data in their work. The general leader board shows results more close to real-world situations where annotated data are limited. Our results show promise that AI technology can enable smarter and safer transportation systems. 35
Mining sequential patterns for classification Fradkin, D; Morchen, F While a number of efficient sequential pattern mining algorithms were developed over the years, they can still take a long time and produce a huge number of patterns, many of which are redundant. These properties are especially frustrating when the goal of pattern mining is to find patterns for use as features in classification problems. In this paper, we describe BIDE-Discriminative, a modification of BIDE that uses class information for direct mining of predictive sequential patterns. We then perform an extensive evaluation on nine real-life datasets of the different ways in which the basic BIDE-Discriminative can be used in real multi-class classification problems, including 1-versus-rest and model-based search tree approaches. The results of our experiments show that 1-versus-rest provides an efficient solution with good classification performance. 34
An Evaluation of Empirical Bayes's Estimation of Value-Added Teacher Performance Measures Guarino, CM; Maxfield, M; Reckase, MD; Thompson, PN; Wooldridge, JM Empirical Bayes's (EB) estimation has become a popular procedure used to calculate teacher value added, often as a way to make imprecise estimates more reliable. In this article, we review the theory of EB estimation and use simulated and real student achievement data to study the ability of EB estimators to properly rank teachers. We compare the performance of EB estimators with that of other widely used value-added estimators under different teacher assignment scenarios. We find that, although EB estimators generally perform well under random assignment (RA) of teachers to classrooms, their performance suffers under nonrandom teacher assignment. Under non-RA, estimators that explicitly (if imperfectly) control for the teacher assignment mechanism perform the best out of all the estimators we examine. We also find that shrinking the estimates, as in EB estimation, does not itself substantially boost performance. 33
Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks Park, N; Kan, A; Dong, XL; Zhao, T; Faloutsos, C How can we estimate the importance of nodes in a knowledge graph (KG)? A KG is a multi-relational graph that has proven valuable for many tasks including question answering and semantic search. In this paper, we present GENI, a method for tackling the problem of estimating node importance in KGs, which enables several downstream applications such as item recommendation and resource allocation. While a number of approaches have been developed to address this problem for general graphs, they do not fully utilize information available in KGs, or lack flexibility needed to model complex relationship between entities and their importance. To address these limitations, we explore supervised machine learning algorithms. In particular, building upon recent advancement of graph neural networks (GNNs), we develop GENI, a GNN-based method designed to deal with distinctive challenges involved with predicting node importance in KGs. Our method performs an aggregation of importance scores instead of aggregating node embeddings via predicate-aware attention mechanism and flexible centrality adjustment. In our evaluation of GENI and existing methods on predicting node importance in real-world KGs with different characteristics, GENI achieves 5-17% higher NDCG@100 than the state of the art. 32
Robust 3D Hand Pose Estimation From Single Depth Images Using Multi-View CNNs Ge, LH; Liang, H; Yuan, JS; Thalmann, D Articulated hand pose estimation is one of core technologies in human-computer interaction. Despite the recent progress, most existing methods still cannot achieve satisfactory performance, partly due to the difficulty of the embedded high-dimensional nonlinear regression problem. Most existing data-driven methods directly regress 3D hand pose from 2D depth image, which cannot fully utilize the depth information. In this paper, we propose a novel multi-view convolutional neural network (CNN)-based approach for 3D hand pose estimation. To better exploit 3D information in the depth image, we project the point cloud generated from the query depth image onto multiple views of two projection settings and integrate them for more robust estimation. Multi-view CNNs are trained to learn the mapping from projected images to heat-maps, which reflect probability distributions of joints on each view. These multi-view heat-maps are then fused to estimate the optimal 3D hand pose with learned pose priors, and the unreliable information in multi-view heat-maps is suppressed using a view selection method. Experimental results show that the proposed method is superior to the state-of-the-art methods on two challenging data sets. Furthermore, a cross-data set experiment also validates that our proposed approach has good generalization ability. 32
Action recognition with spatial-temporal discriminative filter banks Martinez, B; Modolo, D; Xiong, YJ; Tighe, J Action recognition has seen a dramatic performance improvement in the last few years. Most of the current state-of-the-art literature either aims at improving performance through changes to the backbone CNN network, or they explore different trade-offs between computational efficiency and performance, again through altering the backbone network. However, almost all of these works maintain the same last layers of the network, which simply consist of a global average pooling followed by a fully connected layer. In this work we focus on how to improve the representation capacity of the network, but rather than altering the backbone, we focus on improving the last layers of the network, where changes have low impact in terms of computational cost. In particular, we show that current architectures have poor sensitivity to finer details and we exploit recent advances in the fine-grained recognition literature to improve our model in this aspect. With the proposed approach, we obtain state-of-the-art performance on Kinetics-400 and Something-Something-V1, the two major large-scale action recognition benchmarks. 31
Adaptive, Personalized Diversity for Visual Discovery Teo, CH; Nassif, H; Hill, D; Srinivasan, S; Goodman, M; Mohan, V; Vishwanathan, SVN Search queries are appropriate when users have explicit intent, but they perform poorly when the intent is difficult to express or if the user is simply looking to be inspired. Visual browsing systems allow e-commerce platforms to address these scenarios while offering the user an engaging shopping experience. Here we explore extensions in the direction of adaptive personalization and item diversification within Stream, a new form of visual browsing and discovery by Amazon. Our system presents the user with a diverse set of interesting items while adapting to user interactions. Our solution consists of three components (1) a Bayesian regression model for scoring the relevance of items while leveraging uncertainty, (2) a submodular diversification framework that re-ranks the top scoring items based on category, and (3) personalized category preferences learned from the user's behavior. When tested on live traffic, our algorithms show a strong lift in click-through-rate and session duration. 30
Data Integration and Machine Learning: A Natural Synergy Dong, XL; Rekatsinas, T There is now more data to analyze than ever before. As data volume and variety have increased, so have the ties between machine learning and data integration become stronger. For machine learning to be effective, one must utilize data from the greatest possible variety of sources; and this is why data integration plays a key role. At the same time machine learning is driving automation in data integration, resulting in overall reduction of integration costs and improved accuracy. This tutorial focuses on three aspects of the synergistic relationship between data integration and machine learning: (1) we survey how state-of-the-art data integration solutions rely on machine learning-based approaches for accurate results and effective human-in-the-loop pipelines, (2) we review how end-to-end machine learning applications rely on data integration to identify accurate, clean, and relevant data for their analytics exercises, and (3) we discuss open research challenges and opportunities that span across data integration and machine learning. 30
Deep View Synthesis from Sparse Photometric Images Xu, ZX; Bi, S; Sunkavalli, K; Hadap, S; Su, H; Ramamoorthi, R The goal of light transport acquisition is to take images from a sparse set of lighting and viewing directions, and combine them to enable arbitrary relighting with changing view. While relighting from sparse images has received significant attention, there has been relatively less progress on view synthesis from a sparse set of photometric images-images captured under controlled conditions, lit by a single directional source; we use a spherical gantry to position the camera on a sphere surrounding the object. In this paper, we synthesize novel viewpoints across a wide range of viewing directions (covering a 60 degrees cone) from a sparse set of just six viewing directions. While our approach relates to previous view synthesis and image-based rendering techniques, those methods are usually restricted to much smaller baselines, and are captured under environment illumination. At our baselines, input images have few correspondences and large occlusions; however we benefit from structured photometric images. Our method is based on a deep convolutional network trained to directly synthesize new views from the six input views. This network combines 3D convolutions on a plane sweep volume with a novel per-view per-depth plane attention map prediction network to effectively aggregate multi-view appearance. We train our network with a large-scale synthetic dataset of 1000 scenes with complex geometry and material properties. In practice, it is able to synthesize novel viewpoints for captured real data and reproduces complex appearance effects like occlusions, view-dependent specularities and hard shadows. Moreover, the method can also be combined with previous relighting techniques to enable changing both lighting and view, and applied to computer vision problems like multiview stereo from sparse image sets. 30
Optimal Quantile Approximation in Streams Karnin, Z; Lang, K; Liberty, E This paper resolves one of the longest standing basic problems in the streaming computational model. Namely, optimal construction of quantile sketches. An epsilon approximate quantile sketch receives a stream of items x(1),..., x(n) and allows one to approximate the rank of any query item up to additive error epsilon(n) with probability at least 1 - delta. The rank of a query x is the number of stream items such that x(i) <= x. The minimal sketch size required for this task is trivially at least 1/epsilon. Felber and Ostrovsky obtain a O((1/epsilon) log(1/epsilon)) space sketch for a fixed delta. Without restrictions on the nature of the stream or the ratio between epsilon and n, no better upper or lower bounds were known to date. This paper obtains an O((1/epsilon) log log(1/delta)) space sketch and a matching lower bound. This resolves the open problem and proves a qualitative gap between randomized and deterministic quantile sketching for which an Omega((1/epsilon) log(1/epsilon)) lower bound is known. One of our contributions is a novel representation and modification of the widely used merge-and-reduce construction. This modification allows for an analysis which is both tight and extremely simple. The same technique was reported, in private communications, to be useful for improving other sketching objectives and geometric coreset constructions. 29
Adaptive Quasi-Dynamic Traffic Light Control Fleck, JL; Cassandras, CG; Geng, YF We consider the traffic light control problem for a single intersection modeled as a stochastic hybrid system. We study a quasi-dynamic policy based on partial state information defined by detecting whether vehicle backlogs are above or below certain thresholds. The policy is parameterized by green and red cycle lengths as well as the road content thresholds. Using infinitesimal perturbation analysis, we derive online gradient estimators of a cost metric with respect to the controllable light cycles and threshold parameters and use these estimators to iteratively adjust all the controllable parameters through an online gradient-based algorithm so as to improve the overall system performance under various traffic conditions. The results obtained by applying this methodology to a simulated urban setting are also included. 29
Salience Rank: Efficient Keyphrase Extraction with Topic Modeling Teneva, N; Cheng, WW Topical PageRank (TPR) uses latent topic distribution inferred by Latent Dirichlet Allocation (LDA) to perform ranking of noun phrases extracted from documents. The ranking procedure consists of running PageRank K times, where K is the number of topics used in the LDA model. In this paper, we propose a modification of TPR, called Salience Rank. Salience Rank only needs to run PageRank once and extracts comparable or better keyphrases on benchmark datasets. In addition to quality and efficiency benefits, our method has the flexibility to extract keyphrases with varying tradeoffs between topic specificity and corpus specificity. 29
Balancing flow table occupancy and link utilization in software-defined networks Guo, ZH; Xu, Y; Liu, RY; Gushchin, A; Chen, KY; Walid, A; Chao, HJ Software-Defined Networking (SDN) employs a centralized control with a global network view and provides great opportunities to improve network performance. However, due to the limitation of flow table space at the switches and unbalanced traffic allocation on links, an SDN may suffer from flow table overflow and inefficient bandwidth allocation among flows, increasing the controller's burden and degrading network performance. In this paper, we present a dynamic routing scheme named DIFF that differentiates flows based on their impact on network resource and adaptively selects routing paths for them to mitigate the problems of flow-table overflow and inefficient bandwidth allocation. DIFF pre-generates a set of paths for each pair of source-destination edge switches and intelligently selects the paths from the pre-generated path-sets for new flows with an objective to balance flow-table utilizations. It adaptively reroutes some elephant flows to achieve maximum throughput under the rule of max-min fair bandwidth allocation. Simulation results show that DIFF simultaneously balances the flow table and link utilizations, reduces the controller's workload and packet delay, while increasing network throughput, compared with baseline schemes. (C) 2018 Elsevier B.V. All rights reserved. 29
An efficient pattern mining approach for event detection in multivariate temporal data Batal, I; Cooper, GF; Fradkin, D; Harrison, J; Moerchen, F; Hauskrecht, M This work proposes a pattern mining approach to learn event detection models from complex multivariate temporal data, such as electronic health records. We present recent temporal pattern mining, a novel approach for efficiently finding predictive patterns for event detection problems. This approach first converts the time series data into time-interval sequences of temporal abstractions. It then constructs more complex time-interval patterns backward in time using temporal operators. We also present the minimal predictive recent temporal patterns framework for selecting a small set of predictive and non-spurious patterns. We apply our methods for predicting adverse medical events in real-world clinical data. The results demonstrate the benefits of our methods in learning accurate event detection models, which is a key step for developing intelligent patient monitoring and decision support systems. 29
Taming Pretrained Transformers for Extreme Multi-label Text Classification Chang, WC; Yu, HF; Zhong, K; Yang, YM; Dhillon, IS We consider the extreme multi-label text classification (XMC) problem: given an input text, return the most relevant labels from a large label collection. For example, the input text could be a product description on Amazon.com and the labels could be product categories. XMC is an important yet challenging problem in the NLP community. Recently, deep pretrained transformer models have achieved state-of-the-art performance on many NLP tasks including sentence classification, albeit with small label sets. However, naively applying deep transformer models to the XMC problem leads to sub-optimal performance due to the large output space and the label sparsity issue. In this paper, we propose X-Transformer, the first scalable approach to fine-tuning deep transformer models for the XMC problem. The proposed method achieves new state-of-the-art results on four XMC benchmark datasets. In particular, on a Wiki dataset with around 0.5 million labels, the prec@1 of X-Transformer is 77.28%, a substantial improvement over state-of-the-art XMC approaches Parabel (linear) and AttentionXML (neural), which achieve 68.70% and 76.95% precision@1, respectively. We further apply X-Transformer to a product2query dataset from Amazon and gained 10.7% relative improvement on prec@1 over Parabel. 28
Tropical-Forest Structure and Biomass Dynamics from TanDEM-X Radar Interferometry Treuhaft, R; Lei, Y; Goncalves, F; Keller, M; dos Santos, JR; Neumann, M; Almeida, A Changes in tropical-forest structure and aboveground biomass (AGB) contribute directly to atmospheric changes in CO2, which, in turn, bear on global climate. This paper demonstrates the capability of radar-interferometric phase-height time series at X-band (wavelength = 3 cm) to monitor changes in vertical structure and AGB, with sub-hectare and monthly spatial and temporal resolution, respectively. The phase-height observation is described, with a focus on how it is related to vegetation-density, radar-power vertical profiles, and mean canopy heights, which are, in turn, related to AGB. The study site covers 18 x 60 km in the Tapajos National Forest in the Brazilian Amazon. Phase-heights over Tapajos were measured by DLR's TanDEM-X radar interferometer 32 times in a 3.2 year period from 2011-2014. Fieldwork was done on 78 secondary and primary forest plots. In the absence of disturbance, rates of change of phase-height for the 78 plots were estimated by fitting the phase-heights to time with a linear model. Phase-height time series for the disturbed plots were fit to the logistic function to track jumps in phase-height. The epochs of clearing for the disturbed plots were identified with approximate to 1-month accuracy. The size of the phase-height change due to disturbance was estimated with approximate to 2-m accuracy. The monthly time resolution will facilitate REDD+ monitoring. Phase-height rates of change were shown to correlate with LiDAR RH90 height rates taken over a subset of the TanDEM-X data's time span (2012-2013). The average rate of change of phase-height across all 78 plots was 0.5 m-yr(-1) with a standard deviation of 0.6 m-yr(-1). For 42 secondary forest plots, the average rate of change of phase-height was 0.8 m-yr(-1) with a standard deviation of 0.6 m-yr(-1). For 36 primary forest plots, the average phase-height rate was 0.1 m-yr(-1) with a standard deviation of 0.5 m-yr(-1). A method for converting phase-height rates to AGB-rates of change was developed using previously measured phase-heights and field-estimated AGB. For all 78 plots, the average AGB-rate was 1.7 Mg-ha(-1)-yr(-1) with a standard deviation of 4.0 Mg-ha(-1)-yr(-1). The secondary-plot average AGB-rate was 2.1 Mg-ha(-1)-yr(-1), with a standard deviation of 2.4 Mg-ha(-1)-yr(-1). For primary plots, the AGB average rate was 1.1 Mg-ha(-1)-yr(-1) with a standard deviation of 5.2 Mg-ha(-1)-yr(-1). Given the standard deviations and the number of plots in each category, rates in secondary forests and all forests were significantly different from zero; rates in primary forests were consistent with zero. AGB-rates were compared to change models for Tapajos and to LiDAR-based change measurements in other tropical forests. Strategies for improving AGB dynamical monitoring with X-band interferometry are discussed. 28
A Scalable Asynchronous Distributed Algorithm for Topic Modeling Yu, HF; Hsieh, CJ; Yun, H; Vishwanathan, SVN; Dhillon, IS Learning meaningful topic models with massive document collections which contain millions of documents and billions of tokens is challenging because of two reasons. First, one needs to deal with a large number of topics (typically on the order of thousands). Second, one needs a scalable and efficient way of distributing the computation across multiple machines. In this paper, we present a novel algorithm F+Nomad LDA which simultaneously tackles both these problems. In order to handle large number of topics we use an appropriately modified Fenwick tree. This data structure allows us to sample from a multinomial distribution over T items in O(log T) time. Moreover, when topic counts change the data structure can be updated in O(log T) time. In order to distribute the computation across multiple processors, we present a novel asynchronous framework inspired by the Nomad algorithm of [25]. We show that F+Nomad LDA significantly outperforms recent state-of-the-art topic modeling approaches on massive problems which involve millions of documents, billions of words, and thousands of topics. 27
Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data Yamada, M; Tang, JL; Lugo-Martinez, J; Hodzic, E; Shrestha, R; Saha, A; Ouyang, H; Yin, DW; Mamitsuka, H; Sahinalp, C; Radivojac, P; Menczer, F; Chang, Y Machine learning methods are used to discover complex nonlinear relationships in biological and medical data. However, sophisticated learning models are computationally unfeasible for data with millions of features. Here, we introduce the first feature selection method for nonlinear learning problems that can scale up to large, ultra-high dimensional biological data. More specifically, we scale up the novel Hilbert-Schmidt Independence Criterion Lasso (HSIC Lasso) to handle millions of features with tens of thousand samples. The proposed method is guaranteed to find an optimal subset of maximally predictive features with minimal redundancy, yielding higher predictive power and improved interpretability. Its effectiveness is demonstrated through applications to classify phenotypes based on module expression in human prostate cancer patients and to detect enzymes among protein structures. We achieve high accuracy with as few as 20 out of one million features-a dimensionality reduction of 99.998 percent. Our algorithm can be implemented on commodity cloud computing platforms. The dramatic reduction of features may lead to the ubiquitous deployment of sophisticated prediction models in mobile health care applications. 27
CERES: Distantly Supervised Relation Extraction from the Semi-Structured Web Lockard, C; Dong, XLN; Einolghozati, A; Shiralkar, P The web contains countless semi-structured websites, which can be a rich source of information for populating knowledge bases. Existing methods for extracting relations from the DOM trees of semi-structured webpages can achieve high precision and recall only when manual annotations for each website are available. Although there have been efforts to learn extractors from automatically generated labels, these methods are not sufficiently robust to succeed in settings with complex schemas and information-rich websites. In this paper we present a new method for automatic extraction from semi-structured websites based on distant supervision. We automatically generate training labels by aligning an existing knowledge base with a website and leveraging the unique structural characteristics of semi-structured websites. We then train a classifier based on the potentially noisy and incomplete labels to predict new relation instances. Our method can compete with annotation-based techniques in the literature in terms of extraction quality. A large-scale experiment on over 400,000 pages from dozens of multi-lingual long-tail websites harvested 1.25 million facts at a precision of 90%. 25
Experiences with GreenGPS-Fuel-Efficient Navigation Using Participatory Sensing Saremi, F; Fatemieh, O; Ahmadi, H; Wang, HY; Abdelzaher, T; Ganti, R; Liu, HC; Hu, SH; Li, S; Su, L Participatory sensing services based on mobile phones constitute an important growing area of mobile computing. Most services start small and hence are initially sparsely deployed. Unless a mobile service adds value while sparsely deployed, it may not survive conditions of sparse deployment. The paper offers a generic solution to this problem and illustrates this solution in the context of GreenGPS; a navigation service that allows drivers to find the most fuel-efficient routes customized for their vehicles between arbitrary end-points. Specifically, when the participatory sensing service is sparsely deployed, we demonstrate a general framework for generalization from sparse collected data to produce models extending beyond the current data coverage. This generalization allows the mobile service to offer value under broader conditions. GreenGPS uses our developed participatory sensing infrastructure and generalization algorithms to perform inexpensive data collection, aggregation, and modeling in an end-to-end automated fashion. The models are subsequently used by our backend engine to predict customized fuel-efficient routes for both members and non-members of the service. GreenGPS is offered as a mobile phone application and can be easily deployed and used by individuals. A preliminary study of our green navigation idea was performed in [1], however, the effort was focused on a proof-of-concept implementation that involved substantial offline and manual processing. In contrast, the results and conclusions in the current paper are based on a more advanced and accurate model and extensive data from a real-world phone-based implementation and deployment, which enables reliable and automatic end-to-end data collection and route recommendation. The system further benefits from lower cost and easier deployment. To evaluate the green navigation service efficiency, we conducted a user subject study consisting of 22 users driving different vehicles over the course of several months in Urbana-Champaign, IL. The experimental results using the collected data suggest that fuel savings of 21.5 over the fastest, 11.2 percent over the shortest, and 8.4 percent over the Garmin eco routes can be achieved by following GreenGPS green routes. The study confirms that our navigation service can survive conditions of sparse deployment and at the same time achieve accurate fuel predictions and lead to significant fuel savings. 25
Understanding Mixup Training Methods Liang, DJ; Yang, F; Zhang, T; Yang, P Mixup is a neural network training method that generates new samples by linear interpolation of multiple samples and their labels. The mixup training method has better generalization ability than the traditional empirical risk minimization method (ERM). But there is a lack of a more intuitive understanding of why mixup will perform better. In this paper, several different sample mixing methods are used to test how neural networks learn and infer from mixed samples to illustrate how mixups work as a data augmentation method and how it regularizes neural networks. Then, a method of weighting noise perturbation was designed to visualize the loss functions of mixup and ERM training methods to analyze the properties of their high-dimensional decision surfaces. Finally, by analyzing the mixture of samples and their labels, a spatial mixup approach was proposed that achieved the state-of-the-art performance on the CIFAR and ImageNet data sets. This method also enables the generative adversarial nets to have more stable training process and more diverse sample generation ability. 24
Enhancing product robustness in reliability-based design optimization Zhuang, XT; Pan, R; Du, XP Different types of uncertainties need to be addressed in a product design optimization process. In this paper, the uncertainties in both product design variables and environmental noise variables are considered. The reliability-based design optimization (RBDO) is integrated with robust product design (RPD) to concurrently reduce the production cost and the long-term operation cost, including quality loss, in the process of product design. This problem leads to a multi-objective optimization with probabilistic constraints. In addition, the model uncertainties associated with a surrogate model that is derived from numerical computation methods, such as finite element analysis, is addressed. A hierarchical experimental design approach, augmented by a sequential sampling strategy, is proposed to construct the response surface of product performance function for finding optimal design solutions. The proposed method is demonstrated through an engineering example. (C) 2015 Elsevier Ltd. All rights reserved. 24
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification Prokudin, S; Gehler, P; Nowozin, S Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable, we would like our models to quantify their uncertainty to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allows for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art. 23
Visual Recognition in RGB Images and Videos by Learning from RGB-D Data Li, W; Chen, L; Xu, D; Van Gool, L In this work, we propose a framework for recognizing RGB images or videos by learning from RGB-D training data that contains additional depth information. We formulate this task as a new unsupervised domain adaptation (UDA) problem, in which we aim to take advantage of the additional depth features in the source domain and also cope with the data distribution mismatch between the source and target domains. To handle the domain distribution mismatch, we propose to learn an optimal projection matrix to map the samples from both domains into a common subspace such that the domain distribution mismatch can be reduced. Such projection matrix can be effectively optimized by exploiting different strategies. Moreover, we also use different ways to utilize the additional depth features. To simultaneously cope with the above two issues, we formulate a unified learning framework called domain adaptation from multi-view to single-view (DAM2S). By defining various forms of regularizers in our DAM2S framework, different strategies can be readily incorporated to learn robust SVM classifiers for classifying the target samples, and three methods are developed under our DAM2S framework. We conduct comprehensive experiments for object recognition, cross-dataset and cross-view action recognition, which demonstrate the effectiveness of our proposed methods for recognizing RGB images and videos by learning from RGB-D data. 23
Requirements-Driven Test Generation for Autonomous Vehicles With Machine Learning Components Tuncali, CE; Fainekos, G; Prokhorov, D; Ito, H; Kapinski, J Autonomous vehicles are complex systems that are challenging to test and debug. A requirements-driven approach to the development process can decrease the resources required to design and test these systems, while simultaneously increasing the reliability. We present a testing framework that uses signal temporal logic (STL), which is a precise and unambiguous requirements language. Our framework evaluates test cases against the STL formulae and additionally uses the requirements to automatically identify test cases that fail to satisfy the requirements. One of the key features of our tool is the support for machine learning (ML) components in the system design, such as deep neural networks. The framework allows evaluation of the control algorithms, including the ML components, and it also includes models of CCD camera, lidar, and radar sensors, as well as the vehicle environment. We use multiple methods to generate test cases, including covering arrays, which is an efficient method to search discrete variable spaces. The resulting test cases can be used to debug the controller design by identifying controller behaviors that do not satisfy requirements. The test cases can also enhance the testing phase of development by identifying critical corner cases that correspond to the limits of the system's allowed behaviors. We present STL requirements for an autonomous vehicle system, which capture both component-level and system-level behaviors. Additionally, we present three driving scenarios and demonstrate how our requirements-driven testing framework can be used to identify critical system behaviors, which can be used to support the development process. 23
DEFEATnet-A Deep Conventional Image Representation for Image Classification Gao, SH; Duan, LX; Tsang, IW To study underlying possibilities for the successes of conventional image representation and deep neural networks (DNNs) in image representation, we propose a deep feature extraction, encoding, and pooling network (DEFEATnet) architecture, which is a marriage between conventional image representation approaches and DNNs. In particular, in DEFEATnet, each layer consists of three components: feature extraction, feature encoding, and pooling. The primary advantage of DEFEATnet is twofold. First, it consolidates the prior knowledge (e.g., translation invariance) from extracting, encoding, and pooling handcrafted features, as in the conventional feature representation approaches. Second, it represents the object parts at different granularities by gradually increasing the local receptive fields in different layers, as in DNNs. Moreover, DEFEATnet is a generalized framework that can readily incorporate all types of local features as well as all kinds of well-designed feature encoding and pooling methods. Since prior knowledge is preserved in DEFEATnet, it is especially useful for image representation on small/medium size data sets, where DNNs usually fail due to the lack of sufficient training data. Promising experimental results clearly show that DEFEATnets outperform shallow conventional image representation approaches by a large margin when the same type of features, feature encoding and pooling are used. The extensive experiments also demonstrate the effectiveness of the deep architecture of our DEFEATnet in improving the robustness for image presentation. 22
Collaborative Active Visual Recognition from Crowds: A Distributed Ensemble Approach Hua, G; Long, CJ; Yang, M; Gao, Y Active learning is an effective way of engaging users to interactively train models for visual recognition more efficiently. The vast majority of previous works focused on active learning with a single human oracle. The problem of active learning with multiple oracles in a collaborative setting has not been well explored. We present a collaborative computational model for active learning with multiple human oracles, the input from whom may possess different levels of noises. It leads to not only an ensemble kernel machine that is robust to label noises, but also a principled label quality measure to online detect irresponsible labelers. Instead of running independent active learning processes for each individual human oracle, our model captures the inherent correlations among the labelers through shared data among them. Our experiments with both simulated and real crowd-sourced noisy labels demonstrate the efficacy of our model. 22
Stratification of amyotrophic lateral sclerosis patients: a crowdsourcing approach Kueffner, R; Zach, N; Bronfeld, M; Norel, R; Atassi, N; Balagurusamy, V; Di Camillo, B; Chio, A; Cudkowicz, M; Dillenberger, D; Garcia-Garcia, J; Hardiman, O; Hoff, B; Knight, J; Leitner, ML; Li, G; Mangravite, L; Norman, T; Wang, LX; Alkallas, R; Anghel, C; Avril, J; Bacardit, J; Balser, B; Balser, J; Bar-Sinai, Y; Ben-David, N; Ben-Zion, E; Bliss, R; Cai, J; Chernyshev, A; Chiang, JH; Chicco, D; Corriveau, BAN; Dai, JQ; Deshpande, Y; Desplats, E; Durgin, JS; Espiritu, SMG; Fan, F; Fevrier, P; Fridley, BL; Godzik, A; Golinska, A; Gordon, J; Graw, S; Guo, YL; Herpelinck, T; Hopkins, J; Huang, B; Jacobsen, J; Jahandideh, S; Jeon, J; Ji, WK; Jung, K; Karanevich, A; Koestler, DC; Kozak, M; Kurz, C; Lalansingh, C; Larrieu, T; Lazzarini, N; Lerner, B; Lesinski, W; Liang, XT; Lin, XH; Lowe, J; Mackey, L; Meier, R; Min, WW; Mnich, K; Nahmias, V; Noel-MacDonnell, J; O'Donnell, A; Paadre, S; Park, J; Polewko-Klim, A; Raghavan, R; Rudnicki, W; Saghapour, E; Salomond, JB; Sankaran, K; Sendorek, D; Sharan, V; Shiah, YJ; Sirois, JK; Sumanaweera, DN; Usset, J; Vang, YS; Vens, C; Wadden, D; Wang, D; Wong, WC; Xie, XH; Xu, ZQ; Yang, HT; Yu, X; Zhang, HC; Zhang, L; Zhang, SH; Zhu, SF; Xiao, JF; Fang, WC; Peng, J; Yang, C; Chang, HJ; Stolovitzky, G Amyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disease where substantial heterogeneity in clinical presentation urgently requires a better stratification of patients for the development of drug trials and clinical care. In this study we explored stratification through a crowdsourcing approach, the DREAM Prize4Life ALS Stratification Challenge. Using data from > 10,000 patients from ALS clinical trials and 1479 patients from community-based patient registers, more than 30 teams developed new approaches for machine learning and clustering, outperforming the best current predictions of disease outcome. We propose a new method to integrate and analyze patient clusters across methods, showing a clear pattern of consistent and clinically relevant sub-groups of patients that also enabled the reliable classification of new patients. Our analyses reveal novel insights in ALS and describe for the first time the potential of a crowdsourcing to uncover hidden patient sub-populations, and to accelerate disease understanding and therapeutic development. 21
Learning an event sequence embedding for dense event-based deep stereo Tulyakov, S; Fleuret, F; Kiefel, M; Gehler, P; Hirsch, M Today, a frame-based camera is the sensor of choice for machine vision applications. However, these cameras, originally developed for acquisition of static images rather than for sensing of dynamic uncontrolled visual environments, suffer from high power consumption, data rate, latency and low dynamic range. An event-based image sensor addresses these drawbacks by mimicking a biological retina. Instead of measuring the intensity of every pixel in a fixed time interval, it reports events of significant pixel intensity changes. Every such event is represented by its position, sign of change, and timestamp, accurate to the microsecond. Asynchronous event sequences require special handling, since traditional algorithms work only with synchronous, spatially gridded data. To address this problem we introduce a new module for event sequence embedding, for use in different applications. The module builds a representation of an event sequence by firstly aggregating information locally across time, using a novel fully-connected layer for an irregularly sampled continuous domain, and then across discrete spatial domain. Based on this module, we design a deep learning-based stereo method for event-based cameras. The proposed method is the first learning-based stereo method for an event-based camera and the only method that produces dense results. We show large performance increases on the Multi Vehicle Stereo Event Camera Dataset (MVSEC), which became the standard set for the benchmarking of event-based stereo methods. 20
Thermophysical and mechanical properties of novel high-entropy metal nitride-carbides Wen, TQ; Ye, BL; Nguyen, MC; Ma, MD; Chu, YH In this work, a novel (Hf0.2Zr0.2Ta0.2Nb0.2Ti0.2)(N0.5C0.5) high-entropy nitride-carbide (HENC-1) with multi-cationic and -anionic sublattice structure was reported and their thermophysical and mechanical properties were studied for the first time. The results of the first-principles calculations showed that HENC-1 had the highest mixing entropy of 1.151R, which resulted in the lowest Gibbs free energy above 600 K among HENC-1, (Hf0.2Zr0.2Ta0.2Nb0.2Ti0.2)N high-entropy nitrides (HEN-1), and (Hf0.2Zr0.2Ta0.2Nb0.2Ti0.2)C high-entropy carbides (HEC-1). In this case, HENC-1 samples were successfully fabricated by hot-pressing sintering technique at the lowest temperature (1773 K) among HENC-1, HEN-1 and HEC-1 samples. The as-fabricated HENC-1 samples showed a single rock-salt structure of metal nitride-carbides and high compositional uniformity. Meanwhile, they exhibited high microhardness of 19.5 +/- 0.3 GPa at an applied load of 9.8 N and nanohardness of 33.4 +/- 0.5 GPa and simultaneously possessed a high bulk modulus of 258 GPa, Young's modulus of 429 GPa, shear modulus of 176 GPa, and elastic modulus of 572 +/- 7 GPa. Their hardness and modulus are the highest among HENC-1, HEN-1 and HEC-1 samples, which could be attributed to the presence of mass disorder and lattice distortion from the multi-anionic sublattice structure and small grain in HENC-1 samples. In addition, the thermal conductivity of HENC-1 samples was significantly lower than the average value from the rule of mixture between HEC-1 and HEN-1 samples in the range of 300-800 K, which was due to the presence of lattice distortion from the multi-anionic sublattice structure in HENC-1 samples. 20
Learning Generative ConvNets via Multi-grid Modeling and Sampling Gao, RQ; Lu, Y; Zhou, JP; Zhu, SC; Wu, YN This paper proposes a multi-grid method for learning energy-based generative ConvNet models of images. For each grid, we learn an energy-based probabilistic model where the energy function is defined by a bottom-up convolutional neural network (ConvNet or CNN). Learning such a model requires generating synthesized examples from the model. Within each iteration of our learning algorithm, for each observed training image, we generate synthesized images at multiple grids by initializing the finite-step MCMC sampling from a minimal 1x1 version of the training image. The synthesized image at each subsequent grid is obtained by a finite-step MCMC initialized from the synthesized image generated at the previous coarser grid. After obtaining the synthesized examples, the parameters of the models at multiple grids are updated separately and simultaneously based on the differences between synthesized and observed examples. We show that this multi-grid method can learn realistic energy-based generative ConvNet models, and it outperforms the original contrastive divergence (CD) and persistent CD. 20
Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration Nag, A; Balasubramonian, R; Srikumar, V; Walker, R; Shafiee, A; Strachan, JP; Muralimanohar, N Many recent works take advantage of highly parallel analog in-situ computation in memristor crossbars to accelerate the many vector-matrix multiplication operations in deep neural networks (DNNs). However, these in-situ accelerators have two significant shortcomings: The ADCs account for a large fraction of chip power and area, and these accelerators adopt a homogeneous design in which every resource is provisioned for the worst case. By addressing both problems, the new architecture, called Newton, moves closer to achieving optimal energy per neuron for crossbar accelerators. We introduce new techniques that apply at different levels of the tile hierarchy, some leveraging heterogeneity and others relying on divide-and-conquer numeric algorithms to reduce computations and ADC pressure. Finally, we place constraints on how a workload is mapped to tiles, thus helping reduce resource-provisioning in tiles. For many convolutional-neural-network (CNN) dataflows and structures, Newton achieves a 77-percent decrease in power, 51-percent improvement in energy-efficiency, and 2.1x higher throughput/area, relative to the state-of-the-art In-Situ Analog Arithmetic in Crossbars (ISAAC) accelerator. 20
Learning Gaze Transitions from Depth to Improve Video Saliency Estimation Leifman, G; Rudoy, D; Swedish, T; Bayro-Corrochano, E; Raskar, R In this paper we introduce a novel Depth-Aware Video Saliency approach to predict human focus of attention when viewing videos that contain a depth map (RGBD) on a 2D screen. Saliency estimation in this scenario is highly important since in the near future 3D video content will be easily acquired yet hard to display. Despite considerable progress in 3D display technologies, most are still expensive and require special glasses for viewing, so RGBD content is primarily viewed on 2D screens, removing the depth channel from the final viewing experience. We train a generative convolutional neural network that predicts the 2D viewing saliency map for a given frame using the RGBD pixel values and previous fixation estimates in the video. To evaluate the performance of our approach, we present a new comprehensive database of 2D viewing eye-fixation ground-truth for RGBD videos. Our experiments indicate that it is beneficial to integrate depth into video saliency estimates for content that is viewed on a 2D display. We demonstrate that our approach outperforms state-of-the-art methods for video saliency, achieving 15% relative improvement. 19
Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources Niu, L; Xu, XX; Chen, L; Duan, LX; Xu, D In this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain. 19
Kestrel: Video Analytics for Augmented Multi-Camera Vehicle Tracking Qiu, H; Liu, XC; Rallapalli, S; Bency, AJ; Chan, K; Urgaonkar, R; Manjunath, BS; Govindan, R In the future, the video-enabled camera will be the most pervasive type of sensor in the Internet of Things. Such cameras will enable continuous surveillance through heterogeneous camera networks consisting of fixed camera systems as well as cameras on mobile devices. The challenge in these networks is to enable efficient video analytics: the ability to process videos cheaply and quickly to enable searching for specific events or sequences of events. In this paper, we discuss the design and implementation of Kestrel, a video analytics system that tracks the path of vehicles across a heterogeneous camera network. In Kestrel, fixed camera feeds are processed on the cloud, and mobile devices are invoked only to resolve ambiguities in vehicle tracks. Kestrel's mobile device pipeline detects objects using a deep neural network, extracts attributes using cheap visual features, and resolves path ambiguities by careful association of vehicle visual descriptors, while using several optimizations to conserve energy and reduce latency. Our evaluations show that Kestrel can achieve precision and recall comparable to a fixed camera network of the same size and topology, while reducing energy usage on mobile devices by more than an order of magnitude. 18
Recommending Product Sizes to Customers Sembium, V; Rastogi, R; Saroop, A; Merugu, S We propose a novel latent factor model for recommending product size fits {Small, Fit, Large} to customers. Latent factors for customers and products in our model correspond to their physical true size, and are learnt from past product purchase and returns data. The outcome for a customer, product pair is predicted based on the difference between customer and product true sizes, and efficient algorithms are proposed for computing customer and product true size values that minimize two loss function variants. In experiments with Amazon shoe datasets, we show that our latent factor models incorporating personas, and leveraging return codes show a 17-21% AUC improvement compared to baselines. In an online A/B test, our algorithms show an improvement of 0.49% in percentage of Fit transactions over control. 17
Efficient Learning on Point Clouds with Basis Point Sets Prokudin, S; Lassner, C; Romero, J With an increased availability of 3D scanning technology, point clouds are moving into the focus of computer vision as a rich representation of everyday scenes. However, they are hard to handle for machine learning algorithms due to their unordered structure. One common approach is to apply occupancy grid mapping, which dramatically increases the amount of data stored and at the same time loses details through discretization. Recently, deep learning models were proposed to handle point clouds directly and achieve input permutation invariance. However, these architectures often use an increased number of parameters and are computationally inefficient. In this work we propose basis point sets (BPS) as a highly efficient and fully general way to process point clouds with machine learning algorithms. The basis point set representation is a residual representation that can be computed efficiently and can be used with standard neural network architectures and other machine learning algorithms. Using the proposed representation as the input to a simple fully connected network allows us to match the performance of PointNet on a shape classification task, while using three orders of magnitude less floating point operations. In a second experiment, we show how the proposed representation can be used for registering high resolution meshes to noisy 3D scans. Here, we present the first method for single-pass high-resolution mesh registration, avoiding time-consuming per-scan optimization and allowing real-time execution. 17
Efficient Learning on Point Clouds with Basis Point Sets Prokudin, S; Lassner, C; Romero, J With the increased availability of 3D scanning technology, point clouds are moving into the focus of computer vision as a rich representation of everyday scenes. However, they are hard to handle for machine learning algorithms due to their unordered structure. One common approach is to apply occupancy grid mapping, which dramatically increases the amount of data stored and at the same time loses details through discretization. Recently, deep learning models were proposed to handle point clouds directly and achieve input permutation invariance. However, these architectures often use an increased number of parameters and are computationally inefficient. In this work we propose basis point sets (BPS) as a highly efficient and fully general way to process point clouds with machine learning algorithms. The basis point set representation is a residual representation that can be computed efficiently and can be used with standard neural network architectures and other machine learning algorithms. Using the proposed representation as the input to a simple fully connected network allows us to match the performance of PointNet on a shape classification task, while using three orders of magnitude less floating point operations. In a second experiment, we show how the proposed representation can be used for registering high resolution meshes to noisy 3D scans. Here, we present the first method for single-pass high-resolution mesh registration, avoiding time-consuming per-scan optimization and allowing real-time execution. 17
Attack Detection and Approximation in Nonlinear Networked Control Systems Using Neural Networks Niu, HF; Bhowmick, C; Jagannathan, S In networked control systems (NCS), a certain class of attacks on the communication network is known to raise traffic flows causing delays and packet losses to increase. This paper presents a novel neural network (NN)-based attack detection and estimation scheme that captures the abnormal traffic flow due to a class of attacks on the communication links within the feedback loop of an NCS. By modeling the unknown network flow as a nonlinear function at the bottleneck node and using a NN observer, the network attack detection residual is defined and utilized to determine the onset of an attack in the communication network when the residual exceeds a predefined threshold. Upon detection, another NN is used to estimate the flow injected by the attack. For the physical system, we develop an attack detection scheme by using an adaptive dynamic programming-based optimal event-triggered NN controller in the presence of network delays and packet losses. Attacks on the network as well as on the sensors of the physical system can be detected and estimated with the proposed scheme. The simulation results confirm theoretical conclusions. 17
Prevalence of Potentially Distracting Noncare Activities and Their Effects on Vigilance, Workload, and Nonroutine Events during Anesthesia Care Slagle, JM; Porterfield, ES; Lorinc, AN; Afshartous, D; Shotwell, MS; Weinger, MB Background: When workload is low, anesthesia providers may perform non-patient care activities of a clinical, educational, or personal nature. Data are limited on the incidence or impact of distractions on actual care. We examined the prevalence of self-initiated nonclinical distractions and their effects on anesthesia workload, vigilance, and the occurrence of nonroutine events. Methods: In 319 qualifying cases in an academic medical center using a Web-based electronic medical chart, a trained observer recorded video and performed behavioral task analysis. Participant workload and response to a vigilance (alarm) light were randomly measured. Postoperatively, participants were interviewed to elicit possible nonroutine events. Two anesthesiologists reviewed each event to evaluate their association with distractions. Results: At least one self-initiated distraction was observed in 171 cases (54%), largely during maintenance. Distractions accounted for 2% of case time and lasted 2.3s (median). The most common distraction was personal internet use. Distractions were more common in longer cases but were not affected by case type or American Society of Anesthesiologists physical status. Workload ratings were significantly lower during distraction-containing case periods and vigilance latencies were significantly longer in cases without any distractions. Three distractions were temporally associated with, but did not cause, events. Conclusions: Both nurse anesthetists and residents performed potentially distracting tasks of a personal and/or educational nature in a majority of cases. Self-initiated distractions were rarely associated with events. This study suggests that anesthesia professionals using sound judgment can self-manage nonclinical activities. Future efforts should focus on eliminating more cognitively absorbing and less escapable distractions, as well as training in distraction management. 17
Constrained Assortment Optimization Under the Markov Chain-based Choice Model Desir, A; Goyal, V; Segev, D; Ye, C Assortment optimization is an important problem that arises in many practical applications such as retailing and online advertising. The fundamental goal is to select a subset of items to offer from a universe of substitutable items to maximize expected revenue when customers exhibit a random substitution behavior captured by a choice model. We study assortment optimization under the Markov chain choice model in the presence of capacity constraints that arise naturally in many applications. The Markov chain choice model considers item substitutions as transitions in a Markov chain and provides a good approximation for a large class of random utility models, thereby addressing the challenging problem of model selection in choice modeling. In this paper, we present constant factor approximation algorithms for the cardinality- and capacity-constrained assortment-optimization problem under the Markov chain model. We show that this problem is APX-hard even when all item prices are uniform, meaning that, unless P= NP, it is not possible to obtain an approximation better than a particular constant. Our algorithmic approach is based on a new externality adjustment paradigm that exactly captures the externality of adding an item to a given assortment on the remaining set of items, thereby allowing us to linearize a nonlinear, nonsubmodular, and nonmonotone revenue function and to design an iterative algorithm that iteratively builds up a provably good assortment. 16
The Virtual Caliper: Rapid Creation of Metrically Accurate Avatars from 3D Measurements Pujades, S; Mohler, B; Thaler, A; Tesch, J; Mahmood, N; Hesse, N; Bulthoff, HH; Black, MJ Creating metrically accurate avatars is important for many applications such as virtual clothing try-on, ergonomics, medicine, immersive social media, telepresence, and gaming. Creating avatars that precisely represent a particular individual is challenging however, due to the need for expensive 3D scanners, privacy issues with photographs or videos, and difficulty in making accurate tailoring measurements. We overcome these challenges by creating The Virtual Caliper, which uses VR game controllers to make simple measurements. First, we establish what body measurements users can reliably make on their own body. We find several distance measurements to be good candidates and then verify that these are linearly related to 3D body shape as represented by the SMPL body model. The Virtual Caliper enables novice users to accurately measure themselves and create an avatar with their own body shape. We evaluate the metric accuracy relative to ground truth 3D body scan data. compare the method quantitatively to other avatar creation tools, and perform extensive perceptual studies. We also provide a software application to the community that enables novices to rapidly create avatars in fewer than five minutes. Not only is our approach more rapid than existing methods, it exports a metrically accurate 3D avatar model that is rigged and skinned. 16
Factors Influencing Perceived Fairness in Algorithmic Decision-Making: Algorithm Outcomes, Development Procedures, and Individual Differences Wang, RT; Harper, FM; Zhu, HY Algorithmic decision-making systems are increasingly used throughout the public and private sectors to make important decisions or assist humans in making these decisions with real social consequences. While there has been substantial research in recent years to build fair decision-making algorithms, there has been less research seeking to understand the factors that affect people's perceptions of fairness in these systems, which we argue is also important for their broader acceptance. In this research, we conduct an online experiment to better understand perceptions of fairness, focusing on three sets of factors: algorithm outcomes, algorithm development and deployment procedures, and individual differences. We find that people rate the algorithm as more fair when the algorithm predicts in their favor, even surpassing the negative effects of describing algorithms that are very biased against particular demographic groups. We find that this effect is moderated by several variables, including participants' education level, gender, and several aspects of the development procedure. Our findings suggest that systems that evaluate algorithmic fairness through users' feedback must consider the possibility of outcome favorability bias. 15
Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis Liao, YY; Schwarz, K; Mescheder, L; Geiger, A In recent years, Generative Adversarial Networks have achieved impressive results in photorealistic image synthesis. This progress nurtures hopes that one day the classical rendering pipeline can be replaced by efficient models that are learned directly from images. However, current image synthesis models operate in the 2D domain where disentangling 3D properties such as camera viewpoint or object pose is challenging. Furthermore, they lack an interpretable and controllable representation. Our key hypothesis is that the image generation process should be modeled in 3D space as the physical world surrounding us is intrinsically three-dimensional. We define the new task of 3D controllable image synthesis and propose an approach for solving it by reasoning both in 3D space and in the 2D image domain. We demonstrate that our model is able to disentangle latent 3D factors of simple multi-object scenes in an unsupervised fashion from raw images. Compared to pure 2D baselines, it allows for synthesizing scenes that are consistent wrt. changes in viewpoint or object pose. We further evaluate various 3D representations in terms of their usefulness for this challenging task. 15
Exploring Sparseness and Self-Similarity for Action Recognition Sun, C; Junejo, IN; Tappen, M; Foroosh, H We propose that the dynamics of an action in video data forms a sparse self-similar manifold in the space-time volume, which can be fully characterized by a linear rank decomposition. Inspired by the recurrence plot theory, we introduce the concept of Joint Self-Similarity Volume (Joint-SSV) to model this sparse action manifold, and hence propose a new optimized rank-1 tensor approximation of the Joint-SSV to obtain compact low-dimensional descriptors that very accurately characterize an action in a video sequence. We show that these descriptor vectors make it possible to recognize actions without explicitly aligning the videos in time in order to compensate for speed of execution or differences in video frame rates. Moreover, we show that the proposed method is generic, in the sense that it can be applied using different low-level features, such as silhouettes, tracked points, histogram of oriented gradients, and so forth. Therefore, our method does not necessarily require explicit tracking of features in the space-time volume. Our experimental results on five public data sets demonstrate that our method produces promising results and outperforms many baseline methods. 15
Why fairness cannot be automated: Bridging the gap between EU non-discrimination law and AI Wachter, S; Mittelstadt, B; Russell, C In recent years a substantial literature has emerged concerning bias, discrimination, and fairness in artificial intelligence (AI) and machine learning. Connecting this work to existing legal non-discrimination frameworks is essential to create tools and methods that are practically useful across divergent legal regimes. While much work has been undertaken from an American legal perspective, comparatively little has mapped the effects and requirements of EU law. This Article addresses this critical gap between legal, technical, and organisational notions of algorithmic fairness. Through analysis of EU non-discrimination law and jurisprudence of the European Court of Justice (ECJ) and national courts, we identify a critical incompatibility between European notions of discrimination and existing work on algorithmic and automated fairness. A clear gap exists between statistical measures of fairness as embedded in myriad fairness toolkits and governance mechanisms and the context-sensitive, often intuitive and ambiguous discrimination metrics and evidential requirements used by the ECJ; we refer to this approach as contextual equality. This Article makes three contributions. First, we review the evidential requirements to bring a claim under EU non-discrimination law. Due to the disparate nature of algorithmic and human discrimination, the EU's current requirements are too contextual, reliant on intuition, and open to judicial interpretation to be automated. Many of the concepts fundamental to bringing a claim, such as the composition of the disadvantaged and advantaged group, the severity and type of harm suffered, and requirements for the relevance and admissibility of evidence, require normative or political choices to be made by the judiciary on a caseby-case basis. We show that automating fairness or non-discrimination in Europe may be impossible because the law, by design, does not provide a static or homogenous framework suited to testing for discrimination in AI systems. Second, we show how the legal protection offered by non-discrimination law is challenged when AI, not humans, discriminate. Humans discriminate due to negative attitudes (e.g. stereotypes, prejudice) and unintentional biases (e.g. organisational practices or internalised stereotypes) which can act as a signal to victims that discrimination has occurred. Equivalent signalling mechanisms and agency do not exist in algorithmic systems. Compared to traditional forms of discrimination, automated discrimination is more abstract and unintuitive, subtle, intangible, and difficult to detect. The increasing use of algorithms disrupts traditional legal remedies and procedures for detection, investigation, prevention, and correction of discrimination which have predominantly relied upon intuition. Consistent assessment procedures that define a common standard for statistical evidence to detect and assess prima facie automated discrimination are urgently needed to support judges, regulators, system controllers and developers, and claimants. Finally, we examine how existing work on fairness in machine learning lines up with procedures for assessing cases under EU non-discrimination law. A 'gold standard' for assessment of prima facie discrimination has been advanced by the European Court of Justice but not yet translated into standard assessment procedures for automated discrimination. We propose 'conditional demographic disparity' (CDD) as a standard baseline statistical measurement that aligns with the Court's 'gold standard'. Establishing a standard set of statistical evidence for automated discrimination cases can help ensure consistent procedures for assessment, but not judicial interpretation, of cases involving AI and automated systems. Through this proposal for procedural regularity in the identification and assessment of automated discrimination, we clarify how to build considerations of fairness into automated systems as far as possible while still respecting and enabling the contextual approach to judicial interpretation practiced under EU non-discrimination law. 15
Experimental Demonstration and Calibration of a 16-Element Active Incoherent Millimeter-Wave Imaging Array Vakalis, S; Gong, L; He, YX; Papapolymerou, J; Nanzer, JA In this article, an active incoherent millimeter-wave imaging array is presented, along with its calibration procedure. Active incoherent millimeter-wave imaging uses the transmission of incoherent signals from multiple transmitters to mimic the properties of thermal radiation, enabling interferometric image reconstruction that can be realized in a snap-shot mode, without beamsteering. Due to the use of transmitters, the sensitivity requirement on the receivers is significantly relaxed compared with passive millimeter-wave imaging systems that detect low-power thermal radiation, making it possible to use standard commercial hardware, therefore decreasing the cost considerably. No exact knowledge of the transmit illumination is needed; thus, the coordination of the transmitters is minimal, further simplifying the system implementation. In this work, a 16-element K-a-band millimeter-wave imager is built and presented using commercial components and in-house fabricated antennas, along with a calibration procedure to account for amplitude and phase variations in the hardware. Experimental 2-D snapshot image reconstructions are presented. 14
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications Brattoli, B; Tighe, J; Zhdanov, F; Perona, P; Chalupka, K Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at github.com/bbrattoli/ZeroShotVideoClassification. 14
Robust Speech Recognition Via Anchor Word Representations King, B; Chen, IF; Vaizman, Y; Liu, YZ; Maas, R; Parthasarathi, SHK; Hoffmeister, B A challenge for speech recognition for voice-controlled household devices, like the Amazon Echo or Google Home, is robustness against interfering background speech. Formulated as a far-field speech recognition problem. another person or media device in proximity can produce background speech that can interfere with the device-directed speech. We expand on our previous work on device-directed speech detection in the far-field speech setting and introduce two approaches for robust acoustic modeling. Both methods are based on the idea of using an anchor word taken from the device directed speech. Our first method employs a simple yet effective normalization of the acoustic features by subtracting the mean derived over the anchor word. The second method utilizes an encoder network projecting the anchor word onto a fixed-size embedding. which serves as an additional input to the acoustic model. The encoder network and acoustic model are jointly trained. Results on an in-house dataset reveal that, in the presence of background speech, the proposed approaches can achieve up to 35% relative word error rate reduction. 14
Knowledge Verification for Long-Tail Verticals Li, FR; Dong, XLN; Langen, A; Li, Y Collecting structured knowledge for real-world entities has become a critical task for many applications. A big gap between the knowledge in existing knowledge repositories and the knowledge in the real world is the knowledge on tail verticals (i.e., less popular domains). Such knowledge, though not necessarily globally popular, can be personal hobbies to many people and thus collectively impactful. This paper studies the problem of knowledge verification for tail verticals; that is, deciding the correctness of a given triple. Through comprehensive experimental study we answer the following questions. 1) Can we find evidence for tail knowledge from an extensive set of sources, including knowledge bases, the web, and query logs? 2) Can we judge correctness of the triples based on the collected evidence? 3) How can we further improve knowledge verification on tail verticals? Our empirical study suggests a new knowledge-verification framework, which we call FACTY, that applies various kinds of evidence collection techniques followed by knowledge fusion. FACTY can verify 50% of the (correct) tail knowledge with a precision of 84%, and it significantly outperforms state-of-the-art methods. Detailed error analysis on the obtained results suggests future research directions. 14
Identifying similar days for air traffic management Gorripaty, S; Liu, Y; Hansen, M; Pozdnukhov, A Air traffic managers face challenging decisions due to uncertainity in weather and air traffic. One way to support their decisions is to identify similar historical days, the traffic management actions taken on those days, and the resulting outcomes. We develop similarity measures based on quarter-hourly capacity and demand data at four case study airports EWR, SFO, ORD and JFK. We find that dimensionality reduction is feasible for capacity data, and base similarity on principal components. Dimensionality reduction cannot be efficiently performed on demand data, consequently similarity is based on original data. We find that both capacity and demand data lack natural clusters and propose a continuous similarity measure. Finally, we estimate overall capacity and demand similarities, which are visualized using Metric Multidimensional Scaling plots. We observe that most days with air traffic management activity are similar to certain other days, validating the potential of this approach for decision support. (C) 2017 Elsevier Ltd. All rights reserved. 14
A Sparse Topic Model for Extracting Aspect-Specific Summaries from Online Reviews Rakesh, V; Ding, WC; Ahuja, A; Rao, N; Sun, YF; Reddy, CK Online reviews have become an inevitable part of a consumer's decision making process, where the likelihood of purchase not only depends on the product's overall rating, but also on the description of its aspects. Therefore, e-commerce websites such as Amazon and Walmart constantly encourage users to write good quality reviews and categorically summarize different facets of the products. However, despite such attempts, it takes a significant effort to skim through thousands of reviews and look for answers that address the query of consumers. For example, a gamer might be interested in buying a monitor with fast refresh rates and support for Gsync and Freesync technologies, while a photographer might be interested in aspects such as color depth and accuracy. To address these challenges, in this paper, we propose a generative aspect summarization model called APSUM that is capable of providing fine-grained summaries of online reviews. To overcome the inherent problem of aspect sparsity, we impose dual constraints: (a) a spike-and-slab prior over the document-topic distribution and (b) a linguistic supervision over the word-topic distribution. Using a rigorous set of experiments, we show that the proposed model is capable of outperforming the state-of-the-art aspect summarization model over a variety of datasets and deliver intuitive fine-grained summaries that could simplify the purchase decisions of consumers. 14
Protecting Privacy of Users in Brain-Computer Interface Applications Agarwal, A; Dowsley, R; McKinney, ND; Wu, DR; Lin, CT; De Cock, M; Nascimento, ACA Machine learning (ML) is revolutionizing research and industry. Many ML applications rely on the use of large amounts of personal data for training and inference. Among the most intimate exploited data sources is electroencephalogram (EEG) data, a kind of data that is so rich with information that application developers can easily gain knowledge beyond the professed scope from unprotected EEG signals, including passwords, ATM PINs, and other intimate data. The challenge we address is how to engage in meaningful ML with EEG data while protecting the privacy of users. Hence, we propose cryptographic protocols based on secure multiparty computation (SMC) to perform linear regression over EEG signals from many users in a fully privacy-preserving(PP) fashion, i.e., such that each individual's EEG signals are not revealed to anyone else. To illustrate the potential of our secure framework, we show how it allows estimating the drowsiness of drivers from their EEG signals as would be possible in the unencrypted case, and at a very reasonable computational cost. Our solution is the first application of commodity-based SMC to EEG data, as well as the largest documented experiment of secret sharing-based SMC in general, namely, with 15 players involved in all the computations. 14
Fashion Outfit Complementary Item Retrieval Lin, YL; Tran, S; Davis, LS Complementary fashion item recommendation is critical for fashion outfit completion. Existing methods mainly focus on outfit compatibility prediction but not in a retrieval setting. We propose a new framework for outfit complementary item retrieval. Specifically, a category-based subspace attention network is presented, which is a scalable approach for learning the subspace attentions. In addition, we introduce an outfit ranking loss that better models the item relationships of an entire outfit. We evaluate our method on the outfit compatibility, FITB and new retrieval tasks. Experimental results demonstrate that our approach outperforms state-of-the-art methods in both compatibility prediction and complementary item retrieval. 13
Semantic Product Search Nigam, P; Song, YW; Mohan, V; Lakshman, V; Ding, W; Shingavi, A; Teo, CH; Gu, H; Yin, B We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. woman vs. women), and c) sensitivity to spelling errors. To address these issues, we train a deep learning model for semantic matching using customer behavior data. Much of the recent work on large-scale semantic search using deep learning focuses on ranking for web search. In contrast, semantic matching for product search presents several novel challenges, which we elucidate in this paper. We address these challenges by a) developing a new loss function that has an inbuilt threshold to differentiate between random negative examples, impressed but not purchased examples, and positive examples (purchased items), b) using average pooling in conjunction with n-grams to capture short-range linguistic patterns, c) using hashing to handle out of vocabulary tokens, and d) using a model parallel training architecture to scale across 8 GPUs. We present compelling offline results that demonstrate at least 4.7% improvement in Recall@100 and 14.5% improvement in mean average precision (MAP) over baseline state-of-the-art semantic search methods using the same tokenization method. Moreover, we present results and discuss learnings from online A/B tests which demonstrate the efficacy of our method. 13
Estimating Parameters of Nonlinear Systems Using the Elitist Particle Filter Based on Evolutionary Strategies Huemmer, C; Hofmann, C; Maas, R; Kellermann, W In this paper, we present the elitist particle filter based on evolutionary strategies (EPFES) as an efficient approach to estimate the statistics of a latent state vector capturing the relevant information of a nonlinear system. Similar to classical particle filtering, the EPFES consists of a set of particles and respective weights which represent different realizations of the latent state vector and their likelihood of being the solution of the optimization problem. As main innovation, the EPFES includes an evolutionary elitist-particle selection scheme which combines long-term information with instantaneous sampling from an approximated continuous posterior distribution. In this paper, we propose two advancements of the previously published elitist-particle selection process. Further, the EPFES is shown to be a generalization of the widely-used Gaussian particle filter and thus evaluated with respect to the latter: First, we consider the univariate nonstationary growth modelwith time-variant latent state variable to evaluate the tracking capabilities of the EPFES for instantaneously calculated particle weights. This is followed by addressing the problem of single-channel nonlinear acoustic echo cancellation as a challenging benchmark task for identifying an unknown system of large search space: the nonlinear acoustic echo path is modeled by a cascade of a parameterized preprocessor (to model the loudspeaker signal distortions) and a linear FIR filter (to model the sound wave propagation and the microphone). By using long-term information, we highlight the efficacy of the well-generalizing EPFES in estimating the preprocessor parameters for a simulated scenario and a real smartphone recording. Finally, we illustrate similarities between the EPFES and evolutionary algorithms to outline future improvements by fusing the achievements of both fields of research. 13
ONLINE MULTIPLE TARGETS DETECTION AND TRACKING FROM MOBILE ROBOT IN CLUTTERED INDOOR ENVIRONMENTS WITH DEPTH CAMERA Zhou, Y; Yang, YF; Yi, M; Bai, X; Liu, WY; Latecki, LJ Indoor environment is a common scene in our everyday life, and detecting and tracking multiple targets in this environment is a key component for many applications. However, this task still remains challenging due to limited space, intrinsic target appearance variation, e. g. full or partial occlusion, large pose deformation, and scale change. In the proposed approach, we give a novel framework for detection and tracking in indoor environments, and extend it to robot navigation. One of the key components of our approach is a virtual top view created from an RGB-D camera, which is named ground plane projection (GPP). The key advantage of using GPP is the fact that the intrinsic target appearance variation and extrinsic noise is far less likely to appear in GPP than in a regular side-view image. Moreover, it is a very simple task to determine free space in GPP without any appearance learning even from a moving camera. Hence GPP is very different from the top-view image obtained from a ceiling mounted camera. We perform both object detection and tracking in GPP. Two kinds of GPP images are utilized: gray GPP, which represents the maximal height of 3D points projecting to each pixel, and binary GPP, which is obtained by thresholding the gray GPP. For detection, a simple connected component labeling is used to detect footprints of targets in binary GPP. For tracking, a novel Pixel Level Association (PLA) strategy is proposed to link the same target in consecutive frames in gray GPP. It utilizes optical flow in gray GPP, which to our best knowledge has never been done before. Then we back project the detected and tracked objects in GPP to original, sideview (RGB) images. Hence we are able to detect and track objects in the side-view (RGB) images. Our system is able to robustly detect and track multiple moving targets in real time. The detection process does not rely on any target model, which means we do not need any training process. Moreover, tracking does not require any manual initialization, since all entering objects are robustly detected. We also extend the novel framework to robot navigation by tracking. As our experimental results demonstrate, our approach can achieve near prefect detection and tracking results. The performance gain in comparison to state-of-the-art trackers is most significant in the presence of occlusion and background clutter. 13
Toward Achieving Robust Low-Level and High-Level Scene Parsing Shuai, B; Ding, HH; Liu, T; Wang, G; Jiang, XD In this paper, we address the challenging task of scene segmentation. We first discuss and compare two widely used approaches to retain detailed spatial information from pre-trained convolutional context network (CNN)-dilation and skip. Then, we demonstrate that the parsing performance of skip network can be noticeably improved by modifying the parameterization of skip layers. Furthermore, we introduce a dense skip architecture to retain a rich set of low-level information from the pre-trained CNN, which is essential to improve the low-level parsing performance. Meanwhile, we propose a CCN and place it on top of pre-trained CNNs, which is used to aggregate contexts for high-level feature maps so that robust high-level parsing can be achieved. We name our segmentation network enhanced fully convolutional network (EFCN) based on its significantly enhanced structure over FCN. Extensive experimental studies justify each contribution separately. Without bells and whistles, EFCN achieves state-of-the-arts on segmentation datasets of ADE20K, Pascal Context, SUN-RGBD, and Pascal VOC 2012. 13
Bayesian Models for Product Size Recommendations Sembium, V; Rastogi, R; Tekumalla, L; Saroop, A Lack of calibrated product sizing in popular categories such as apparel and shoes leads to customers purchasing incorrect sizes, which in turn results in high return rates due to fit issues. We address the problem of product size recommendations based on customer purchase and return data. We propose a novel approach based on Bayesian logit and probit regression models with ordinal categories {Small, Fit, Large} to model size fits as a function of the difference between latent sizes of customers and products. We propose posterior computation based on mean-field variational inference, leveraging the Polya-Gamma augmentation for the logit prior, that results in simple updates, enabling our technique to efficiently handle large datasets. Our Bayesian approach effectively deals with issues arising from noise and sparsity in the data providing robust recommendations. Offline experiments with real-life shoe datasets show that our model outperforms the state-of-the-art in 5 of 6 datasets. and leads to an improvement of 17-26% in AUC over baselines when predicting size fit outcomes. 12
Image Based Virtual Try-on Network from Unpaired Data Neuberger, A; Borenstein, E; Hilleli, B; Oks, E; Alpert, S This paper presents a new image-based virtual try-on approach (Outfit-VITON) that helps visualize how a composition of clothing items selected from various reference images form a cohesive outfit on a person in a query image. Our algorithm has two distinctive properties. First, it is inexpensive, as it simply requires a large set of single (non-corresponding) images (both real and catalog) of people wearing various garments without explicit 3D information. The training phase requires only single images, eliminating the need for manually creating image pairs, where one image shows a person wearing a particular garment and the other shows the same catalog garment alone. Secondly, it can synthesize images of multiple garments composed into a single, coherent outfit; and it enables control of the type of garments rendered in the final outfit. Once trained, our approach can then synthesize a cohesive outfit from multiple images of clothed human models, while fitting the outfit to the body shape and pose of the query person. An online optimization step takes care of fine details such as intricate textures and logos. Quantitative and qualitative evaluations on an image dataset containing large shape and style variations demonstrate superior accuracy compared to existing state-of-the-art methods, especially when dealing with highly detailed garments. 12
Image Search with Text Feedback by Visiolinguistic Attention Learning Chen, YB; Gong, SG; Bazzani, L Image search with text feedback has promising impacts in various real-world applications, such as e-commerce and internet search. Given a reference image and text feedback from user, the goal is to retrieve images that not only resemble the input image, but also change certain aspects in accordance with the given text. This is a challenging task as it requires the synergistic understanding of both image and text. In this work, we tackle this task by a novel Visiolinguistic Attention Learning (VAL) framework. Specifically, we propose a composite transformer that can be seamlessly plugged in a CNN to selectively preserve and transform the visual features conditioned on language semantics. By inserting multiple composite transformers at varying depths, VAL is incentive to encapsulate the multi-granular visiolinguistic information, thus yielding an expressive representation for effective image search. We conduct comprehensive evaluation on three datasets: Fashion200k, Shoes and FashionIQ. Extensive experiments show our model exceeds existing approaches on all datasets, demonstrating consistent superiority in coping with various text feedbacks, including attribute-like and natural language descriptions. 12
Topological Interference Management With User Admission Control via Riemannian Optimization Shi, YM; Mishra, B; Chen, W Topological interference management (TIM) provides a promising way to manage interference only based on the network connectivity information. Previous works on the TIM problem mainly focus on using the index coding approach and graph theory to establish conditions of network topologies to achieve the feasibility of topological interference management. In this paper, we propose a novel user admission control approach via sparse and low-rank optimization to maximize the number of admitted users for achieving the feasibility of topological interference management. However, the resulting sparse and low-rank optimization problem is non-convex and highly intractable, for which the conventional convex relaxation approaches are inapplicable, e.g., a simple l(1)-norm relaxation approach yields the objective unbounded and non-convex. To assist efficient algorithms design for the formulated rank-constrained (i.e., degrees-of-freedom (DoFs) allocation) l(0)-norm maximization (i.e., user capacity maximization) problem, we propose a novel non-convex but smoothed l(1)-regularized minimization approach to induce sparsity pattern with bounded objective values. We further develop a Riemannian trust-region algorithm to solve the resulting rank-constrained smooth non-convex optimization problem via exploiting the quotient manifold of fixed-rank matrices. Simulation results demonstrate the effectiveness and optimality of the proposed Riemannian algorithm to maximize the number of admitted users for topological interference management. 12
An Overview of Coding Tools in AV1: the First Video Codec from the Alliance for Open Media Chen, Y; Mukherjee, D; Han, JN; Grange, A; Xu, YW; Parker, S; Chen, C; Su, H; Joshi, U; Chiang, CH; Wang, YQ; Wilkins, P; Bankoski, J; Trudeau, L; Egge, N; Valin, JM; Davies, T; Midtskogen, S; Norkin, A; de Rivaz, P; Liu, Z In 2018, the Alliance for Open Media (AOMedia) finalized its first video compression format AV1, which is jointly developed by the industry consortium of leading video technology companies. The main goal of AV1 is to provide an open source and royalty-free video coding format that substantially outperforms state-of-the-art codecs available on the market in compression efficiency while remaining practical decoding complexity as well as being optimized for hardware feasibility and scalability on modern devices. To give detailed insights into how the targeted performance and feasibility is realized, this paper provides a technical overview of key coding techniques in AV1. Besides, the coding performance gains are validated by video compression tests performed with the libaom AV1 encoder against the libvpx VP9 encoder. Preliminary comparison with two leading HEVC encoders, x265 and HM, and the reference software of VVC is also conducted on AOM's common test set and an open 4k set. 12
Do State Laws Protecting Older Workers from Discrimination Reduce Age Discrimination in Hiring? Evidence from a Field Experiment Neumark, D; Burn, I; Button, P; Chehras, N We conduct a resume field experiment in all US states to study how state laws protecting older workers from age discrimination affect age discrimination in hiring for retail sales jobs. We relate the difference in callback rates between old and young applicants to states' variation in age and disability discrimination laws. These laws could boost hiring of older applicants, although they could have the unintended consequence of deterring hiring if they increase termination costs. In our preferred estimates that are weighted to be representative of the workforce, we find evidence that there is less discrimination against older men and women in states where age discrimination law allows larger damages and more limited evidence that there is less discrimination against older women in states where disability discrimination law allows larger damages. Our clearest result is that the laws do not have the unintended consequence of lowering callbacks for older workers. 12
Data Quality - The Role of Empiricism Sadiq, S; Dasu, T; Dong, XL; Freire, J; Ilyas, IF; Link, S; Miller, RJ; Naumann, F; Zhou, XF; Srivastava, D We outline a call to action for promoting empiricism in data quality research. The action points result from an analysis of the landscape of data quality research. The landscape exhibits two dimensions of empiricism in data quality research relating to type of metrics and scope of method. Our study indicates the presence of a data continuum ranging from real to synthetic data, which has implications for how data quality methods are evaluated. The dimensions of empiricism and their inter-relationships provide a means of positioning data quality research, and help expose limitations, gaps and opportunities. 12
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models Drugman, T; Pylkkonen, J; Kneser, R The goal of this paper is to simulate the benefits of jointly applying active learning (AL) and semi-supervised training (SST) in a new speech recognition application. Our data selection approach relies on confidence filtering, and its impact on both the acoustic and language models (AM and LM) is studied. While AL is known to be beneficial to AM training, we show that it also carries out substantial improvements to the LM when combined with SST. Sophisticated confidence models, on the other hand, did not prove to yield any data selection gain. Our results indicate that, while SST is crucial at the beginning of the labeling process, its gains degrade rapidly as AL is set in place. The final simulation reports that AL allows a transcription cost reduction of about 70% over random selection. Alternatively, for a fixed transcription budget, the proposed approach improves the word error rate by about 12.5% relative. 11
A High-Performance Algorithm for Identifying Frequent Items in Data Streams Anderson, D; Bevan, P; Lang, K; Liberty, E; Rhodes, L; Thaler, J Estimating frequencies of items over data streams is a common building block in streaming data measurement and analysis. Misra and Gries introduced their seminal algorithm for the problem in 1982, and the problem has since been revisited many times due its practicality and applicability. We describe a highly optimized version of Misra and Gries' algorithm that is suitable for deployment in industrial settings. Our code is made public via an open source library called Data Sketches that is already used by several companies and production systems. Our algorithm improves on two theoretical and practical aspects of prior work. First, it handles weighted updates in amortized constant time, a common requirement in practice. Second, it uses a simple and fast method for merging summaries that asymptotically improves on prior work even for unweighted streams. We describe experiments confirming that our algorithms are more efficient than prior proposals. 11
A Robust Approach for Mitigating Risks in Cyber Supply Chains Zheng, KY; Albert, LA In recent years, there have been growing concerns regarding risks in federal information technology (IT) supply chains in the United States that protect cyber infrastructure. A critical need faced by decisionmakers is to prioritize investment in security mitigations to maximally reduce risks in IT supply chains. We extend existing stochastic expected budgeted maximum multiple coverage models that identify good solutions on average that may be unacceptable in certain circumstances. We propose three alternative models that consider different robustness methods that hedge against worst-case risks, including models that maximize the worst-case coverage, minimize the worst-case regret, and maximize the average coverage in the (1-alpha) worst cases (conditional value at risk). We illustrate the solutions to the robust methods with a case study and discuss the insights their solutions provide into mitigation selection compared to an expected-value maximizer. Our study provides valuable tools and insights for decisionmakers with different risk attitudes to manage cybersecurity risks under uncertainty. 11
A budgeted maximum multiple coverage model for cybersecurity planning and management Zheng, KY; Albert, LA; Luedtke, JR; Towle, E This article studies how to identify strategies for mitigating cyber-infrastructure vulnerabilities. We propose an optimization framework that prioritizes the investment in security mitigations to maximize the coverage of vulnerabilities. We use multiple coverage to reflect the implementation of a layered defense, and we consider the possibility of coverage failure to address the uncertainty in the effectiveness of some mitigations. Budgeted Maximum Multiple Coverage (BMMC) problems are formulated, and we demonstrate that the problems are submodular maximization problems subject to a knapsack constraint. Other variants of the problem are formulated given different possible requirements for selecting mitigations, including unit cost cardinality constraints and group cardinality constraints. We design greedy approximation algorithms for identifying near-optimal solutions to the models. We demonstrate an optimal (1-1/e)-approximation ratio for BMMC and a variation of BMMC that considers the possibility of coverage failure, and a 1/2-approximation ratio for a variation of BMMC that uses a cardinality constraint and group cardinality constraints. The computational study suggests that our models yield robust solutions that use a layered defense and provide an effective mechanism to hedge against the risk of possible coverage failure. We also find that the approximation algorithms efficiently identify near-optimal solutions, and that a Benders branch-and-cut algorithm we propose can find provably optimal solutions to the vast majority of our test instances within an hour for the variations of the proposed models that consider coverage failures. 11
Lot targeting and lot dispatching decision policies for semiconductor manufacturing: optimisation under uncertainty with simulation validation Siebert, M; Bartlett, K; Kim, H; Ahmed, S; Lee, J; Nazzal, D; Nemhauser, G; Sokol, J Because semiconductor manufacturing is a complex and dynamic process, production scheduling in this industry typically relies on simple decision policies that use local rather than global information. Such myopic policies may lead to increased congestion in the material handling system and negatively impact throughput. In this paper, we propose a fluid-model lot dispatching policy that iteratively optimises lot selection based on current WIP distribution of the entire system. Furthermore, we propose to split the decision policies into two phases in order to include travel times information into the dispatching and targeting decisions. We provide simulation results for a prototype facility that show that our proposed policies outperform commonly used dispatching rules in throughput, machine utilisation and machine target accuracy. 11
Collective Multi-type Entity Alignment Between Knowledge Graphs Zhu, Q; Wei, H; Sisman, B; Zheng, D; Faloutsos, C; Dong, XL; Han, JW Knowledge graph (eg. Freebase, YAGO) is a multi-relational graph representing rich factual information among entities of various types. Entity alignment is the key step towards knowledge graph integration from multiple sources. It aims to identify entities across different knowledge graphs that refer to the same real world entity. However, current entity alignment systems overlook the sparsity of different knowledge graphs and can not align multi-type entities by one single model. In this paper, we present a Collective Graph neural network for Multi-type entity Alignment, called CG-MuAlign. Different from previous work, CG-MuAlign jointly aligns multiple types of entities, collectively leverages the neighborhood information and generalizes to unlabeled entity types. Specifically, we propose novel collective aggregation function tailored for this task, that (1) relieves the incompleteness of knowledge graphs via both cross-graph and self attentions, (2) scales up efficiently with mini-batch training paradigm and effective neighborhood sampling strategy. We conduct experiments on real world knowledge graphs with millions of entities and observe the superior performance beyond existing methods. In addition, the running time of our approach is much less than the current state-of-the-art deep learning methods. 11
Measuring Technological Innovation over the Long Run Kelly, B; Papanikolaou, D; Seru, A; Taddy, M We use textual analysis of -high-dimensional data from patent documents to create new indicators of technological innovation. We identify important patents based on textual similarity of a given patent to previous and subsequent work: these patents are distinct from previous work but related to subsequent innovations. Our importance indicators correlate with existing measures of patent quality but also provide complementary information. We identify breakthrough innovations as the most important patents-those in the right tail of our measure-and construct time series indices of technological change at the aggregate and sectoral levels. Our technology indices capture the evolution of technological waves over a long time span (1840 to the present) and cover innovation by private and public firms as well as nonprofit organizations and the US government. Advances in electricity and transportation drive the index in the 1880s, chemicals and electricity in the 1920s and 1930s, and computers and communication in the post-1980s. 11
Arousal and economic decision making Jahedi, S; Deck, C; Ariely, D Previous experiments have found that subjecting participants to cognitive load leads to poorer decision making, consistent with dual-system models of behavior. Rather than taxing the cognitive system, this paper reports the results of an experiment that takes a complementary approach: arousing the emotional system. The results indicate that exposure to arousing visual stimuli as compared to neutral images has a negligible impact on performance in arithmetic tasks, impatience, risk taking in the domain of losses, and snack choice although we find that arousal modestly increases risk-taking in the gains domain and increases susceptibility to anchoring effects. We find the effect of arousal on decision making to be smaller and less consistent then the effect of increased cognitive load for the same tasks. (C) 2016 Elsevier B.V. All rights reserved. 11
MuSCA: a multi-scale source-sink carbon allocation model to explore carbon allocation in plants. An application to static apple tree structures Reyes, F; Pallas, B; Pradal, C; Vaggi, F; Zanotelli, D; Tagliavini, M; Gianelle, D; Costes, E Background and aims Carbon allocation in plants is usually represented at a topological scale, specific to each model. This makes the results obtained with different models, and the impact of their scales of representation, difficult to compare. In this study, we developed a multi-scale carbon allocation model (MuSCA) that allows the use of different, user-defined, topological scales of a plant, and assessment of the impact of each spatial scale on simulated results and computation time. Methods Model multi-scale consistency and behaviour were tested on three realistic apple tree structures. Carbon allocation was computed at five scales, spanning from the metamer (the finest scale, used as a reference) up to first-order branches, and for different values of a sap friction coefficient. Fruit dry mass increments were compared across spatial scales and with field data. Key Results The model was able to represent effects of competition for carbon assimilates on fruit growth. Intermediate friction parameter values provided results that best fitted lield data. Fruit growth simulated at the metamer scale differed of -1 % in respect to results obtained at growth unit scale and up to 60 % in respect to first order branch and fruiting unit scales. Generally, the coarser the spatial scale the more predicted fruit growth diverged from the reference. Coherence in fruit growth across scales was also differentially impacted, depending on the tree structure considered. Decreasing the topological resolution reduced computation time by up to four orders of magnitude. Conclusions MuSCA revealed that the topological scale has a major influence on the simulation of carbon allocation. This suggests that the scale should be a factor that is carefully evaluated when using a carbon allocation model, or when comparing results produced by different models. Finally, with MuSCA. trade-off between computation time and prediction accuracy can be evaluated by changing topological scales. 11
Online cluster validity indices for performance monitoring of streaming data clustering Moshtaghi, M; Bezdek, JC; Erfani, SM; Leckie, C; Bailey, J Cluster analysis is used to explore structure in unlabeled batch data sets in a wide range of applications. An important part of cluster analysis is validating the quality of computationally obtained clusters. A large number of different internal indices have been developed for validation in the offline setting. However, this concept cannot be directly extended to the online setting because streaming algorithms do not retain the data, nor maintain a partition of it, both needed by batch cluster validity indices. In this paper, we develop two incremental versions (with and without forgetting factors) of the Xie-Beni and Davies-Bouldin validity indices, and use them to monitor and control two streaming clustering algorithms (sk-means and online ellipsoidal clustering), In this context, our new incremental validity indices are more accurately viewed as performance monitoring functions. We also show that incremental cluster validity indices can send a distress signal to online monitors when evolving structure leads an algorithm astray. Our numerical examples indicate that the incremental Xie-Beni index with a forgetting factor is superior to the other three indices tested. 10
An Integrated ANN - GA Approach to Maximise the Material Removal Rate and Surface Roughness of Wire Cut EDM on Titanium Alloy Karthikeyan, R; Kumar, VS; Punitha, A; Chavan, UM This investigation was planned to get the optimized material removal rate and surface roughness of Wire Cut Electric Discharge Machining (WCEDM) on Ti6A4 V by taking into consideration of four input factors such as pulse on, pulse off, voltage and input power. Taguchi supported L9 orthogonal array was used to determine the total number of experimental conditions and its values of material removal rate (MRR) and surface roughness (SR) were calculated. Instead of trying the traditional regression model, in this investigation ANN model was constructed; as ANN is more effective when the number of experiments is restricted. To optimize the material removal rate and surface roughness, a feed forward artificial neural network model was developed and genetic algorithm was used by optimizing the weighing factors of the network in the neural power software. Finally, the model was achieved with the root mean square error of 0.0059 and 0.0033 for MRR and SR respectively. In turn the optimized value of MRR and SR were found 7429 mm(3)/min and 2.1068 mu m. 10
Automatic Speaker Recognition with Limited Data Li, RR; Jiang, JY; Liu, JH; Hsieh, CC; Wang, W Automatic speaker recognition (ASR) is a stepping-stone technology towards semantic multimedia understanding and benefits versatile downstream applications. In recent years, neural network-based ASR methods have demonstrated remarkable power to achieve excellent recognition performance with sufficient training data. However, it is impractical to collect sufficient training data for every user, especially for fresh users. Therefore, a large portion of users usually has a very limited number of training instances. As a consequence, the lack of training data prevents ASR systems from accurately learning users acoustic biometrics, jeopardizes the downstream applications, and eventually impairs user experience. In this work, we propose an adversarial few-shot learning-based speaker identification framework (AFEASI) to develop robust speaker identification models with only a limited number of training instances. We first employ metric learning-based few-shot learning to learn speaker acoustic representations, where the limited instances are comprehensively utilized to improve the identification performance. In addition, adversarial learning is applied to further enhance the generalization and robustness for speaker identification with adversarial examples. Experiments conducted on a publicly available large-scale dataset demonstrate that \model significantly outperforms eleven baseline methods. An in-depth analysis further indicates both effectiveness and robustness of the proposed method. 10
Learning Implicit Surface Light Fields Oechsle, M; Niemeyer, M; Reiser, C; Mescheder, L; Strauss, T; Geiger, A Implicit representations of 3D objects have recently achieved impressive results on learning-based 3D reconstruction tasks. While existing works use simple texture models to represent object appearance, photo-realistic image synthesis requires reasoning about the complex interplay of light, geometry and surface properties. In this work, we propose a novel implicit representation for capturing the visual appearance of an object in terms of its surface light field. In contrast to existing representations, our implicit model represents surface light fields in a continuous fashion and independent of the geometry. Moreover, we condition the surface light field with respect to the location and color of a small light source. Compared to traditional surface light field models, this allows us to manipulate the light source and relight the object using environment maps. We further demonstrate the capabilities of our model to predict the visual appearance of an unseen object from a single real RGB image and corresponding 3D shape information. As evidenced by our experiments, our model is able to infer rich visual appearance including shadows and specular reflections. Finally, we show that the proposed representation can be embedded into a variational auto-encoder for generating novel appearances that conform to the specified illumination conditions. 10
Understanding the relationship between levels of mobile technology use in high school physics classrooms and the learning outcome Zhai, XM; Zhang, ML; Li, M; Zhang, XJ Mobile technology has been increasingly adopted in science education. We generally assume that more innovative use of mobile technology leads to a greater learning outcome. Yet, there is a lack of empirical research to support this assumption. To fill in the gap, we drew upon data from 803 high school students who had used mobile devices for five months in physics classrooms. Using the SAMR model (ie, Substitution, Augmentation, Modification, and Redefinition), we distinguished their uses into two levels: Substitution (replacing traditional instructional approach with mobile technology without functional improvement), and augmentation (enhancing instruction with affordances provided by mobile technology). Using Hierarchical Linear Modeling analysis, we found that the augmentation level of use was positively correlated with the physics learning outcome, but the substitution level of use was not. We further identified four sub-types of uses within the augmentation level. We found that after-school remediating activities and student-teacher displaying activities were positively correlated with student physics achievement, but teacher-assigned activities had no significant correlation and learning aid activities had a negative correlation with the learning outcome. This study provided empirical evidence to support the assumption that a higher level of mobile technology use may be related to a greater learning outcome and that the impact of mobile technology may be determined by multiple factors such as who initiates the use and whether the use enhances or distracts students' knowledge construction. 10
The Effects of Empirical Keying of Personality Measures on Faking and Criterion-Related Validity Cucina, JM; Vasilopoulos, NL; Su, CW; Busciglio, HH; Cozma, I; DeCostanza, AH; Martin, NR; Shaw, MN We investigated the effects of empirical keying on scoring personality measures. To our knowledge, this is the first published study to investigate the use of empirical keying for personality in a selection context. We hypothesized that empirical keying maximizes use of the information provided in responses to personality items. We also hypothesized that it reduces faking since the relationship between response options and performance is not obvious to respondents. Four studies were used to test the hypotheses. In Study 1, the criterion-related validity of empirically keyed personality measures was investigated using applicant data from a law enforcement officer predictive validation study. A combination of training and job performance measures was used as criteria. In Study 2, two empirical keys were created for long and short measures of the five factors. The criterion-related validities of the empirical keys were investigated using Freshman GPA (FGPA) as a criterion. In Study 3, one set of the empirical keys from Study 2 was applied to experimental data to examine the effects of empirical keying on applicant faking and on the relationship of personality scores and cognitive ability. In Study 4, we examined the generalizability of empirical keying across different organizations. Across the studies, option- and item-level empirical keying increased criterion-related validities for academic, training, and job performance. Empirical keying also reduced the effects of faking. Thus, both hypotheses were supported. We recommend that psychologists using personality measures to predict performance should consider the use of empirical keying as it enhanced validity and reduced faking. 10
Ceres: Harvesting Knowledge from the Semi-structured Web Dong, XL Knowledge graphs have been used to support a wide range of applications and enhance search and QA for Google, Bing, Amazon Alexa, etc. However, we often miss long-tail knowledge, including unpopular entities, unpopular relations, and unpopular verticals. In this talk we describe our efforts in harvesting knowledge from semi-structured websites, which are often populated according to some templates using vast volume of data stored in underlying databases. We describe our Ceres system, which extracts knowledge from semi-structured web. AutoCeres is a ClosedIE system that extracts knowledge according to existing ontology. It improves the accuracy of fully automatic knowledge extraction from 60%+ of state-of-the-art to 90%+ on semi-structured data. OpenCeres is the first-ever OpenIE system on semi-structured data, that is able to identify new relations not readily included in existing ontologies. ZeroShotCeres goes further and enables extracting knowledge for completely new domains, where there is no seed knowledge for bootstrapping the extraction. Finally, we describe our other efforts in ontology alignment, entity linkage, graph mining, and QA, that allow us to best leverage the knowledge we extract for search and QA. 9
DER-TEE: Secure Distributed Energy Resource Operations Through Trusted Execution Environments Sebastian, DJ; Agrawal, U; Tamimi, A; Hahn, A The high penetration of renewable energy means the grid is increasingly dependent on consumer-owned devices operation, providing a growing nexus between the Internet of Things (IoT) and the smart grid. However, these devices are much more vulnerable as they are connected, through interconnections to utility, manufacturers, third-party operators, and other consumer IoT devices. Therefore, novel security mechanisms are needed to protect these devices, especially ensuring the integrity of critical measurements and control messages. Fortunately, the growing prevalence of hardware-enforced trusted execution environments (TEEs) provides an opportunity to utilize their secure storage and cryptographic functions to provide enhanced security to various IoT platforms. This paper will demonstrate a TEE-based architecture for smart inverters that utilizes hardware and software-based isolation to prevent tampering of inverter telemetry data. Furthermore, it provides an implementation of the proposed architecture on an ARM TrustZone-enabled platform using open portable TEE (OP-TEE) on a Raspberry-Pi. The developed implementation is evaluated under a set of cybersecurity metrics. 9
Node Embedding via Word Embedding for Network Community Discovery Ding, WC; Lin, C; Ishwar, P Neural node embeddings have recently emerged as a powerful representation for supervised learning tasks involving graph-structured data. We leverage this recent advance to develop a novel algorithm for unsupervised community discovery in graphs. Through extensive experimental studies on simulated and real-world data, we demonstrate that the proposed approach consistently improves over the current state-of-the-art. Specifically, our approach empirically attains the information-theoretic limits for community recovery under the benchmark stochastic block models for graph generation and exhibits better stability and accuracy over both spectral clustering and acyclic belief propagation in the community recovery limits. 9
Cartogram Visualization for Bivariate Geo-Statistical Data Nusrat, S; Alam, MJ; Scheidegger, C; Kobourov, S We describe bivariate cartograms, a technique specifically designed to allow for the simultaneous comparison of two geo-statistical variables. Traditional cartograms are designed to show only a single statistical variable, but in practice, it is often useful to show two variables (e.g., the total sales for two competing companies) simultaneously. We illustrate bivariate cartograms using Dorling-style cartograms, yet the technique is simple and generalizable to other cartogram types, such as contiguous cartograms, rectangular cartograms, and non-contiguous cartograms. An interactive feature makes it possible to switch between bivariate cartograms, and the traditional (monovariate) cartograms. Bivariate cartograms make it easy to find more geographic patterns and outliers in a pre-attentive way than previous approaches, as shown in Fig. 2. They are most effective for showing two variables from the same domain (e.g., population in two different years, sales for two different companies), although they can also be used for variables from different domains (e.g., population and income). We also describe a small-scale evaluation of the proposed techniques that indicates bivariate cartograms are especially effective for finding geo-statistical patterns, trends and outliers. 9
A dynamic ambulance routing model with multiple response Yoon, S; Albert, LA Emergency medical services systems equipped with both advanced and basic emergency vehicles often dispatch both types of vehicles to one call, which is called multiple response. Multiple response allows for faster response times at the potential cost of making more vehicles unavailable for service. To evaluate the value of multiple response, we formulate a Markov decision process model that dynamically determines which type of vehicle(s) to dispatch based. We show that the optimal policies are class separable. Numerical experiments demonstrate that multiple response can significantly improve system performance when patients health needs are uncertain. 9
Sustainability at Scale: Towards Bridging the Intention-Behavior Gap with Sustainable Recommendations Tomkins, S; Isley, S; London, B; Getoor, L Finding sustainable products and evaluating their claims is a significant barrier facing sustainability-minded customers. Tools that reduce both these burdens are likely to boost the sale of sustainable products. However, it is difficult to determine the sustainability characteristics of these products - there are a variety of certifications and definitions of sustainability, and quality labeling requires input from domain experts. In this paper, we propose a flexible probabilistic framework that uses domain knowledge to identify sustainable products and customers, and uses these labels to predict customer purchases. We evaluate our approach on grocery items from the Amazon catalog. Our proposed approach outperforms established recommender system models in predicting future purchases while jointly inferring sustainability scores for customers and products. 9
Content-BasedWeak Supervision for Ad-Hoc Re-Ranking MacAvaney, S; Yates, A; Hui, K; Frieder, O One challenge with neural ranking is the need for a large amount of manually-labeled relevance judgments for training. In contrast with prior work, we examine the use of weak supervision sources for training that yield pseudo query-document pairs that already exhibit relevance (e.g., newswire headline-content pairs and encyclopedic heading-paragraph pairs). We also propose filtering techniques to eliminate training samples that are too far out of domain using two techniques: a heuristic-based approach and novel supervised filter that re-purposes a neural ranker. Using several leading neural ranking architectures and multiple weak supervision datasets, we show that these sources of training pairs are effective on their own (outperforming prior weak supervision techniques), and that filtering can further improve performance. 9
Researcher mobility at a US research-intensive university: Implications for research and internationalization strategies Payumo, JG; Lan, G; Arasu, P This study offers a unique lens on the patterns, productivity, and impact of researcher mobility at a US research-intensive university. Bibliometric data for Washington State University (WSU) was extracted from Elsevier's Scopus database and analyzed for the 10-year period from 2002 to 2012. We grouped researchers into four categories based on common patterns of movement into, within, and out of the USA: mobile (inflow, outflow, and transitory) versus non-mobile (stationary). We compared the research performances of these different groups using two normalized indicators: relative research productivity and the field-weighted citation impact of the researchers' publications. Our analysis showed that 83% of active researchers at WSU were mobile during the 10-year period based on their having both publications affiliated with WSU and publications affiliated with at least one other institution. The publications of mobile researchers had higher impact compared to non-mobile researchers. Additionally, WSU researchers who primarily moved between other US-based institutions produced publications with higher impact compared to those of internationally mobile researchers, though the latter group was more prolific. Transitory researchers-those spending less than 2 years at either WSU or another institution-comprised the largest sub-group of mobile researchers at 59%. The results of this study offer additional evidence about the value to US universities of researcher mobility and greater research collaborations with both domestic and international partners. 9
Rivulet: A Fault-Tolerant Platform for Smart-Home Applications Ardekani, MS; Singh, RP; Agrawal, N; Terry, DB; Suminto, RO Rivulet is a fault-tolerant distributed platform for running smart-home applications; it can tolerate failures typical for a home environment (e.g., link losses, network partitions, sensor failures, and device crashes). In contrast to existing cloud-centric solutions, which rely exclusively on a home gateway device, Rivulet leverages redundant smart consumer appliances (e.g., TVs, Refrigerators) to spread sensing and actuation across devices local to the home, and avoids making the Smart-Home Hub a single point of failure. Rivulet ensures event delivery in the presence of link loss, network partitions and other failures in the home, to enable applications with reliable sensing in the case of sensor failures, and event processing in the presence of device crashes. In this paper, we present the design and implementation of Rivulet, and evaluate its effective handling of failures in a smart home. 9
A physics-informed deep learning approach for bearing fault detection Shen, S; Lu, H; Sadoughi, M; Hu, C; Nemani, V; Thelen, A; Webster, K; Darr, M; Sidon, J; Kenny, S In recent years, advances in computer technology and the emergence of big data have enabled deep learning to achieve impressive successes in bearing condition monitoring and fault detection. While existing deep learning approaches are able to efficiently detect and classify bearing faults, most of these approaches depend exclusively on data and do not incorporate physical knowledge into the learning and prediction processes-or more importantly, embed the physical knowledge of bearing faults into the model training process, which makes the model physically meaningful. To address this challenge, we propose a physics-informed deep learning approach that consists of a simple threshold model and a deep convolutional neural network (CNN) model for bearing fault detection. In the proposed physics-informed deep learning approach, the threshold model first assesses the health classes of bearings based on known physics of bearing faults. Then, the CNN model automatically extracts high-level characteristic features from the input data and makes full use of these features to predict the health class of a bearing. We designed a loss function for training and validating the CNN model that selectively amplifies the effect of the physical knowledge assimilated by the threshold model when embedding this knowledge into the CNN model. The proposed physics-informed deep learning approach was validated using (1) data from 18 bearings on an agricultural machine operating in the field, and (2) data from bearings on a laboratory test stand in the Case Western Reserve University (CWRU) Bearing Data Center. 9
Two Indias: The structure of primary health care markets in rural Indian villages with implications for policy Das, J; Daniels, B; Ashok, M; Shim, EY; Muralidharan, K We visited 1519 villages across 19 Indian states in 2009 to (a) count all health care providers and (b) elicit their quality as measured through tests of medical knowledge. We document three main findings. First, 75% of villages have at least one health care provider and 64% of care is sought in villages with 3 or more providers. Most providers are in the private sector (86%) and, within the private sector, the majority are 'informal providers' without any formal medical training. Our estimates suggest that such informal providers account for 68% of the total provider population in rural India. Second, there is considerable variation in quality across states and formal qualifications are a poor predictor of quality. For instance, the medical knowledge of informal providers in Tamil Nadu and Karnataka is higher than that of fully trained doctors in Bihar and Uttar Pradesh. Surprisingly, the share of informal providers does not decline with socioeconomic status. Instead, their quality, along with the quality of doctors in the private and public sector, increases sharply. Third, India is divided into two nations not just by quality of health care providers, but also by costs: Better performing states provide higher quality at lower per-visit costs, suggesting that they are on a different production possibility frontier. These patterns are consistent with significant variation across states in the availability and quality of medical education. Our results highlight the complex structure of health care markets, the large share of private informal providers, and the substantial variation in the quality and cost of care across and within markets in rural India. Measuring and accounting for this complexity is essential for health care policy in India. 9
An integrated acquisition policy for supplier selection and lot sizing considering total quantity discounts and a quality constraint Li, X; Ventura, JA; Venegas, BB; Kweon, SJ; Hwang, SW We consider a two-stage supply chain where a buyer purchases a product from multiple capacitated suppliers to satisfy a constant demand rate over a finite planning horizon. Suppliers have different perfect rates and offer total quantity discounts. The buyer selects suppliers and allocates orders to them that satisfy a minimum average quality level. A mathematical model is proposed with the objective on minimizing the total cost per time unit. The model is solved by dualizing the quality constraint. The relaxed model is solved by an efficient dynamic programming algorithm. The subgradient method is used to solve the dual problem. 9
Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities Bickhart, DM; Kolmogorov, M; Tseng, E; Portik, DM; Korobeynikov, A; Tolstoganov, I; Uritskiy, G; Liachko, I; Sullivan, ST; Shin, SB; Zorea, A; Andreu, VP; Panke-Buisse, K; Medema, MH; Mizrahi, I; Pevzner, PA; Smith, TPL Microbial communities might include distinct lineages of closely related organisms that complicate metagenomic assembly and prevent the generation of complete metagenome-assembled genomes (MAGs). Here we show that deep sequencing using long (HiFi) reads combined with Hi-C binning can address this challenge even for complex microbial communities. Using existing methods, we sequenced the sheep fecal metagenome and identified 428 MAGs with more than 90% completeness, including 44 MAGs in single circular contigs. To resolve closely related strains (lineages), we developed MAGPhase, which separates lineages of related organisms by discriminating variant haplotypes across hundreds of kilobases of genomic sequence. MAGPhase identified 220 lineage-resolved MAGs in our dataset. The ability to resolve closely related microbes in complex microbial communities improves the identification of biosynthetic gene clusters and the precision of assigning mobile genetic elements to host genomes. We identified 1,400 complete and 350 partial biosynthetic gene clusters, most of which are novel, as well as 424 (298) potential host-viral (host-plasmid) associations using Hi-C data. Metagenome sequencing can now distinguish closely related microbes using long reads and haplotype phasing. 9
Probabilistic Semantic Retrieval for Surveillance Videos With Activity Graphs Chen, YT; Wang, J; Bai, YN; Castanon, G; Saligrama, V We present a novel framework for finding complex activities matching user-described queries in cluttered surveillance videos. The wide diversity of queries coupled with the unavailability of annotated activity data limits our ability to train activity models. To bridge the semantic gap, we propose letting users describe an activity as a semantic graph with object attributes and inter-object relationships associated with nodes and edges, respectively. We learn node/edge-level visual predictors during training and, at test-time, propose retrieving activity by identifying likely locations that match the semantic graph. We formulate a novel conditional random field-based probabilistic activity localization objective that accounts for misdetections, misclassifications and track losses, and outputs a likelihood score for a candidate grounded location of the query in the video. We seek groundings that maximize overall precision and recall. To handle the combinatorial search over all high-probability groundings, we propose a highest precision subgraph matching algorithm. Our method outperforms existing retrieval methods on benchmarked datasets. 8
Cooperative Highway Work Zone Merge Control Based on Reinforcement Learning in a Connected and Automated Environment Ren, TZ; Xie, YC; Jiang, LM Given the aging infrastructure and the anticipated growing number of highway work zones in the U.S.A., it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed method assumes that all vehicles are fully automated, connected, and cooperative. It inserts two metering zones in the open lane to make space for merging vehicles in the closed lane. In addition, each vehicle in the closed lane learns how to adjust its longitudinal position optimally to find a safe gap in the open lane using an off-policy soft actor critic reinforcement learning (RL) algorithm, considering its surrounding traffic conditions. The learning results are captured in convolutional neural networks and used to control individual vehicles in the testing phase. By adding the metering zones and taking the locations, speeds, and accelerations of surrounding vehicles into account, cooperation among vehicles is implicitly considered. This RL-based model is trained and evaluated using a microscopic traffic simulator. The results show that this cooperative RL-based merge control significantly outperforms popular strategies such as late merge and early merge in terms of both mobility and safety measures. It also performs better than a strategy assuming all vehicles are equipped with cooperative adaptive cruise control. 8
Bayesian Adversarial Human Motion Synthesis Zhao, R; Su, H; Ji, Q We propose a generative probabilistic model for human motion synthesis. Our model has a hierarchy of three layers. At the bottom layer, we utilize Hidden semi-Markov Model (HSMM), which explicitly models the spatial pose, temporal transition and speed variations in motion sequences. At the middle layer, HSMM parameters are treated as random variables which are allowed to vary across data instances in order to capture large intra- and inter-class variations. At the top layer, hyperparameters define the prior distributions of parameters, preventing the model from overfitting. By explicitly capturing the distribution of the data and parameters, our model has a more compact parameterization compared to GAN-based generative models. We formulate the data synthesis as an adversarial Bayesian inference problem, in which the distributions of generator and discriminator parameters are obtained for data synthesis. We evaluate our method through a variety of metrics, where we show advantage than other competing methods with better fidelity and diversity. We further evaluate the synthesis quality as a data augmentation method for recognition task. Finally, we demonstrate the benefit of our fully probabilistic approach in data restoration task. 8
Employing Semantic Context for Sparse Information Extraction Assessment Li, PP; Wang, HX; Li, HS; Wu, XD A huge amount of texts available on the World Wide Web presents an unprecedented opportunity for information extraction (IE). One important assumption in IE is that frequent extractions are more likely to be correct. Sparse IE is hence a challenging task because no matter how big a corpus is, there are extractions supported by only a small amount of evidence in the corpus. However, there is limited research on sparse IF., especially in the assessment of the validity of sparse IEs. Motivated by this, we introduce a lightweight, explicit semantic approach for assessing sparse IE.(1) We first use a large semantic network consisting of millions of concepts, entities, and attributes to explicitly model the context of any semantic relationship. Second, we learn from three semantic contexts using different base classifiers to select an optimal classification model for assessing sparse extractions. Finally, experiments show that as compared with several state-of-the-art approaches, our approach can significantly improve the F-score in the assessment of sparse extractions while maintaining the efficiency. 8
Kernel Square-Loss Exemplar Machines for Image Retrieval Rezende, RS; Zepeda, J; Ponce, J; Bach, F; Perez, P Zepeda and Perez [41] have recently demonstrated the promise of the exemplar SVM (ESVM) as a feature encoder for image retrieval. This paper extends this approach in several directions: We first show that replacing the hinge loss by the square loss in the ESVM cost function significantly reduces encoding time with negligible effect on accuracy. We call this model square-loss exemplar machine, or SLEM. We then introduce a kernelized SLEM which can be implemented efficiently through low-rank matrix decomposition, and displays improved performance. Both SLEM variants exploit the fact that the negative examples are fixed, so most of the SLEM computational complexity is relegated to an offline process independent of the positive examples. Our experiments establish the performance and computational advantages of our approach using a large array of base features and standard image retrieval datasets. 8
AUTOKNOW: Self-Driving Knowledge Collection for Products of Thousands of Types Dong, XL; He, X; Kan, A; Li, X; Liang, Y; Ma, J; Xu, YE; Zhang, CW; Zhao, T; Saldana, GB; Deshpande, S; Manduca, AM; Ren, J; Singh, SP; Xiao, F; Chang, HS; Karamanolakis, G; Mao, YN; Wang, YQ; Faloutsos, C; McCallum, A; Han, JW Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AUTOKNOW, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AUTOKNOW is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AUTOKNOW has been operational in collecting product knowledge for over 11K product types. 8
Competition in the chaperone-client network subordinates cell-cycle entry to growth and stress Moreno, DF; Parisi, E; Yahya, G; Vaggi, F; Csikasz-Nagy, A; Aldea, M The precise coordination of growth and proliferation has a universal prevalence in cell homeostasis. As a prominent property, cell size is modulated by the coordination between these processes in bacterial, yeast, and mammalian cells, but the underlying molecular mechanisms are largely unknown. Here, we show that multifunctional chaperone systems play a concerted and limiting role in cell-cycle entry, specifically driving nuclear accumulation of the G1 Cdk-cyclin complex. Based on these findings, we establish and test a molecular competition model that recapitulates cell-cycle-entry dependence on growth rate. As key predictions at a single-cell level, we show that availability of the Ydj1 chaperone and nuclear accumulation of the G1 cyclin Cln3 are inversely dependent on growth rate and readily respond to changes in protein synthesis and stress conditions that alter protein folding requirements. Thus, chaperone workload would subordinate Start to the biosynthetic machinery and dynamically adjust proliferation to the growth potential of the cell. 8
ProductQnA: Answering User Questions on E-Commerce Product Pages Kulkarni, A; Mehta, K; Garg, S; Bansal, V; Rasiwasia, N; Sengamedu, SH Product pages on e-commerce websites often overwhelm their customers with a wealth of data, making discovery of relevant information a challenge. Motivated by this, here, we present a novel framework to answer both factoid and non-factoid user questions on product pages. We propose several question-answer matching models leveraging both deep learned distributional semantics and semantics imposed by a structured resource like a domain specific ontology. The proposed framework supports the use of a combination of these models and we show, through empirical evaluation, that a cascade of these models does much better in meeting the high precision requirements of such a question-answering system. Evaluation on user asked questions shows that the proposed system achieves 66% higher precision(1) as compared to IDF-weighted average of word vectors baseline [1]. 7
MULTIIMPORT: Inferring Node Importance in a Knowledge Graph from Multiple Input signals Park, N; Kan, A; Dong, XL; Zhao, T; Faloutsos, C Given multiple input signals, how can we infer node importance in a knowledge graph (KG)? Node importance estimation is a crucial and challenging task that can benefit a lot of applications including recommendation, search, and query disambiguation. A key challenge towards this goal is how to effectively use input from different sources. On the one hand, a KG is a rich source of information, with multiple types of nodes and edges. On the other hand, there are external input signals, such as the number of votes or pageviews, which can directly tell us about the importance of entities in a KG. While several methods have been developed to tackle this problem, their use of these external signals has been limited as they are not designed to consider multiple signals simultaneously. In this paper, we develop an end-to-end model MultiImport, which infers latent node importance from multiple, potentially overlapping, input signals. MultiImport is a latent variable model that captures the relation between node importance and input signals, and effectively learns from multiple signals with potential conflicts. Also, MultiImport provides an effective estimator based on attentive graph neural networks. We ran experiments on real-world KGs to show that MultiImport handles several challenges involved with inferring node importance from multiple input signals, and consistently outperforms existing methods, achieving up to 23.7% higher NDCG@100 than the state-of-the-art method. 7
QPipe: Quantiles Sketch Fully in the Data Plane Ivkin, N; Yu, ZL; Braverman, V; Jin, X Efficient network management requires collecting a variety of statistics over the packet flows. Monitoring the flows directly in the data plane allows the system to detect anomalies faster. However, monitoring algorithms have to handle a throughput of 109 packets per second and to maintain a very low memory footprint. Widely adopted sampling-based approaches suffer from low accuracy in estimations. Thus, it is natural to ask: Is it possible to maintain important statistics in the data plane using small memory footprint?. In this paper, we answer this question in affirmative for an important case of quantiles. We introduce QPipe, the first quantiles sketching algorithm that can be implemented entirely in the data plane. Our main technical contribution is an on-the-plane implementation of a variant of SweepKLL [27] algorithm. Specifically, we give novel implementations of argmin(), the major building block of SweepKLL which are usually not supported in the data plane of the commodity switch. We prototype QPipe in P4 and compare its performance with a sampling-based baseline. Our evaluations demonstrate 10x memory reduction for a fixed approximation error and 90x error improvement for a fixed amount of memory. We conclude that QPipe can be an attractive alternative to sampling-based methods. 7
Language-Agnostic Representation Learning for Product Search on E-Commerce Platforms Ahuja, A; Rao, N; Katariya, S; Subbian, K; Reddy, CK Product search forms an indispensable component of any e-commerce service, and helps customers find products of their interest from a large catalog on these websites. When products that are irrelevant to the search query are surfaced, it leads to a poor customer experience, thus reducing user trust and increasing the likelihood of churn. While identifying and removing such results from product search is crucial, doing so is a burdensome task that requires large amounts of human annotated data to train accurate models. This problem is exacerbated when products are cross-listed across countries that speak multiple languages, and customers specify queries in multiple languages and from different cultural contexts. In this work, we propose a novel multi-lingual multi-task learning framework, to jointly train product search models on multiple languages, with limited amount of training data from each language. By aligning the query and product representations from different languages into a language-independent vector space of queries and products, respectively, the proposed model improves the performance over baseline search models in any given language. We evaluate the performance of our model on real data collected from a leading e-commerce service. Our experimental evaluation demonstrates up to 23% relative improvement in the classification F1-score compared to the state-of-the-art baseline models. 7
A numerical study of 3D frequency-domain elastic full-waveform inversion Pan, GD; Liang, L; Habashy, TM We have developed a 3D elastic full-waveform inversion (FWI) algorithm with forward modeling and inversion performed in the frequency domain. The Helmholtz equation is solved with a second-order finite-difference method using an iterative solver equipped with an efficient complex-shifted incomplete LU-based preconditioner. The inversion is based on the minimization of the data misfit functional and a total variation regularization for the unknown model parameters. We implement the Gauss-Newton method as the optimization engine for the inversions. The codes are parallelized with a message passing interface based on the number of shots and receivers. We examine the performance of this elastic FWI algorithm and workflow on synthetic examples including surface seismic and vertical seismic profile configurations. With various initial models, we manage to obtain high-quality velocity images for 3D earth models. 7
Information-Estimation Relationships Over Binomial and Negative Binomial Models Taborda, CG; Guo, DN; Perez-Cruz, F In recent years, a number of new connections between information measures and estimation have been found under various models, including, predominantly, Gaussian and Poisson models. This paper develops similar results for the binomial and negative binomial models. In particular, it is shown that the derivative of the relative entropy and the derivative of the mutual information for the binomial and negative binomial models can be expressed through the expectation of closed-form expressions that have conditional estimates as the main argument. Under mild conditions, those derivatives take the form of an expected Bregman divergence. 7
Active Frame, Location, and Detector Selection for Automated and Manual Video Annotation Karasev, V; Ravichandran, A; Soatto, S We describe an information-driven active selection approach to determine which detectors to deploy at which location in which frame of a video to minimize semantic class label uncertainty at every pixel, with the smallest computational cost that ensures a given uncertainty bound. We show minimal performance reduction compared to a paragon algorithm running all detectors at all locations in all frames, at a small fraction of the computational cost. Our method can handle uncertainty in the labeling mechanism, so it can handle both oracles (manual annotation) or noisy detectors (automated annotation). 7
Exploiting Web Images for Weakly Supervised Object Detection Tao, QY; Yang, H; Cai, JF In recent years, the performance of object detection has advanced significantly with the evolution of deep convolutional neural networks. However, the state-of-the-art object detection methods still rely on accurate bounding box annotations that require extensive human labeling. Object detection without bounding box annotations, that is, weakly supervised detection methods, are still lagging far behind. As weakly supervised detection only uses image level labels and does not require the ground truth of bounding box location and label of each object in an image, it is generally very difficult to distill knowledge of the actual appearances of objects. Inspired by curriculum learning, this paper proposes an easy-to-hard knowledge transfer scheme that incorporates easy web images to provide prior knowledge of object appearance as a good starting point. While exploiting large-scale free web imagery, we introduce a sophisticated labor-free method to construct a web dataset with good diversity in object appearance. After that, semantic relevance and distribution relevance are introduced and utilized in the proposed curriculum training scheme. Our end-to-end learning with the constructed web data achieves remarkable improvement across most object classes, especially for the classes that are often considered hard in other works. 7
Not too ugly to be tasty: Guiding consumer food inferences for the greater good Pfeiffer, BE; Sundar, A; Deval, H The issue of food waste is an important societal challenge with a significant environmental impact. An important issue contributing to food waste is consumers' unwillingness to purchase suboptimal food. Past literature has shown that people prefer perfectly formed food to abnormally shaped food when given a choice, but much of the mechanism underlying this preference is not well documented. Using a framework based on the halo effect, the authors focus on consumers affective and cognitive responses that cause them to shy away from produce that does not meet the usual aesthetic criteria. Results demonstrate that consumers find well-formed produce vs. deformed produce to be more aesthetically pleasing (beautiful) and that this positive affective reaction leads to more positive consumer inferences of taste, health, and quality. Results also indicate that consumers view sellers of well-formed produce to be more competent than sellers of deformed produce and that this perception is driven by perceptions of beauty and consumer inferences of taste, health, and quality. Lastly, results show that the effects of form on consumer inferences may depend on different distribution channels. Shopping at a farmers market mitigates the impact of the deformation on consumer inferences. Given that form and actual taste, health, and quality are not generally correlated, the results indicate that consumers are making inaccurate inferences. Exploring these inferences has the potential to open new avenues to educate consumers. 7
How Cognitive Models of Human Body Experience Might Push Robotics Schurmann, T; Mohler, BJ; Peters, J; Beckerle, P In the last decades, cognitive models of multisensory integration in human beings have been developed and applied to model human body experience. Recent research indicates that Bayesian and connectionist models might push developments in various branches of robotics: assistive robotic devices might adapt to their human users aiming at increased device embodiment, e.g., in prosthetics, and humanoid robots could be endowed with human-like capabilities regarding their surrounding space, e.g., by keeping safe or socially appropriate distances to other agents. In this perspective paper, we review cognitive models that aim to approximate the process of human sensorimotor behavior generation, discuss their challenges and potentials in robotics, and give an overview of existing approaches. While model accuracy is still subject to improvement, human-inspired cognitive models support the understanding of how the modulating factors of human body experience are blended. Implementing the resulting insights in adaptive and learning control algorithms could help to taylor assistive devices to their user's individual body experience. Humanoid robots who develop their own body schema could consider this body knowledge in control and learn to optimize their physical interaction with humans and their environment. Cognitive body experience models should be improved in accuracy and online capabilities to achieve these ambitious goals, which would foster human-centered directions in various fields of robotics. 7
Code-Level Model Checking in the Software Development Workflow Chong, N; Cook, B; Kallas, K; Khazem, K; Monteiro, FR; Schwartz-Narbonne, D; Tasiran, S; Tautschnig, M; Tuttle, MR This experience report describes a style of applying symbolic model checking developed over the course of four years at Amazon Web Services. Lessons learned are drawn from proving properties of numerous C-based systems, e.g., custom hypervisors, encryption code, boot loaders, and an IoT operating system. Using our methodology, we find that we can prove the correctness of industrial low-level C-based systems with reasonable effort and predictability. Furthermore, Amazon Web Services developers are increasingly writing their own formal specifications. All proofs discussed in this paper are publicly available on GitHub. 7
Convolutional neural network with median layers for denoising salt-and-pepper contaminations Liang, LM; Deng, S; Gueguen, L; Wei, MQ; Wu, XM; Qin, J We propose a deep fully convolutional neural network with a new type of layer, named median layer, to restore images contaminated by salt-and-pepper (s&p) noise. A median layer simply performs median filtering on all feature channels. By adding this kind of layer into some widely used fully convolutional deep neural networks, we develop an end-to-end network that removes extremely high-level s&p noise with -out performing any non-trivial preprocessing tasks. Experiments show that inserting median layers into a simple fully-convolutional network with the L-2 loss significantly boosts signal-to-noise ratio. Quantitative comparisons testify that our network outperforms the state-of-the-art methods with a lim-ited amount of training data. (C) 2021 Elsevier B.V. All rights reserved. 7
Dynamic Prediction of the Incident Duration Using Adaptive Feature Set Ghosh, B; Asif, MT; Dauwels, J; Fastenrath, U; Guo, HL Non-recurring incidents such as accidents and vehicle breakdowns are the leading causes of severe traffic congestions in large cities. Consequently, anticipating the duration of such events in advance can be highly useful in mitigating the resultant congestion. However, availability of partial information or ever-changing ground conditions makes the task of forecasting the duration particularly challenging. In this paper, we propose an adaptive ensemble model that can provide reasonable forecasts even when a limited amount of information is available and further improves the prediction accuracy as more information becomes available during the course of the incidents. Furthermore, we consider the scenarios where the historical incident reports may not always contain accurate information about the duration of the incidents. To mitigate this issue, we first quantify the effective duration of the incidents by looking for the change points in traffic state and then utilize this information to predict the duration of the incidents. We compare the prediction performance of different traditional regression methods, and the experimental results show that the Treebagger outperforms other methods. For the incidents with duration in the range of 36 - 200 min, the mean absolute percentage error (MAPE) in predicting the duration is in the range of 25 - 55. Moreover, for longer duration incidents (greater than 65 min), prediction improves significantly with time. For example, the MAPE value varies over time from 76 to 50 for incidents having a duration greater than 200 min. Finally, the overall MAPE value averaged over all incidents improves by 50 with elapsed time for prediction of reported as well as effective duration. 7
Intrinsic Gaussian processes on complex constrained domains Niu, M; Cheung, P; Lin, LZ; Dai, ZW; Lawrence, N; Dunson, D We propose a class of intrinsic Gaussian processes (GPs) for interpolation, regression and classification on manifolds with a primary focus on complex constrained domains or irregularly shaped spaces arising as subsets or submanifolds of R, R2, R3 and beyond. For example, intrinsic GPs can accommodate spatial domains arising as complex subsets of Euclidean space. Intrinsic GPs respect the potentially complex boundary or interior conditions as well as the intrinsic geometry of the spaces. The key novelty of the approach proposed is to utilize the relationship between heat kernels and the transition density of Brownian motion on manifolds for constructing and approximating valid and computationally feasible covariance kernels. This enables intrinsic GPs to be practically applied in great generality, whereas existing approaches for smoothing on constrained domains are limited to simple special cases. The broad utilities of the intrinsic GP approach are illustrated through simulation studies and data examples. 7
Network-Aware Feasible Repairs for Erasure-Coded Storage Sipos, M; Gahm, J; Venkat, N; Oran, D A significant amount of research on using erasure coding for distributed storage has focused on reducing the amount of data that needs to be transferred to replace failed nodes. This continues to be an active topic as the introduction of faster storage devices looks to put an even greater strain on the network. However, with a few notable exceptions, most published work assumes a flat, static network topology between the nodes of the system. We propose a general framework to find the lowest cost feasible repairs in a more realistic, heterogeneous and dynamic network, and examine how the number of repair strategies to consider can be reduced for three distinct erasure codes. We devote a significant part of the paper to determining the set of feasible repairs for random linear network coding (RLNC) and describe a system of efficient checks using techniques from the arsenal of dynamic programming. Our solution involves decomposing the problem into smaller steps, memorizing, and then reusing intermediate results. All computationally intensive operations are performed prior to the failure of a node to ensure that the repair can start with minimal delay, based on up-to-date network information. We show that all three codes benefit from being network aware and find that the extra computations required for RLNC can be reduced to a viable level for a wide range of parameter values. 7
Sim2Real in Robotics and Automation: Applications and Challenges Hofer, S; Bekris, K; Handa, A; Gamboa, JC; Mozifian, M; Golemo, F; Atkeson, C; Fox, D; Goldberg, K; Leonard, J; Liu, CK; Peters, J; Song, SR; Welinder, P; White, M To Perform reliably and consistently over sustained periods of time, large-scale automation critically relies on computer simulation. Simulation allows us and supervisory AI to effectively design, validate, and continuously improve complex processes, and helps practitioners to gain insight into the operation and justify future investments. While numerous successful applications of simulation in industry exist, such as circuit simulation, finite element methods, and computeraided design (CAD), state-of-the-art simulators fall short of accurately modeling physical phenomena, such as friction, impact, and deformation. 7
Deep Learning for IoT Lin, T Deep learning and other machine learning approaches are deployed to many systems related to Internet of Things or IoT. However, it faces challenges that adversaries can take loopholes to hack these systems through tampering history data. This paper first presents overall points of adversarial machine learning. Then, we illustrate traditional methods, such as Petri Net cannot solve this new question efficiently. After that, this paper uses the example from triage(filter) analysis from IoT cyber security operations center. Filter analysis plays a significant role in IoT cyber operations. The overwhelming data flood is obviously above the cyber analyst's analytical reasoning. To help IoT data analysis more efficient, we propose a retrieval method based on deep learning (recurrent neural network). Besides, this paper presents a research on data retrieval solution to avoid hacking by adversaries in the fields of adversary machine leaning. It further directs the new approaches in terms of how to implementing this framework in IoT settings based on adversarial deep learning. 6
Quality Inference Based Task Assignment in Mobile Crowdsensing Gao, XF; Huang, HW; Liu, CL; Wu, F; Chen, GH With the increase of mobile devices, Mobile Crowdsensing (MCS) has become an efficient way to ubiquitously sense and collect environment data. Comparing to traditional sensor networks, MCS has a vital advantage that workers play an active role in collecting and sensing data. However, due to the openness of MCS, workers and sensors are of different qualities. Low quality sensors and workers may yield noisy data or even inaccurate data. Which gives the importance of inferring the quality of workers and sensors and seeking a valid task assignment with enough total qualities for MCS. To solve the problem, we adopt truth inference methods to iteratively infer the truth and qualities. Based on the quality inference, this paper proposes a task assignment problem called quality-bounded task assignment with redundancy constraint (QTAR). Different from traditional task assignment problem, redundancy constraint is added to satisfy the preliminaries of truth inference, which requires that each task should be assigned a certain or more amount of workers. We prove that QTAR is NP-complete and propose a(2 + epsilon) - approximation algorithm for QTAR, called QTA. Finally, experiments are conducted on both synthesis data and real dataset. The results of the experiments prove the efficiency and effectiveness of our algorithms. 6
HydraList: A Scalable In-Memory Index Using Asynchronous Updates and Partial Replication Mathew, A; Min, C Increased capacity of main memory has led to the rise of in-memory databases. With disk access eliminated, efficiency of index structures has become critical for performance in these systems. An ideal index structure should exhibit high performance for a wide variety of workloads, be scalable, and efficient in handling large data sets. Unfortunately, our evaluation shows that most state-of-the-art index structures fail to meet these three goals. For an index to be performant with large data sets, it should ideally have time complexity independent of the key set size. To ensure scalability, critical sections should be minimized and synchronization mechanisms carefully designed to reduce cache coherence traffic. Moreover, complex memory hierarchy in servers makes data placement and memory access patterns important for high performance across all workload types. In this paper, we present HydraList, a new concurrent, scalable, and high performance in-memory index structure for massive multi-core machines. The key insight behind our design of HydraList is that an index structure can be divided into two components (search and data layers) which can be updated independently leading to lower synchronization overhead. By isolating the search layer, we are able to replicate it across NUMA nodes and reduce cache misses and remote memory accesses. As a result, our evaluation shows that HydraList outperforms other index structures especially in a variety of workloads and key types. 6
Measuring the impact of lexical and structural inconsistencies on developers' cognitive load during bug localization Fakhoury, S; Roy, D; Ma, Y; Arnaoudova, V; Adesope, O A large portion of the cost of any software lies in the time spent by developers in understanding a program's source code before any changes can be undertaken. Measuring program comprehension is not a trivial task. In fact, different studies use self-reported and various psycho-physiological measures as proxies. In this research, we propose a methodology using functional Near Infrared Spectroscopy (fNIRS) and eye tracking devices as an objective measure of program comprehension that allows researchers to conduct studies in environments close to real world settings, at identifier level of granularity. We validate our methodology and apply it to study the impact of lexical, structural, and readability issues on developers' cognitive load during bug localization tasks. Our study involves 25 undergraduate and graduate students and 21 metrics. Results show that the existence of lexical inconsistencies in the source code significantly increases the cognitive load experienced by participants not only on identifiers involved in the inconsistencies but also throughout the entire code snippet. We did not find statistical evidence that structural inconsistencies increase the average cognitive load that participants experience, however, both types of inconsistencies result in lower performance in terms of time and success rate. Finally, we observe that self-reported task difficulty, cognitive load, and fixation duration do not correlate and appear to be measuring different aspects of task difficulty. 6
Mapping Sentiments to Themes of Customer reactions on Social Media during a Security Hack: A Justice Theory Perspective Ivaturi, K; Bhagwatwar, A As social media continues to transform firm-customer interactions, firms must leverage customer reactions to generate actionable insights, especially in contexts (e.g., crisis events) where customer reactions are critical. Using the justice theory, we categorize customer reactions of two firms, Home Depot and Target, during the time-frame of a security hack to understand key themes/topics. We then map the themes/topics to customer sentiments in those reactions. We found that customers associate justice with simple procedures than the experience of dealing with the firm. In addition, it is critical for firms to carefully assess and control customer sentiments on social media during crisis events. 6
SMPLpix: Neural Avatars from 3D Human Models Prokudin, S; Black, MJ; Romero, J Recent advances in deep generative models have led to an unprecedented level of realism for synthetically generated images of humans. However, one of the remaining fundamental limitations of these models is the ability to flexibly control the generative process, e.g. change the camera and human pose while retaining the subject identity. At the same time, deformable human body models like SMPL [34] and its successors provide full control over pose and shape, but rely on classic computer graphics pipelines for rendering. Such rendering pipelines require explicit mesh rasterization that (a) does not have the potential to fix artifacts or lack of realism in the original 3D geometry and (b) until recently, were not fully incorporated into deep learning frameworks. In this work, we propose to bridge the gap between classic geometry-based rendering and the latest generative networks operating in pixel space. We train a network that directly converts a sparse set of 3D mesh vertices into photorealistic images, alleviating the need for traditional rasterization mechanism. We train our model on a large corpus of human 3D models and corresponding real photos, and show the advantage over conventional differentiable renderers both in terms of the level of photorealism and rendering efficiency. 6
Modeling highly imbalanced crash severity data by ensemble methods and global sensitivity analysis Jiang, LM; Xie, YC; Wen, X; Ren, TZ Crash severity has been extensively studied and numerous methods have been developed for investigating the relationship between crash outcome and explanatory variables. Crash severity data are often characterized by highly imbalanced severity distributions, with most crashes in the Property-Damage-Only (PDO) category and the severe crash category making up only a fraction of the total observations. Many methods perform better on outcome categories with the most observations than other categories. This often leads to a high modeling accuracy for PDO crashes but poor accuracies for other severity categories. This research introduces two ensemble methods to model imbalanced crash severity data: AdaBoost and Gradient Boosting. It also adopts a more reasonable performance metric, F1 score, for model selection. It is found that AdaBoost and Gradient Boosting outperform other benchmark methods and generate more balanced prediction accuracies. Additionally, a global sensitivity analysis is adopted to determine the individual and joint impacts of explanatory factors on crash severity outcome. Vertical curve, seat belt use, accident type, road characteristics, and truck percentage are found to be the most influential factors. Finally, a simulation-based approach is used to further study how the impact of a particular factor may vary with respect to different value ranges. 6
Benchmarking liquidity proxies: The case of EU sovereign bonds Langedijk, S; Monokroussos, G; Papanagiotou, E We examine effective measures of liquidity in the context of EU sovereign bonds and the Basel III regulatory framework. We observe that the empirical correlations between benchmarks and proxies are typically very low and in general become weaker as the frequency over which these relationships are examined becomes higher, and that the relative strength of the various proxies may change with the frequency considered. The main implications of our results for the EU sovereign bond market are (i) the use of liquidity proxies may lead to erroneous conclusions; (ii) any liquidity measure needs to be assessed against the relevant timeframe for conversion into cash; and (iii) the end-of-day spread is the best performing proxy across different frequencies. 6
Multi-Stream End-to-End Speech Recognition Li, RZ; Wang, XF; Mallidi, SH; Watanabe, S; Hori, T; Hermansky, H Attention-based methods and Connectionist Temporal Classification (CTC) network have been promising research directions for end-to-end (E2E) Automatic Speech Recognition (ASR). The joint CTC/Attention model has achieved great success by utilizing both architectures during multi-task training and joint decoding. In this article, we present a multi-stream framework based on joint CTC/Attention E2E ASR with parallel streams represented by separate encoders aiming to capture diverse information. On top of the regular attention networks, the Hierarchical Attention Network (HAN) is introduced to steer the decoder toward the most informative encoders. A separate CTC network is assigned to each stream to force monotonic alignments. Two representative framework have been proposed and discussed, which are Multi-Encoder Multi-Resolution (MEM-Res) framework and Multi-Encoder Multi-Array (MEM-Array) framework, respectively. In MEM-Res framework, two heterogeneous encoders with different architectures, temporal resolutions and separate CTC networks work in parallel to extract complementary information from same acoustics. Experiments are conducted on Wall Street Journal (WSJ) and CHiME-4, resulting in relative Word Error Rate (WER) reduction of 18.0-32.1% and the best WER of 3.6% in the WSJ eval92 test set. The MEM-Array framework aims at improving the far-field ASR robustness using multiple microphone arrays which are activated by separate encoders. Compared with the best single-array results, the proposed framework has achieved relative WER reduction of 3.7% and 9.7% in AMI and DIRHA multi-array corpora, respectively, which also outperforms conventional fusion strategies. 6
Sequential-based Adversarial Optimisation for Personalised Top-N Item Recommendation Manotumruksa, J; Yilmaz, E Personalised top-N item recommendation systems aim to generate a ranked list of interesting items to users based on their interactions (e.g. click, purchase and rating). Recently, various sequential-based factorised approaches have been proposed to exploit deep neural networks to effectively capture the users' dynamic preferences from their sequences of interactions. These factorised approaches usually rely on a pairwise ranking objective such as the Bayesian Personalised Ranking (BPR) for optimisation. However, previous works have shown that optimising factorised approaches with BPR can hinder the generalisation, which can degrade the quality of item recommendations. To address this challenge, we propose a Sequential-based Adversarial Optimisation (SAO) framework that effectively enhances the generalisation of sequential-based factorised approaches. Comprehensive experiments on six public datasets demonstrate the effectiveness of the SAO framework in enhancing the performance of the state-of-the-art sequential-based factorised approach in terms of NDCG by 3-14%. 6
Interdiction models for delaying adversarial attacks against critical information technology infrastructure Zheng, KY; Albert, LA Information technology (IT) infrastructure relies on a globalized supply chain that is vulnerable to numerous risks from adversarial attacks. It is important to protect IT infrastructure from these dynamic, persistent risks by delaying adversarial exploits. In this paper, we propose max-min interdiction models for critical infrastructure protection that prioritizes cost-effective security mitigations to maximally delay adversarial attacks. We consider attacks originating from multiple adversaries, each of which aims to find a critical path through the attack surface to complete the corresponding attack as soon as possible. Decision-makers can deploy mitigations to delay attack exploits, however, mitigation effectiveness is sometimes uncertain. We propose a stochastic model variant to address this uncertainty by incorporating random delay times. The proposed models can be reformulated as a nested max-max problem using dualization. We propose a Lagrangian heuristic approach that decomposes the max-max problem into a number of smaller subproblems, and updates upper and lower bounds to the original problem via subgradient optimization. We evaluate the perfect information solution value as an alternative method for updating the upper bound. Computational results demonstrate that the Lagrangian heuristic identifies near-optimal solutions efficiently, which outperforms a general purpose mixed-integer programming solver on medium and large instances. 6
IRIS-HiSA: Highly Scalable and Available Carrier-Grade SDN Controller Cluster Shin, J; Kim, T; Lee, B; Yang, S As software defined networking (SDN) extends its applications to carrier-grade networks, the need for high scalability and availability of a SDN controller is becoming increasingly important. Although existing works have shown the feasibility of a distributed controller, the switches in the data plane are required to know some of the internal specifics such as the IP addresses of the individual controller instances. This constraint increases the operational complexity as the number of controller instances increases. In this paper, we propose a distributed controller cluster architecture called IRIS-HiSA. The main goal is to support seamless load balancing and failover with horizontal scalability, as is done in existing works, but one of the distinguishing features of IRIS-HiSA is to provide transparency to the switches in the data plane. Thus, the switches do not need to know the internal details of the controller cluster, and they simply access it in the same way a single controller is accessed. In addition to proving seamless load balancing and a failover, a performance evaluation is conducted to analyze the high scalability in which the throughput of the flow setup is proportionally increased with the number of controller instances. 6
Feasibility of Image Registration for Ultrasound-Guided Prostate Radiotherapy Based on Similarity Measurement by a Convolutional Neural Network Zhu, N; Najafi, M; Han, B; Hancock, S; Hristov, D Purpose: Registration of 3-dimensional ultrasound images poses a challenge for ultrasound-guided radiation therapy of the prostate since ultrasound image content changes significantly with anatomic motion and ultrasound probe position. The purpose of this work is to investigate the feasibility of using a pretrained deep convolutional neural network for similarity measurement in image registration of 3-dimensional transperineal ultrasound prostate images. Methods: We propose convolutional neural network-based registration that maximizes a similarity score between 2 identical in size 3-dimensional regions of interest: one encompassing the prostate within a simulation (reference) 3-dimensional ultrasound image and another that sweeps different spatial locations around the expected prostate position within a pretreatment 3-dimensional ultrasound image. The similarity score is calculated by (1) extracting pairs of corresponding 2-dimensional slices (patches) from the regions of interest, (2) providing these pairs as an input to a pretrained convolutional neural network which assigns a similarity score to each pair, and (3) calculating an overall similarity by summing all pairwise scores. The convolutional neural network method was evaluated against ground truth registrations determined by matching implanted fiducial markers visualized in a pretreatment orthogonal pair of x-ray images. The convolutional neural network method was further compared to manual registration and a standard commonly used intensity-based automatic registration approach based on advanced normalized correlation. Results: For 83 image pairs from 5 patients, convolutional neural network registration errors were smaller than 5 mm in 81% of the cases. In comparison, manual registration errors were smaller than 5 mm in 61% of the cases and advanced normalized correlation registration errors were smaller than 5 mm only in 25% of the cases. Conclusion: Convolutional neural network evaluation against manual registration and an advanced normalized correlation -based registration demonstrated better accuracy and reliability of the convolutional neural network. This suggests that with training on a large data set of transperineal ultrasound prostate images, the convolutional neural network method has potential for robust ultrasound-to-ultrasound registration. 6
The Minor fall, the Major lift: inferring emotional valence of musical chords through lyrics Kolchinsky, A; Dhande, N; Park, K; Ahn, YY We investigate the association between musical chords and lyrics by analysing a large dataset of user-contributed guitar tablatures. Motivated by the idea that the emotional content of chords is reflected in the words used in corresponding lyrics, we analyse associations between lyrics and chord categories. We also examine the usage patterns of chords and lyrics in different musical genres, historical eras and geographical regions. Our overall results confirm a previously known association between Major chords and positive valence. We also report a wide variation in this association across regions, genres and eras. Our results suggest possible existence of different emotional associations for other types of chords. 6
How Twitter Can Support the HIV/AIDS Response to Achieve the 2030 Eradication Goal: In-Depth Thematic Analysis of World AIDS Day Tweets Odlum, M; Yoon, S; Broadwell, P; Brewer, R; Kuang, D Background: HIV/AIDS is a tremendous public health crisis, with a call for its eradication by 2030. A human rights response through civil society engagement is critical to support and sustain HIV eradication efforts. However, ongoing civil engagement is a challenge. Objective: This study aimed to demonstrate the use of Twitter data to assess public sentiment in support of civil society engagement. Methods: Tweets were collected during World AIDS Days 2014 and 2015. A total of 39,940 unique tweets (>10 billion users) in 2014 and 78,215 unique tweets (>33 billion users) in 2015 were analyzed. Response frequencies were aggregated using natural language processing. Hierarchical rank-2 nonnegative matrix factorization algorithm generated a hierarchy of tweets into binary trees. Tweet hierarchy clusters were thematically organized by the Joint United Nations Programme on HIV/AIDS core action principles and categorized under HIV/AIDS Prevention, Treatment or Care, or Support. Results: Topics tweeted 35 times or more were visualized. Results show a decrease in 2015 in the frequency of tweets associated with the fight to end HIV/AIDS, the recognition of women, and to achieve an AIDS-free generation. Moreover, an increase in tweets was associated with an integrative approach to the HIV/AIDS response. Hierarchical thematic differences in 2015 included no prevention discussion and the recognition of the pandemic's impact and discrimination. In addition, a decrease was observed in motivation to fast track the pandemic's end and combat HIV/AIDS. Conclusions: The human rights-based response to HIV/AIDS eradication is critical. Findings demonstrate the usefulness of Twitter as a low-cost method to assess public sentiment for enhanced knowledge, increased hope, and revitalized expectations for HIV/AIDS eradication. 6
Joint Entropy-Assisted Graphene Oxide-Based Multiplexing Biosensing Platform for Simultaneous Detection of Multiple Proteases Zhang, YW; Chen, XH; Yuan, SQ; Wang, L; Guan, XY Due to the limited clinical utility of individual biomarkers, there is growing recognition of the need for combining multiple biomarkers as a panel to improve the accuracy and efficacy of disease diagnosis and prognosis. The conventional method to detect multiple analyte species is to construct a sensor array, which consists of an array of individual selective probes for different species. In this work, by using cancer biomarker matrix metalloproteinases (MMPs) and a disintegrin and metalloproteinases (ADAMs) as model analytes and functionalized nanographene oxide (nGO) as a sensing element, we developed a multiplexing fluorescence sensor in a nonarray format for simultaneous measurement of the activities of multiple proteases. The constructed nGO-based biosensor was rapid, sensitive, and selective and was also utilized for the successful profiling of ADAMs/MMPs in simulated serum samples. Furthermore, we showed that joint entropy and programming could be utilized to guide experiment design, especially in terms of the selection of a subset of proteases from the entire MMPs/ADAMs family as an appropriate biomarker panel. Our developed nGO-based multiplex sensing platform should find useful application in early cancer detection and diagnosis. 6
Sequentially adaptive Bayesian learning algorithms for inference and optimization Geweke, J; Durham, G The sequentially adaptive Bayesian learning algorithm (SABL) builds on and ties together ideas from sequential Monte Carlo and simulated annealing. The algorithm can be used to simulate from Bayesian posterior distributions, using either data tempering or power tempering, or for optimization. A key feature of SABL is that the introduction of information is adaptive and controlled, ensuring that the algorithm performs reliably and efficiently in a wide variety of applications with off-the-shelf settings, minimizing the need for tedious tuning, tinkering, trial and error by users. The algorithm is pleasingly parallel, and a Matlab toolbox implementing the algorithm is able to make efficient use of massively parallel computing environments such as graphics processing units (GPUs) with minimal user effort. This paper describes the algorithm, provides theoretical foundations, applies the algorithm to Bayesian inference and optimization problems illustrating key properties of its operation, and briefly describes the open source software implementation. (C) 2018 Elsevier B.V. All rights reserved. 6
UPS: Unified PMU-Data Storage System to Enhance TD PMU Data Usability Kosen, I; Huang, C; Chen, Z; Zhang, XC; Min, L; Zhou, D; Zhu, L; Liu, YL The emerging distribution-level phasor measurement unit (D-PMU) is expected to play an important role in enhancing distribution system observability and situational-awareness. It is a demanding yet challenging task to develop advanced D-PMU data management and analytics tools to improve D-PMU data usability and further promote D-PMU projects deployment. This paper focuses on D-PMU data processing and storage. It presents a brief review of existing D-PMU data storage systems and points out their limitations on high-performance, flexibility, and scalability. To overcome the limitations, a unified PMU-data storage system (UPS) is proposed. Specifically, a unified I/O interface between storage servers and computing jobs is developed to effectively reduce the overhead of managing various computing jobs and data analytics over multiple storage infrastructures; and PMUCache with PMUCache partitioning and PMUCache replacement algorithms are designed to support in-situ data processing and shared distributed data storage and further serve the computing jobs/queries with different quality of service (QoS) requirements. Through a series of experiments, it is demonstrated that UPS achieves high performance on fast and big data processing and storage, and efficiently increases the flexibility and scalability of PMU data management systems. 6
Providing a Single Ground-Truth for Illuminant Estimation for the ColorChecker Dataset Hemrit, G; Finlayson, GD; Gijsenij, A; Gehler, P; Bianco, S; Drew, MS; Funt, B; Shi, LL The ColorChecker dataset is one of the most widely used image sets for evaluating and ranking illuminant estimation algorithms. However, this single set of images has at least 3 different sets of ground-truth (i.e., correct answers) associated with it. In the literature it is often asserted that one algorithm is better than another when the algorithms in question have been tuned and tested with the different ground-truths. In this short correspondence we present some of the background as to why the 3 existing ground-truths are different and go on to make a new single and recommended set of correct answers. Experiments reinforce the importance of this work in that we show that the total ordering of a set of algorithms may be reversed depending on whether we use the new or legacy ground-truth data. 6
Tuning Materials-Binding Peptide Sequences toward Gold- and Silver-Binding Selectivity with Bayesian Optimization Hughes, ZE; Nguyen, MA; Wang, JL; Liu, Y; Swihart, MT; Poloczek, M; Frazier, PI; Knecht, MR; Walsh, TR Peptide sequence engineering can potentially deliver materials-selective binding capabilities, which would be highly attractive in numerous biotic and abiotic nanomaterials applications. However, the number of known materials-selective peptide sequences is small, and identification of new sequences is laborious and haphazard. Previous attempts have sought to use machine learning and other informatics approaches that rely on existing data sets to accelerate the discovery of materials-selective peptides, but too few materials-selective sequences are known to enable reliable prediction. Moreover, this knowledge base is expensive to expand. Here, we combine a comprehensive and integrated experimental and modeling effort and introduce a Bayesian Effective Search for Optimal Sequences (BESOS) approach to address this challenge. Through this combined approach, we significantly expand the data set of Au-selective peptide sequences and identify an additional Ag-selective peptide sequence. Analysis of the binding motifs for the Ag-binders offers a roadmap for future prediction with machine learning, which should guide identification of further Ag-selective sequences. These discoveries will enable wider and more versatile integration of Ag nanoparticles in biological platforms. 6
AIM 2019 Challenge on Video Temporal Super-Resolution: Methods and Results Nah, S; Son, S; Timofte, R; Lee, KM; Li, SY; Pan, Z; Xu, XY; Sun, WX; Choi, M; Kim, H; Han, B; Xu, N; Park, B; Yu, S; Kim, S; Jeong, J; Shen, W; Bao, WB; Zhai, GT; Chen, L; Gao, ZY; Chen, GN; Lu, YH; Duan, R; Liu, T; Zhang, LJ; Park, W; Kim, M; Pisha, G; Naor, E; Aloni, L Videos contain various types and strengths of motions that may look unnaturally discontinuous in time when the recorded frame rate is low. This paper reviews the first AIM challenge on video temporal super-resolution (frame interpolation) with a focus on the proposed solutions and results. From low-frame-rate (15 fps) video sequences, the challenge participants are asked to submit higher-frame-rate (60 fps) video sequences by estimating temporally intermediate frames. We employ the REDS VTSR dataset derived from diverse videos captured in a hand-held camera for training and evaluation purposes. The competition had 62 registered participants, and a total of 8 teams competed in the final testing phase. The challenge winning methods achieve the state-of-the-art in video temporal super-resolution. 6
BLOCK PUBLIC ACCESS: Trust Safety Verification of Access Control Policies Bouchet, M; Cook, B; Cutler, B; Druzkina, A; Gacek, A; Hadarean, L; Jhala, R; Marshall, B; Peebles, D; Rungta, N; Schlesinger, C; Stephens, C; Varming, C; Warfield, A Data stored in cloud services is highly sensitive and so access to it is controlled via policies written in domain-specific languages (DSLs). The expressiveness of these DSLs provides users flexibility to cover a wide variety of uses cases, however, unintended misconfigurations can lead to potential security issues. We introduce BLOCK PUBLIC ACCESS, a tool that formally verifies policies to ensure that they only allow access to trusted principals, i.e. that they prohibit access to the general public. To this end, we formalize the notion of Trust Safety that formally characterizes whether or not a policy allows unconstrained (public) access. Next, we present a method to compile the policy down to a logical formula whose unsatisfiability can be (1) checked by SMT and (2) ensures Trust Safety. The constructs of the policy DSLs render unsatisfiability checking PSPACE-complete, which precludes verifying the millions of requests per second seen at cloud scale. Hence, we present an approach that leverages the structure of the policy DSL to compute a much smaller residual policy that corresponds only to untrusted accesses. Our approach allows BLOCK PUBLIC ACCESS to, in the common case, syntactically verify Trust Safety without having to query the SMT solver. We have implemented BLOCK PUBLIC ACCESS and present an evaluation showing how the above optimization yields a low-latency policy verifier that the S3 team at Amazon Web Services has integrated into their authorization system, where it is currently in production, analyzing millions of policies everyday to ensure that client buckets do not grant unintended public access. 5
Cloud Reliability Izrailevsky, Y; Bell, C Cloud computing allows engineers to rapidly develop complex systems and deploy them continuously, at global scale. This can create unique reliability risks. Cloud infrastructure providers are constantly developing new ideas and incorporating them into their products and services in order to increase the reliability of their platforms. Cloud application developers can also employ a number of architectural techniques and operational best practices to further boost the availability of their systems. 5
Differentially Private Real-Time Streaming Data Publication Based on Sliding Window Under Exponential Decay Sun, L; Ge, C; Huang, X; Wu, YJ; Gao, Y Continuous response of range query on steaming data provides useful information for many practical applications as well as the risk of privacy disclosure. The existing research on differential privacy streaming data publication mostly pay close attention to boosting query accuracy, but pay less attention to query efficiency, and ignore the effect of timeliness on data weight. In this paper, we propose an effective algorithm of differential privacy streaming data publication under exponential decay mode. Firstly, by introducing the Fenwick tree to divide and reorganize data items in the stream, we achieve a constant time complexity for inserting a new item and getting the prefix sum. Meanwhile, we achieve time complicity linear to the number of data item for building a tree. After that, we use the advantage of matrix mechanism to deal with relevant queries and reduce the global sensitivity. In addition, we choose proper diagonal matrix further improve the range query accuracy. Finally, considering about exponential decay, every data item is weighted by the decay factor. By putting the Fenwick tree and matrix optimization together, we present complete algorithm for differentiate private real-time streaming data publication. The experiment is designed to compare the algorithm in this paper with similar algorithms for streaming data release in exponential decay. Experimental results show that the algorithm in this paper effectively improve the query efficiency while ensuring the quality of the query. 5
TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking Reddy, ND; Guigues, L; Pishchulin, L; Eledath, J; Narasimhan, SG We consider the task of 3D pose estimation and tracking of multiple people seen in an arbitrary number of camera feeds. We propose TesseTrack1, a novel top-down approach that simultaneously reasons about multiple individuals' 3D body joint reconstructions and associations in space and time in a single end-to-end learnable framework. At the core of our approach is a novel spatio-temporal formulation that operates in a common voxelized feature space aggregated from single- or multiple camera views. After a person detection step, a 4D CNN produces short-term person-specific representations which are then linked across time by a differentiable matcher. The linked descriptions are then merged and deconvolved into 3D poses. This joint spatio-temporal formulation contrasts with previous piecewise strategies that treat 2D pose estimation, 2D-to-3D lifting, and 3D pose tracking as independent sub-problems that are error-prone when solved in isolation. Furthermore, unlike previous methods, TesseTrack is robust to changes in the number of camera views and achieves very good results even if a single view is available at inference time. Quantitative evaluation of 3D pose reconstruction accuracy on standard benchmarks shows significant improvements over the state of the art. Evaluation of multi-person articulated 3D pose tracking in our novel evaluation framework demonstrates the superiority of TesseTrack over strong baselines. 5
Hybrid coexpression link similarity graph clustering for mining biological modules from multiple gene expression datasets Salem, S; Ozcaglar, C Background: Advances in genomic technologies have enabled the accumulation of vast amount of genomic data, including gene expression data for multiple species under various biological and environmental conditions. Integration of these gene expression datasets is a promising strategy to alleviate the challenges of protein functional annotation and biological module discovery based on a single gene expression data, which suffers from spurious coexpression. Results: We propose a joint mining algorithm that constructs a weighted hybrid similarity graph whose nodes are the coexpression links. The weight of an edge between two coexpression links in this hybrid graph is a linear combination of the topological similarities and co-appearance similarities of the corresponding two coexpression links. Clustering the weighted hybrid similarity graph yields recurrent coexpression link clusters (modules). Experimental results on Human gene expression datasets show that the reported modules are functionally homogeneous as evident by their enrichment with biological process GO terms and KEGG pathways. 5
The Role of Attributes in Product Quality Comparisons Moraes, F; Yang, J; Zhang, RT; Murdock, V In online shopping quality is a key consideration when purchasing an item. Since customers cannot physically touch or try out an item before buying it, they must assess its quality from information gathered online. In a typical eCommerce setting, the customer is presented with seller-generated content from the product catalog, such as an image of the product, a textual description, and lists or comparisons of attributes. In addition to catalog attributes, customers often have access to customer-generated content such as reviews and product questions and answers. In a crowdsourced study, we asked crowd workers to compare product pairs from kitchen, electronics, home, beauty and office categories. In a side-by-side comparison, we asked them to choose the product that is higher quality, and further to identify the attributes that contributed to their judgment, where the attributes were both seller-generated and customer-generated. We find that customers tend to perceive more expensive items as higher quality but that their purchase decisions are uncorrelated with quality, suggesting that customers seek a trade-off between price and quality when making purchase decisions. Crowd workers placed a higher value on attributes derived from customer-generated content such as reviews than on catalog attributes. Among the catalog attributes, brand, item material and pack size(1) were most often selected. Finally, attributes with a low correlation with perceived quality are nonetheless useful in predicting purchases in a machine-learned system. 5
LeChatelier-Samuelson principle in games and pass-through of shocks Alexandrov, A; Bedre-Defolie, O The LeChatelier-Samuelson principle states that, as a reaction to a shock, an agent's short-run adjustment of an affected action is smaller than its long-run adjustment (when the agent can also adjust other related actions). We extend the principle to strategic environments where the long-run adjustment also accounts for other players adjusting their strategies. We show that the principle holds for supermodular games (strategic complements) satisfying monotone comparative statics and provide sufficient conditions for the principle to hold in games of strategic substitutes/heterogeneity. We discuss the principle's implications for cost pass-through of multiproduct firms. (C) 2016 Elsevier Inc. All rights reserved. 5
Pricing in non-convex markets with quadratic deliverability costs Kuang, XL; Lamadrid, AJ; Zuluaga, LF This article studies the problem of obtaining equilibrium clearing prices for markets with non-convexities when it is relevant to account for convex quadratic deliverability costs and constraints. In a general market, such a situation arises when quadratic commodity or transactions costs are relevant. In the particular case of electricity markets, there is a mix of resources including dispatchable and renewable energy sources, leading to the presence of integer variables and quadratic costs reflecting ramping needs. To illustrate our results, we compute and analyze the equilibrium clearing prices of the Scarfs classical market problem with the addition of ramping costs. (C) 2019 Elsevier B.V. All rights reserved. 5
Selfie: reflections on TLS 1.3 with PSK Drucker, N; Gueron, S TLS 1.3 allows two parties to establish a shared session key from an out-of-band agreed pre-shared key (PSK). The PSK is used to mutually authenticate the parties, under the assumption that it is not shared with others. This allows the parties to skip the certificate verification steps, saving bandwidth, communication rounds, and latency. In this paper, we identify a vulnerability in this specific TLS 1.3 option by showing a new reflection attack that we call Selfie. This attack uses the fact that TLS does not mandate explicit authentication of the server and the client, and leverages it to break the protocol's mutual authentication property. We explain the root cause of this TLS 1.3 vulnerability, provide a fully detailed demonstration of a Selfie attack using the TLS implementation of OpenSSL, and propose mitigation. The Selfie attack is the first attack on TLS 1.3 after its official release in 2018. It is surprising because it uncovers an interesting gap in the existing TLS 1.3 models that the security proofs rely on. We explain the gap in these model assumptions and show how it affects the proofs in this case. 5
RESLING: a scalable and generic framework to mine top-k representative subgraph patterns Natarajan, D; Ranu, S Mining subgraph patterns is an active area of research due to its wide-ranging applications. Examples include frequent subgraph mining, discriminative subgraph mining, statistically significant subgraphs. Existing research has primarily focused onmining all subgraph patterns in the database. However, due to the exponential subgraph search space, the number of patterns mined, typically, is too large for any human-mediated analysis. Consequently, deriving insights from the mined patterns is hard for domain scientists. In addition, subgraph pattern mining is posed in multiple forms: the function that models if a subgraph is a pattern varies based on the application and the database could be over multiple graphs or a single, large graph. In this paper, we ask the following question: Given a subgraph importance function and a budget k, which are the k subgraph patterns that best represent all other patterns of interest? We show that the problem is NP-hard, and propose a generic framework called Resling that adapts to arbitrary subgraph importance functions and generalizable to both transactional graph databases as well as single, large graphs. Resling derives its power by structuring the search space in the form of an edit map, where each subgraph is a node, and two subgraphs are connected if they have an edit distance of one. We rank nodes in the edit map through two random walk based algorithms: vertex-reinforced random walks (Resling-VR) and negative-reinforced random walks(Resling-NR). Experiments show that Resling-VR is up to 20 times more representative of the pattern space and two orders of magnitude faster than the state-of-the-art techniques. Resling-NR further improves the running time while maintaining comparable or better performance in representative power. 5
Closed-loop identification for plants under model predictive control Esmaili, A; Li, JY; Xie, JY; Isom, JD Model predictive controllers incorporate step response models for pairings of independent and dependent variables. Motivated by the fact that it may be time-consuming to conduct open-loop experiments to identify the step response models, the paper assesses the performance of closed-loop system identification on MPC-equipped plants, using both simulated and actual plant data. Pure feedback closed-loop system identification is shown to be effective for an identifiable simulated system and an industrial hydrogen production plant. The use of closed-loop system identification as a mechanism for monitoring model quality in MPC implementations may enhance the long-term sustainability of the implementation. (C) 2018 Elsevier Ltd. All rights reserved. 5
Market pricing with single-generator-failure security constraints Li, C; Hedman, KW; Zhang, MH Regional transmission organisations and independent system operators include different types of security requirements to approximate system security issues. Transmission line contingencies are well handled in state-of-art market models with line outage distribution factors and, at the same time, the impacts of transmission line contingencies are reflected in energy prices. However, there is a lack of efficient mechanisms to handle generator contingencies and reflect the impacts of generator contingencies on energy prices. In this study, a set of security constraints to withstand single-generator-failure contingencies are presented and the market implications are studied. A new component of locational marginal prices, a marginal security component, which is a weighted shadow price of the security constraints, is proposed to better represent energy prices. A 3-bus system example is given to illustrate the market implications. The results are confirmed on a 73-bus system test case. 5
Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web Dong, XL; Hajishirzi, H; Lockard, C; Shiralkar, P How do we surface the large amount of information present in HTML documents on the Web, from news articles to Rotten Tomatoes pages to tables of sports scores? Such information can enable a variety of applications including knowledge base construction, question answering, recommendation, and more. In this tutorial, we present approaches for information extraction (IE) from Web data that can be differentiated along two key dimensions: 1) the diversity in data modality that is leveraged, e.g. text, visual, XML/HTML, and 2) the thrust to develop scalable approaches with zero to limited human supervision. 5
Bayesian networks for supporting query processing over incomplete autonomous databases Raghunathan, R; De, S; Kambhampati, S As the information available to na < ve users through autonomous data sources continues to increase, mediators become important to ensure that the wealth of information available is tapped effectively. A key challenge that these information mediators need to handle is the varying levels of incompleteness in the underlying databases in terms of missing attribute values. Existing approaches such as QPIAD aim to mine and use Approximate Functional Dependencies (AFDs) to predict and retrieve relevant incomplete tuples. These approaches make independence assumptions about missing values-which critically hobbles their performance when there are tuples containing missing values for multiple correlated attributes. In this paper, we present a principled probabilistic alternative that views an incomplete tuple as defining a distribution over the complete tuples that it stands for. We learn this distribution in terms of Bayesian networks. Our approach involves mining/learning Bayesian networks from a sample of the database, and using it to do both imputation (predict a missing value) and query rewriting (retrieve relevant results with incompleteness on the query-constrained attributes, when the data sources are autonomous). We present empirical studies to demonstrate that (i) at higher levels of incompleteness, when multiple attribute values are missing, Bayesian networks do provide a significantly higher classification accuracy and (ii) the relevant possible answers retrieved by the queries reformulated using Bayesian networks provide higher precision and recall than AFDs while keeping query processing costs manageable. 5
Production and Inventory Control for a Make-to-Stock/Calibrate-to-Order System with Dedicated and Shared Resources Demirel, S; Duenyas, I; Kapuscinski, R Consider a firm that produces multiple products on dedicated production lines (stage 1), which are further customized/calibrated on a shared resource (stage 2), common to all products. The dedicated production lines and the shared resource for calibration face capacity uncertainties. The firm holds inventory of products that are not yet calibrated and carries out calibration when an order is received. We analyze a multiperiod inventory model for two products and derive the optimal production policy at stage 1 as well as the optimal allocation of the shared resource to demands at stage 2. For the shared resource, because of its capacity uncertainty, not only the total planned production quantities matter, but also the sequence in which the products are processed. We characterize the optimal allocation of the shared resource and show that the optimal policy keeps the ending inventories of products as close to a so-called target path as possible. For the dedicated production lines, because of their capacity uncertainty, the optimal production policy depends on the initial inventories. We identify and characterize the structural properties of the optimal production policy. Through a numerical study, we explore how the presence of finite shared capacity influences the inventory targets. We find that the behavior may be counterintuitive: when multiple products share a finite capacity in stage 2, the inventory target for the product having a larger dedicated production capacity or less capacity variability in stage 1 can be higher than the other product. We finally provide sensitivity analysis for the optimal policy and test the performance of simple heuristic policies. 5
A Scalable and Generic Framework to Mine Top-k Representative Subgraph Patterns Natarajan, D; Ranu, S Mining subgraph patterns is an active area of research. Existing research has primarily focused on mining all subgraph patterns in the database. However, due to the exponential subgraph search space, the number of patterns mined, typically, is too large for any human mediated analysis. Consequently, deriving insights from the mined patterns is hard for domain scientists. In addition, subgraph pattern mining is posed in multiple forms: the function that models if a subgraph is a pattern varies based on the application and the database could be over multiple graphs or a single, large graph. In this paper, we ask the following question: Given a subgraph importance function and a budget k, which are the k subgraph patterns that best represent all other patterns of interest? We show that the problem is NP-hard, and propose a generic framework called RESLING that adapts to arbitrary subgraph importance functions and generalizable to both transactional graph databases as well as single, large graphs. Experiments show that RESLING is up to 20 times more representative of the pattern space and 2 orders of magnitude faster than the state-of-the-art techniques. 5
Scalable Distributed Service Integrity Attestation for Software-as-a-Service Clouds Du, J; Dean, DJ; Tan, YM; Gu, XH; Yu, T Software-as-a-service (SaaS) cloud systems enable application service providers to deliver their applications via massive cloud computing infrastructures. However, due to their sharing nature, SaaS clouds are vulnerable to malicious attacks. In this paper, we present IntTest, a scalable and effective service integrity attestation framework for SaaS clouds. IntTest provides a novel integrated attestation graph analysis scheme that can provide stronger attacker pinpointing power than previous schemes. Moreover, IntTest can automatically enhance result quality by replacing bad results produced by malicious attackers with good results produced by benign service providers. We have implemented a prototype of the IntTest system and tested it on a production cloud computing infrastructure using IBM System S stream processing applications. Our experimental results show that IntTest can achieve higher attacker pinpointing accuracy than existing approaches. IntTest does not require any special hardware or secure kernel support and imposes little performance impact to the application, which makes it practical for large-scale cloud systems. 5
You must have clicked on this ad by mistake! Data-driven identification of accidental clicks on mobile ads with applications to advertiser cost discounting and click-through rate prediction Tolomei, G; Lalmas, M; Farahat, A; Haines, A In the cost per click pricing model, an advertiser pays an ad network only when a user clicks on an ad; in turn, the ad network gives a share of that revenue to the publisher where the ad was impressed. Still, advertisers may be unsatisfied with ad networks charging them for valueless clicks, or so-called accidental clicks. These happen when users click on an ad, are redirected to the advertiser website and bounce back without spending any time on the ad landing page. Charging advertisers for such clicks is detrimental in the long term as the advertiser may decide to run their campaigns on other ad networks. In addition, machine-learned click models trained to predict which ad will bring the highest revenue may overestimate an ad click-through rate, and as a consequence negatively impacting revenue for both the ad network and the publisher. In this work, we propose a data-driven method to detect accidental clicks from the perspective of the ad network. We collect observations of time spent by users on a large set of ad landing pages-i.e., dwell time. We notice that the majority of per-ad distributions of dwell time fit to a mixture of distributions, where each component may correspond to a particular type of clicks, the first one being accidental. We then estimate dwell time thresholds of accidental clicks from that component. Using our method to identify accidental clicks, we then propose a technique that smoothly discounts the advertiser's cost of accidental clicks at billing time. Experiments conducted on a large dataset of ads served on Yahoo mobile apps confirm that our thresholds are stable over time, and revenue loss in the short term is marginal. We also compare the performance of an existing machine-learned click model trained on all ad clicks with that of the same model trained only on non-accidental clicks. There, we observe an increase in both ad click-through rate (+ 3.9%) and revenue (+ 0.2%) on ads served by the Yahoo Gemini network when using the latter. These two applications validate the need to consider accidental clicks for both billing advertisers and training ad click models. 5
Reinforcement Learning for the Adaptive Scheduling of Educational Activities Bassen, J; Balaji, B; Schaarschmidt, M; Thille, C; Painter, J; Zimmaro, D; Gamest, A; Fast, E; Mitchell, JC Adaptive instruction for online education can increase learning gains and decrease the work required of learners, instructors, and course designers. Reinforcement Learning (RL) is a promising tool for developing instructional policies, as RL models can learn complex relationships between course activities, learner actions, and educational outcomes. This paper demonstrates the first RL model to schedule educational activities in real time for a large online course through active learning. Our model learns to assign a sequence of course activities while maximizing learning gains and minimizing the number of items assigned. Using a controlled experiment with over 1,000 learners, we investigate how this scheduling policy affects learning gains, dropout rates, and qualitative learner feedback. We show that our model produces better learning gains using fewer educational activities than a linear assignment condition, and produces similar learning gains to a self-directed condition using fewer educational activities and with lower dropout rates. 5
Y QC-MDPC Decoders with Several Shades of Gray Drucker, N; Gueron, S; Kostic, D QC-MDPC code-based KEMs rely on decoders that have a small or even negligible Decoding Failure Rate (DFR). These decoders should be efficient and implementable in constant-time. One example for a QC-MDPC KEM is the Round-2 candidate of the NIST PQC standardization project, BIKE. We have recently shown that the Black-Gray decoder achieves the required properties. In this paper, we define several new variants of the Black-Gray decoder. One of them, called Black-Gray-Flip, needs only 7 steps to achieve a smaller DFR than Black-Gray with 9 steps, for the same block size. On current AVX512 platforms, our BIKE-1 (Level-1) constant-time decapsulation is 1.9x faster than the previous decapsulation with Black-Gray. We also report an additional 1.25x decapsulating speedup using the new AVX512-VBMI2 and vector-PCLMULQDQ instructions available on Ice-Lake micro-architecture. 5
Can Adversarial Weight Perturbations Inject Neural Backdoors? Garg, S; Kumar, A; Goel, V; Liang, YY Adversarial machine learning has exposed several security hazards of neural models. Thus far, the concept of an adversarial perturbation has exclusively been used with reference to the input space referring to a small, imperceptible change which can cause a ML model to err. In this work we extend the idea of adversarial perturbations to the space of model weights, specifically to inject backdoors in trained DNNs, which exposes a security risk of publicly available trained models. Here, injecting a backdoor refers to obtaining a desired outcome from the model when a trigger pattern is added to the input, while retaining the original predictions on a non-triggered input. From the perspective of an adversary, we characterize these adversarial perturbations to be constrained within an l(infinity) norm around the original model weights. We introduce adversarial perturbations in model weights using a composite loss on the predictions of the original model and the desired trigger through projected gradient descent. Our results show that backdoors can be successfully injected with a very small average relative change in model weight values for several CV and NLP applications. 5
Two methods of data reconciliation for pipeline networks Isom, JD; Stamps, AT; Esmaili, A; Mancilla, C The paper compares the two most common methods of data reconciliation for pipeline networks. The first method, an unscented Kalman filter (UKF), uses a system of nonlinear implicit ordinary differential equations derived from the governing partial differential equations. The second method, a quadratic program, relies on a transformation of the system of nonlinear ordinary differential equations into a set of linear difference equations, with the linearization optimized for known pressure and flow ranges using a novel linearization technique. Both the UKF and the quadratic programming approaches for data reconciliation in gas pipeline networks are viable for networks of small to moderate size. Given the reduced number of simplifying assumptions and the resulting improved accuracy, the UKF may be preferable when the computational problem is tractable. The quadratic programming approach is faster, accepts lower fidelity models, and provides acceptable accuracy. (C) 2018 Elsevier Ltd. All rights reserved. 5
An efficient approach based on selective partitioning for maximal frequent itemsets mining Bai, A; Dhabu, M; Jagtap, V; Deshpande, PS We present a maximal frequent itemset (MFI) mining algorithm based on selective partitioning called SelPMiner. It makes use of a novel data format named Itemset-count tree-a compact and optimized representation in the form of partition that reduces memory requirement. It also does selective partitioning of the database, which reduces runtime to scan database. As the algorithm progressively searches for longer frequent itemsets in a depth-first manner, it creates new partitions with even smaller sizes having less dimensions and unique data instances, which results in faster support counting. SelPMiner uses a number of optimizations to prune the search space. We also prove upper bounds on the amount of memory consumed by these partitions. Experimental comparisons of the SelPMiner algorithm with popular existing fastest MFI mining algorithms on different types of datasets show significant speedup in computation time for many cases. SelPMiner works especially well when the minimum support is low and consumes less memory. 5
ACES: Automatic Configuration of Energy Harvesting Sensors with Reinforcement Learning Fraternali, F; Balaji, B; Agarwal, Y; Gupta, RK Many modern smart building applications are supported by wireless sensors to sense physical parameters, given the flexibility they offer and the reduced cost of deployment. However, most wireless sensors are powered by batteries today, and large deployments are inhibited by the requirement of periodic battery replacement. Energy harvesting sensors provide an attractive alternative, but they need to provide adequate quality of service to applications given the uncertainty of energy availability. We propose ACES, which uses reinforcement learning to maximize sensing quality of energy harvesting sensors for periodic and event-driven indoor sensing with available energy. Our custom-built sensor platform uses a supercapacitor to store energy and Bluetooth Low Energy to relay sensors data. Using simulations and real deployments, we use the data collected to continually adapt the sensing of each node to changing environmental patterns and transfer learning to reduce the training time in real deployments. In our 60-node deployment lasting 2 weeks, nodes stop operations only 0.1% of the time, and collection of data is comparable with current battery-powered nodes. We show that ACES reduces the node duty-cycle period by an average of 33% compared to three prior reinforcement learning techniques while continuously learning environmental changes over time. 5
A dual boundary classifier for predicting acute hypotensive episodes in critical care Bhattacharya, S; Huddar, V; Rajan, V; Reddy, CK An Acute Hypotensive Episode (AHE) is the sudden onset of a sustained period of low blood pressure and is one among the most critical conditions in Intensive Care Units (ICU). Without timely medical care, it can lead to an irreversible organ damage and death. By identifying patients at risk for AHE early, adequate medical intervention can save lives and improve patient outcomes. In this paper, we design a novel dual-boundary classification based approach for identifying patients at risk for AHE. Our algorithm uses only simple summary statistics of past Blood Pressure measurements and can be used in an online environment facilitating real-time updates and prediction. We perform extensive experiments with more than 4,500 patient records and demonstrate that our method outperforms the previous best approaches of AHE prediction. Our method can identify AHE patients two hours in advance of the onset, giving sufficient time for appropriate clinical intervention with nearly 80% sensitivity and at 95% specificity, thus having very few false positives. 5
Knowledge Graph Semantic Enhancement of Input Data for Improving AI Bhatt, S; Sheth, A; Shalin, V; Zhao, JJ Intelligent systems designed using machine learning algorithms require a large number of labeled data. Background knowledge provides complementary, real-world factual information that can augment the limited labeled data to train a machine learning algorithm. The term Knowledge Graph (KG) is in vogue as for many practical applications, it is convenient and useful to organize this background knowledge in the form of a graph. Recent academic research and implemented industrial intelligent systems have shown promising performance for machine learning algorithms that combine training data with a knowledge graph. In this article, we discuss the use of relevant KGs to enhance the input data for two applications that use machine learning-recommendation and community detection. The KG improves both accuracy and explainability. 5
The Art and Craft of Fraudulent App Promotion in Google Play Rahman, M; Hernandez, N; Recabarren, R; Ahmed, SI; Carbunar, B Black Hat App Search Optimization (ASO) in the form of fake reviews and sockpuppet accounts, is prevalent in peer-opinion sites, e.g., app stores, with negative implications on the digital and real lives of their users. To detect and filter fraud, a growing body of research has provided insights into various aspects of fraud posting activities, and made assumptions about the working procedures of the fraudsters from online data. However, such assumptions often lack empirical evidence from the actual fraud perpetrators. To address this problem, in this paper, we present results of both a qualitative study with 18 ASO workers we recruited from 5 freelancing sites, concerning activities they performed on Google Play, and a quantitative investigation with fraud-related data collected from other 39 ASO workers. We reveal findings concerning various aspects of ASO worker capabilities and behaviors, including novel insights into their working patterns, and supporting evidence for several existing assumptions. Further, we found and report participant-revealed techniques to bypass Google-imposed verifications, concrete strategies to avoid detection, and even strategies that leverage fraud detection to enhance fraud efficacy. We report a Google site vulnerability that enabled us to infer the mobile device models used to post more than 198 million reviews in Google Play, including 9,942 fake reviews. We discuss the deeper implications of our findings, including their potential use to develop the next generation fraud detection and prevention systems. 5
ADDRESS TRANSLATION FOR THROUGHPUT-ORIENTED ACCELERATORS Pichai, B; Hsu, L; Bhattacharjee, A WITH PROCESSOR VENDORS EMBRACING HARDWARE HETEROGENEITY, PROVIDING LOW-OVERHEAD HARDWARE AND SOFTWARE ABSTRACTIONS TO SUPPORT EASY-TO-USE PROGRAMMING MODELS IS A CRITICAL PROBLEM. IN THIS CONTEXT, THIS WORK SETS THE FOUNDATION FOR DESIGNING MEMORY MANAGEMENT UNITS (MMUS) FOR GPUS IN CPU/GPU SYSTEMS, THE KEY MECHANISM NECESSARY TO SUPPORT THE INCREASINGLY IMPORTANT UNIFIED ADDRESS SPACE PARADIGM IN HETEROGENEOUS SYSTEMS. 5
Practical Access Pattern Privacy by Combining PIR and Oblivious Shuffle Zhang, ZL; Wang, K; Lin, WP; Fu, AWC; Wong, RCW We consider the following secure data retrieval problem: a client outsources encrypted data blocks to a semi-trusted cloud server and later retrieves blocks without disclosing access patterns. Existing PIR and ORAM solutions suffer from serious performance bottlenecks in terms of communication or computation costs. To help eliminate this void, we introduce access pattern unlinkability that separates access pattern privacy into short-term privacy at individual query level and long-term privacy at query distribution level. This new security definition provides tunable trade-offs between privacy and query performance. We present an efficient construction, called SBR protocol, using PIR and Oblivious Shuffling to enable secure data retrieval while satisfying access pattern unlinkability. Both analytical and empirical analysis show that SBR exhibits flexibility and usability in practice. 5
Code-level model checking in the software development workflow at Amazon Web Services Chong, N; Cook, B; Eidelman, J; Kallas, K; Khazem, K; Monteiro, FR; Schwartz-Narbonne, D; Tasiran, S; Tautschnig, M; Tuttle, MR This article describes a style of applying symbolic model checking developed over the course of four years at Amazon Web Services. Lessons learned are drawn from proving properties of numerous C-based systems, for example, custom hypervisors, encryption code, boot loaders, and an IoT operating system. Using our methodology, we find that we can prove the correctness of industrial low-level C-based systems with reasonable effort and predictability. Furthermore, Amazon Web Services developers are increasingly writing their own formal specifications. As part of this effort, we have developed a CI system that allows integration of the proofs into standard development workflows and extended the proof tools to provide better feedback to users. All proofs discussed in this article are publicly available on GitHub. 5
Cry75Aa (Mpp75Aa) Insecticidal Proteins for Controlling the Western Corn Rootworm, Diabrotica virgifera virgifera LeConte (Coleoptera: Chrysomelidae), Isolated from the Insect-Pathogenic Bacterium Brevibacillus laterosporus Bowen, D; Yin, Y; Flasinski, S; Chay, C; Bean, G; Milligan, J; Moar, W; Pan, A; Werner, B; Buckman, K; Howe, A; Ciche, T; Turner, K; Pleau, M; Zhang, J; Kouadio, JL; Hibbard, BE; Price, P; Roberts, J This study describes three closely related proteins cloned from Brevibacillus laterosporus strains that are lethal upon feeding to Diabrotica virgifera virgifera LeConte, the western corn rootworm (WCR). Mpp75Aa1, Mpp75Aa2, and Mpp75Aa3 were toxic to WCR larvae when the larvae were fed purified protein. Transgenic plants expressing each mMpp75Aa protein were protected from feeding damage and showed a significant reduction in adult emergence from infested plants by both susceptible Cry3Bb1 and Cry34Ab1/Cry35Ab1-resistant WCR. These results demonstrate that proteins from B. laterosporus are as efficacious as the well-known Bacillus thuringiensis insecticidal proteins in controlling major insect pests such as WCR. The deployment of transgenic maize expressing mMpp75Aa, along with other active molecules lacking cross-resistance, has the potential to be a useful tool for control of WCR populations resistant to current B. thuringiensis traits. IMPORTANCE Insects feeding on roots of crops can damage the plant roots, resulting in yield loss due to poor water and nutrient uptake and plant lodging. In maize, the western corn rootworm (WCR) can cause severe damage to the roots, resulting in significant economic loss for farmers. Genetically modified (GM) plants expressing Bacillus thuringiensis insect control proteins have provided a solution for control of these pests. In recent years, populations of WCR resistant to the B. thuringiensis proteins in commercial GM maize have emerged. There is a need to develop new insecticidal traits for the control of WCR populations resistant to current commercial traits. New proteins with commercial-level efficacy on WCR from sources other than B. thuringiensis are becoming more critical. The Mpp75Aa proteins from B. laterosporus, when expressed in maize, are efficacious against the resistant populations of WCR and have the potential to provide solutions for control of resistant WCR. 5
Towards Crowd-based Customer Service: A Mixed-Initiative Tool for Managing Q&A Sites Piccardi, T; Convertino, G; Zancanaro, M; Wang, J; Archambeau, C In this paper, we propose a mixed-initiative approach to integrate a Q&A site based on a crowd of volunteers with a standard operator-based help desk, ensuring quality of customer service. Q&A sites have emerged as an efficient way to address questions in various domains by leveraging crowd knowledge. However, they lack sufficient reliability to be the sole basis of customer service applications. We built a proof-of-concept mixed-initiative tool that helps a crowd-manager to decide if a question will get a satisfactory and timely answer by the crowd or if it should be redirected to a dedicated operator. A user experiment found that our tool reduced the participants' cognitive load and improved their performance, in terms of their precision and recall. In particular, those with higher performance benefited more than those with lower performance. 5
Economic Predictions With Big Data: The Illusion of Sparsity Giannone, D; Lenza, M; Primiceri, GE We compare sparse and dense representations of predictive models in macroeconomics, microeconomics, and finance. To deal with a large number of possible predictors, we specify a prior that allows for both variable selection and shrinkage. The posterior distribution does not typically concentrate on a single sparse model, but on a wide set of models that often include many predictors. 5
Entity Matching Meets Data Science: A Progress Report from the Magellan Project Govind, Y; Konda, P; Suganthan, GCP; Martinkus, P; Nagarajan, P; Li, H; Soundararajan, A; Mudgal, S; Ballard, JR; Zhang, HJ; Ardalan, A; Das, S; Paulsen, D; Saini, A; Paulson, E; Park, Y; Carter, M; Sun, MJ; Fung, GM; Doan, AH Entity matching (EM) finds data instances that refer to the same real-world entity. In 2015, we started the Magellan project at UW-Madison, joint with industrial partners, to build EM systems. Most current EM systems are stand-alone monoliths. In contrast, Magellan borrows ideas from the field of data science (DS), to build a new kind of EM systems, which is an ecosystem of interoperable tools. This paper provides a progress report on the past 3.5 years of Magellan, focusing on the system aspects and on how ideas from the field of data science have been adapted to the EM context. We argue why EM can be viewed as a special class of DS problems, and thus can benefit from system building ideas in DS. We discuss how these ideas have been adapted to build PyMatcher and CloudMatcher, EM tools for power users and lay users. These tools have been successfully used in 21 EM tasks at 12 companies and domain science groups, and have been pushed into production for many customers. We report on the lessons learned, and outline a new envisioned Magellan ecosystem, which consists of not just on-premise Python tools, but also interoperable microservices deployed, executed, and scaled out on the cloud, using tools such as Dockers and Kubernetes. 5
Crowdsourcing Enumeration Queries: Estimators and Interfaces Trushkowsky, B; Kraska, T; Franklin, MJ; Sarkar, P; Ramachandran, V Hybrid human/computer database systems promise to greatly expand the usefulness of query processing by incorporating the crowd for data gathering and other tasks. Such systems raise many implementation questions. Perhaps the most fundamental issue is that the closed world assumption underlying relational query semantics does not hold in such systems. As a consequence, the meaning of even simple queries can be called into question. Furthermore, query progress monitoring becomes difficult due to non-uniformities in the arrival of crowdsourced data and peculiarities of how people work in crowdsourcing systems. To address these issues, we develop statistical tools that enable users and systems developers to reason about query completeness. These tools can also help drive query execution and crowdsourcing strategies. We evaluate our techniques using experiments on a popular crowdsourcing platform. 5
Computer-based inhibitory control training in children with Attention-Deficit/Hyperactivity Disorder (ADHD): Evidence for behavioral and neural impact Meyer, KN; Santillana, R; Miller, B; Clapp, W; Way, M; Bridgman-Goines, K; Sheridan, MA Attention-deficit hyperactivity disorder (ADHD) is the most commonly diagnosed psychological disorder of childhood. Medication and cognitive behavioral therapy are effective treatments for many children; however, adherence to medication and therapy regimens is low. Thus, identifying effective adjunct treatments is imperative. Previous studies exploring computerized training programs as supplementary treatments have targeted working memory or attention. However, many lines of research suggest inhibitory control (IC) plays a central role in ADHD pathophysiology, which makes IC a potential intervention target. In this randomized control trial (NCT03363568), we target IC using a modified stop-signal task (SST) training designed by NeuroScouting, LLC in 40 children with ADHD, aged 8 to 11 years. Children were randomly assigned to adaptive treatment (n = 20) or non-adaptive control (n = 20) with identical stimuli and task goals. Children trained at home for at least 5 days a week (about 15m/day) for 4-weeks. Relative to the control group, the treatment group showed decreased relative theta power in resting EEG and trending improvements in parent ratings of attention (i.e. decreases in inattentive behaviors). Both groups showed improved SST performance. There was not evidence for treatment effects on hyperactivity or teacher ratings of symptoms. Results suggest training IC alone has potential to positively impact symptoms of ADHD and provide evidence for neural underpinnings of this impact (change in theta power; change in N200 latency). This shows promising initial results for the use of computerized training of IC in children with ADHD as a potential adjunct treatment option for children with ADHD. 5
Streaming Algorithms for Estimating the Matching Size in Planar Graphs and Beyond Esfandiari, H; Hajiaghayi, M; Liaghat, V; Monemizadeh, M; Onak, K We consider the problem of estimating the size of a maximum matching when the edges are revealed in a streaming fashion. When the input graph is planar, we present a simple and elegant streaming algorithm that, with high probability, estimates the size of a maximum matching within a constant factor using (O) over tilde (n(2/3)) space, where n is the number of vertices. The approach generalizes to the family of graphs that have bounded arboricity, which include graphs with an excluded constant-size minor. To the best of our knowledge, this is the first result for estimating the size of a maximum matching in the adversarial-order streaming model (as opposed to the random-order streaming model) in o(n) space. We circumvent the barriers inherent in the adversarial-order model by exploiting several structural properties of planar graphs, and more generally, graphs with bounded arboricity. We further reduce the required memory size to (O) over tilde(root n) for three restricted settings: (i) when the input graph is a forest; (ii) when we have 2-passes and the input graph has bounded arboricity; and (iii) when the edges arrive in random order and the input graph has bounded arboricity. Finally, we design a reduction from the Boolean Hidden Matching Problem to show that there is no randomized streaming algorithm that estimates the size of the maximum matching to within a factor better than 3/2 and uses only o(n(1/2)) bits of space. Using the same reduction, we show that there is no deterministic algorithm that computes this kind of estimate in o(n) bits of space. The lower bounds hold even for graphs that are collections of paths of constant length. 5
Privacy- and Utility-Preserving Textual Analysis via Calibrated Multivariate Perturbations Feyisetan, O; Balle, B; Drake, T; Diethe, T Accurately learning from user data while providing quantifiable privacy guarantees provides an opportunity to build better ML models while maintaining user trust. This paper presents a formal approach to carrying out privacy preserving text perturbation using the notion of d chi-privacy designed to achieve geo-indistinguishability in location data. Our approach applies carefully calibrated noise to vector representation of words in a high dimension space as defined by word embedding models. We present a privacy proof that satisfies d chi-privacy where the privacy parameter epsilon provides guarantees with respect to a distance metric defined by the word embedding space. We demonstrate how e can be selected by analyzing plausible deniability statistics backed up by large scale analysis on GLOVE and FASTTEXT embeddings. We conduct privacy audit experiments against 2 baseline models and utility experiments on 3 datasets to demonstrate the tradeoff between privacy and utility for varying values of epsilon on different task types. Our results demonstrate practical utility (< 2% utility loss for training binary classifiers) while providing better privacy guarantees than baseline models. 4
MRNet-Product2Vec: A Multi-task Recurrent Neural Network for Product Embeddings Biswas, A; Bhutani, M; Sanyal, S E-commerce websites such as Amazon, Alibaba, Flipkart, and Walmart sell billions of products. Machine learning (ML) algorithms involving products are often used to improve the customer experience and increase revenue, e.g., product similarity, recommendation, and price estimation. The products are required to be represented as features before training an ML algorithm. In this paper, we propose an approach called MRNet-Product2Vec for creating generic embeddings of products within an e-commerce ecosystem. We learn a dense and low-dimensional embedding where a diverse set of signals related to a product are explicitly injected into its representation. We train a Discriminative Multi-task Bidirectional Recurrent Neural Network (RNN), where the input is a product title fed through a Bidirectional RNN and at the output, product labels corresponding to fifteen different tasks are predicted. The task set includes several intrinsic characteristics about a product such as price, weight, size, color, popularity, and material. We evaluate the proposed embedding quantitatively and qualitatively. We demonstrate that they are almost as good as sparse and extremely high-dimensional TF-IDF representation in spite of having less than 3% of the TF-IDF dimension. We also use a multimodal autoencoder for comparing products from different language-regions and show preliminary yet promising qualitative results. 4
Leveraging Hierarchical Representations for Preserving Privacy and Utility in Text Feyisetan, O; Diethe, T; Drake, T Guaranteeing a certain level of user privacy in an arbitrary piece of text is a challenging issue. However, with this challenge comes the potential of unlocking access to vast data stores for training machine learning models and supporting data driven decisions. We address this problem through the lens of d(chi)-privacy, a generalization of Differential Privacy to non Hamming distance metrics. In this work, we explore word representations in Hyperbolic space as a means of preserving privacy in text. We provide a proof satisfying d.-privacy, then we define a probability distribution in Hyperbolic space and describe a way to sample from it in high dimensions. Privacy is provided by perturbing vector representations of words in high dimensional Hyperbolic space to obtain a semantic generalization. We conduct a series of experiments to demonstrate the tradeoff between privacy and utility. Our privacy experiments illustrate protections against an authorship attribution algorithm while our utility experiments highlight the minimal impact of our perturbations on several downstream machine learning models. Compared to the Euclidean baseline, we observe > 20x greater guarantees on expected privacy against comparable worst case statistics. 4
Machine Learning at Amazon Herbrich, R In this talk I will give an introduction into the field of machine learning and discuss why it is a crucial technology for Amazon. Machine learning is the science of automatically extracting patterns from data in order to make automated predictions of future data. One way to differentiate machine learning tasks is by the following two factors: (1) How much noise is contained in the data? and (2) How far into the future is the prediction task? The former presents a limit to the learnability of task regardless which learning algorithm is used whereas the latter has a crucial implication on the representation of the predictions: while most tasks in search and advertising typically only forecast minutes into the future, tasks in e-commerce can require predictions up to a year into the future. The further the forecast horizon, the more important it is to take account of uncertainty in both the learning algorithm and the representation of the predictions. I will discuss which learning frameworks are best suited for the various scenarios, that is, short-term predictions with little noise vs. long-term predictions with lots of noise, and present some ideas to combine representation learning with probabilistic methods. In the second half of the talk, I will give an overview of the applications of machine learning at Amazon ranging from demand forecasting, machine translation to automation of computer vision tasks and robotics. I will also discuss the importance of tools for data scientist and share learnings on bringing machine learning algorithms into products. 4
Using Phoneme Representations to Build Predictive Models Robust to ASR Errors Fang, AJ; Filice, S; Limsopatham, N; Rokhlenko, O Even though Automatic Speech Recognition (ASR) systems significantly improved over the last decade, they still introduce a lot of errors when they transcribe voice to text. One of the most common reasons for these errors is phonetic confusion between similar-sounding expressions. As a result, ASR transcriptions often contain quasi oronyms, i.e., words or phrases that sound similar to the source ones, but that have completely different semantics (e.g., win instead of when or accessible on defecting instead of accessible and affecting). These errors significantly affect the performance of downstream Natural Language Understanding (NLU) models (e.g., intent classification, slot filling, etc.) and impair user experience. To make NLU models more robust to such errors, we propose novel phonetic-aware text representations. Specifically, we represent ASR transcriptions at the phoneme level, aiming to capture pronunciation similarities, which are typically neglected in word-level representations (e.g., word embeddings). To train and evaluate our phoneme representations, we generate noisy ASR transcriptions of four existing datasets - Stanford Sentiment Treebank, SQuAD, TREC Question Classification and Subjectivity Analysis - and show that common neural network architectures exploiting the proposed phoneme representations can effectively handle noisy transcriptions and significantly outperform state-of-the-art base-lines. Finally, we confirm these results by testing our models on real utterances spoken to the Alexa virtual assistant. 4
Finite element analysis of the impact of bone nanostructure on its piezoelectric response Pai, S; Kwon, J; Liang, BW; Cho, HN; Soghrati, S The piezoelectric response of bone at the submicron scale is analyzed under mechanical loadings using the finite element (FE) method. A new algorithm is presented to virtually reconstruct realistic bone nanostructures, consisting of collagen fibrils embedded in a hydroxyapatite mineral network. This algorithm takes into account potential misalignments between fibrils, as well the porous structure of the mineral phase. A parallel non-iterative mesh generation algorithm is utilized to create high-fidelity FE models for several representative volume elements (RVEs) of the bone with various fibrils volume fractions and misalignments. The piezoelectric response of each RVE is simulated under three types of loading: the longitudinal compression, lateral compression, and shear. The resulting homogenized stress and electric field in RVEs with aligned fibrils showed a linear variation with the fibrils volume fraction under all loading conditions. For RVEs with misaligned fibrils, although more oscillations were observed in homogenized results, their difference with the results of RVEs with aligned fibrils subject to lateral compression and shear loadings were negligible. However, under longitudinal compression, the electric field associated with RVEs with misaligned fibrils was notably higher than that of RVEs with aligned fibrils for the same volume fraction. 4
Shallow and Deep Syntactic/Semantic Structures for Passage Reranking in Question-Answering Systems Tymoshenko, K; Moschitti, A In this article, we extensively study the use of syntactic and semantic structures obtained with shallow and full syntactic parsers for answer passage reranking. We propose several dependency and constituent-based structures, also enriched with Linked Open Data (LD) knowledge to represent pairs of questions and answer passages. We encode such tree structures in learning-to-rank (L2R) algorithms using tree kernels, which can project them in tree substructure spaces, where each dimension represents a powerful syntactic/semantic feature. Additionally, since we define links between question and passage structures, our tree kernel spaces also include relational structural features. We carried out an extensive comparative experimentation of our models for automatic answer selection benchmarks on different TREC QA corpora as well as the newer Wikipedia-based dataset, namely WikiQA, which has been widely used to test sentence rerankers. The results consistently demonstrate that our structural semantic models achieve the state of the art in passage reranking. In particular, we derived the following important findings: (i) relational syntactic structures are essential to achieve superior results; (ii) models trained with dependency trees can outperform those trained with shallow trees, e.g., in case of sentence reranking; (iii) external knowledge automatically generated with focus and question classifiers is very effective; and (iv) the semantic information derived by LD and incorporated in syntactic structures can be used to replace the knowledge provided by the above-mentioned classifiers. This is a remarkable advantage as it enables our models to increase coverage and portability over new domains. 4
A Study of Context Dependencies in Multi-page Product Search Bi, KP; Teo, CH; Dattatreya, Y; Mohan, V; Croft, WB In product search, users tend to browse results on multiple search result pages (SERPs) (e.g., for queries on clothing and shoes) before deciding which item to purchase. Users' clicks can be considered as implicit feedback which indicates their preferences and used to re-rank subsequent SERPs. Relevance feedback (RF) techniques are usually involved to deal with such scenarios. However, these methods are designed for document retrieval, where relevance is the most important criterion. In contrast, product search engines need to retrieve items that are not only relevant but also satisfactory in terms of customers' preferences. Personalization based on users' purchase history has been shown to be effective in product search [1]. However, this method captures users' long-term interest, which do not always align with their short-term interest, and does not benefit customers with little or no purchase history. In this paper, we study RF techniques based on both long-term and short-term context dependencies in multi-page product search. We also propose an end-to-end context-aware embedding model which can capture both types of context. Our experimental results show that short-term context leads to much better performance compared with long-term and no context. Moreover, our proposed model is more effective than state-of-art word-based RF models. 4
How Do Influencers Mention Brands in Social Media? Sponsorship Prediction of Instagram Posts Yang, X; Kim, S; Sun, YZ Brand mentioning is a type of word-of-mouth advertising method where a brand name is disclosed by social media users in posts. Recently, brand mentioning by influencers has raised great attention because of the strong viral effects on the huge fan base of influencers. In this paper, we study the brand mentioning practice of influencers. More specifically, we analyze a brand mentioning social network built on 18,523 Instagram influencers and 804,397 brand mentioning posts. As a result, we found four inspiring findings: (i) most influencers mention only a few brands in their posts; (ii) popular influencers tend to mention only popular brands while micro-influencers do not have a preference on brand popularity; (iii) audience have highly similar reactions to sponsored and non-sponsored posts; and (iv) compared to non-sponsored posts, sponsored brand mentioning posts favor fewer usertags and more hashtags with longer captions to exclusively promote the specific products. Moreover, we propose a neural network-based model to classify the sponsorship of posts utilizing network embedding and social media features. The experimental results show that our model achieves 80% accuracy and significantly outperforms baseline methods. 4
Data Integration and Machine Learning: A Natural Synergy Dong, XL; Rekatsinas, T As data volume and variety have increased, so have the ties between machine learning and data integration become stronger. For machine learning to be effective, one must utilize data from the greatest possible variety of sources; and this is why data integration plays a key role. At the same time machine learning is driving automation in data integration, resulting in overall reduction of integration costs and improved accuracy. This tutorial focuses on three aspects of the synergistic relationship between data integration and machine learning: (1) we survey how state-of-the-art data integration solutions rely on machine learning-based approaches for accurate results and effective human-in-the-loop pipelines, (2) we review how end-to-end machine learning applications rely on data integration to identify accurate, clean, and relevant data for their analytics exercises, and (3) we discuss open research challenges and opportunities that span across data integration and machine learning. 4
A Stochastic Programming Approach for Locating and Dispatching Two Types of Ambulances Yoon, S; Albert, LA; White, VM Emergency Medical Service systems aim to respond to emergency calls in a timely manner and provide prehospital care to patients. This paper addresses the problem of locating multiple types of emergency vehicles to stations while taking into account that vehicles are dispatched to prioritized patients with different health needs. We propose a two-stage stochastic-programming model that determines how to locate two types of ambulances in the first stage and dispatch them to prioritized emergency patients in the second stage after call-arrival scenarios are disclosed. We demonstrate how the base model can be adapted to include nontransport vehicles. A model formulation generalizes the base model to consider probabilistic travel times and general utilities for dispatching ambulances to prioritized patients. We evaluate the benefit of the model using two case studies, a value of the stochastic solution approach, and a simulation analysis. The case study is extended to study how to locate vehicles in the model extension with nontransport vehicles. Stochastic-programming models are computationally challenging for large-scale problem instances, and, therefore, we propose a solution technique based on Benders cuts. 4
Compiler-Driven FPGA Virtualization with SYNERGY Landgraf, J; Yang, T; Lin, W; Rossbach, CJ; Schkufza, E FPGAs are increasingly common in modern applications, and cloud providers now support on-demand FPGA acceleration in data centers. Applications in data centers run on virtual infrastructure, where consolidation, multi-tenancy, and workload migration enable economies of scale that are fundamental to the provider's business. However, a general strategy for virtualizing FPGAs has yet to emerge. While manufacturers struggle with hardware-based approaches, we propose a compiler/runtime-based solution called SYNERGY. We show a compiler transformation for Verilog programs that produces code able to yield control to software at sub-clock-tick granularity according to the semantics of the original program. SYNERGY uses this property to efficiently support core virtualization primitives: suspend and resume, program migration, and spatial/temporal multiplexing, on hardware which is available today. We use SYNERGY to virtualize FPGA workloads across a cluster of Altera SoCs and Xilinx FPGAs on Amazon F1. The workloads require no modification, run within 3 - 4x of unvirtualized performance, and incur a modest increase in FPGA fabric utilization. 4
Political budget cycles and the civil service: Evidence from highway spending in US states Bostashvili, D; Ujhelyi, G We study political budget cycles in infrastructure spending that are conditional on bureaucratic organization. Bureaucrats can facilitate or hinder politicians' ability to engage in voter-friendly spending around elections. To test this idea, we use civil service reforms undertaken by US states in the second half of the 20th century to study political budget cycles in highway spending under civil service and patronage. We find that under patronage, highway spending is 12% higher in election years and 9% higher in the year before an election. By contrast, under civil service highway spending is essentially smooth over the electoral cycle. These findings provide a novel way through which civil service rules can stabilize government activity. (C) 2019 Elsevier B.V. All rights reserved. 4
One-Pass Ranking Models for Low-Latency Product Recommendations Freno, A; Saveski, M; Jenatton, R; Archambeau, C Purchase logs collected in e-commerce platforms provide rich information about customer preferences. These logs can be leveraged to improve the quality of product recommendations by feeding them to machine-learned ranking models. However, a variety of deployment constraints limit the naive applicability of machine learning to this problem. First, the amount and the dimensionality of the data make in-memory learning simply not possible. Second, the drift of customers' preference over time require to retrain the ranking model regularly with freshly collected data. This limits the time that is available for training to prohibitively short intervals. Third, ranking in real-time is necessary whenever the query complexity prevents us from caching the predictions. This constraint requires to minimize prediction time (or equivalently maximize the data throughput), which in turn may prevent us from achieving the accuracy necessary in web scale industrial applications. In this paper, we investigate how the practical challenges faced in this setting can be tackled via an online learning to rank approach. Sparse models will be the key to reduce prediction latency, whereas one pass stochastic optimization will minimize the training time and restrict the memory footprint. Interestingly, and perhaps surprisingly, extensive experiments show that one-pass learning preserves most of the predictive performance. Additionally, we study a variety of online learning algorithms that enforce sparsity and provide insights to help the practitioner make an informed decision about which approach to pick. We report results on a massive purchase log dataset from the Amazon retail website, as well as on several benchmarks from the LETOR corpus. 4
Energy-Based Learning for Scene Graph Generation Suhail, M; Mittal, A; Siddiquie, B; Broaddus, C; Eledath, J; Medioni, G; Sigal, L Traditional scene graph generation methods are trained using cross-entropy losses that treat objects and relationships as independent entities. Such a formulation, however, ignores the structure in the output space, in an inherently structured prediction problem. In this work, we introduce a novel energy-based learning framework for generating scene graphs. The proposed formulation allows for efficiently incorporating the structure of scene graphs in the output space. This additional constraint in the learning framework acts as an inductive bias and allows models to learn efficiently from a small number of labels. We use the proposed energy-based framework + to train existing state-of-the-art models and obtain a significant performance improvement, of up to 21% and 27%, on the Visual Genome [9] and GQA [5] benchmark datasets, respectively. Furthermore, we showcase the learning efficiency of the proposed framework by demonstrating superior performance in the zero- and few-shot settings where data is scarce. 4
Centralized and Decentralized Cache-Aided Interference Management in Heterogeneous Parallel Channels Piovano, E; Joudeh, H; Clerckx, B We consider the problem of cache-aided interference management in a network consisting of K-T single-antenna transmitters and K-R single-antenna receivers, where each node is equipped with a cache memory. Transmitters communicate with receivers over two heterogenous parallel subchannels: the P-subchannel for which transmitters have perfect instantaneous knowledge of the channel state, and the N-subchannel for which the transmitters have no knowledge of the instantaneous channel state. Under the assumptions of uncoded placement and separable one-shot linear delivery over the two subchannels, we characterize the optimal degrees-of-freedom (DoF) to within a constant multiplicative factor of 2. We extend the result to a decentralized setting in which no coordination is required for content placement at the receivers. In this case, we characterize the optimal one-shot linear DoF to within a factor of 3. 4
Heuristic algorithms for solving a set of NP-hard single-machine scheduling problems with resource-dependent processing times Mor, B; Shabtay, D; Yedidsion, L In this paper we study a large set of single-machine scheduling problems with resource-dependent processing times that reduce (consolidate) to the same mathematical model. We take advantage of this property to present a set of universal (unified) heuristic algorithms capable of solving any of the problems that reduces to the unified model. By performing an extensive experimental study, we show that the suggested heuristics can solve large instances of this set of problems with a negligible average percent relative gap between the value of the heuristic solution and the value of a tight lower bound on the optimal solution value. 4
ConvERSe'20: The WSDM 2020 Workshop on Conversational Systems for E-Commerce Recommendations and Search Agichtein, E; Hakkani-Tur, D; Kallumadi, S; Malmasi, S Conversational systems have improved dramatically recently, and are receiving increasing attention in academic literature. These systems are also becoming adapted in E-Commerce due to increased integration of E-Commerce search and recommendation source with virtual assistants such as Alexa, Siri, and Google assistant. However, significant research challenges remain spanning areas of dialogue systems, spoken natural language processing, human-computer interaction, and search and recommender systems, which all are exacerbated with demanding requirements of ECommerce. The purpose of this workshop is to bring together researchers and practitioners in the areas of conversational systems, human-computer interaction, information retrieval, and recommender systems. Bringing diverse research areas together into a single workshop would accelerate progress on adapting conversation systems to the E-Commerce domain, to set a research agenda, to examine how to build and share data sets, and to establish common evaluation metrics and benchmarks to drive research progress. 4
Deep In-Memory Architectures in SRAM: An Analog Approach to Approximate Computing Kang, MG; Gonugondla, SK; Shanbhag, NR This article provides an overview of recently proposed deep in-memory architectures (DIMAs) in SRAM for energy- and latency-efficient hardware realization of machine learning (ML) algorithms. DIMA tackles the data movement problem in von Neumann architectures head-on by deeply embedding mixed-signal computations into a conventional memory array. In doing so, it trades off its computational signal-to-noise ratio (compute SNR) with energy and latency, and therefore, it represents an analog form of approximate computing. DIMA exploits the inherent error immunity of ML algorithms and SNR budgeting methods to operate its analog circuitry in a low-swing/low-compute SNR regime, thereby achieving $> 100\times $ reduction in the energy-delay product (EDP) over an equivalent von Neumann architecture with no loss in inference accuracy. This article describes DIMA's computational pipeline and provides a Shannon-inspired rationale for its robustness to process, temperature, and voltage variations and design guidelines to manage its analog nonidealities. DIMA's versatility, effectiveness, and practicality demonstrated via multiple silicon IC prototypes in a 65-nm CMOS process are described. A DIMA-based instruction set architecture (ISA) to realize an end-to-end application-to-architecture mapping for the accelerating diverse ML algorithms is also presented. Finally, DIMA's fundamental tradeoff between energy and accuracy in the low-compute SNR regime is analyzed to determine energy-optimum design parameters. 4
OCTET: Online Catalog Taxonomy Enrichment with Self-Supervision Mao, YN; Zhao, T; Kan, AR; Zhang, CW; Dong, XLN; Faloutsos, C; Han, JW Taxonomies have found wide applications in various domains, especially online for item categorization, browsing, and search. Despite the prevalent use of online catalog taxonomies, most of them in practice are maintained by humans, which is labor-intensive and difficult to scale. While taxonomy construction from scratch is considerably studied in the literature, how to effectively enrich existing incomplete taxonomies remains an open yet important research question. Taxonomy enrichment not only requires the robustness to deal with emerging terms but also the consistency between existing taxonomy structure and new term attachment. In this paper, we present a self-supervised end-to-end framework, OCTET, for Online Catalog Taxonomy EnrichmenT. OCTET leverages heterogeneous information unique to online catalog taxonomies such as user queries, items, and their relations to the taxonomy nodes while requiring no other supervision than the existing taxonomies. We propose to distantly train a sequence labeling model for term extraction and employ graph neural networks (GNNs) to capture the taxonomy structure as well as the query-item-taxonomy interactions for term attachment. Extensive experiments in different online domains demonstrate the superiority of OCTET over state-of-the-art methods via both automatic and human evaluations. Notably, OCTET enriches an online catalog taxonomy in production to 2 times larger in the open-world evaluation. 4
A Carbon-Aware Incentive Mechanism for Greening Colocation Data Centers Islam, MA; Mahmud, H; Ren, SL; Wang, XR The massive energy consumption of data centers worldwide has resulted in a large carbon footprint, raising serious concerns to sustainable IT initiatives and attracting a great amount of research attention. Nonetheless, the current efforts to date, despite encouraging, have been primarily centered around owner-operated data centers (e.g., Google data center), leaving out another major segment of data center industry-colocation data centers-much less explored. As a major hindrance to carbon efficiency desired by the operator, colocation suffers from split incentive: tenants may not be willing to manage their servers for carbon efficiency. In this paper, we aim at minimizing the carbon footprint of geo-distributed colocation data centers, while ensuring that the operator's cost meets a long-term budget constraint. We overcome the split incentive hurdle by devising a novel online carbon-aware incentive mechanism, called GreenColo, in which tenants voluntarily bid for energy reduction at self-determined prices and will receive financial rewards if their bids are accepted at runtime. Using trace based simulation we show that GreenColo results in a carbon footprint fairly close (23 versus 18 percent) to the optimal offline solution with future information, while being able to satisfy the colocation operator's long-term budget constraint. We demonstrate the effectiveness of GreenColo in practical scenarios via both simulation studies and scaled-down prototype experiments. Our results show that GreenColo can reduce the carbon footprint by up to 24 percent without incurring any additional cost for the colocation operator (compared to the no-incentive baseline case), while tenants receive financial rewards for free without violating service level agreement. 4
Design and Circuit Modeling of Graphene Plasmonic Nanoantennas Rakheja, S; Sengupta, P; Shakiah, SM Nanonntennas are critical elements of nanoscale wireless communication technologies with potential to overcome some of the limitations of on-chip interconnects. In this paper, we model the optical response of graphene-based nanoantennas (GNAs) using an equivalent resistive-inductive-capacitive (RLC) circuit by incorporating plasmonic effects that are present in graphene at terahertz (THz) frequencies. The equivalent circuit model is used to estimate the resonance characteristics of the nanoantenna, thereby facilitating geometry and material optimization. Three GNA structures are considered, namely bowtie, circular, and rectangular dimers with vacuum and silicon dioxide dielectric environment. We characterize the radiation efficiency, the Purcell factor, the quality factor, and the resonant frequency of various antennas with respect to the physical properties of the surrounding dielectric media and the graphene sheet. The GNAs are designed for a resonant frequency in the range of 10 - 25 THz with field enhancements around 10(4) - 10(5) and radiation efficiency similar to 5 - 16%. The circuit model is applied to examine the trade-off between the radiation efficiency and the field enhancement in the antenna gap. While bowtie antennas have higher field enhancement due to charge accumulation in their pointed structure, they suffer from low radiation efficiency stemming from the large spreading resistance losses. The circuit model is validated against finite-difference time-domain (FDTD) simulations conducted in Lumerical, and excellent match between the model and numerical results is demonstrated in the THz regime. The circuit models can be readily used in a hierarchical circuit simulator to design and optimize optical nanoantennas for THz communication in high-performance computing systems. 4
Tag Ownership Transfer in Radio Frequency Identification Systems: A Survey of Existing Protocols and Open Challenges Taqieddin, E; Al-Dahoud, H; Niu, HF; Sarangapani, J Radio frequency identification (RFID) is a modern approach to identify and track several assets at once in a supply chain environment. In many RFID applications, tagged items are frequently transferred from one owner to another. Thus, there is a need for secure ownership transfer (OT) protocols that can perform the transfer while, at the same time, protect the privacy of owners. Several protocols have been proposed in an attempt to fulfill this requirement. In this paper, we provide a comprehensive and systematic review of the RFID OT protocols that appeared over the years of 2005-2018. In addition, we compare these protocols based on the security goals which involve their support of OT properties and their resistance to attacks. From the presented comparison, we draw attention to the open issues in this field and provide suggestions for the direction that future research should follow. Furthermore, we suggest a set of guidelines to be considered in the design of new protocols. To the best of our knowledge, this is the first comprehensive survey that reviews the available OT protocols from the early start up to the current state of the art. 4
Large Scale Analysis of Multitasking Behavior During Remote Meetings Cao, HC; Lee, CJ; Iqbal, S; Czerwinski, M; Wong, P; Rintel, S; Hecht, B; Teevan, J; Yang, LQ Virtual meetings are critical for remote work because of the need for synchronous collaboration in the absence of in-person interactions. In-meeting multitasking is closely linked to people's productivity and wellbeing. However, we currently have limited understanding of multitasking in remote meetings and its potential impact. In this paper, we present what we believe is the most comprehensive study of remote meeting multitasking behavior through an analysis of a large-scale telemetry dataset collected from February to May 2020 of U.S. Microsoft employees and a 715 -person diary study. Our results demonstrate that intrinsic meeting characteristics such as size, length, time, and type, significantly correlate with the extent to which people multitask, and multitasking can lead to both positive and negative outcomes. Our findings suggest important best-practice guidelines for remote meetings (e.g., avoid important meetings in the morning) and design implications for productivity tools (e.g., support positive remote multitasking). 4
Endogeneity and non-response bias in treatment evaluation - nonparametric identification of causal effects by instruments Fricke, H; Frolich, M; Huber, M; Lechner, M This paper proposes a nonparametric method for evaluating treatment effects in the presence of both treatment endogeneity and attrition/non-response bias, based on two instrumental variables. Using a discrete instrument for the treatment and an instrument with rich (in general continuous) support for non-response/attrition, we identify the average treatment effect on compliers as well as the total population under the assumption of additive separability of observed and unobserved variables affecting the outcome. We suggest non- and semiparametric estimators and apply the latter to assess the treatment effect of gym training, which is instrumented by a randomized cash incentive paid out conditional on visiting the gym, on self-assessed health among students at a Swiss university. The measurement of health is prone to non-response, which is instrumented by a cash lottery for participating in the follow-up survey. 4
To Share or Not to Share? Capacity Reservation in a Shared Supplier Qi, AY; Ahn, HS; Sinha, A When a supplier serves multiple buyers, the buyers often reserve the supplier's capacity in advance to secure the supply to fulfill their demand. In this study, we analyze two common types of capacity reservation: exclusive and first-priority reservations. Both reservations give a buyer first access to its reserved capacity, but the reservations differ in how the leftovers (if any) are used. In most cases, as long as the buyer gets to use the reserved capacity first, it does not pay attention to how the leftover capacity is utilized, leaving that to the supplier's discretion (first-priority). However, in a number of cases, buyers prohibit discretionary use of the reserved capacity (no one touches my leftovers) and implement the restriction by placing an employee at the supplier or installing monitoring devices (exclusive). One potential benefit of first-priority capacity is resource pooling: allowing access to one another's leftovers can reduce the amount of capacity reserved by the buyers while enabling the supplier to satisfy buyers' orders better in some cases. The Operations Management literature suggests that the benefit of resource pooling is greater when the demand correlation is negative and smaller when the correlation is positive. We investigate the capacity reservation type and level that each buyer chooses facing uncertain (and correlated) demand. We investigate how the reservation price and demand correlation affect the equilibrium outcome. We also examine the supplier's decision to set the optimal reservation prices. We find that at least one firm reserves first-priority capacity in equilibrium as long as the supplier offers a discount for first-priority capacity (or charges a premium for exclusive capacity). Depending on the reservation price difference and demand correlation, we find that the equilibrium outcome is inefficient (i.e., not Pareto optimal) for the buyers when they settle in a free-rider or a prisoner's dilemma equilibrium. We show that the supplier always induces both buyers to reserve a large amount of exclusive capacity so that the supplier can make profits from both capacity reservation and production. While this seems like the best scenario for the supplier, we show that, allowing bilateral capacity transfer (e.g., the buyers trading their reserved capacity) can improve not only the buyers' profits but also the supplier's profit. 4
Measuring problem prescription opioid use among patients receiving long-term opioid analgesic treatment: development and evaluation of an algorithm for use in EHR and claims data Carrell, DS; Albertson-Junkans, L; Ramaprasan, A; Scull, G; Mackwood, M; Johnson, E; Cronkite, DJ; Baer, A; Hansen, K; Green, CA; Hazlehurst, BL; Janoff, SL; Coplan, PM; DeVeaugh-Geiss, A; Grijalva, CG; Liang, CH; Enger, CL; Lange, J; Shortreed, SM; Von Korff, M Objective: Opioid surveillance in response to the opioid epidemic will benefit from scalable, automated algorithms for identifying patients with clinically documented signs of problem prescription opioid use. Existing algorithms lack accuracy. We sought to develop a high-sensitivity, high-specificity classification algorithm based on widely available structured health data to identify patients receiving chronic extended-release/long-acting (ER/LA) therapy with evidence of problem use to support subsequent epidemiologic investigations. Methods: Outpatient medical records of a probability sample of 2,000 Kaiser Permanente Washington patients receiving >= 60 days' supply of ER/LA opioids in a 90-day period from 1 January 2006 to 30 June 2015 were manually reviewed to determine the presence of clinically documented signs of problem use and used as a reference standard for algorithm development. Using 1,400 patients as training data, we constructed candidate predictors from demographic, enrollment, encounter, diagnosis, procedure, and medication data extracted from medical claims records or the equivalent from electronic health record (EHR) systems, and we used adaptive least absolute shrinkage and selection operator (LASSO) regression to develop a model. We evaluated this model in a comparable 600-patient validation set. We compared this model to ICD-9 diagnostic codes for opioid abuse, dependence, and poisoning. This study was registered with ClinicalTrials.gov as study NCT02667262 on 28 January 2016. Results: We operationalized 1,126 potential predictors characterizing patient demographics, procedures, diagnoses, timing, dose, and location of medication dispensing. The final model incorporating 53 predictors had a sensitivity of 0.582 at positive predictive value (PPV) of 0.572. ICD-9 codes for opioid abuse, dependence, and poisoning had a sensitivity of 0.390 at PPV of 0.599 in the same cohort. Conclusions: Scalable methods using widely available structured EHR/claims data to accurately identify problem opioid use among patients receiving long-term ER/LA therapy were unsuccessful. This approach may be useful for identifying patients needing clinical evaluation. 4
Obsidian: Typestate and Assets for Safer Blockchain Programming Coblenz, M; Oei, R; Etzel, T; Koronkevich, P; Baker, M; Bloem, Y; Myers, BA; Sunshine, J; Aldrich, J Blockchain platforms are coming into use for processing critical transactions among participants who have not established mutual trust. Many blockchains are programmable, supporting smart contracts, which maintain persistent state and support transactions that transform the state. Unfortunately, bugs in many smart contracts have been exploited by hackers. Obsidian is a novel programming language with a type system that enables static detection of bugs that are common in smart contracts today. Obsidian is based on a core calculus, Silica, for which we proved type soundness. Obsidian uses typestate to detect improper state manipulation and uses linear types to detect abuse of assets. We integrated a permissions system that encodes a notion of ownership to allow for safe, flexible aliasing. We describe two case studies that evaluate Obsidian's applicability to the domains of parametric insurance and supply chain management, finding that Obsidian's type system facilitates reasoning about high-level states and ownership of resources. We compared our Obsidian implementation to a Solidity implementation, observing that the Solidity implementation requires much boilerplate checking and tracking of state, whereas Obsidian does this work statically. 4
megaSDM: integrating dispersal and time-step analyses into species distribution models Shipley, BR; Bach, R; Do, Y; Strathearn, H; McGuire, JL; Dilkina, B Understanding how species ranges shift as climates rapidly change informs us how to effectively conserve vulnerable species. Species distribution models (SDMs) are an important method for examining these range shifts. The tools for performing SDMs are ever improving. Here, we present the megaSDM R package. This package facilitates realistic spatiotemporal SDM analyses by incorporating dispersal probabilities, creating time-step maps of range change dynamics and efficiently handling large datasets and computationally intensive environmental subsampling techniques. Provided a list of species and environmental data, megaSDM synthesizes GIS processing, subsampling methods, MaxEnt modelling, dispersal rate restrictions and additional statistical tools to create a variety of outputs for each species, time period and climate scenario requested. For each of these, megaSDM generates a series of distribution maps and outputs visual representations of statistical data. megaSDM offers several advantages over other commonly used SDM tools. First, many of the functions in megaSDM natively implement parallelization, enabling the package to handle large amounts of data efficiently without the need for additional coding. megaSDM also implements environmental subsampling of occurrences, making the technique broadly available in a way that was not possible before due to computational considerations. Uniquely, megaSDM generates maps showing the expansion and contraction of a species range across all considered time periods (time-maps), and constrains both presence/absence and continuous suitability maps of species ranges according to species-specific dispersal constraints. The user can then directly compare non-dispersal and dispersal-limited distribution predictions. This paper discusses the unique features and highlights of megaSDM, describes the structure of the package and demonstrates the package's features and the model flow through examples. 4
An exploratory decision tree analysis to predict physical activity compliance rates in breast cancer survivors Paxton, RJ; Zhang, LF; Wei, CS; Price, D; Zhang, F; Courneya, KS; Kakadiaris, IA Background: The study of physical activity in cancer survivors has been limited to one cause, one effect relationships. In this exploratory study, we used recursive partitioning to examine multiple correlates that influence physical activity compliance rates in cancer survivors. Methods: African American breast cancer survivors (N = 267, Mean age = 54 years) participated in an online survey that examined correlates of physical activity. Recursive partitioning (RP) was used to examine complex and nonlinear associations between sociodemographic, medical, cancer-related, theoretical, and quality of life indicators. Results: Recursive partitioning revealed five distinct groups. Compliance with physical activity guidelines was highest (82% met guidelines) among survivors who reported higher mean action planning scores (P < 0.001) and lower mean barriers to physical activity (P = 0.035). Compliance with physical activity guidelines was lowest (9% met guidelines) among survivors who reported lower mean action and coping (P = 0.002) planning scores. Similarly, lower mean action planning scores and poor advanced lower functioning (P = 0.034), even in the context of higher coping planning scores, resulted in low physical activity compliance rates (13% met guidelines). Subsequent analyses revealed that body mass index (P = 0.019) and number of comorbidities (P = 0.003) were lowest in those with the highest compliance rates. Conclusion: Our findings support the notion that multiple factors determine physical activity compliance rates in African American breast cancer survivors. Interventions that encourage action and coping planning and reduce barriers in the context of addressing function limitations may increase physical activity compliance rates. 4
The 5th AI City Challenge Naphade, M; Wang, S; Anastasiu, DC; Tang, Z; Chang, MC; Yang, XD; Yao, Y; Zheng, L; Chakraborty, P; Sharma, A; Feng, Q; Ablavsky, V; Sclaroff, S The AI City Challenge was created with two goals in mind: (1) pushing the boundaries of research and development in intelligent video analysis for smarter cities use cases, and (2) assessing tasks where the level of performance is enough to cause real-world adoption. Transportation is a segment ripe for such adoption. The fifth AI City Challenge attracted 305 participating teams across 38 countries, who leveraged city-scale real traffic data and high-quality synthetic data to compete in five challenge tracks. Track 1 addressed video-based automatic vehicle counting, where the evaluation being conducted on both algorithmic effectiveness and computational efficiency. Track 2 addressed city-scale vehicle re-identification with augmented synthetic data to substantially increase the training set for the task. Track 3 addressed city-scale multi-target multi-camera vehicle tracking. Track 4 addressed traffic anomaly detection. Track 5 was a new track addressing vehicle retrieval using natural language descriptions. The evaluation system shows a general leader board of all submitted results, and a public leader board of results limited to the contest participation rules, where teams are not allowed to use external data in their work. The public leader board shows results more close to real-world situations where annotated data is limited. Results show the promise of AI in Smarter Transportation. State-of-the-art performance for some tasks shows that these technologies are ready for adoption in real-world systems. 4
Unsupervised quality estimation without reference corpus for subtitle machine translation using word embeddings Gupta, P; Shekhawat, S; Kumar, K We demonstrate the potential for using aligned bilingual word embeddings to create an unsupervised method to evaluate machine translations without a need for parallel translation corpus or reference corpus. We explain why movie subtitles differ from other text and share our experimental results conducted on them for four target languages (French, German, Portuguese and Spanish) with English source subtitles. We propose a novel automated evaluation method of calculating edits (insertion, deletion, substitution and shifts) to indicate translation quality and human aided post edit requirements to perfect machine translation. 3
GAN-Control: Explicitly Controllable GANs Shoshan, A; Bhonker, N; Kviatkovsky, I; Medioni, G We present a framework for training GANs with explicit control over generated facial images. We are able to control the generated image by settings exact attributes such as age, pose, expression, etc. Most approaches for manipulating GAN-generated images achieve partial control by leveraging the latent space disentanglement properties, obtained implicitly after standard GAN training. Such methods are able to change the relative intensity of certain attributes, but not explicitly set their values. Recently proposed methods, designed for explicit control over human faces, harness morphable 3D face models (3DMM) to allow fine-grained control capabilities in GANs. Unlike these methods, our control is not constrained to 3DMM parameters and is extendable beyond the domain of human faces. Using contrastive learning, we obtain GANs with an explicitly disentangled latent space. This disentanglement is utilized to train control-encoders mapping human-interpretable inputs to suitable latent vectors, thus allowing explicit control. In the domain of human faces we demonstrate control over identity, age, pose, expression, hair color and illumination. We also demonstrate control capabilities of our framework in the domains of painted portraits and dog image generation. We demonstrate that our approach achieves state-of-the-art performance both qualitatively and quantitatively. 3
DETECTING EXPRESSIONS WITH MULTIMODAL TRANSFORMERS Parthasarathy, S; Sundaram, S Developing machine learning algorithms to understand person-to-person engagement can result in natural user experiences for communal devices such as Amazon Alexa. Among other cues such as voice activity and gaze, a person's audio-visual expression that includes tone of the voice and facial expression serves as an implicit signal of engagement between parties in a dialog. This study investigates deep-learning algorithms for audio-visual detection of user's expression. We first implement an audio-visual baseline model with recurrent layers that shows competitive results compared to current state of the art. Next, we propose the transformer architecture with encoder layers that better integrate audio-visual features for expressions tracking. Performance on the Aff-Wild2 database shows that the proposed methods perform better than baseline architecture with recurrent layers with absolute gains approximately 2% for arousal and valence descriptors. Further, multimodal architectures show significant improvements over models trained on single modalities with gains of up to 3.6%. Ablation studies show the significance of the visual modality for the expression detection on the Aff-Wild2 database. 3
Framework for economic cost assessment of land subsidence Kok, S; Costa, AL Land subsidence is increasingly recognised as a complex and costly challenge, especially in urban subsidence-prone areas. Often caused by overextraction of groundwater in order to meet increasing drinking and irrigation water demand, the expected damage from subsidence across the globe runs in the billions of dollars annually. Economic cost assessment can support the development of policies for dealing with subsidence, e.g. in problem analysis and evaluation of strategies. However, research on economic cost assessment of subsidence is limited and a standardized framework is lacking: this is an important basis for sound, comparable and consistent decision making and knowledge transfer across studies. Results from a review of existing literature indicate that that there is a (i) high variability in the literature in terms of subsidence characteristics and geographical origin, (ii) variation in addressed effects and applied economic assessment approaches and (iii) lack of a standardized assessment framework. This complicates comparison and value transfer of results across settings. In this context, we propose a standardized framework for economic cost assessment of land subsidence along the lines of direct and indirect, market and non-market effects 3
Seeker: Real-Time Interactive Search Biswas, A; Pham, TT; Vogelsong, M; Snyder, B; Nassif, H This paper introduces Seeker, a system that allows users to adaptively refine search rankings in real time, through a series of feedbacks in the form of likes and dislikes. When searching online, users may not know how to accurately describe their product of choice in words. An alternative approach is to search an embedding space, allowing the user to query using a representation of the item (like a tune for a song, or a picture for an object). However, this approach requires the user to possess an example representation of their desired item. Additionally, most current search systems do not allow the user to dynamically adapt the results with further feedback. On the other hand, users often have a mental picture of the desired item and are able to answer ordinal questions of the form: Is this item similar to what you have in mind? With this assumption, our algorithm allows for users to provide sequential feedback on search results to adapt the search feed. We show that our proposed approach works well both qualitatively and quantitatively. Unlike most previous representation-based search systems, we can quantify the quality of our algorithm by evaluating humans-in-the-loop experiments. 3
PennSyn2Real: Training Object Recognition Models Without Human Labeling Nguyen, T; Miller, ID; Cohen, A; Thakur, D; Guru, A; Prasad, S; Taylor, CJ; Chaudhari, P; Kumar, V Scalable training data generation is a critical problem in deep learning. We propose PennSyn2Real - a photo-realistic synthetic dataset consisting of more than 100 000 4K images of more than 20 types of micro aerial vehicles (MAVs). The dataset can be used to generate arbitrary numbers of training images for high-level computer vision tasks such as MAV detection and classification. Our data generation framework bootstraps chroma-keying, a mature cinematography technique, with a motion tracking system providing artifact-free and curated annotated images. Our system, therefore, allows object orientations and lighting to be controlled. This framework is easy to set up and can be applied to a broad range of objects, reducing the gap between synthetic and real-world data. We show that synthetic data generated using this framework can be directly used to train CNN models for common object recognition tasks such as detection and segmentation. We demonstrate competitive performance in comparison with training using only real images. Furthermore, bootstrapping the generated synthetic data in few-shot learning can significantly improve the overall performance, reducing the number of required training data samples to achieve the desired accuracy. 3
OpenCrowd: A Human-Al Collaborative Approach for Finding Social Influencers via Open-Ended Answers Aggregation Arous, I; Yang, J; Khayati, M; Cudre-Mauroux, P Finding social influencers is a fundamental task in many online applications ranging from brand marketing to opinion mining. Existing methods heavily rely on the availability of expert labels, whose collection is usually a laborious process even for domain experts. Using open-ended questions, crowdsourcing provides a cost-effective way to find a large number of social influencers in a short time. Individual crowd workers, however, only possess fragmented knowledge that is often of low quality. To tackle those issues, we present OpenCrowd, a unified Bayesian framework that seamlessly incorporates machine learning and crowdsourcing for effectively finding social influencers. To infer a set of influencers, OpenCrowd bootstraps the learning process using a small number of expert labels and then jointly learns a feature-based answer quality model and the reliability of the workers. Model parameters and worker reliability are updated iteratively, allowing their learning processes to benefit from each other until an agreement on the quality of the answers is reached. We derive a principled optimization algorithm based on variational inference with efficient updating rules for learning OpenCrowd parameters. Experimental results on finding social influencers in different domains show that our approach substantially improves the state of the art by 11.5% AUC. Moreover, we empirically show that our approach is particularly useful in finding micro-influencers, who are very directly engaged with smaller audiences. 3
Adverse Rainfall Shocks and Civil War: Myth or Reality? Maertens, R News reports and policymakers frequently link African civil conflicts and wars to agricultural crises caused by droughts. However, empirical studies of the relationship between rainfall and civil conflict or war remain inconclusive. I reexamine this relationship focusing on rainfall over each country's agricultural land during the growing seasons. I also incorporate that the relationship between rainfall and agricultural output is hump-shaped, as rainfall beyond a threshold decreases output. I find a U-shaped relationship between rainfall and the risk of civil conflict and war in (Sub-Saharan) African countries. This relationship mirrors the hump-shaped relationship between rainfall and agricultural output. 3
Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs Choudhary, N; Rao, N; Katariya, S; Subbian, K; Reddy, CK Knowledge Graphs (KGs) are ubiquitous structures for information storage in several real-world applications such as web search, e-commerce, social networks, and biology. Querying KGs remains a foundational and challenging problem due to their size and complexity. Promising approaches to tackle this problem include embedding the KG units (e.g., entities and relations) in a Euclidean space such that the query embedding contains the information relevant to its results. These approaches, however, fail to capture the hierarchical nature and semantic information of the entities present in the graph. Additionally, most of these approaches only utilize multihop queries (that can be modeled by simple translation operations) to learn embeddings and ignore more complex operations such as intersection, and union of simpler queries. To tackle such complex operations, in this paper, we formulate KG representation learning as a self-supervised logical query reasoning problem that utilizes translation, intersection and union queries over KGs. We propose Hyperboloid Embeddings (HypE), a novel self-supervised dynamic reasoning framework, that utilizes positive first-order existential queries on a KG to learn representations of its entities and relations as hyperboloids in a Poincare ball. HypE models the positive first-order queries as geometrical translation, intersection, and union. For the problem of KG reasoning in real-world datasets, the proposed HypE model significantly outperforms the state-of-the art results. We also apply HypE to an anomaly detection task on a popular e-commerce website product taxonomy as well as hierarchically organized web articles and demonstrate significant performance improvements compared to existing baseline methods. Finally, we also visualize the learned HypE embeddings in a Poincare ball to clearly interpret and comprehend the representation space. 3
A Framework for pre-training hidden-unit conditional random fields and its extension to long short term memory networks Kim, YB; Stratos, K; Sarikaya, R In this paper, we introduce a simple unsupervised framework for pre-training hidden-unit conditional random fields (HUCRFs), i.e., learning initial parameter estimates for HUCRFs prior to supervised training.Our framework exploits the model structure of HUCRFs to make effective use of unlabeled data from the same domain or labeled data from a different domain. The key idea is to use the separation of HUCRF parameters between observations and labels: this allows us to pre-train observation parameters independently of label parameters. Pre-training is achieved by creating pseudo-labels from such resources. In the case of unlabeled data, we cluster observations and use the resulting clusters as pseudo-labels. Observation parameters can be trained on these resources and then transferred to initialize the supervised training process on the target labeled data. Experiments on various sequence labeling tasks demonstrate that the proposed pre-training method consistently yields significant improvement in performance. The core idea could be extended to other learning techniques including deep learning. We applied the proposed technique to recurrent neural networks (RNN) with long short term memory (LSTM) architecture and obtained similar gains. (C) 2017 Elsevier Ltd. All rights reserved. 3
Towards Efficient Tensor Decomposition-Based DNN Model Compression with Optimization Framework Yin, M; Sui, Y; Liao, SY; Yuan, B Advanced tensor decomposition, such as tensor train (TT) and tensor ring (TR), has been widely studied for deep neural network (DNN) model compression, especially for recurrent neural networks (RNNs). However, compressing convolutional neural networks (CNNs) using TT/TR always suffers significant accuracy loss. In this paper, we propose a systematic framework for tensor decomposition-based model compression using Alternating Direction Method of Multipliers (ADMM). By formulating TT decompositionbased model compression to an optimization problem with constraints on tensor ranks, we leverage ADMM technique to systemically solve this optimization problem in an iterative way. During this procedure, the entire DNN model is trained in the original structure instead of TT format, but gradually enjoys the desired low tensor rank characteristics. We then decompose this uncompressed model to TT format, and fine-tune it to finally obtain a high-accuracy TTformat DNN model. Our framework is very general, and it works for both CNNs and RNNs, and can be easily modified to fit other tensor decomposition approaches. We evaluate our proposed framework on different DNN models for image classification and video recognition tasks. Experimental results show that our ADMM-based TT-format models demonstrate very high compression performance with high accuracy. Notably, on CIFAR-100, with 2.3x and 2.4xcompression ratios, our models have 1.96% and 2.21% higher top-1 accuracy than the original ResNet-20 and ResNet-32, respectively. For compressing ResNet-18 on ImageNet, our model achieves 2.47x FLOPs reduction without accuracy loss. 3
WPNets and PWNets: From the Perspective of Channel Fusion Liang, DJ; Yang, F; Zhang, T; Tian, J; Yang, P The performance and parameters of neural networks have a positive correlation, and there are a lot of parameter redundancies in the existing neural network architectures. By exploring the channels relationship of the whole and part of the neural network, the architectures of the convolution network with the tradeoff between the parameters and the performance are obtained. Two network architectures are implemented by dividing the convolution kernels of one layer into multiple groups, thus ensuring that the network has more connections and fewer parameters. In these two network architectures, the information of one network flows from the whole to the part, which is called whole-to-part connected networks (WPNets), and the information of the other network flows from the part to the whole, which is called part-to-whole connected networks (PWNets). WPNets use the whole channel information to enhance partial channel information, and the PWNets use partial channel information to generate or enhance the whole channel information. We evaluate the proposed architectures on three competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet), and our models obtain comparable results even with far fewer parameters compared to many state of the arts. Our network architecture code is available at github. 3
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases Weikum, G; Dong, XLN; Razniewski, S; Suchanek, F Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods. 3
Dynamic dispatch policies for emergency response with multiple types of vehicles Yoon, S; Albert, LA Emergency medical service (EMS) systems have two main goals when sending ambulances to patients: rapidly responding to patients and sending the right type of personnel to patients based on their health needs. We address these issues by formulating and studying a Markov decision process model that determines which type of ambulances (servers) to send to patients in real-time. The base model considers a loss system over a finite time horizon, and we provide a model variant that considers an infinite time horizon and the average reward criterion. Structural properties of the optimal policies are derived. Computational experiments using a real-world EMS dataset show that the optimal policies inform how to dynamically dispatch ambulance types to patients. We propose and evaluate three classes of heuristics, including a static constant threshold heuristic, a greedy heuristic, and a dynamic greedy threshold heuristic. Computational results suggest that the greedy threshold heuristic closely approximates the optimal policies and reduces the complexity of implementing dynamic policies in real settings. 3
SOTER on ROS: A Run-Time Assurance Framework on the Robot Operating System Shivakumar, S; Torfah, H; Desai, A; Seshia, SA We present an implementation of SOTER, a run-time assurance framework for building safe distributed mobile robotic (DMR) systems, on top of the Robot Operating System (ROS). The safety of DMR systems cannot always be guaranteed at design time, especially when complex, off-the-shelf components are used that cannot be verified easily. SOTER addresses this by providing a language-based approach for run-time assurance for DMR systems. SOTER implements the reactive robotic software using the language P, a domain-specific language designed for implementing asynchronous event-driven systems, along with an integrated run-time assurance system that allows programmers to use unfortified components but still provide safety guarantees. We describe an implementation of SOTER for ROS and demonstrate its efficacy using a multi-robot surveillance case study, with multiple run-time assurance modules. Through rigorous simulation, we show that SOTER enabled systems ensure safety, even when using unknown and untrusted components. 3
Enabling data mining of handwritten coursework Stahovich, TF; Lin, HL Data mining has become an increasingly important tool for education researchers and practitioners. However, work in this field has focused on data from online educational systems. Here, we present techniques to enable data mining of handwritten coursework, which is an essential component of instruction in many disciplines. Our techniques include methods for classifying pen strokes as diagram, equation, and cross-out strokes. The latter are used to strike out erroneous work. We have also created techniques for grouping equation strokes into equation groups and then individual characters. Our results demonstrate that our classification and grouping techniques are more accurate than prior techniques for this task. We also demonstrate applications of our techniques for automated assessment of student competence. We present a novel approach for measuring the correctness of exam solutions from an analysis of lexical features of handwritten equations. This analysis demonstrates, for example, that the number of equation groups correlates positively with grade. We also use our techniques to extend graphical protocol analysis to free-form, handwritten problem solutions. While prior work in a laboratory setting suggests that long pauses are indicative of low competence, our work shows that the frequency of long pauses during exams correlates positively with competence. (C) 2016 Published by Elsevier Ltd. 3
Arabic community question answering Nakov, P; Marquez, L; Moschitti, A; Mubarak, H We analyze resources and models for Arabic community Question Answering (cQA). In particular, we focus on CQA-MD, our cQA corpus for Arabic in the domain of medical forums. We describe the corpus and the main challenges it poses due to its mix of informal and formal language, and of different Arabic dialects, as well as due to its medical nature. We further present a shared task on cQA at SemEval, the International Workshop on Semantic Evaluation, based on this corpus. We discuss the features and the machine learning approaches used by the teams who participated in the task, with focus on the models that exploit syntactic information using convolutional tree kernels and neural word embeddings. We further analyze and extend the outcome of the SemEval challenge by training a meta-classifier combining the output of several systems. This allows us to compare different features and different learning algorithms in an indirect way. Finally, we analyze the most frequent errors common to all approaches, categorizing them into prototypical cases, and zooming into the way syntactic information in tree kernel approaches can help solve some of the most difficult cases. We believe that our analysis and the lessons learned from the process of corpus creation as well as from the shared task analysis will be helpful for future research on Arabic cQA. 3
Political Turnover, Bureaucratic Turnover, and the Quality of Public Services Akhtari, M; Moreira, D; Trucco, L We study how political turnover in mayoral elections in Brazil affects public service provision by local governments. Exploiting a regression discontinuity design for close elections, we find that municipalities with a new party in office experience upheavals in the municipal bureaucracy: new personnel are appointed across multiple service sectors, and at both managerial and non-managerial levels. In education, the increase in the replacement rate of personnel in schools controlled by the municipal government is accompanied by test scores that are 0.05-0.08 standard deviations lower. In contrast, turnover of the mayor's party does not impact local (non-municipal) schools. These findings suggest that political turnover can adversely affect the quality of public services when the bureaucracy is not shielded from the political process. 3
Narrative context-based data-to-text generation for ambient intelligence Jang, J; Noh, H; Lee, Y; Pantel, SM; Rim, H In this paper, we propose a language generation model for the world of ambient intelligence (AmI). Various devices in use today are connected to the Internet and are used to provide a considerable amount of information. Because language is the most effective way for humans to communicate with one another, one approach to controlling AmI devices is to use a smart assistant based on language systems. One such framework for data-to-text generation is the natural language generation (NLG) model that generates text from non-linguistic data. Previously proposed NLG models employed heuristic-based approaches to generate relatively short sentences. We find that such approaches are structurally inflexible and tend to generate text that is not diverse. Moreover, there are various domains where numerical values are important, such as sports, finance, and weather. These values need to be generated in terms of categorical information. (e.g., hits, homeruns, and strikeouts.) In the generated outputs, the numerical values often do not accurately correspond to categorical information. Our proposed data-to-text generation model provides both diversity and coherence of information through a narrative context and a copy mechanism. It allows for the learning of the narrative context and sentence structures from a domain corpus without requiring additional explanation of the intended category or sentential grammars. The results of experiments performed from various perspectives show that the proposed model generates text outputs containing diverse and coherent information. 3
Risk management for cyber-infrastructure protection: A bi-objective integer programming approach Schmidt, A; Albert, LA; Zheng, KY Information and communication technology supply chains present risks that are complex and difficult for organizations to manage. The cost and benefit of proposed security controls must be assessed to best match an organizational risk tolerance and direct the use of security resources. In this paper, we present integer and stochastic optimization models for selecting a portfolio of security controls within an organizational budget. We consider two objectives: to maximize the risk reduction across all potential attacks and to maximize the number of attacks whose risk levels are lower than a risk threshold after security controls are applied. Deterministic and stochastic bi-objective budgeted difficulty-threshold control selection problems are formulated for selecting mitigating controls to reflect an organization's risk preference. In the stochastic problem, we consider uncertainty as to whether the selected controls can reduce the risks associated with attacks. We demonstrate through a computational study that the trade-off between the two objectives is important to consider for certain risk preferences and budgets. We demonstrate the value of the stochastic model when a relatively high number of attacks are desired to be secured past a risk threshold and show the deterministic solution provides near optimal solutions otherwise. We provide an analysis of model solutions. 3
Aerial Road Segmentation in the Presence of Topological Label Noise Henry, C; Fraundorfer, F; Vig, E The availability of large-scale annotated datasets has enabled Fully-Convolutional Neural Networks to reach outstanding performance on road extraction in aerial images. However, high-quality pixel-level annotation is expensive to produce and even manually labeled data often contains topological errors. Trading off quality for quantity, many datasets rely on already available yet noisy labels, for example from OpenStreetMap. In this paper, we explore the training of custom U-Nets built with ResNet and DenseNet backbones using noise-aware losses that are robust towards label omission and registration noise. We perform an extensive evaluation of standard and noise-aware losses, including a novel Bootstrapped DICE-Coefficient loss, on two challenging road segmentation benchmarks. Our losses yield a consistent improvement in overall extraction quality and exhibit a strong capacity to cope with severe label noise. Our method generalizes well to two other fine-grained topology delineation tasks: surface crack detection for quality inspection and cell membrane extraction in electron microscopy imagery. 3
Reinforcement learning and its connections with neuroscience and psychology Subramanian, A; Chitlangia, S; Baths, V Reinforcement learning methods have recently been very successful at performing complex sequential tasks like playing Atari games, Go and Poker. These algorithms have outperformed humans in several tasks by learning from scratch, using only scalar rewards obtained through interaction with their environment. While there certainly has been considerable independent innovation to produce such results, many core ideas in reinforcement learning are inspired by phenomena in animal learning, psychology and neuroscience. In this paper, we comprehensively review a large number of findings in both neuroscience and psychology that evidence reinforcement learning as a promising candidate for modeling learning and decision making in the brain. In doing so, we construct a mapping between various classes of modern RL algorithms and specific findings in both neurophysiological and behavioral literature. We then discuss the implications of this observed relationship between RL, neuroscience and psychology and its role in advancing research in both AI and brain science. (C) 2021 Elsevier Ltd. All rights reserved. 3
Service Discovery based Blue-Green Deployment Technique in Cloud Native Environments Yang, B; Sailer, A; Jain, S; Tomala-Reyes, AE; Singh, M; Ramnath, A The paradigm of the cloud computing model brings the expectations of the services consumed from the cloud to be always on and available for global consumption around the clock. Hence, the challenge for the service providers becomes to accommodate the planned maintenance windows to keep them to a minimum duration in order to reduce the impact on the consumers of their services. In this paper, we assess the continuous delivery methodology called Blue/Green deployment which targets to enable service updates with zero maintenance windows, and thus with no disruption to the end users. To overcome current issues we proposed an optimized Blue/Green technique based on automated service discovery, dynamic routing and automated application deployment. Our experiments indicate that the service discovery based Blue/Green deployment has a better performance for service continuous delivery compared to the available technologies. 3
How To Use Neural Networks To Investigate Quantum Many-Body Physics Carrasquilla, J; Torlai, G Over the past few years, machine learning has emerged as a powerful computational tool to tackle complex problems in a broad range of scientific disciplines. In particular, artificial neural networks have been successfully used to mitigate the exponential complexity often encountered in quantum many-body physics, the study of properties of quantum systems built from a large number of interacting particles. In this article, we review some applications of neural networks in condensed matter physics and quantum information, with particular emphasis on hands-on tutorials serving as a quick start for a newcomer to the field. The prerequisites of this tutorial are basic probability theory and calculus, linear algebra, basic notions of neural networks, statistical physics, and quantum mechanics. The reader is introduced to supervised machine learning with convolutional neural networks to learn a phase transition, unsupervised learning with restricted Boltzmann machines to perform quantum tomography, and the variational Monte Carlo method with recurrent neural networks for approximating the ground state of a many-body Hamiltonian. For each algorithm, we briefly review the key ingredients and their corresponding neural network implementation, and show numerical experiments for a system of interacting Rydberg atoms in two dimensions. 3
Learning pairwise patterns in Community Question Answering Filice, S; Moschitti, A In recent years, forums offering community Question Answering (cQA) services gained popularity on the web, as they offer a new opportunity for users to search and share knowledge. In fact, forums allow users to freely ask questions and expect answers from the community. Although the idea of receiving a direct, targeted response from other users is very attractive, it is not rare to see long threads of comments, where only a small portion of them are actually valid answers. In many cases users start conversations, ask for other information, and discuss about things, which are not central to the original topic. Therefore, finding the desired information in a long list of answers might be very time-consuming. Designing automatic systems to select good answers is not an easy task. In many cases the question and the answer do not share a large textual content, and approaches based on measuring the question-answer similarity will often fail. A more intriguing and promising approach would be trying to define valid question-answer templates and use a system to understand whether any of these templates is satisfied for a given question-answer pair. Unfortunately, the manual definition of these templates is extremely complex and requires a domain-expert. In this paper, we propose a supervised kernel-based framework that automatically learns from training question-answer pairs the syntactic/semantic patterns useful to recognize good answers. We carry out a detailed experimental evaluation, where we demonstrate that the proposed framework achieves state-of-the-art results on the Qatar Living datasets released in three different editions of the Community Question Answering Challenge of SemEval. 3
A New NTRU-Type Public-Key Cryptosystem over the Binary Field Gu, YY; Xie, XW; Gu, CS As the development of cloud computing and the convenience of wireless sensor netowrks, smart devices are widely used in daily life, but the security issues of the smart devices have not been well resolved. In this paper, we present a new NTRU-type public-key cryptosystem over the binary field. Specifically, the security of our scheme relies on the computational intractability of an unbalanced sparse polynomial ratio problem (DUSPR). Through theoretical analysis, we prove the correctness of our proposed cryptosystem. Furthermore, we implement our scheme using the NTL library, and conduct a group of experiments to evaluate the capabilities and consuming time of encryption and decryption. Our experiments result demonstrates that the NTRU-type public-key cryptosystem over the binary field is relatively practical and effective. 3
DeCaf: Diagnosing and Triaging Performance Issues in Large-Scale Cloud Services Bansal, C; Renganathan, S; Asudani, A; Midy, O; Janakiraman, M Large scale cloud services use Key Performance Indicators (KPIs) for tracking and monitoring performance. They usually have Service Level Objectives (SLOs) baked into the customer agreements which are tied to these KPIs. Dependency failures, code bugs, infrastructure failures, and other problems can cause performance regressions. It is critical to minimize the time and manual effort in diagnosing and triaging such issues to reduce customer impact. Large volume of logs and mixed type of attributes (categorical, continuous) in the logs makes diagnosis of regressions non-trivial. In this paper, we present the design, implementation and experience from building and deploying DeCaf, a system for automated diagnosis and triaging of KPI issues using service logs. It uses machine learning along with pattern mining to help service owners automatically root cause and triage performance issues. We present the learnings and results from case studies on two large scale cloud services in Microsoft where DeCaf successfully diagnosed 10 known and 31 unknown issues. DeCaf also automatically triages the identified issues by leveraging historical data. Our key insights are that for any such diagnosis tool to be effective in practice, it should a) scale to large volumes of service logs and attributes, b) support different types of KPIs and ranking functions, c) be integrated into the DevOps processes. 3
Learning a Complete Image Indexing Pipeline Jain, H; Zepeda, J; Perez, P; Gribonval, R To work at scale, a complete image indexing system comprises two components: An inverted file index to restrict the actual search to only a subset that should contain most of the items relevant to the query; An approximate distance computation mechanism to rapidly scan these lists. While supervised deep learning has recently enabled improvements to the latter, the former continues to be based on unsupervised clustering in the literature. In this work, we propose a first system that learns both components within a unifying neural framework of structured binary encoding. 3
Displacement-Based Simplified Seismic Loss Assessment of Steel Buildings Cantisani, G; Della Corte, G; Sullivan, TJ; Roldan, R Evaluating losses consequent to damages induced by earthquakes has become one main paradigm in seismic performance assessment. Within this context, this paper explores a simplified seismic loss assessment methodology for steel structures that utilises displacement-based assessment in place of non-linear dynamic analyses. To this end, two case studies have been considered: (i) an archetype existing single-storey non-residential building, (ii) a prototype new multi-story residential building. Comparing results obtained using the fully probabilistic and the simplified displacement-based assessment methodologies allows the benefits of the simplified method to be highlighted. Possible difficulties encountered in applying the simplified method are also identified. 3
The Trade-Comovement Puzzle Drozd, LA; Kolbin, S; Nosal, JB Standard international transmission mechanism of productivity shocks predicts a weak endogenous linkage between trade and business cycle synchronization: a problem known as the trade-comovement puzzle. We provide the foundational analysis of the puzzle, pointing to three natural candidate resolutions: (i) financial market frictions, (ii) Greenwood-Hercowitz-Huffman preferences, and (iii) dynamic trade elasticity that is low in the short run but high in the long run. We show the effects of each of these candidate resolutions analytically and evaluate them quantitatively. We find that while (i) and (ii) fall short of the data, (iii) goes a long way toward resolving the puzzle. 3
OPTIMAL PRICING AND INVENTORY STRATEGIES FOR INTRODUCING A NEW PRODUCT BASED ON DEMAND SUBSTITUTION EFFECTS Dong, ZS; Chen, W; Zhao, Q; Li, JQ This paper studies a single-period inventory-pricing problem with two substitutable products, which is very important in the area of Operations Management but has received little attention. The proposed problem focuses on determining the optimal price of the existing product and the inventory level of the new product. Inspired by practice, the problem considers various pricing strategies for the existing product as well as the cross elasticity of demand between existing and new products. A mathematical model has been developed for different pricing strategies to maximize the expected profit. It has been proven that the objective function is concave and there exists the unique optimal solution. Different sets of computational examples are conducted to show that the optimal pricing and inventory strategy generated by the model can increase profits. 3
Semantic granularity metric learning for visual search Manandhar, D; Bastan, M; Yap, KH Existing metric learning methods often do not consider different granularly in visual similarly. However, in many domains, images exhibit similarly at multiple granularities with visual semantic concepts, e.g. fashion demonstrates similarly ranging from clothing of the exact same instance to similar looks/design or common category. Therefore, training image triplets/pairs inherently possess different degree of information. Nevertheless, the existing methods often treat them with equal importance which hinder capturing underlying granularities in image similarly. In view of this, we propose a new semantic granularly metric learning (SGML) that develops a novel idea of detecting and leveraging attribute semantic space and integrating it into deep metric learning to capture multiple granularities of similarly. The proposed framework simultaneously learns image attributes and embeddings with multitask-CNN where the tasks are linked by semantic granularly similarly mapping to leverage correlations between the tasks. To this end, we propose a new soft-binomial deviance loss that effectively integrates informativeness of training samples into metric-learning on-the-fly during training. Compared to recent ensemble-based methods, SGML is conceptually elegant, computationally simple yet effective. Extensive experiments on benchmark datasets demonstrate its superiorly e.g., 1-4.5%-Recall@1 improvement over the state-of-the-arts (Kim a al., 2018; Cakir a al., 2019) on DeepFashion-Inshop 3
Characterizing Public Cloud Resource Contention to Support Virtual Machine Co-residency Prediction Han, XL; Schooley, R; Mackenzie, D; David, O; Lloyd, WJ Hypervisors used to implement virtual machines (VMs) for infrastructure-as-a-service (IaaS) cloud platforms have undergone continued improvements for the past decade. VM components including CPU, memory, network, and storage I/O have evolved from full software emulation, to paravirtualization, to hardware virtualization. While these innovations have helped reduce performance overhead when simulating a computer, considerable performance loss is still possible in the public cloud from resource contention of co-located VMs. In this paper, we investigate the extent of performance degradation from resource contention by leveraging well-known benchmarks run in parallel across three generations of virtualization hypervisors. Using a Python-based test harness we orchestrate execution of CPU, disk, and network I/O bound benchmarks across up to 48 VMs sharing the same Amazon Web Services dedicated host server. We found that executing benchmarks on hosts with many idle Linux VMs produced unexpected performance degradation. As public cloud users are interested in avoiding resource contention from co-located VMs, we next leveraged our dedicated host performance measurements as independent variables to train models to predict the number of co-resident VMs. We evaluated multiple linear regression and random forest models using test data from independent benchmark runs across 96 vCPU dedicated hosts running up to 48 x 2 vCPU VMs where we controlled VM placements. Multiple linear regression over normalized data achieved R-2 =.942, with mean absolute error of VM co-residency predictions of +/- 1.61 VMs. We then leveraged our models to infer VM co-residency among a set of 50 VMs on the public cloud, where co-location data is unavailable. Here models cannot be independently verified, but results suggest the relative occupancy level of public cloud hosts enabling users to infer when their VMs reside on busy hosts. Our results characterize how recent hypervisor and hardware advancements are addressing resource contention, while demonstrating the potential to leverage co-located benchmarks for VM co-residency prediction in a public cloud. 3
CVaR-LASSO Enhanced Index Replication (CLEIR): outperforming by minimizing downside risk Gendreau, B; Jin, Y; Nimalendran, M; Zhong, XL Index-funds are one of the most popular investment vehicles among investors, with total assets indexed to the S&P500 exceeding $8.7 trillion at-the-end of 2016. Recently, enhanced-index-funds, which seek to outperform an index while maintaining a similar risk-profile, have grown in popularity. We propose an enhanced-index-tracking method that uses the linear absolute shrinkage selection operator (LASSO) method to minimize the Conditional Value-at-Risk (CVaR) of the tracking error. This minimizes the large downside tracking-error while keeping the upside. Using historical and simulated data, our CLEIR method outperformed the benchmark with a tracking error of . The effect is more pronounced when the number of the constituents is large. Using 50-80 large stocks in the S&P 500 index, our method closely tracked the benchmark with an alpha 2.55%. 3
Workshop on Privacy in NLP (PrivateNLP 2020) Feyisetan, O; Ghanavati, S; Thaine, P Privacy-preserving data analysis has become essential in Machine Learning (ML), where access to vast amounts of data can provide large gains the in accuracies of tuned models. A large proportion of user-contributed data comes from natural language e.g., text transcriptions from voice assistants. It is therefore important for curated natural language datasets to preserve the privacy of the users whose data is collected and for the models trained on sensitive data to only retain non-identifying (i.e., generalizable) information. The workshop aims to bring together researchers and practitioners from academia and industry to discuss the challenges and approaches to designing, building, verifying, and testing privacy-preserving systems in the context of Natural Language Processing (NLP). 3
Spatial decomposition of magnetic anisotropy in magnets: Application to doped Fe16N2 Sun, Y; Yao, YX; Nguyen, MC; Wang, CZ; Ho, KM; Antropov, V We propose a scheme of decomposition of the total relativistic energy in solids to intra- and interatomic contributions. The method is based on a site variation of such fundamental constant as the speed of light. As a practical illustration of the method, we tested such decomposition in the case of a spin-orbit interaction variation for the decomposition of the magnetic anisotropy energy (MAE) in CoPt. We further studied the alpha-Fe16N2 magnet doped by Bi, Sb, Co, and Pt atoms. It was found that the addition of Pt atoms can enhance the MAE by as much as five times while Bi and Sb substitutions double the total MAE. Using the proposed technique, we demonstrate the spatial distribution of these enhancements. Our studies also suggest that Sb, Pt, and Co substitutions could be synthesized by experiments. 3
MVHM: A Large-Scale Multi-View Hand Mesh Benchmark for Accurate 3D Hand Pose Estimation Chen, LJ; Lin, SY; Xie, YS; Lin, YY; Xie, XH Estimating 3D hand poses from a single RGB image is challenging because depth ambiguity leads the problem illposed. Training hand pose estimators with 3D hand mesh annotations and multi-view images often results in significant performance gains. However, existing multi-view datasets are relatively small with hand joints annotated by off-the-shelf trackers or automated through model predictions, both of which may be inaccurate and can introduce biases. Collecting a large-scale multi-view 3D hand pose images with accurate mesh and joint annotations is valuable but strenuous. In this paper, we design a spin match algorithm that enables a rigid mesh model matching with any target mesh ground truth. Based on the match algorithm, we propose an efficient pipeline to generate a largescale multi-view hand mesh (MVHM) dataset with accurate 3D hand mesh and joint labels. We further present a multi-view hand pose estimation approach to verify that training a hand pose estimator with our generated dataset greatly enhances the performance. Experimental results show that our approach achieves the performance of 0.990 in AUC(20-50) on the MHP dataset compared to the previous state-of-the-art of 0.939 on this dataset. Our datasset is available at https://github.com/Kuzphi/MVHM. 3
Randomized learning and generalization of fair and private classifiers: From PAC-Bayes to stability and differential privacy Oneto, L; Donini, M; Pontil, M; Shawe-Taylor, J We address the problem of randomized learning and generalization of fair and private classifiers. From one side we want to ensure that sensitive information does not unfairly influence the outcome of a classifier. From the other side we have to learn from data while preserving the privacy of individual observations. We initially face this issue in the PAC-Bayes framework presenting an approach which trades off and bounds the risk and the fairness of the randomized (Gibbs) classifier. Our new approach is able to handle several different state-of-the-art fairness measures. For this purpose, we further develop the idea that the PAC-Bayes prior can be defined based on the data-generating distribution without actually knowing it. In particular, we define a prior and a posterior which give more weight to functions with good generalization and fairness properties. Furthermore, we will show that this randomized classifier possesses interesting stability properties using the algorithmic distribution stability theory. Finally, we will show that the new posterior can be exploited to define a randomized accurate and fair algorithm. Differential privacy theory will allow us to derive that the latter algorithm has interesting privacy preserving properties ensuring our threefold goal of good generalization, fairness, and privacy of the final model. (C) 2020 Elsevier B.V. All rights reserved. 3
POLICY CHALLENGES IN MAPPING INTERNET INTERDOMAIN CONGESTION Claffi, K; Clark, DD; Bauer, S; Dhamdhere, A Interconnection links connecting broadband access providers with their peers, transit providers and major content providers, are a potential point of discriminatory treatment and impairment of user experience. However, adequate data to shed light on this situation is lacking, and different actors can put forward opportunistic interpretations of data to support their points of view. In this article, we introduce a topology-aware model of interconnection to elucidate our own beliefs about how to measure interconnection links of access providers and how policymakers should interpret the results. We use six case studies that show how our conceptual model can guide a critical analysis of what is or should be measured and reported, and how to soundly interpret these measurements. 3
Maximizing Determinants under Matroid Constraints Madan, V; Nikolov, A; Singh, M; Tantipongpipat, U Given a set of vectors v(1),..., v(n) is an element of R-d and a matroid M = ([n], I), we study the problem of finding a basis S of M such that det (Sigma(i is an element of S) v(i)v(i)) is maximized. This problem appears in a diverse set of areas, such as experimental design, fair allocation of goods, network design, and machine learning. The current best results include an e(2k)-estimation for any matroid of rank k [8] and a (1 + epsilon)(d)-approximation for a uniform matroid of rank k >= d + d/epsilon [30], where the rank k >= d denotes the desired size of the optimal set. Our main result is a new approximation algorithm for the general problem with an approximation guarantee that depends only on the dimension d of the vectors, and not on the size k of the output set. In particular, we show an (O(d))(d)-estimation and an (O(d))d(3)-approximation for any matroid, giving a significant improvement over prior work when k >> d. Our result relies on showing that there exists an optimal solution to a convex programming relaxation for the problem which has sparse support; in particular, no more than O(d(2)) variables of the solution have fractional values. The sparsity results rely on the interplay between the first order optimality conditions for the convex program and matroid theory. We believe that the techniques introduced to show sparsity of optimal solutions to convex programs will be of independent interest. We also give a new randomized rounding algorithm that crucially exploits the sparsity of solutions to the convex program. To show the approximation guarantee, we utilize recent works on strongly log-concave polynomials [8], [4] and show new relationships between different convex programs [33], [6] studied for the problem. Finally, we show how to use the estimation algorithm to give an efficient deterministic approximation algorithm. Once again, the algorithm crucially relies on sparsity of the fractional solution to guarantee that the approximation factor depends solely on the dimension d. 3
DeepTC-Enhancer: Improving the Readability of Automatically Generated Tests Roy, D; Zhang, ZY; Ma, M; Arnaoudova, V; Panichella, A; Panichella, S; Gonzalez, D; Mirakhorli, M Automated test case generation tools have been successfully proposed to reduce the amount of human and infrastructure resources required to write and run test cases. However, recent studies demonstrate that the readability of generated tests is very limited due to (i) uninformative identifiers and (ii) lack of proper documentation. Prior studies proposed techniques to improve test readability by either generating natural language summaries or meaningful methods names. While these approaches are shown to improve test readability, they are also affected by two limitations: (1) generated summaries are often perceived as too verbose and redundant by developers, and (2) readable tests require both proper method names but also meaningful identifiers (within-method readability). In this work, we combine template based methods and Deep Learning (DL) approaches to automatically generate test case scenarios (elicited from natural language patterns of test case statements) as well as to train DL models on path-based representations of source code to generate meaningful identifier names. Our approach, called DeepTC-Enhancer, recommends documentation and identifier names with the ultimate goal of enhancing readability of automatically generated test cases. An empirical evaluation with 36 external and internal developers shows that (1) DeepTC-Enhancer outperforms significantly the baseline approach for generating summaries and performs equally with the baseline approach for test case renaming, (2) the transformation proposed by DeepTC-Enhancer results in a significant increase in readability of automatically generated test cases, and (3) there is a significant difference in the feature preferences between external and internal developers. 3
Agile reactive navigation for a non-holonomic mobile robot using a pixel processor array Liu, YN; Bose, L; Greatwood, C; Chen, JN; Fan, R; Richardson, T; Carey, SJ; Dudek, P; Mayol-Cuevas, W This paper presents an agile reactive navigation strategy for driving a non-holonomic ground vehicle around a pre-set course of gates in a cluttered environment using a low-cost processor array sensor. This enables machine vision tasks to be performed directly upon the sensor's image plane, rather than using a separate general-purpose computer. The authors demonstrate a small ground vehicle running through or avoiding multiple gates at high speed using minimal computational resources. To achieve this, target tracking algorithms are developed for the Pixel Processing Array and captured images are then processed directly on the vision sensor acquiring target information for controlling the ground vehicle. The algorithm can run at up to 2000 fps outdoors and 200 fps at indoor illumination levels. Conducting image processing at the sensor level avoids the bottleneck of image transfer encountered in conventional sensors. The real-time performance of on-board image processing and robustness is validated through experiments. Experimental results demonstrate the algorithm's ability to enable a ground vehicle to navigate at an average speed of 2.20 m/s for passing through multiple gates and 3.88 m/s for a 'slalom' task in an environment featuring significant visual clutter. 3
Optimal Covariate Balancing Conditions in Propensity Score Estimation Fan, JQ; Imai, K; Lee, I; Liu, H; Ning, Y; Yang, XL Inverse probability of treatment weighting (IPTW) is a popular method for estimating the average treatment effect (ATE). However, empirical studies show that the IPTW estimators can be sensitive to the misspecification of the propensity score model. To address this problem, researchers have proposed to estimate propensity score by directly optimizing the balance of pretreatment covariates. While these methods appear to empirically perform well, little is known about how the choice of balancing conditions affects their theoretical properties. To fill this gap, we first characterize the asymptotic bias and efficiency of the IPTW estimator based on the covariate balancing propensity score (CBPS) methodology under local model misspecification. Based on this analysis, we show how to optimally choose the covariate balancing functions and propose an optimal CBPS-based IPTW estimator. This estimator is doubly robust; it is consistent for the ATE if either the propensity score model or the outcome model is correct. In addition, the proposed estimator is locally semiparametric efficient when both models are correctly specified. To further relax the parametric assumptions, we extend our method by using a sieve estimation approach. We show that the resulting estimator is globally efficient under a set of much weaker assumptions and has a smaller asymptotic bias than the existing estimators. Finally, we evaluate the finite sample performance of the proposed estimators via simulation and empirical studies. An open-source software package is available for implementing the proposed methods. 3
Energy aware scheduling on heterogeneous multiprocessors with DVFS and Duplication Singh, J; Gujral, A; Singh, H; Singh, JU; Auluck, N Duplication and dynamic voltage/frequency scaling (DVFS) creates an interesting trade-off for scheduling task graphs on multiprocessors to improve energy consumption and schedule length (or makespan). With DVFS, tasks are made to run on low voltages, which decreases their computation power. However, it also increases their execution costs and hence, may increase the schedule length. Furthermore, applying DVFS on processors does not impact the communication delay/energy consumption. Duplicating a task on multiple processors reduces the communication delay among them, which further reduces the schedule length. Although duplication reduces the communication energy among processors, it also increases the overall computation energy. In this paper, we explore this trade-off between duplication and DVFS, and propose a polynomial time heuristic to schedule task graphs on heterogeneous multiprocessors. The tasks are carefully duplicated with DVFS to reduce its impact on the computation energy. The results demonstrate that the proposed algorithm is able to effectively balance the makespan and energy consumption over other algorithms in various scenarios. 3
Planar graphs: Random walks and bipartiteness testing Czumaj, A; Monemizadeh, M; Onak, K; Sohler, C We initiates the study of property testing in arbitrary planar graphs. We prove that bipartiteness can be tested in constant time, improving on the previous bound of (n) for graphs on n vertices. The constant-time testability was only known for planar graphs with bounded degree. Our algorithm is based on random walks. Since planar graphs have good separators, that is, bad expansion, our analysis diverges from standard techniques that involve the fast convergence of random walks on expanders. We reduce the problem to the task of detecting an odd-parity cycle in a multigraph induced by constant-length cycles. We iteratively reduce the length of cycles while preserving the detection probability, until the multigraph collapses to a collection of easily discoverable self-loops. Our approach extends to arbitrary minor-free graphs. We also believe that our techniques will find applications to testing other properties in arbitrary minor-free graphs. 3
Implementation, recruitment and baseline characteristics: A randomized trial of combined treatments for smoking cessation and weight control Bush, T; Lovejoy, J; Javitz, H; Mahuna, S; Torres, AJ; Wassum, K; Magnusson, B; Benedict, C; Spring, B Background: Two-thirds of treatment-seeking smokers are obese or overweight. Most smokers are concerned about gaining weight after quitting. The average smoker experiences modest post-quit weight gain which discourages many smokers from quitting. Although evidence suggests that combined interventions to help smokers quit smoking and prevent weight gain can be helpful, studies have not been replicated in real world settings. Methods: This paper describes recruitment and participant characteristics of the Best Quit Study, a 3-arm randomized controlled trial testing tobacco cessation treatment alone or combined with simultaneous or sequential weight management. Study participants were recruited via tobacco quitlines from August 5, 2013 to December 15, 2014. Results: Statistical analysis on baseline data was conducted in 2015/2016. Among 5082 potentially eligible callers to a tobacco quitline, 2540 were randomized (50% of eligible). Compared with individuals eligible but not randomized, those randomized were significantly more likely to be female (65.7% vs 54.5%, p < 0.01), overweight or obese (76.3% vs 62.5%, p < 0.01), more confident in quitting (p < 0.01), more addicted (first cigarette within 5 min: 50.0% vs 44.4%, p < 0.01), and have a chronic disease (28.6% vs. 24.4%, p < 0.01). Randomized groups were not statistically significantly different on demographics, tobacco or weight variables. Two-thirds of participants were female and white with a mean age of 43. Conclusions: Adding weight management interventions to tobacco cessation quitlines was feasible and acceptable to smokers. If successful for cessation and weight outcomes, a combined intervention may provide a treatment approach for addressing weight gain with smoking cessation through tobacco quitlines. 3
Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning Salvador, A; Gundogdu, E; Bazzani, L; Donoser, M Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images to train machine learning models. In this work, we revisit existing approaches for cross-modal recipe retrieval and propose a simplified end-to-end model based on well established and high performing encoders for text and images. We introduce a hierarchical recipe Transformer which attentively encodes individual recipe components (titles, ingredients and instructions). Further, we propose a self-supervised loss function computed on top of pairs of individual recipe components, which is able to leverage semantic relationships within recipes, and enables training using both image-recipe and recipe-only samples. We conduct a thorough analysis and ablation studies to validate our design choices. As a result, our proposed method achieves state-of-the-art performance in the cross-modal recipe retrieval task on the Recipe1M dataset. We make code and models publicly available(1). 2
Machine Learning @ Amazon Rastogi, R In this talk, I will first provide an overview of key problem areas where we are applying Machine Learning (ML) techniques within Amazon such as product demand forecasting, product search, and information extraction from reviews, and associated technical challenges. I will then talk about three specific applications where we use a variety of methods to learn semantically rich representations of data: question answering where we use deep learning techniques, product size recommendations where we use probabilistic models, and fake reviews detection where we use tensor factorization algorithms. 2
SuperPart: Supervised graph partitioning for record linkage Reas, R; Ash, S; Barton, R; Borthwick, A Identifying sets of items that are equivalent to one another is a problem common to many fields. Systems addressing this generally have at their core a function s(d(i), d(j)) for computing the similarity between pairs of records d(i), d(j). The output of s() can be interpreted as a weighted graph where edges indicate the likelihood of two records matching. Partitioning this graph into equivalence classes is non-trivial due to the presence of inconsistencies and imperfections in s(). Numerous algorithmic approaches to the problem have been proposed, but (1) it is unclear which approach should be used on a given dataset; (2) the algorithms do not generally output a confidence in their decisions; and (3) require error-prone tuning to a particular notion of ground truth. We present SuperPart, a scalable, supervised learning approach to graph partitioning. We demonstrate that SuperPart yields competitive results on the problem of detecting equivalent records without manual selection of algorithms or an exhaustive search over hyperparameters. Also, we show the quality of SuperPart's confidence measures by reporting Area Under the Precision-Recall Curve metrics that exceed a baseline measure by 11%. Finally, to bolster additional research in this domain, we release three new datasets derived from real-world Amazon product data along with ground-truth partitionings. 2
Style-Aware Normalized Loss for Improving Arbitrary Style Transfer Cheng, JX; Jaiswal, A; Wu, Y; Natarajan, P; Natarajan, P Neural Style Transfer (NST) has quickly evolved from single-style to infinite-style models, also known as Arbitrary Style Transfer (AST). Although appealing results have been widely reported in literature, our empirical studies on four well-known AST approaches (GoogleMagenta [14], AdaIN [19], LinearTransfer [29], and SANet [37]) show that more than 50% of the time, AST stylized images are not acceptable to human users, typically due to under- or over-stylization. We systematically study the cause of this imbalanced style transferability (IST) and propose a simple yet effective solution to mitigate this issue. Our studies show that the IST issue is related to the conventional AST style loss, and reveal that the root cause is the equal weightage of training samples irrespective of the properties of their corresponding style images, which biases the model towards certain styles. Through investigation of the theoretical bounds of the AST style loss, we propose a new loss that largely overcomes IST. Theoretical analysis and experimental results validate the effectiveness of our loss, with over 80% relative improvement in style deception rate and 98% relatively higher preference in human evaluation. 2
A Real-Time Constraint Management Approach Through Constraint Similarity and Pattern Recognition in Power System Yu, FL; Tu, F In real-time energy market, Regional Transmission Organization (RTO) dispatchers meet the energy demand while respecting transmission security constraints using the least-cost security constrained economic dispatch program, called Unit Dispatch System (UDS). If the transmission violation detected by the Energy Management System (EMS) requires redispatch, system operator will transfer the transmission constraint information to UDS for resolution through a manual process. Dispatchers need to make an educated guess on constraint trends to support their manual redispatch process. However, as the number of constraints increases during peak hours, this process is more and more complicated and the system becomes less manageable. This paper intends to challenge this operational difficulty through a similarity model to reveal the constraint relations and a heuristic pattern recognition method to categorize constraints into controllable constraint groups (CCG). System operators only take care of a dominant constraint in each CCG using controllable units instead of them all. This will make system much more manageable during peak hours, therefore, will increase effectiveness and efficiency of system operations. 2
LORE: A Large-Scale Offer Recommendation Engine with Eligibility and Capacity Constraints Makhijani, R; Chakrabarti, S; Struble, D; Liu, Y Businesses, such as Amazon, department store chains, home furnishing store chains, Uber, and Lyft, frequently offer deals, product discounts and incentives to drive sales, increase new product acceptance and engage with users. In order to appeal to diverse user groups, these businesses typically design more than one promotion offer but market different ones to different users. For instance, Uber offers a percentage discount in the rides to some users and a low fixed price to others. In this paper, we propose solutions to optimally recommend promotions and items to maximize user conversion constrained by user eligibility and item or offer capacity (limited quantity of items or offers) simultaneously. We achieve this through an offer recommendation model based on Min-Cost Flow network optimization, which enables us to satisfy the constraints within the optimization itself and solve it in polynomial time. We present two approaches that can be used in various settings: single period solution and sequential time period offering. We evaluate these approaches against competing methods using counterfactual evaluation in offline mode. We also discuss three practical aspects that may affect the online performance of constrained optimization: capacity determination, traffic arrival pattern and clustering for large scale setting. 2
Core self-evaluations, job complexity, and net worth: An examination of mediating and moderating factors Rosopa, PJ; McIntyre, AL; Fairbanks, IN; D'Souza, KB Core self-evaluations (CSE) is a higher-order latent variable composed of four lower-order variables-self-esteem, self-efficacy, emotional stability, and locus of control. Relatively little research has examined CSE as a distal predictor of financial success and the mechanisms that lead to financial success. Utilizing data from a longitudinal sample (N = 3364) collected over several decades, it was found that CSE had a positive effect on net worth, and that CSE had an indirect effect on net worth through job complexity. Additionally, job complexity and cognitive ability interacted in predicting net worth. Specifically, the positive association between job complexity and net worth became stronger as cognitive ability increased. Implications for the literature on the complex relationship between CSE and major life outcome variables and directions for future research are discussed. 2
Multi-stage stochastic programming models for provisioning cloud computing resources Bulbul, K; Noyan, N; Erol, H We focus on the resource provisioning problem of a cloud consumer from an Infrastructure-as-a-Service type of cloud. The cloud provider offers two deployment options, which can be mixed and matched as appropriate. Cloud instances may be reserved for a fixed time period in advance at a smaller usage cost per hour but require a full commitment and payment for the entire contract duration. In contrast, on-demand instances reflect a pay-as-you-go policy at a premium. The trade-off between these two options is rooted in the inherent uncertainty in demand and price and makes it attractive to complement a base reserved capacity with on-demand capacity to hedge against the spikes in demand. This paper provides several novel multi-stage stochastic programming formulations to enable a cloud consumer to handle the cloud resource provisioning problem at a tactical level. We first formulate the cloud resource provisioning problem as a risk-neutral multi-stage stochastic program, which serves as the base model for further modeling variants. In our second set of models, we also incorporate a certain concept of system reliability. In particular, chance constraints integrated into the base formulation require a minimum service level met from reserved capacity, provide more visibility into the future available capacity, and smooth out expensive on-demand usage by hedging against possible demand fluctuations. An extensive computational study demonstrates the value of the proposed models by discussing computational performance, gleaning practical managerial insights from the analysis of the solutions of the proposed models, and quantifying the value of the stochastic solutions. (C) 2020 Elsevier B.V. All rights reserved. 2
Rate-Splitting Multiple Access for Overloaded Cellular Internet of Things Mao, YJ; Piovano, E; Clerckx, B In the near future, it is envisioned that cellular networks will have to cope with extensive internet of things (IoT) devices. Therefore, a required feature of cellular IoT will be the capability to serve simultaneously a large number of devices with heterogeneous demands and qualities of channel state information at the transmitter (CSIT). In this paper, we focus on an overloaded multiple-input single-output (MISO) broadcast channel (BC) with two groups of CSIT qualities, namely, one group of users (representative of high-end devices) for which the transmitter has partial knowledge of the CSI, the other group of users (representative of IoT devices) for which the transmitter only has knowledge of the statistical CSI (i.e., the distribution information of the user channels). We introduce rate-splitting multiple access (RSMA), a new multiple access based on multi-antenna rate-splitting (RS) technique for cellular IoT. Two strategies are proposed, namely, time partitioning-RSMA (TP-RSMA) and power partitioning-RSMA (PP-RSMA). The former independently serves the two groups of users over orthogonal time slots while the latter jointly serves the two groups of users within the same time slot in a non-orthogonal manner. We first show at high signal-to-noise ratio (SNR) that PP-RSMA achieves the optimal degrees-of-freedom (DoF) in an overloaded MISO BC with heterogeneous CSIT qualities. We then show at finite SNR that PP-RSMA achieves explicit sum rate gain over TP-RSMA and all baseline schemes by marrying the benefits of PP and RSMA. Furthermore, PP-RSMA is robust to CSIT inaccuracy and flexible to cope with quality of service (QoS) rate constraints of all users. The DoF and rate analysis helps us in drawing the conclusion that PP-RSMA is a powerful framework for cellular IoT with a large number of devices. 2
Crowdsourced privacy-preserved feature tagging of short home videos for machine learning ASD detection Washington, P; Tariq, Q; Leblanc, E; Chrisman, B; Dunlap, K; Kline, A; Kalantarian, H; Penev, Y; Paskov, K; Voss, C; Stockham, N; Varma, M; Husic, A; Kent, J; Haber, N; Winograd, T; Wall, DP Standard medical diagnosis of mental health conditions requires licensed experts who are increasingly outnumbered by those at risk, limiting reach. We test the hypothesis that a trustworthy crowd of non-experts can efficiently annotate behavioral features needed for accurate machine learning detection of the common childhood developmental disorder Autism Spectrum Disorder (ASD) for children under 8 years old. We implement a novel process for identifying and certifying a trustworthy distributed workforce for video feature extraction, selecting a workforce of 102 workers from a pool of 1,107. Two previously validated ASD logistic regression classifiers, evaluated against parent-reported diagnoses, were used to assess the accuracy of the trusted crowd's ratings of unstructured home videos. A representative balanced sample (N=50 videos) of videos were evaluated with and without face box and pitch shift privacy alterations, with AUROC and AUPRC scores>0.98. With both privacy-preserving modifications, sensitivity is preserved (96.0%) while maintaining specificity (80.0%) and accuracy (88.0%) at levels comparable to prior classification methods without alterations. We find that machine learning classification from features extracted by a certified nonexpert crowd achieves high performance for ASD detection from natural home videos of the child at risk and maintains high sensitivity when privacy-preserving mechanisms are applied. These results suggest that privacy-safeguarded crowdsourced analysis of short home videos can help enable rapid and mobile machine-learning detection of developmental delays in children. 2
Sketch and Scale Geo-distributed tSNE and UMAP Wei, V; Ivkin, N; Braverman, V; Szalay, AS Running machine learning analytics over geographically distributed datasets is a rapidly arising problem in the world of data management policies ensuring privacy and data security. Visualizing high dimensional data using tools such as t-distributed Stochastic Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP) became a common practice for data scientists. Both tools scale poorly in time and memory. While recent optimizations showed successful handling of 10,000 data points, scaling beyond million points is still challenging. We introduce a novel framework: Sketch and Scale (SnS). It leverages a Count Sketch data structure to compress the data on the edge nodes, aggregates the reduced size sketches on the master node, and runs vanilla tSNE or UMAP on the summary, representing the densest areas, extracted from the aggregated sketch. We show this technique to be fully parallel, scale linearly in time, logarithmically in memory and communication, making it possible to analyze datasets with many millions, potentially billions of data points, spread across several data centers around the globe. We demonstrate the power of our method on two mid-size datasets: cancer data with 52 million 35-band pixels from multiplex images of tumor biopsies; and astrophysics data of 100 million stars with multi-color photometry from the Sloan Digital Sky Survey (SDSS). 2
HIGH-FREQUENCY ADVERSARIAL DEFENSE FOR SPEECH AND AUDIO Olivier, R; Raj, B; Shah, M Recent work suggests that adversarial examples are enabled by high-frequency components in the dataset. In the speech domain where spectrograms are used extensively, masking those components seems like a sound direction for defenses against attacks. We explore a smoothing approach based on additive noise masking in priority high frequencies. We show that this approach is much more robust than the naive noise filtering approach, and a promising research direction. We successfully apply our defense on a Librispeech speaker identification task, and on the UrbanSound8K audio classification dataset. 2
Approximate Capacity of Fast Fading Interference Channels With no Instantaneous CSIT Sebastian, J; Karakus, C; Diggavi, S We develop a characterization of fading models, which assigns a number called logarithmic Jensen's gap to a given fading model. We show that as a consequence of a finite logarithmic Jensen's gap, an approximate capacity region can be obtained for fast fading interference channels (FF-ICs) for several scenarios. We illustrate three instances where a constant capacity gap can be obtained as a function of the logarithmic Jensen's gap. First, for an FF-IC with neither feedback nor instantaneous channel state information at transmitter (CSIT), if the fading distribution has finite logarithmic Jensen's gap, we show that a rate-splitting scheme based on the average interference-to-noise ratio can achieve its approximate capacity. Second, we show that a similar scheme can achieve the approximate capacity of FF-IC with feedback and delayed CSIT, if the fading distribution has finite logarithmic Jensen's gap. Third, when this condition holds, we show that point-to-point codes can achieve approximate capacity for a class of FF-ICs with feedback. We prove that the logarithmic Jensen's gap is finite for common fading models, including Rayleigh and Nakagami fading, thereby obtaining the approximate capacity region of FF-IC with these fading models. 2
ANT COLONY OPTIMIZATION ALGORITHMS WITH DIVERSIFIED SEARCH IN THE PROBLEM OF OPTIMIZATION OF AIRTRAVEL ITINERARY Hulianytskyi, L; Pavlenko, A The formulated problem is to find optimal traveler's route in airline networks, which takes into account cost of the constructed route and user conditions with time-dependent cost of connections. Ant colony system algorithms are proposed to solve the time-dependent problem represented by an extended,flight graph. Unlike the available ant algorithm implementations, the developed algorithms take into account the properties of dynamic networks (t(me-dependent availability and connection cost) and user conditions. The improved approach to the diversification of search in ant colony system algorithms in terms of time dependence for a dense graph increased the quality of the constructed routes from different regions. The proposed algorithms are analyzed for efficiency based on the analysis of the results of a computational experiment from real data. 2
A Solver-Aided Language for Test Input Generation Ringer, T; Grossman, D; Schwartz-Narbonne, D; Tasiran, S Developing a small but useful set of inputs for tests is challenging. We show that a domain-specific language backed by a constraint solver can help the programmer with this process. The solver can generate a set of test inputs and guarantee that each input is different from other inputs in a way that is useful for testing. This paper presents Iorek: a tool that empowers the programmer with the ability to express to any SMT solver what it means for inputs to be different. The core of Iorek is a rich language for constraining the set of inputs, which includes a novel bounded enumeration mechanism that makes it easy to define and encode a 91 flexible notion of difference over a recursive structure. We demonstrate the flexibility of this mechanism for generating strings. We use Iorek to test real services and find that it is effective at finding bugs. We also build Iorek into a random testing tool and show that it increases coverage. 2
AUDIOVISUAL HIGHLIGHT DETECTION IN VIDEOS Mundnich, K; Fenster, A; Khare, A; Sundaram, S In this paper, we test the hypothesis that interesting events in unstructured videos are inherently audiovisual. We combine deep image representations for object recognition and scene understanding with representations from an audiovisual affect recognition model. To this set, we include content agnostic audio-visual synchrony representations and mel-frequency cepstral coefficients to capture other intrinsic properties of audio. These features are used in a modular supervised model. We present results from two experiments: efficacy study of single features on the task, and an ablation study where we leave one feature out at a time. For the video summarization task, our results indicate that the visual features carry most information, and including audiovisual features improves over visual-only information. To better study the task of highlight detection, we run a pilot experiment with highlights annotations for a small subset of video clips and fine-tune our best model on it. Results indicate that we can transfer knowledge from the video summarization task to a model trained specifically for the task of highlight detection. 2
Can malpractice pressure compel a physician to relocate? Ellyson, AM; Robertson, JC Existing literature considers the effect of changes in malpractice pressure by focusing on physician supply, and concludes that changes in tort laws have limited but some impact on physician movement. Using a panel dataset which follows a random sample of 28,227 family medicine physicians in the United States from 1992-2007, this paper evaluates whether changes in malpractice premiums impact a physician's decision to relocate their practice. Our findings suggest that even large premium growth has no impact on the physician relocation decision. Generally, these results suggest that family medicine physicians do not use relocation as a strategy to avoid malpractice pressure. However, some physicians are more inclined to relocate than others. Results indicate that group and hospital practice physicians are more likely to move to another state when premiums are high compared to solo and partnership practice physicians. (C) 2018 Elsevier Inc. All rights reserved. 2
Epistemic fragmentation poses a threat to the governance of online targeting Milano, S; Mittelstadt, B; Wachter, S; Russell, C Online targeting isolates individual consumers, causing what we call epistemic fragmentation. This phenomenon amplifies the harms of advertising and inflicts structural damage to the public forum. The two natural strategies to tackle the problem of regulating online targeted advertising, increasing consumer awareness and extending proactive monitoring, fail because even sophisticated individual consumers are vulnerable in isolation, and the contextual knowledge needed for effective proactive monitoring remains largely inaccessible to platforms and external regulators. The limitations of both consumer awareness and of proactive monitoring strategies can be attributed to their failure to address epistemic fragmentation. We call attention to a third possibility that we call a civic model of governance for online targeted advertising, which overcomes this problem, and describe four possible pathways to implement this model. Online targeted advertising fuelled by machine learning can lead to the isolation of individual consumers. This problem of 'epistemic fragmentation' cannot be tackled with current regulation strategies and a new, civic model of governance for advertising is needed. 2
The Multi-Temporal Urban Development SpaceNet Dataset Van Etten, A; Hogan, D; Manso, JM; Shermeyer, J; Weir, N; Lewis, R Satellite imagery analytics have numerous human development and disaster response applications, particularly when time series methods are involved. For example, quantifying population statistics is fundamental to 67 of the 231 United Nations Sustainable Development Goals Indicators, but the World Bank estimates that over 100 countries currently lack effective Civil Registration systems. To help address this deficit and develop novel computer vision methods for time series data, we present the Multi-Temporal Urban Development SpaceNet (MUDS, also known as SpaceNet 7) dataset. This open source dataset consists of medium resolution (4.0m) satellite imagery mosaics, which includes approximate to 24 images (one per month) covering > 100 unique geographies, and comprises > 40,000 km(2) of imagery and exhaustive polygon labels of building footprints therein, totaling over I IM individual annotations. Each building is assigned a unique identifier (i.e. address), which permits tracking of individual objects over time. Label fidelity exceeds image resolution; this omniscient labeling is a unique feature of the dataset, and enables surprisingly precise algorithmic models to be crafted. We demonstrate methods to track building footprint construction (or demolition) over time, thereby directly assessing urbanization. Performance is measured with the newly developed SpaceNet Change and Object Tracking (SCOT) metric, which quantifies both object tracking as well as change detection. We demonstrate that despite the moderate resolution of the data, we are able to track individual building identifiers over time. 2
Additionality and Forest Conservation Regulation for Residential Development Newburn, DA; Ferris, JS We analyze the potential effects that a unique forest conservation regulation has on residential development, and assess the additionality in forest cover due to this regulation. We combine panel data on forest cover change from satellite imagery and parcel-level modeling on residential development, including residential subdivisions occurring before and after the regulation is adopted. Our results suggest that after adoption, there was a 21% increase in forest cover within subdivisions relative to the amount without the regulation. The heterogeneous effects of this regulation suggest that on average, forest cover increased for parcels with lower levels of existing forest cover. However, parcels with the highest levels of forest cover continue to have significant decreases in forest cover, despite the regulation, thereby resulting in fragmentation in regions with the most intact forest cover. 2
Using Lexical Properties of Handwritten Equations to Estimate the Correctness of Students' Solutions to Engineering Problems Stahovich, TF; Lin, H; Gyllen, J We present a technique that examines handwritten equations from a student's solution to an engineering problem and from this estimates the correctness of the work. More specifically, we demonstrate that lexical properties of the equations correlate with the grade a human grader would assign. We characterize these properties with a set of features that include the number of occurrences of various classes of symbols and binary and tripartite sequences of them. Support vector machine (SVM) regression models trained with these features achieved a correlation of r = .433 (p< .001) on a combined set of six exam problems. Prior work suggests that the number of long pauses in the writing that occur as a student solves a problem correlates with correctness. We found that combining this pause feature with our lexical features produced more accurate predictions than using either type of feature alone. SVM regression models trained using an optimized subset of three lexical features and the pause feature achieved an average correlation with grade across the six problems of r = .503 (p< .001). These techniques are an important step toward creating systems that can automatically assess handwritten coursework. 2
Urban Sensing for Anomalous Event Detection: Distinguishing Between Legitimate Traffic Changes and Abnormal Traffic Variability Zameni, M; He, MY; Moshtaghi, M; Ghafoori, Z; Leckie, C; Bezdek, JC; Ramamohanarao, K Sensors deployed in different parts of a city continuously record traffic data, such as vehicle flows and pedestrian counts. We define an unexpected change in the traffic counts as an anomalous local event. Reliable discovery of such events is very important in real-world applications such as real-time crash detection or traffic congestion detection. One of the main challenges to detecting anomalous local events is to distinguish them from legitimate global traffic changes, which happen due to seasonal effects, weather and holidays. Existing anomaly detection techniques often raise many false alarms for these legitimate traffic changes, making such techniques less reliable. To address this issue, we introduce an unsupervised anomaly detection system that represents relationships between different locations in a city. Our method uses training data to estimate the traffic count at each sensor location given the traffic counts at the other locations. The estimation error is then used to calculate the anomaly score at any given time and location in the network. We test our method on two real traffic datasets collected in the city of Melbourne, Australia, for detecting anomalous local events. Empirical results show the greater robustness of our method to legitimate global changes in traffic count than four benchmark anomaly detection methods examined in this paper. Data related to this paper are available at: https://vicroadsopendata-vicroadsmaps. opendata.arcgis.com/datasets/147696bb47544a209e0a5e79e165d1b0_0. 2
DISENTANGLEMENT FOR AUDIO-VISUAL EMOTION RECOGNITION USING MULTITASK SETUP Peri, R; Parthasarathy, S; Bradshaw, C; Sundaram, S Deep learning models trained on audio-visual data have been successfully used to achieve state-of-the-art performance for emotion recognition. In particular, models trained with multitask learning have shown additional performance improvements. However, such multitask models entangle information between the tasks, encoding the mutual dependencies present in label distributions in the real world data used for training. This work explores the disentanglement of multimodal signal representations for the primary task of emotion recognition and a secondary person identification task. In particular, we developed a multitask framework to extract low-dimensional embeddings that aim to capture emotion specific information, while containing minimal information related to person identity. We evaluate three different techniques for disentanglement and report results of up to 13% disentanglement while maintaining emotion recognition performance. 2
Boundary Preserving Dense Local Regions Kim, J; Grauman, K We propose a dense local region detector to extract features suitable for image matching and object recognition tasks. Whereas traditional local interest operators rely on repeatable structures that often cross object boundaries (e.g., corners, scale-space blobs), our sampling strategy is driven by segmentation, and thus preserves object boundaries and shape. At the same time, whereas existing region-based representations are sensitive to segmentation parameters and object deformations, our novel approach to robustly sample dense sites and determine their connectivity offers better repeatability. In extensive experiments, we find that the proposed region detector provides significantly better repeatability and localization accuracy for object matching compared to an array of existing feature detectors. In addition, we show our regions lead to excellent results on two benchmark tasks that require good feature matching: weakly supervised foreground discovery and nearest neighbor-based object recognition. 2
Budget-Management Strategies in Repeated Auctions Balseiro, S; Kim, A; Mahdian, M; Mirrokni, V In online advertising, advertisers purchase ad placements by participating in a long sequence of repeated auctions. One of the most important features that advertising platforms often provide and advertisers often use is budget management, which allows advertisers to control their cumulative expenditures. Advertisers typically declare the maximum daily amount they are willing to pay, and the platform adjusts allocations and payments to guarantee that cumulative expenditures do not exceed budgets. There are multiple ways to achieve this goal, and each one, when applied to all budget-constrained advertisers simultaneously, drives the system toward a different equilibrium. Our goal is to compare the system equilibria of a range of budget-management strategies. In particular, we consider six different budget-management strategies, including probabilistic throttling, thresholding, bid shading, reserve pricing, and two versions of multiplicative boosting. We show that these methods admit a system equilibrium, study their incentive properties, prove dominance relations among them in a simplified setting, and confirm our theoretical findings using real ad auction data from a sponsored search. Our study sheds light on the impact of budget-management strategies on the trade-off between the seller's profit and buyers' utility and may be of practical relevance for advertising platforms. 2
Action Modifiers: Learning from Adverbs in Instructional Videos Doughty, H; Laptev, I; Mayol-Cuevas, W; Damen, D We present a method to learn a representation for adverbs from instructional videos using weak supervision from the accompanying narrations. Key to our method is the fact that the visual representation of the adverb is highly dependant on the action to which it applies, although the same adverb will modify multiple actions in a similar way. For instance, while 'spread quickly' and 'mix quickly' will look dissimilar, we can learn a common representation that allows us to recognize both, among other actions. We formulate this as an embedding problem, and use scaled dot-product attention to learn from weakly-supervised video narrations. We jointly learn adverbs as invertible transformations operating on the embedding space, so as to add or remove the effect of the adverb. As there is no prior work on weakly supervised learning of adverbs, we gather paired action-adverb annotations from a subset of the HowTo 100M dataset for 6 adverbs: quickly/slowly, finely/coarsely, and partially/completely. Our method outperforms all baselines for video-to-adverb retrieval with a performance of 0.719 mAP. We also demonstrate our model's ability to attend to the relevant video parts in order to determine the adverb for a given action. 2
On Asymptotic Distributions and Confidence Intervals for LIFT Measures in Data Mining Jiang, WX; Zhao, Y A LIFT measure, such as the response rate, lift, or the percentage of captured response, is a fundamental measure of effectiveness for a scoring rule obtained from data mining, which is estimated from a set of validation data. In this article, we study how to construct confidence intervals of the LIFT measures. We point out the subtlety of this task and explain how simple binomial confidence intervals can have incorrect coverage probabilities, due to omitting variation from the sample percentile of the scoring rule. We derive the asymptotic distribution using some advanced empirical process theory and the functional delta method in the Appendix. The additional variation is shown to be related to a conditional mean response, which can be estimated by a local averaging of the responses over the scores from the validation data. Alternatively, a subsampling method is shown to provide a valid confidence interval, without needing to estimate the conditional mean response. Numerical experiments are conducted to compare these different methods regarding the coverage probabilities and the lengths of the resulting confidence intervals. 2
TOP-DOWN ATTENTION IN END-TO-END SPOKEN LANGUAGE UNDERSTANDING Chen, YX; Lu, WY; Mottini, A; Li, LE; Droppo, J; Du, Z; Zeng, B Spoken language understanding (SLU) is the task of inferring the semantics of spoken utterances. Traditionally, this has been achieved with a cascading combination of Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) modules that are optimized separately, which can lead to a suboptimal overall performance. More recently, End-to-End SLU (E2E SLU) was proposed to perform SLU directly from speech through a joint optimization of the modules, addressing some of the traditional SLU shortcomings. A key challenge of this approach is how to best integrate the feature learning of the ASR and NLU sub-tasks to maximize their performance. While it is known that in general, ASR models focus on low-level features, and NLU models need higher-level contextual information, ASR models can nonetheless also leverage top-down syntactic and semantic information to improve their recognition. Based on this insight, we propose Top-Down SLU (TD-SLU), a new transformer-based E2E SLU model that uses top-down attention and an attention gate to fuse high-level NLU features with low-level ASR features, which leads to a better optimization of both tasks. We have validated our model using the public FluentSpeech set, and a large custom dataset. Results show TD-SLU is able to outperform selected baselines both in terms of ASR and NLU quality metrics, and suggest that the added syntactic and semantic high-level information can improve the model's performance. 2
On Constant-Time QC-MDPC Decoders with Negligible Failure Rate Drucker, N; Gueron, S; Kostic, D The QC-MDPC code-based KEM Bit Flipping Key Encapsulation (BIKE) is one of the Round-2 candidates of the NIST PQC standardization project. It has a variant that is proved to be IND-CCA secure. The proof models the KEM with some black-box (ideal) primitives. Specifically, the decapsulation invokes an ideal primitive called decoder, required to deliver its output with a negligible Decoding Failure Rate (DFR). The concrete instantiation of BIKE substitutes this ideal primitive with a new decoding algorithm called Backflip, that is shown to have the required negligible DFR. However, it runs in a variable number of steps and this number depends on the input and on the key. This paper proposes a decoder that has a negligible DFR and also runs in a fixed (and small) number of steps. We propose that the instantiation of BIKE uses this decoder with our recommended parameters. We study the decoder's DFR as a function of the scheme's parameters to obtain a favorable balance between the communication bandwidth and the number of steps that the decoder runs. In addition, we build a constant-time software implementation of the proposed instantiation, and show that its performance characteristics are quite close to the IND-CPA variant. Finally, we discuss a subtle gap that needs to be resolved for every IND-CCA secure KEM (BIKE included) where the decapsulation has nonzero failure probability: the difference between average DFR and worst-case failure probability per key and ciphertext. 2
Monocular Depth Estimation via Deep Structured Models with Ordinal Constraints Ron, D; Duan, K; Ma, CY; Xu, N; Wang, SL; Hanumante, S; Sagar, D User interaction provides useful information for solving challenging computer vision problems in practice. In this paper, we show that a very limited number of user clicks could greatly boost monocular depth estimation performance and overcome monocular ambiguities. We formulate this task as a deep structured model, in which the structured pixel-wise depth estimation has ordinal constraints introduced by user clicks. We show that the inference of the proposed model could be efficiently solved through a feed-forward network. We demonstrate the effectiveness of the proposed model on NYU Depth V2 and Stanford 2D-3D datasets. On both datasets, we achieve state-of-the-art performance when encoding user interaction into our deep models. 2
The Future of Artificially Intelligent Assistants Muthukrishnan, M; Tomkins, A; Heck, L; Agarwal, D; Geramifard, A Artificial Intelligence has been present in literature at least since the ancient Greeks. Depictions present a wide range of perspectives of AI ranging from malefic overlords to depressive androids. Perhaps the most common recurring theme is the AI Assistant: C3PO from Star Wars; the Jetson's Rosie the Robot; the benign hyper-efficient Minds of lain M. Banks's Culture novels; the eerie HAL 9000 of Arthur C. Clarke's 2001: A Space Odyssey. Today, artificially intelligent assistants are actual products in the marketplace, based on startling recent progress in technologies like speaker-independent speech recognition. These products are in their infancy, but are improving rapidly. In this panel, we will address the product and technology landscape, and will ask a series of experts in the field plus the members of the audience to take a stance on what the future of artificially intelligent assistants will look like. 2
On the consistency of cognitive load Deck, C; Jahedi, S; Sheremeta, R There are many ways to induce cognitive load. In this paper, we manipulate cognitive capacity using four common techniques: a number memorization task, a visual pattern task, an auditory recall task, and time pressure. Under each load manipulation (as well as under `no load'), every participant completes a series of math problems, lottery tasks, logic puzzles, and allocation decisions. We find similar behavioral responses across all techniques: poorer performance on the math problems and logic puzzles, more risk aversion in the lottery tasks, and no systematic impact on allocation decisions. Using within-subject variation, we show that individuals whose math performance is most impacted for a given load manipulation (number memorization), are the same individuals whose performance is most impacted by other load manipulation, and in the other tasks. We also find that participants who scored above the median in a cognitive reflection test (CRT), and are thus able to resist the first response that comes to mind, are greatly impacted when placed under cognitive load; those scoring below the median in the CRT are not impacted much. (c) 2021 Elsevier B.V. All rights reserved. 2
MAXIMUM LIKELIHOOD FEATURES FOR GENERATIVE IMAGE MODELS Chang, LB; Borenstein, E; Zhang, W; Geman, S Most approaches to computer vision can be thought of as lying somewhere on a continuum between generative and discriminative. Although each approach has had its successes, recent advances have favored discriminative methods, most notably the convolutional neural network. Still, there is some doubt about whether this approach will scale to a human-level performance given the numbers of samples that are needed to train state-of-the-art systems. Here, we focus on the generative or Bayesian approach, which is more model based and, in theory, more efficient. Challenges include latent-variable modeling, computationally efficient inference, and data modeling. We restrict ourselves to the problem of data modeling, which is possibly the most daunting, and specifically to the generative modeling of image patches. We formulate a new approach, which can be broadly characterized as an application of conditional modeling, designed to sidestep the high-dimensionality and complexity of image data. A series of experiments, learning appearance models for faces and parts of faces, illustrates the flexibility and effectiveness of the approach. 2
Addressing Bias and Fairness in Search Systems Gao, RY; Shah, C Search systems have unprecedented influence on how and what information people access. These gateways to information on the one hand create an easy and universal access to online information, and on the other hand create biases that have shown to cause knowledge disparity and ill-decisions for information seekers. Most of the algorithms for indexing, retrieval, and ranking are heavily driven by the underlying data that itself is biased. In addition, orderings of the search results create position bias and exposure bias due to their considerable focus on relevance and user satisfaction. These and other forms of biases that are implicitly and sometimes explicitly woven in search systems are becoming increasing threats to information seeking and sense-making processes. In this tutorial, we will introduce the issues of biases in data, in algorithms, and overall in search processes and show how we could think about and create systems that are fairer, with increasing diversity and transparency. Specifically, the tutorial will present several fundamental concepts such as relevance, novelty, diversity, bias, and fairness using socio-technical terminologies taken from various communities, and dive deeper into metrics and frameworks that allow us to understand, extract, and materialize them. The tutorial will cover some of the most recent works in this area and show how this interdisciplinary research has opened up new challenges and opportunities for communities such as SIGIR. 2
Sustained credit card borrowing Grodzicki, D; Koulayev, S Using a large panel of credit card accounts, we examine the dynamics of credit card borrowing and repayment in the United States and what these imply for the expected costs of credit card debt to consumers. Our analysis reveals that: (a) credit cards are predominantly used to borrow, (b) card debt is sustained for long periods and balances frequently rise before being repaid, and (c) this debt is potentially more costly than anticipated. Specifically, we document that 82% of outstanding balances are debt and that 70% of this debt accrues to those borrowing continuously for a year or more. The expected annualized cost of an episode of continuous borrowing is 28% of its initial balance, or 13 percentage points. higher than the average annual percentage rate. Moreover, credit scores decline during episodes, further raising the expected cost of borrowing on a card. 2
Artificial intelligence and medical imaging: Definition, state of the art and perspectives Brunelle, F; Brunelle, P In medicine, emergence of artificial intelligence which origin from years 1950, is the outcome of three radical disruptive innovations: (1) digitalization of medical imaging techniques that allows their parametric use, (2) algorithms development that allows use of natural language processing on medical records, and (3) deep learning algorithms allowing treatment of uncategorized data (e.g. image classification). These systems can already automatically detect lesion and open the way to detection of lung, prostate, breast cancer. Their accuracy is superior to radiologists. Integrated to other medical date such as clinical biological, genetic, they will modify deeply the organization and structuration of the medical system. (C) 2019 l'Academie nationale de medecine. Published by Elsevier Masson SAS. All rights reserved. 2
The TIPS Liquidity Premium Andreasen, MM; Christensen, JHE; Riddell, S We introduce an arbitrage-free term structure model of nominal and real yields that accounts for liquidity risk in Treasury inflation-protected securities (TIPS). The novel feature of our model is to identify liquidity risk from individual TIPS prices by accounting for the tendency that TIPS, like most fixed-income securities, go into buy-and-hold investors' portfolios as time passes. We find a sizable and countercyclical TIPS liquidity premium, which helps our model to match TIPS prices. Accounting for liquidity risk also improves the model's ability to forecast inflation and match surveys of inflation expectations. 2
Error Invariants for Concurrent Traces Holzer, A; Schwartz-Narbonne, D; Befrouei, MT; Weissenbacher, G; Wies, T Error invariants are assertions that over-approximate the reachable program states at a given position in an error trace while only capturing states that will still lead to failure if execution of the trace is continued from that position. Such assertions reflect the effect of statements that are involved in the root cause of an error and its propagation, enabling slicing of statements that do not contribute to the error. Previous work on error invariants focused on sequential programs. We generalize error invariants to concurrent traces by augmenting them with additional information about hazards such as write-after-write events, which are often involved in race conditions and atomicity violations. By providing the option to include varying levels of details in error invariants-such as hazards and branching information-our approach allows the programmer to systematically analyze individual aspects of an error trace. We have implemented a hazard-sensitive slicing tool for concurrent traces based on error invariants and evaluated it on benchmarks covering a broad range of real-world concurrency bugs. Hazard-sensitive slicing significantly reduced the length of the considered traces and still maintained the root causes of the concurrency bugs. 2
Compare and Draw Lessons - Designing Resilience for Communities at Risk: Socio-technical Decision Support for Near-field Tsunamis Boulos, G; Huggins, LJ; Siciliano, MD; Ling, H; Yackovich, JC; Mosse, D; Comfort, LK Since the catastrophic 2004 Sumatran earthquake and tsunami, tsunami early warning systems have been established in every major ocean. These systems rely on Deep-ocean sensors for Assessment and Reporting of Tsunamis (DART buoys) as well as seismic stations and tidal gauges for detection. Yet these systems have largely missed the early detection of near-field tsunamis. To evaluate the potential impact of near-field tsunamis generated by earthquakes offshore of Padang, Indonesia, we present a design that couples an underwater sensor network with a land-based communications network to support risk assessment and response operations among emergency response organizations, including warning dissemination to the public. 2
The Critical Role of Process Analysis in Chemical Recycling and Upcycling of Waste Plastics Nicholson, SR; Rorrer, JE; Singh, A; Konev, MO; Rorrer, NA; Carpenter, AC; Jacobsen, AJ; Roman-Leshkov, Y; Beckham, GT There is an urgent need for new technologies to enable circularity for synthetic polymers, spurred by the accumulation of waste plastics in landfills and the environment and the contributions of plastics manufacturing to climate change. Chemical recycling is a promising means to convert waste plastics into molecular intermediates that can be remanufactured into new products. Given the growing interest in the development of new chemical recycling approaches, it is critical to evaluate the economics, energy use, greenhouse gas emissions, and other life cycle inventory metrics for emerging processes, relative to the incumbent, linear manufacturing practices employed today. Here we offer specific definitions for classes of chemical recycling and upcycling and describe general process concepts for the chemical recycling of mixed plastics waste. We present a framework for techno-economic analysis and life cycle assessment for both closed- and open-loop chemical recycling. Rigorous application of these process analysis tools will be required to enable impactful solutions for the plastics waste problem. 2
New frontiers in cognitive ability testing: working memory Martin, N; Capman, J; Boyce, A; Morgan, K; Gonzalez, MF; Adler, S Purpose Cognitive ability tests demonstrate strong relationships with job performance, but have several limitations; notably, subgroup differences based on race/ethnicity. As an alternative, the purpose of this paper is to develop a working memory assessment for personnel selection contexts. Design/methodology/approach The authors describe the development of Global Adaptive Memory Evaluation (G.A.M.E.) - a working memory assessment - along with three studies focused on refining and validating G.A.M.E., including examining test-taker reactions, reliability, subgroup differences, construct and criterion-related validity, and measurement equivalence across computer and mobile devices. Findings Evidence suggests that G.A.M.E. is a reliable and valid tool for employee selection. G.A.M.E. exhibited convergent validity with other cognitive assessments, predicted job performance, yielded smaller subgroup differences than traditional cognitive ability tests, was engaging for test-takers, and upheld equivalent measurement across computers and mobile devices. Research limitations/implications Additional research is needed on the use of working memory assessments as an alternative to traditional cognitive ability testing, including its advantages and disadvantages, relative to other constructs and methods. Practical implications The findings illustrate working memory's potential as an alternative to traditional cognitive ability assessments and highlight the need for cognitive ability tests that rely on modern theories of intelligence and leverage burgeoning mobile technology. Originality/value This paper highlights an alternative to traditional cognitive ability tests, namely, working memory assessments, and demonstrates how to design reliable, valid, engaging and mobile-compatible versions. 2
A bi-criteria multiple-choice secretary problem Yu, G; Jacobson, SH; Kiyavash, N This article studies a Bi-criteria Multiple-choice Secretary Problem (BMSP) with full information. A sequence of candidates arrive one at a time, with a two-dimensional attribute vector revealed upon arrival. A decision maker needs to select a total number of eta candidates to fill eta job openings, based on the attribute vectors of candidates. The objective of the decision maker is to maximize the expected sum of attribute values of selected candidates for both dimensions of the attribute vector. An approach for generating Pareto-optimal policies for BMSP is proposed using the weighted sum method. Moreover, closed-form expressions for values of both objective functions under Pareto-optimal policies for BMSP are provided to help a decision maker in the policy planning stage. These analysis techniques can be applied directly to solve the more general class of multi-criteria multiple-choice Secretary Problems, provided the objective functions are in the form of accumulating a product-form reward for each selected candidate. 2
Temporal-Aware Self-Supervised Learning for 3D Hand Pose and Mesh Estimation in Videos Chen, LJ; Lin, SY; Xie, YS; Lin, YY; Xie, XH Estimating 3D hand pose directly from RGB images is challenging but has gained steady progress recently by training deep models with annotated 3D poses. However annotating 3D poses is difficult and as such only a few 3D hand pose datasets are available, all with limited sample sizes. In this study, we propose a new framework of training 3D pose estimation models from RGB images without using explicit 3D annotations, i.e., trained with only 2D information. Our framework is motivated by two observations: 1) Videos provide richer information for estimating 3D poses as opposed to static images; 2) Estimated 3D poses ought to be consistent whether the videos are viewed in the forward order or reverse order. We leverage these two observations to develop a self-supervised learning model called temporal-aware self-supervised network (TASSN). By enforcing temporal consistency constraints, TASSN learns 3D hand poses and meshes from videos with only 2D keypoint position annotations. Experiments show that our model achieves surprisingly good results, with 3D estimation accuracy on par with the state-of-the-art models trained with 3D annotations, highlighting the benefit of the temporal consistency in constraining 3D prediction models. 2
READ: Recursive Autoencoders for Document Layout Generation Patil, AG; Ben-Eliezer, O; Perel, O; Averbuch-Elor, H Layout is a fundamental component of any graphic design. Creating large varieties of plausible document layouts can be a tedious task, requiring numerous constraints to be satisfied, including local ones relating different semantic elements and global constraints on the general appearance and spacing. In this paper, we present a novel framework, coined READ, for REcursive Autoencoders for Document layout generation, to generate plausible 2D layouts of documents in large quantities and varieties. First, we devise an exploratory recursive method to extract a structural decomposition of a single document. Leveraging a dataset of documents annotated with labeled bounding boxes, our recursive neural network learns to map the structural representation, given in the form of a simple hierarchy, to a compact code, the space of which is approximated by a Gaussian distribution. Novel hierarchies can be sampled from this space, obtaining new document layouts. Moreover, we introduce a combinatorial metric to measure structural similarity among document layouts. We deploy it to show that our method is able to generate highly variable and realistic layouts. We further demonstrate the utility of our generated layouts in the context of standard detection tasks on documents, showing that detection performance improves when the training data is augmented with generated documents whose layouts are produced by READ. 2
Cooperative Mixed Reality Leveraging Edge Computing and Communication Tang, SH; Chen, BH; Hochstetler, J; Hirsch, J; Fu, S Traditional Mixed Reality(MR) and Augmented Reality(AR) devices support a wide gamut of sensors, but the limited computational resource onboard such devices make advanced tasks difficult. With the introduction of Edge computing and more advanced Edge hardware, we are no longer bound to just the onboard processor or the Cloud as our only source of computational power. In our work, we introduce the use of Edge with MR devices to provide a cooperative perception capability to the MR device. We base our approach on the portability and low latency of the Edge. Through our prototype system, we demonstrate the potential of devices and evaluate the performance and feasibility through real world trials. Our evaluation proves that the system is capable of supporting cooperative perception tasks. 2
Relative Error Streaming Quantiles Cormode, G; Karnin, Z; Liberty, E; Thaler, J; Vesely, P Approximating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring. Given a stream of n items from a data universe u equipped with a total order, the task is to compute a sketch (data structure) of size poly(log(n), 1/epsilon). Given the sketch and a query item y is an element of u, one should be able to approximate its rank in the stream, i.e., the number of stream elements smaller than or equal to y. Most works to date focused on additive epsilon n error approximation, culminating in the KLL sketch that achieved optimal asymptotic behavior. This paper investigates multiplicative (1 +/- epsilon)-error approximations to the rank. Practical motivation for multiplicative error stems from demands to understand the tails of distributions, and hence for sketches to be more accurate near extreme values. The most space-efficient algorithms due to prior work store either O(log(epsilon(2)n)/epsilon(2)) or O(log(3)(epsilon n)/epsilon) universe items. This paper presents a randomized algorithm storing O(log(1.5)(epsilon n)/epsilon) items, which is within an O(root log(epsilon n)) factor of optimal. The algorithm does not require prior knowledge of the stream length and is fully mergeable, rendering it suitable for parallel and distributed computing environments. 2
The Impact of the Michigan Merit Curriculum on High School Math Course-Taking Kim, S; Wallsworth, G; Xu, R; Schneider, B; Frank, K; Jacob, B; Dynarski, S Michigan Merit Curriculum (MMC) is a statewide college-preparatory policy that applies to the high school graduating class of 2011 and later. Using detailed Michigan high school transcript data, this article examines the effect of the MMC on various students' course-taking and achievement outcomes. Our analyses suggest that (a) post-MMC cohorts took and passed approximately 0.2 additional years' of math courses, and students at low socioeconomic status (SES) schools drove nearly all of these effects; (b) post-policy students also completed higher-level courses, with the largest increase among the least prepared students; (c) we did not find strong evidence on students' ACT math scores; and (d) we found an increase in college enrollment rates for post-MMC cohorts, and the increase is mostly driven by well-prepared students. 2
EX3: Explainable Attribute-aware Item-set Recommendations Xian, YK; Zhao, T; Li, J; Chan, J; Kan, A; Ma, J; Dong, XL; Faloutsos, C; Karypis, G; Muthukrishnan, S; Zhang, YF Existing recommender systems in the e-commerce domain primarily focus on generating a set of relevant items as recommendations; however, few existing systems utilize underlying item attributes as a key organizing principle in presenting recommendations to users. Mining important attributes of items from customer perspectives and presenting them along with item sets as recommendations can provide users more explainability and help them make better purchase decision. In this work, we generalize the attribute-aware item-set recommendation problem, and develop a new approach to generate sets of items (recommendations) with corresponding important attributes (explanations) that can best justify why the items are recommended to users. In particular, we propose a system that learns important attributes from historical user behavior to derive item set recommendations, so that an organized view of recommendations and their attribute-driven explanations can help users more easily understand how the recommendations relate to their preferences. Our approach is geared towards real world scenarios: we expect a solution to be scalable to billions of items, and be able to learn item and attribute relevance automatically from user behavior without human annotations. To this end, we propose a multi-step learning-based framework called Extract-Expect-Explain (EX3), which is able to adaptively select recommended items and important attributes for users. We experiment on a large-scale real-world benchmark and the results show that our model outperforms state-of-the-art baselines by an 11.35% increase on NDCG with adaptive explainability for item set recommendation. 2
A user-centric network communication broker for multimedia collaborative computing Zhang, C; Sadjadi, SM; Sun, WX; Rangaswami, R; Deng, Y The development of collaborative multimedia applications today follows a vertical development approach, where each application is built on top of low-level network abstractions such as the socket interface. This stovepipe development process is a major inhibitor that drives up the cost of development and slows down the innovation pace of new generations of communication applications. In this paper, we propose a network communication broker (NCB) that provides a unified higher-level abstraction for the class of multimedia collaborative applications. We demonstrate how NCB encapsulates the complexity of network-level communication control and media delivery, and expedites the development of applications with various communication logics. We investigate the minimum necessary requirements for the NCB abstraction. We identify that the concept of user-level sessions involving multiple parties and multiple media, is critical to designing a reusable NCB to facilitate next-generation multimedia communications. Furthermore, the internal design of NCB decouples the user-level sessions from network-level sessions, so that the NCB framework can accommodate heterogeneous networks, and applications can be easily ported to new network environments. In addition, we demonstrate how the extensible and self-managing design of NCB supports dynamic adaptation in response to changes in network conditions and user requirements. 2
Improving Applicant Reactions to Forced-Choice Personality Measurement: Interventions to Reduce Threats to Test Takers' Self-Concepts Dalal, DK; Zhu, XY; Rangel, B; Boyce, AS; Lobene, E Previous research has demonstrated that selection decision making is improved with the use of valid pre-employment assessments, but these assessments can often engender negative reactions on the part of job candidates. Reactions to personality assessments tend to be particularly negative, and these reactions are even worse for forced-choice personality assessments. This latter issue is particularly troubling given the evidence showing that forced-choice measurement is quite effective at reducing deliberate response distortions (i.e., faking). Given the importance organizations place on candidate experience during the recruitment and selection process, improving applicants' reactions to valid selection assessments is important. Previous research has not, however, discussed the reasons or mechanisms behind why test takers have negative reactions to forced-choice assessments in particular. Here, we propose that forced-choice measurement threatens elements of the test taker's self-concept thereby engendering negative reactions to the assessment. Based on these theoretical arguments, we develop and test the efficacy of four format variations to a forced-choice assessment to improve test taker reactions. Results suggest that, compared to a traditional/standard forced-choice assessment, test takers reacted more positively to forced-choice assessment formats that (1) used a graded, as opposed to dichotomous, response scale (i.e., allowing for slightly (dis)agree responses); (2) included post-assessment performance feedback; and (3) removed the most socially undesirable items from the test. The theoretical and practical implications of these results are discussed. 2
Subblock-Based Motion Derivation and Inter Prediction Refinement in the Versatile Video Coding Standard Yang, HT; Chen, HB; Chen, JL; Esenlik, S; Sethuraman, S; Xiu, XY; Alshina, E; Luo, JC Efficient representation and coding of fine-granular motion information is one of the key research areas for exploiting inter-frame correlation in video coding. Representative techniques towards this direction are affine motion compensation (AMC), decoder-side motion vector refinement (DMVR), and subblock-based temporal motion vector prediction (SbTMVP). Fine-granular motion information is derived at subblock level for all the three coding tools. In addition, the obtained inter prediction can be further refined by two optical flow-based coding tools, the bi-directional optical flow (BDOF) for bi-directional inter prediction and the prediction refinement with optical flow (PROF) exclusively used in combination with AMC. The aforementioned five coding tools have been extensively studied and finally adopted in the Versatile Video Coding (VVC) standard. This paper presents technical details of each tool and highlights the design elements with the consideration of typical hardware implementations. Following the common test conditions defined by Joint Video Experts Team (JVET) for the development of VVC, 5.7% bitrate reduction on average is achieved by the five tools. For test sequences characterized by large and complex motion, up to 13.4% bitrate reduction is observed. Additionally, visual quality improvement is demonstrated and analyzed. 2
Generative Question Refinement with Deep Reinforcement Learning in Retrieval-based QA System Liu, Y; Zhang, CW; Yan, XH; Chang, Y; Yu, PS In real-world question-answering (QA) systems, ill-formed questions, such as wrong words, ill word order and noisy expressions, are common and may prevent the QA systems from understanding and answering accurately. In order to eliminate the effect of ill-formed questions, we approach the question refinement task and propose a unified model, Qrefine, to refine the ill-formed questions to well-formed questions. The basic idea is to learn a Seq2Seq model to generate a new question from the original one. To improve the quality and retrieval performance of the generated questions, we make two major improvements: 1) To better encode the semantics of ill-formed questions, we enrich the representation of questions with character embedding and the contextual word embedding such as BERT, besides the traditional context-free word embeddings; 2) To make it capable to generate desired questions, we train the model with deep reinforcement learning techniques that consider an appropriate wording of the generation as an immediate reward and the correlation between generated question and answer as time-delayed long-term rewards. Experimental results on real-world datasets show that the proposed Qrefine can generate refined questions with high readability but fewer mistakes than original questions provided by users. Moreover, the refined questions also significantly improve the accuracy of answer retrieval. 2
Dynamic Inventory Repositioning in On-Demand Rental Networks Benjaafar, S; Jiang, D; Li, X; Li, XB We consider a rental service with a fixed number of rental units distributed across multiple locations. The units are accessed by customers without prior reservation and on an on-demand basis. Customers can decide on how long to keep a unit and where to return it. Because of the randomness in demand and in returns, there is a need to periodically reposition inventory away from some locations and into others. In deciding on how much inventory to reposition and where, the system manager balances potential lost sales with repositioning costs. Although the problem is increasingly common in applications involving on-demand rental services, not much is known about the nature of the optimal policy for systems with a general network structure or about effective approaches to solving the problem. In this paper, first, we show that the optimal policy in each period can be described in terms of a well-specified region over the state space. Within this region, it is optimal not to reposition any inventory, whereas, outside the region, it is optimal to reposition but only such that the system moves to a new state that is on the boundary of the no-repositioning region. We also provide a simple check for when a state is in the no-repositioning region. Second, we leverage the features of the optimal policy, along with properties of the optimal cost function, to propose a provably convergent approximate dynamic programming algorithm to tackle problems with a large number of dimensions. 2
A nonparametric approach to identify age, time, and cohort effects Antonczyk, D; Fitzenberger, B; Mammen, E; Yu, K Empirical studies in the social sciences and biometrics often rely on data and models where a number of individuals born at different dates are observed at several points in time, and the relationship of interest centers on the effects of age a, cohort c, and time t. Because of t = a + c, the design is degenerate and one is automatically confronted with the associated (linear) identification problem studied intensively for parametric models (Mason and Fienberg 1985; MaCurdy and Mroz 1995; Kuang, Nielsen and Nielsen 2008a,b). Nonlinear time, age, and cohort effects can be identified in an additive model. The present study seeks to solve the identification problem employing a nonparametric estimation approach: We develop an additive model which is solved using a backfitting algorithm, in the spirit of Mammen et al. (1999). Our approach has the advantage that we do not have to worry about the parametric specification and its impact on the identification problem. The results can easily be interpreted, as the smooth backfitting algorithm is a projection of the data onto the space of additive models. We develop a complete asymptotic distribution theory for nonparametric estimators based on kernel smoothing and apply the method to a study on wage inequality in Germany between 1975 and 2004. (C) 2019 Elsevier B.V. All rights reserved. 2
Learning to Map Wikidata Entities To Predefined Topics Bhargava, P; Spasojevic, N; Ellinger, S; Rao, A; Menon, A; Fuhrmann, S; Hu, GN Recently much progress has been made in entity disambiguation and linking systems (EDL). Given a piece of text, EDL links words and phrases to entities in a knowledge base, where each entity defines a specific concept. Although extracted entities are informative, they are often too specific to be used directly by many applications. These applications usually require text content to be represented with a smaller set of predefined concepts or topics, belonging to a topical taxonomy, that matches their exact needs. In this study, we aim to build a system that maps Wikidata entities to such predefined topics. We explore a wide range of methods that map entities to topics, including GloVe similarity, Wikidata predicates, Wikipedia entity definitions, and entity-topic co-occurrences. These methods often predict entity-topic mappings that are reliable, i.e., have high precision, but tend to miss most of the mappings, i.e., have low recall. Therefore, we propose an ensemble system that effectively combines individual methods and yields much better performance, comparable with human annotators. 2
Burden of neurological diseases in the US revealed by web searches Baeza-Yates, R; Sangal, PM; Villoslada, P Background Analyzing the disease-related web searches of Internet users provides insight into the interests of the general population as well as the healthcare industry, which can be used to shape health care policies. Methods We analyzed the searches related to neurological diseases and drugs used in neurology using the most popular search engines in the US, Google and Bing/Yahoo. Results We found that the most frequently searched diseases were common diseases such as dementia or Attention Deficit/Hyperactivity Disorder (ADHD), as well as medium frequency diseases with high social impact such as Parkinson's disease, MS and ALS. The most frequently searched CNS drugs were generic drugs used for pain, followed by sleep disorders, dementia, ADHD, stroke and Parkinson's disease. Regarding the interests of the healthcare industry, ADHD, Alzheimer's disease, MS, ALS, meningitis, and hypersomnia received the higher advertising bids for neurological diseases, while painkillers and drugs for neuropathic pain, drugs for dementia or insomnia, and triptans had the highest advertising bidding prices. Conclusions Web searches reflect the interest of people and the healthcare industry, and are based either on the frequency or social impact of the disease. 2
Competitive Multi-Agent Deep Reinforcement Learning with Counterfactual Thinking Wang, Y; Wan, Y; Zhang, CW; Bai, L; Cui, LX; Yu, PS Counterfactual thinking describes a psychological phenomenon that people re-infer the possible results with different solutions about things that have already happened. It helps people to gain more experience from mistakes and thus to perform better in similar future tasks. This paper investigates the counterfactual thinking for agents to find optimal decision-making strategies in multi-agent reinforcement learning environments. In particular, we propose a multi-agent deep reinforcement learning model with a structure which mimics the human-psychological counterfactual thinking process to improve the competitive abilities for agents. To this end, our model generates several possible actions (intent actions) with a parallel policy structure and estimates the rewards and regrets for these intent actions based on its current understanding of the environment. Our model incorporates a scenario-based framework to link the estimated regrets with its inner policies. During the iterations, our model updates the parallel policies and the corresponding scenario-based regrets for agents simultaneously. To verify the effectiveness of our proposed model, we conduct extensive experiments. Experimental results show that counterfactual thinking can actually benefit the agents to obtain more accumulative rewards from the environments with fair information by comparing to their opponents. 2
Gaussian Mixture Models for Classification and Hypothesis Tests Under Differential Privacy Tong, XS; Xi, BW; Kantarcioglu, M; Inan, A Many statistical models are constructed using very basic statistics: mean vectors, variances, and covariances. Gaussian mixture models are such models. When a data set contains sensitive information and cannot be directly released to users, such models can be easily constructed based on noise added query responses. The models nonetheless provide preliminary results to users. Although the queried basic statistics meet the differential privacy guarantee, the complex models constructed using these statistics may not meet the differential privacy guarantee. However it is up to the users to decide how to query a database and how to further utilize the queried results. In this article, our goal is to understand the impact of differential privacy mechanism on Gaussian mixture models. Our approach involves querying basic statistics from a database under differential privacy protection, and using the noise added responses to build classifier and perform hypothesis tests. We discover that adding Laplace noises may have a non-negligible effect on model outputs. For example variance-covariance matrix after noise addition is no longer positive definite. We propose a heuristic algorithm to repair the noise added variance-covariance matrix. We then examine the classification error using the noise added responses, through experiments with both simulated data and real life data, and demonstrate under which conditions the impact of the added noises can be reduced. We compute the exact type I and type II errors under differential privacy for one sample z test, one sample t test, and two sample t test with equal variances. We then show under which condition a hypothesis test returns reliable result given differentially private means, variances and covariances. 2
An Ultra-Low-Power Image Signal Processor for Hierarchical Image Recognition With Deep Neural Networks An, HC; Schiferl, S; Venkatesan, S; Wesley, T; Zhang, QR; Wang, JC; Choo, KD; Liu, SY; Liu, BW; Li, ZY; Gong, LY; Zhong, HF; Blaauw, D; Dreslinski, R; Kim, HS; Sylvester, D We propose an ultra-low-power (ULP) image signal processor (ISP) that performs on-the-fly in-processing frame compression/decompression and hierarchical event recognition to exploit the temporal and spatial sparsity in an image sequence. This approach reduces energy consumption spent processing and transmitting unimportant image data to achieve a 16 imaging system energy gain in an intruder detection scenario. The ISP was fabricated in 40-nm CMOS and consumes only 170 at 5 frames/s for neural network-based intruder detection and 192 compressed image recording. 2
RATEmiRs: the rat atlas of tissue-specific and enriched miRNAs for discerning baseline expression exclusivity of candidate biomarkers Bushel, PR; Caiment, F; Wu, H; O'Lone, R; Day, F; Calley, J; Smith, A; Li, JY; Harrill, AH MicroRNAs (miRNAs) are small RNAs that regulate mRNA expression and have been targeted as biomarkers of organ damage and disease. To explore the utility of miRNAs to assess injury to specific tissues, a tissue atlas of miRNA abundance was constructed. The Rat Atlas of Tissue-specific and Enriched miRNAs (RATEmiRs) catalogues miRNA sequencing data from 21 and 23 tissues in male and female Sprague-Dawley rats, respectively. RATEmiRs identifies tissue-enriched (TE), tissue-specific (TS), or organ-specific (OS) miRNAs via comparisons of one or more tissue or organ vs others. We provide a brief overview of RATEmiRs and present how to use it to detect miRNA expression abundance of candidate biomarkers as well as to compare the expression of miRNAs between rat and human. The database is available at 2
Structural Results on Matching Estimation with Applications to Streaming Bury, M; Grigorescu, E; McGregor, A; Monemizadeh, M; Schwiegelshohn, C; Vorotnikova, S; Zhou, S We study the problem of estimating the size of a matching when the graph is revealed in a streaming fashion. Our results are multifold: We give a tight structural result relating the size of a maximum matching to the arboricity of a graph, which has been one of the most studied graph parameters for matching algorithms in data streams. One of the implications is an algorithm that estimates the matching size up to a factor of (+2)(1+epsilon) using O (n4/5) space in dynamic streams, where n is the number of nodes in the graph. We also show that in the vertex arrival insertion-only model, an (+2) approximation can be achieved using only O(logn) space.We further show that the weight of a maximum weighted matching can be efficiently estimated by augmenting any routine for estimating the size of an unweighted matching. Namely, given an algorithm for computing a -approximation in the unweighted case, we obtain a 2(1+epsilon)<bold> approximation for the weighted case</bold>, while only incurring a multiplicative logarithmic factor in the space bounds. The algorithm is implementable in any streaming model, including dynamic streams.We also investigate algebraic aspects of computing matchings in data streams, by proposing new algorithms and lower bounds based on analyzing the rank of the Tutte-matrix of the graph. In particular, we present an algorithm determining whether there exists a matching of size k using O(k2logn) space.We also show a lower bound of space for small approximation factors to the maximum matching size in insertion-only streams. This lower bound also holds for approximating the rank of a matrix. 2
NetVision: On-Demand Video Processing in Wireless Networks Lu, ZQ; Chan, KV; Urgaonkar, R; Pu, SL; La Porta, T The vast adoption of mobile devices with cameras has greatly contributed to the proliferation of the creation and distribution of videos. For a variety of purposes, valuable information may be extracted from these videos. While the computational capability of mobile devices has greatly improved recently, video processing is still a demanding task for mobile devices. We design an on-demand video processing system, NetVision, that performs distributed video processing using deep learning across a wireless network of mobile and edge devices to answer queries while minimizing the query response time. However, the problem of minimal query response time for processing videos stored across a network is a strongly NP-hard problem. To deal with this, we design a greedy algorithm with bounded performance. To further deal with the dynamics of the transmission rate between mobile and edge devices, we design an adaptive algorithm. We built NetVision and deployed it on a small testbed. Based on the measurements of the testbed and by extensive simulations, we show that the greedy algorithm is close to the optimum and the adaptive algorithm performs better with more dynamic transmission rates. We then perform experiments on the small testbed to examine the realized system performance in both stationary networks and mobile networks. 2
Equilibrium strategies for multiple interdictors on a common network Sreekumaran, H; Hota, AR; Liu, AL; Uhan, NA; Sundaram, S In this work, we introduce multi-interdictor games , which model interactions among multiple interdictors with differing objectives operating on a common network. As a starting point, we focus on shortest path multi-interdictor (SPMI) games , where multiple interdictors try to increase the shortest path lengths of their own adversaries attempting to traverse a common network. We first establish results regarding the existence of equilibria for SPMI games under both discrete and continuous interdiction strategies. To compute such an equilibrium, we present a reformulation of the SPMI game, which leads to a generalized Nash equilibrium problem (GNEP) with non-shared constraints. While such a problem is computationally challenging in general, we show that under continuous interdiction actions, an SPMI game can be formulated as a linear complementarity problem and solved by Lemke's algorithm. In addition, we present decentralized heuristic algorithms based on best response dynamics for games under both continuous and discrete interdiction strategies. Finally, we establish theoretical lower bounds on the worst-case efficiency loss of equilibria in SPMI games, with such loss caused by the lack of coordination among noncooperative interdictors, and use the decentralized algorithms to numerically study the average-case efficiency loss. (C) 2020 Elsevier B.V. All rights reserved. 2
Common factors of commodity prices Delle Chiaie, S; Ferrara, L; Giannone, D In this paper, we extract latent factors from a large cross-section of commodity prices, including fuel and non-fuel commodities. We decompose each commodity price series into a global (or common) component, block-specific components, and a purely idiosyncratic component. We find that the bulk of the fluctuations in commodity prices are well summarized by a single global factor. This global factor is closely related to fluctuations in global economic activity and, since the early 2000s, has become more important in explaining variations in commodity prices. 2
Low-shot Learning in Natural Language Processing Xia, CY; Zhang, CW; Zhang, JW; Liang, TT; Peng, H; Yu, PS This paper study the low-shot learning paradigm in Natural Language Processing (NLP), which aims to provide the ability that can adapt to new tasks or new domains with limited annotation data, like zero or few labeled examples. Specifically, Low-shot learning unifies the zero-shot and few-shot learning paradigm. Diverse low-shot learning approaches, including capsule-based networks, data-augmentation methods, and memory networks, are discussed for different NLP tasks, for example, intent detection and named entity typing. We also provide potential future directions for low-shot learning in NLP. 2
Deeptime: a Python library for machine learning dynamical models from time series data Hoffmann, M; Scherer, M; Hempel, T; Mardt, A; de Silva, B; Husic, BE; Klus, S; Wu, H; Kutz, N; Brunton, SL; Noe, F Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables, dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. Deeptime can be found under https://deeptime-ml.github.io/. 2
Real-time diabetic retinopathy screening by deep learning in a multisite national screening programme: a prospective interventional cohort study Ruamviboonsuk, P; Tiwari, R; Sayres, R; Nganthavee, V; Hemarat, K; Kongprayoon, A; Raman, R; Levinstein, B; Liu, Y; Schaekermann, M; Lee, R; Virmani, S; Widner, K; Chambers, J; Hersch, F; Peng, L; Webster, DR Background Diabetic retinopathy is a leading cause of preventable blindness, especially in low-income and middle-income countries (LMICs). Deep-learning systems have the potential to enhance diabetic retinopathy screenings in these settings, yet prospective studies assessing their usability and performance are scarce. Methods We did a prospective interventional cohort study to evaluate the real-world performance and feasibility of deploying a deep-learning system into the health-care system of Thailand. Patients with diabetes and listed on the national diabetes registry, aged 18 years or older, able to have their fundus photograph taken for at least one eye, and due for screening as per the Thai Ministry of Public Health guidelines were eligible for inclusion. Eligible patients were screened with the deep-learning system at nine primary care sites under Thailand's national diabetic retinopathy screening programme. Patients with a previous diagnosis of diabetic macular oedema, severe non-proliferative diabetic retinopathy, or proliferative diabetic retinopathy; previous laser treatment of the retina or retinal surgery; other non-diabetic retinopathy eye disease requiring referral to an ophthalmologist; or inability to have fundus photograph taken of both eyes for any reason were excluded. Deep-learning system-based interpretations of patient fundus images and referral recommendations were provided in real time. As a safety mechanism, regional retina specialists over-read each image. Performance of the deep-learning system (accuracy, sensitivity, specificity, positive predictive value [PPV], and negative predictive value [NPV]) were measured against an adjudicated reference standard, provided by fellowship-trained retina specialists. This study is registered with the Thai national clinical trials registry TCRT20190902002. Findings Between Dec 12,2018, and March 29,2020,7940 patients were screened for inclusion. 7651(96.3%) patients were eligible for study analysis, and 2412 (31.5%) patients were referred for diabetic retinopathy, diabetic macular oedema, ungradable images, or low visual acuity. For vision-threatening diabetic retinopathy, the deep-learning system had an accuracy of 94.7% (95% CI 93.0-96.2), sensitivity of 91.4% (874-95.0), and specificity of 95.4% (94.1-96.7). The retina specialist over-readers had an accuracy of 93.5 (91.7-95.0; p=0.17), a sensitivity of 84.8% (79.4-90.0; p=0.024), and specificity of 95.5% (94.1-96.7; p=0.98). The PPV for the deep-learning system was 79.2 (95% CI 73.8-84.3) compared with 75.6 (69.8-81.1) for the over-readers. The NPV for the deep-learning system was 95.5 (92.8-97-9) compared with 92.4 (89.3-95.5) for the over-readers. Interpretation A deep-learning system can deliver real-time diabetic retinopathy detection capability similar to retina specialists in community-based screening settings. Socioenvironmental factors and workflows must be taken into consideration when implementing a deep-learning system within a large-scale screening programme in LMICs. Copyright (C) 2022 The Author(s). Published by Elsevier Ltd. 2
SUPERVEGAN: Super Resolution Video Enhancement GAN for Perceptually Improving Low Bitrate Streams Andrei, SS; Shapovalova, N; Mayol-Cuevas, W This paper presents a novel model family that we call SUPERVEGAN, for the problem of video enhancement for low bitrate streams by simultaneous video super resolution and removal of compression artifacts from low bitrates (e.g. 250Kbps). Our strategy is fully end-to-end, but we upsample and tackle the problem in two main stages. The first stage deals with removal of streaming compression artifacts and performs a partial upsampling, and the second stage performs the final upsampling and adds detail generatively. We also use a novel progressive training strategy for video together with the use of perceptual metrics. Our experiments shown resilience to training bitrate and we show how to derive real-time models. We also introduce a novel bitrate equivalency test that enables the assessment of how much a model improves streams with respect to bitrate. We demonstrate efficacy on two publicly available HD datasets, LIVE-NFLX-II and Tears of Steel (TOS). We compare against a range of baselines and encoders and our results demonstrate our models achieve a perceptual equivalence which is up to two times over the input bitrate. In particular our 4X upsampling outperforms baseline methods on the LPIPS perceptual metric, and our 2X upsampling model also outperforms baselines on traditional metrics such as PSNR. 1
Training Strategies to Handle Missing Modalities for Audio-Visual Expression Recognition Parthasarathy, S; Sundaram, S Automatic audio-visual expression recognition can play an important role in communication services such as tele-health, VOIP calls and human-machine interaction. Accuracy of audio-visual expression recognition could benefit from the interplay between the two modalities. However, most audio-visual expression recognition systems, trained in ideal conditions, fail to generalize in real world scenarios where either the audio or visual modality could be missing due to a number of reasons such as limited bandwidth, interactors' orientation, caller initiated muting. This paper studies the performance of a state-of-the art transformer when one of the modalities is missing. We conduct ablation studies to evaluate the model in the absence of either modality. Further, we propose a strategy to randomly ablate visual inputs during training at the clip or frame level to mimic real world scenarios. Results conducted on in-the-wild data, indicate significant generalization in proposed models trained on missing cues, with gains up to 17% for frame level ablations, showing that these training strategies cope better with the loss of input modalities. 1
A context-sensitive real-time Spell Checker with language adaptability Gupta, P We present a novel language adaptable spell checking system that detects spelling errors and suggests context-sensitive corrections in real-time. We show that our system can be extended to new languages with minimal language-specific processing. Available literature majorly discusses spell checkers for English but there are no publicly available systems that can be extended to work for other languages out of the box. Most of the systems do not work in real-time. We explain the process of generating a languages word dictionary and n-gram probability dictionaries using Wikipedia-articles data and manually curated video subtitles. We present the results of generating a list of suggestions for a misspelled word. We also propose three approaches to create noisy channel datasets of real-world typographic errors. Finally, we show the effectiveness of language adaptability of our proposed system by extending it to 24 languages. 1
Learning Sparse Models at Scale Herbrich, R Recently, learning deep models from dense data has received a lot of attention in tasks such as object recognition and signal processing. However, when dealing with non-sensory data about real-world entities, data is often sparse; for example people interaction with products in e-Commerce, people interacting with each other in social networks or word sequences in natural language. In this talk, I will share lessons learned over the past 10 years when learning predictive models based on sparse data: 1) how to scale the inference algorithms to distributed data setting, 2) how to automate the learning process by reducing the amount of hyper-parameters to zero, 3) how to deal with Zipf distributions when learning resource-constrained models, and 4) how to combine dense and sparse-learning algorithms. The talk will be drawing from many real-world experiences I gathered over the past decade in applications of the techniques in gaming, search, advertising and recommendations of systems developed at Microsoft, Facebook and Amazon. 1
Perspectives on Becoming an Applied Machine Learning Scientist Rasiwasia, N While becoming a scientist is all about creating a dent in the boundaries of human knowledge, applying that expertise in a corporate setting requires a transformation of the scientist skill set. In both academia and industry, the primary focus is problem solving. In academia, however, it is targeted toward creating a novel solution that pushes the state of the art, while, in industry, the solution is valuable if it leads to the desired change in business metrics. 1
Machine Learning @ Amazon Rastogi, R In this talk, I will first provide an overview of key problem areas where we are applying Machine Learning (ML) techniques within Amazon such as product demand forecasting, product search, and information extraction from reviews, and associated technical challenges. I will then talk about two specific applications where we use a variety of methods to learn semantically rich representations of data: question answering where we use deep learning techniques and product size recommendations where we use probabilistic models, and fake reviews detection where we use tensor factorization algorithms. 1
Transitioning to Agile-In a Large Organization Mohanarangam, K Change is difficult. A big organization that is set in their old ways of doing things will have big resistance to change. Transitioning to agile is one such change. There are several problems an enterprise would face when transitioning to agile from waterfall software development methodology. It might be as trivial as a team not being colocated, or it could be complete lack of trust between the development team and business partners. In this article, I will describe various problems that my team and I faced when we transitioned to agile in my previous organization and how we successfully overcame those problems and became a model team. 1
END-TO-END MULTI-CHANNEL TRANSFORMER FOR SPEECH RECOGNITION Chang, FJ; Radfar, M; Mouchtaris, A; King, B; Kunzmann, S Transformers are powerful neural architectures that allow integrating different modalities using attention mechanisms. In this paper, we leverage the neural transformer architectures for multi-channel speech recognition systems, where the spectral and spatial information collected from different microphones are integrated using attention layers. Our multi-channel transformer network mainly consists of three parts: channel-wise self attention layers (CSA), cross-channel attention layers (CCA), and multi-channel encoder-decoder attention layers (EDA). The CSA and CCA layers encode the contextual relationship within and between channels and across time, respectively. The channel-attended outputs from CSA and CCA are then fed into the EDA layers to help decode the next token given the preceding ones. The experiments show that in a far-field in-house dataset, our method outperforms the baseline single-channel transformer, as well as the super-directive and neural beamformers cascaded with the transformers. 1
Optimizing Speech Recognition Evaluation Using Stratified Sampling Pylkkonen, J; Drugman, T; Bisani, M Producing large enough quantities of high-quality transcriptions for accurate and reliable evaluation of an automatic speech recognition (ASR) system can be costly. It is therefore desirable to minimize the manual transcription work for producing metrics with an agreed precision. In this paper we demonstrate how to improve ASR evaluation precision using stratified sampling. We show that by altering the sampling, the deviations observed in the error metrics can be reduced by up to 30% compared to random sampling, or alternatively, the same precision can be obtained on about 30% smaller datasets. We compare different variants for conducting stratified sampling, including a novel sample allocation scheme tailored for word error rate. Experimental evidence is provided to assess the effect of different sampling schemes to evaluation precision. 1
Online Dual Decomposition for Performance and Delivery-Based Distributed Ad Allocation Huang, JC; Jenatton, R; Archambeau, C Online optimization is central to display advertising, where we must sequentially allocate ad impressions to maximize the total welfare among advertisers, while respecting various advertiser-specified long-term constraints (e.g., total amount of the ad's budget that is consumed at the end of the campaign). In this paper, we present the online dual decomposition (ODD) framework for large-scale, online, distributed ad allocation, which combines dual decomposition and online convex optimization. ODD allows us to account for the distributed and the online nature of the ad allocation problem and is extensible to a variety of ad allocation problems arising in real-world display advertising systems. Moreover, ODD does not require assumptions about auction dynamics, stochastic or adversarial feedback, or any other characteristics of the ad marketplace. We further provide guarantees for the online solution as measured by bounds on cumulative regret. The regret analysis accounts for the impact of having to estimate constraints in an online setting before they are observed and for the dependence on the smoothness with which constraints and constraint violations are generated. We provide an extensive set of results from a large-scale production advertising system at Amazon to validate the framework and compare its behavior to various ad allocation algorithms. 1
A Multi-Objective Rule Optimizer with an Application to Risk Management Pulkkinen, P; Tiwari, N; Kumar, A; Jones, C Managing risk is important to any E-commerce merchant. Various machine learning (ML) models combined with a rule set as the decision layer is a common practice to manage the risks. Unlike the ML models that can be automatically refreshed periodically based on new risk patterns, rules are generally static and rely on manual updates. To tackle that, this paper presents a data-driven and automated rule optimization method that generates multiple Pareto-optimal rule sets representing different trade-offs between business objectives. This enables business owners to make informed decisions when choosing between optimized rule sets for changing business needs and risks. Furthermore, manual work in rule management is greatly reduced. For scalability this method leverages Apache Spark and runs either on a single host or in a distributed environment in the cloud. This allows us to perform the optimization in a distributed fashion using millions of transactions, hundreds of variables and hundreds of rules during the training. The proposed method is general but we used it for optimizing real-world E-commerce (Amazon) risk rule sets. It could also be used in other fields such as finance and medicine. 1
Performance prediction of deep learning applications training in GPU as a service systems Lattuada, M; Gianniti, E; Ardagna, D; Zhang, L Data analysts predict that the GPU as a service (GPUaaS) market will grow to support 3D models, animated video processing, gaming, and deep learning model training. The main cloud providers already offer in their catalogs VMs with different type and number of GPUs. Because of the significant difference in terms of performance and cost of this type of VMs, correctly selecting the most appropriate one to execute the required job is mandatory to minimize the training cost. Motivated by these considerations, this paper proposes performance models to predict GPU-deployed neural networks (NNs) training. The proposed approach is based on machine learning and exploits two main sets of features, thus capturing both NNs properties and hardware characteristics. Such data enable the learning of multiple linear regression models that, coupled with an established feature selection technique, become accurate prediction tools, with errors below 12% on average. An extensive experimental campaign, performed both on public and in-house private cloud deployments, considers popular deep NNs used for image classification and speech transcription. The results show that prediction errors remain small even when extrapolating outside the range spanned by the input data, with important implications for the models' applicability. 1
Using imported intermediate goods: Selection and technology effects Gibson, MJ; Graciano, TA Producers that use imported intermediate goods tend to be much larger and more productive than others. Some of this is due to a selection effect: the most productive producers self-select into importing because only they can overcome the fixed costs of developing trade relationships with foreign input suppliers. Some of this is due to a technology effect: any given producer would have higher variable profits from operating the technology using imported intermediate goods. To account for the roles of these theoretical mechanisms, we develop a simple model of a competitive small open economy in which heterogeneous firms endogenously decide whether to use imported intermediate goods. The technology that uses imported intermediate goods is superior but requires a higher fixed cost of operating. The calibrated model captures the large performance advantage of importers and quantifies the selection and technology effects. 1
HETEROSKEDASTICITY AUTOCORRELATION ROBUST INFERENCE IN TIME SERIES REGRESSIONS WITH MISSING DATA Rho, SH; Vogelsang, TJ In this article, we investigate the properties of heteroskedasticity and autocorrelation robust (HAR) test statistics in time series regression settings when observations are missing. We primarily focus on the nonrandom missing process case where we treat the missing locations to be fixed as T -> infinity by mapping the missing and observed cutoff dates into points on [0,1] based on the proportion of time periods in the sample that occur up to those cutoff dates. We consider two models, the amplitude modulated series (Parzen, 1963) regression model, which amounts to plugging in zeros for missing observations, and the equal space regression model, which simply ignores the missing observations. When the amplitude modulated series regression model is used, the fixed-b limits of the HAR test statistics depend on the locations of missing observations but are otherwise pivotal. When the equal space regression model is used, the fixed-b limits of the HAR test statistics have the standard fixed-b limits as in Kiefer and Vogelsang (2005). We discuss methods for obtaining fixed-b critical values with a focus on bootstrap methods and find the naive i.i.d. bootstrap with missing dates fixed to be an effective and practical way to obtain the fixed-b critical values. 1
Meta-Embedding as Auxiliary Task Regularization O'Neill, J; Bollegala, D Word embeddings have been shown to benefit from ensembling several word embedding sources, often carried out using straightforward mathematical operations over the set of word vectors. More recently, self-supervised learning has been used to find a lower-dimensional representation, similar in size to the individual word embeddings within the ensemble. Reconstruction is typically carried out prior to use on a downstream task. In this work we frame meta-embedding as a multi-task learning problem by reconstructing the word embedding sources as an auxiliary task that regularizes and shares a meta-embedding layer with the main downstream task. We carry out intrinsic evaluation (6 word similarity datasets and 3 analogy datasets) and extrinsic evaluation (4 downstream tasks). For intrinsic task evaluation, supervision comes from various labeled word similarity datasets. Our experimental results show that the performance is improved for all word similarity datasets when compared to self-supervised learning methods with a mean increase of 11.33 in Spearman correlation. Specifically, the proposed method shows the best performance in 4 out of 6 of word similarity datasets when using a cosine reconstruction loss and Brier's word similarity loss. Moreover, improvements are also made when performing word meta-embedding reconstruction in sequence tagging and sentence meta-embedding for sentence classification. 1
FlashR: Parallelize and Scale R for Machine Learning using SSDs Zheng, D; Mhembere, D; Vogelstein, JT; Priebe, CE; Burns, R R is one of the most popular programming languages for statistics and machine learning, but it is slow and unable to scale to large datasets. The general approach for having an efficient algorithm in R is to implement it in C or FORTRAN and provide an R wrapper. FlashR accelerates and scales existing R code by parallelizing a large number of matrix functions in the R base package and scaling them beyond memory capacity with solid-state drives (SSDs). FlashR performs memory hierarchy aware execution to speed up parallelized R code by (i) evaluating matrix operations lazily, (ii) performing all operations in a DAG in a single execution and with only one pass over data to increase the ratio of computation to I/O, (iii) performing two levels of matrix partitioning and reordering computation on matrix partitions to reduce data movement in the memory hierarchy. We evaluate FlashR on various machine learning and statistics algorithms on inputs of up to four billion data points. Despite the huge performance gap between SSDs and RAM, FlashR on SSDs closely tracks the performance of FlashR in memory for many algorithms. The R implementations in FlashR outperforms H2O and Spark MLlib by a factor of 3-20. 1
On the Futility of Dynamics in Robust Mechanism Design Balseiro, SR; Kim, A; Russo, D We consider a principal who repeatedly interacts with a strategic agent holding private information. In each round, the agent observes an idiosyncratic shock drawn independently and identically from a distribution known to the agent but not to the principal. The utilities of the principal and the agent are determined by the values of the shock and outcomes that are chosen by the principal based on reports made by the agent. When the principal commits to a dynamic mechanism, the agent best-responds to maximize his aggregate utility over the whole time horizon. The principal's goal is to design a dynamic mechanism to minimize his worst-case regret, that is, the largest difference possible between the aggregate utility he could obtain if he knew the agent's distribution and the actual aggregate utility he obtains. We identify a broad class of games in which the principal's optimal mechanism is static without any meaningful dynamics. The optimal dynamic mechanism, if it exists, simply repeats an optimal mechanism for a single-round problem in each round. The minimax regret is the number of rounds times the minimax regret in the single-round problem. The class of games includes repeated selling of identical copies of a single good or multiple goods, repeated principal-agent relationships with hidden information, and repeated allocation of a resource without money. Outside this class of games, we construct examples in which a dynamic mechanism provably outperforms any static mechanism. 1
Localized Triplet Loss for Fine-grained Fashion Image Retrieval D'Innocente, A; Garg, N; Zhang, Y; Bazzani, L; Donoser, M Fashion retrieval methods aim at learning a clothing-specific embedding space where images are ranked based on their global visual similarity with a given query. However, global embeddings struggle to capture localized fine-grained similarities between images, because of aggregation operations. Our work deals with this problem by learning localized representations for fashion retrieval based on local interest points of prominent visual features specified by a user. We introduce a localized triplet loss function that compares samples based on corresponding patterns. We incorporate random local perturbation on the interest point as a key regularization technique to enforce local invariance of visual representations. Due to the absence of existing fashion datasets to train on localized representations, we introduce FashionLocalTriplets, a new high-quality dataset annotated by fashion specialists that contains triplets of women's dresses and interest points. The proposed model outperforms state-of-the-art global representations on FashionLocalTriplets. 1
Understanding Social Interaction across Social Network Sites Zhang, P; Lu, T; Liu, BX; Gu, HS; Gu, N People tend to utilize multiple social network sites (SNSs) simultaneously to maintain some social relationships, which results that there are many overlapping relationships and interactions among SNSs. Although many studies have focused on social interaction and cross-SNS user footprint analysis and understanding, little research investigates social interaction from a perspective of two or more SNSs, and the interplay of interaction among SNSs has been unknown. In this paper, we aim to explore whether interaction building in a new SNS hinders the interaction frequency in an existing site, and if so, what kinds of users and relationships' interactions are more or less likely to be affected. For these questions, we sampled 7,015 pairs of overlapping identities, 23,590 pairs of overlapping relationships and 6,771 pairs of overlapping interactions from Weibo and Douban and made analysis by combining multiple methods like Regression Discontinuity Design and Random-effects Negative Binomial Regression model. Our results suggest that no matter from the perspective of individuals or from the perspective of relationships, interaction construction in a new SNS is detrimental to interaction frequency in an existing site. Based on our findings, we also propose several valuable insights about how to enhance social interaction and promote its retention when users are involved into interacting practice in multiple platforms. 1
MULTI-TASK SELF-SUPERVISED PRE-TRAINING FOR MUSIC CLASSIFICATION Wu, HH; Kao, CC; Tang, QM; Sun, M; McFee, B; Bello, JP; Wang, C Deep learning is very data hungry, and supervised learning especially requires massive labeled data to work well. Machine listening research often suffers from limited labeled data problem, as human annotations are costly to acquire, and annotations for audio are time consuming and less intuitive. Besides, models learned from labeled dataset often embed biases specific to that particular dataset. Therefore, unsupervised learning techniques become popular approaches in solving machine listening problems. Particularly, a self-supervised learning technique utilizing reconstructions of multiple hand-crafted audio features has shown promising results when it is applied to speech domain such as emotion recognition and automatic speech recognition (ASR). In this paper, we apply self-supervised and multi-task learning methods for pre-training music encoders, and explore various design choices including encoder architectures, weighting mechanisms to combine losses from multiple tasks, and worker selections of pretext tasks. We investigate how these design choices interact with various downstream music classification tasks. We find that using various music specific workers altogether with weighting mechanisms to balance the losses during pre-training helps improve and generalize to the downstream tasks. 1
Data Integration and Machine Learning: A Natural Synergy Dong, XL; Rekatsinas, T As data volume and variety have increased, so have the ties between machine learning and data integration become stronger. For machine learning to be effective, one must utilize data from the greatest possible variety of sources; and this is why data integration plays a key role. At the same time machine learning is driving automation in data integration, resulting in overall reduction of integration costs and improved accuracy. This tutorial focuses on three aspects of the synergistic relationship between data integration and machine learning: (1) we survey how state-of-the-art data integration solutions rely on machine learning-based approaches for accurate results and effective human-in-the-loop pipelines, (2) we review how end-to-end machine learning applications rely on data integration to identify accurate, clean, and relevant data for their analytics exercises, and (3) we discuss open research challenges and opportunities that span across data integration and machine learning. 1
The Cyclicality of On-the-Job Search Effort Ahn, HJ; Shao, L This paper provides new evidence for cyclicality in the job-search effort of employed workers, on-the-job search (OJS) intensity, in the U.S. using American Time Use Survey and various cyclical indicators. We find that the probability of an employed worker to engage in OJS is statistically significantly countercyclical, while time spent on OJS of an employed job seeker is weakly countercyclical. The fear of job loss, employment uncertainty, and workers' financial situations is crucial in the job search decision of employed individuals. The results imply that the precautionary motive might be the key driver of the countercyclicality in OJS intensity. 1
Approximation algorithms for scheduling C-benevolent jobs on weighted machines Yu, G; Jacobson, SH This article considers a new variation of the online interval scheduling problem, which consists of scheduling C-benevolent jobs on multiple heterogeneous machines with different positive weights. The reward for completing a job assigned to a machine is given by the product of the job value and the machine weight. The objective of this scheduling problem is to maximize the total reward for completed jobs. Two classes of approximation algorithms are analyzed, Cooperative Greedy algorithms and Prioritized Greedy algorithms, with competitive ratios provided. We show that when the weight ratios between machines are small, the Cooperative Greedy algorithm outperforms the Prioritized Greedy algorithm. As the weight ratios increase, the Prioritized Greedy algorithm outperforms the Cooperative Greedy algorithm. Moreover, as the weight ratios approach infinity, the competitive ratio of the Prioritized Greedy algorithm approaches four. We also provide a lower bound of 3/2 and 9/7 for the competitive ratio of any deterministic algorithm for scheduling C-benevolent jobs on two and three machines with arbitrary weights, respectively. 1
eSNAP: Enabling Sensor Network Automatic Positioning in IoT Lighting Systems Abboud, K; Li, Y; Bermudez, S The configuration of sensor nodes in Internet-of-Things (IoT) systems is a time-consuming and labor-intensive process, often due to the lack of user interfaces in embedded sensor devices or the large number of nodes in a network. A crucial step in configuring an IoT node is mapping its identification (ID) to its physical location within the deployment area. This step is more pronounced in lighting systems, where both lighting control and sensing data need to be supported. In this article, we propose, eSNAP, an ID-location mapping method that enables the automatic configuration of wireless sensor nodes in connected lighting systems. Such mapping allows setting up and maintaining a secure and reliable IoT network with little human intervention, while providing valuable contextual information for the sensor data, which is critical in most IoT applications, including lighting. Our proposed method combines available digitized building planning/design information with theories of the Euclidean distance matrices and combinatorial optimization to enable the automatic configuration of IoT nodes. Furthermore, we leverage the channel and time diversities of the received signal measurements obtained by off-the-shelf wireless RF modules to enhance the positioning accuracy over time. We evaluate and validate the proposed method using state-of-the-art wireless mesh networking standards on two real-world setups: 1) a 30-node lab setup implemented using the Bluetooth mesh standard-compliant embedded platform and 2) a real-world connected lighting system based on ZigBee mesh in a corporate office building. 1
A 0.44-mu J/dec, 39.9-mu s/dec, Recurrent Attention In-Memory Processor for Keyword Spotting Dbouk, H; Gonugondla, SK; Sakr, C; Shanbhag, NR This article presents a deep learning-based classifier IC for keyword spotting (KWS) in 65-nm CMOS designed using an algorithm-hardware co-design approach. First, a recurrent attention model (RAM) algorithm for the KWS task (the KeyRAM algorithm) is proposed. The KeyRAM algorithm enables accuracy versus energy scalability via a confidence-based computation (CC) scheme, leading to a 2.5x reduction in computational complexity compared to state-of-the-art (SOTA) neural networks, and is well-suited for in-memory computing (IMC) since the bulk (89%) of its computations are 4-b matrix-vector multiplies. The KeyRAM IC comprises a multi-bit multi-bank IMC architecture with a digital co-processor. A sparsity-aware summation scheme is proposed to alleviate the challenge faced by IMCs when summing sparse activations. The digital co-processor employs diagonal major weight storage to compute without any stalls. This combination of the IMC and digital processors enables a balanced tradeoff between energy efficiency and high accuracy computation. The resultant KWS IC achieves SOTA decision latency of 39.9 mu s with a decision energy <0.5 mu J/dec which translates to more than 24x savings in the energy-delay product (EDP) of decisions over existing KWS ICs. 1
On the Redundancy in the Rank of Neural Network Parameters and Its Controllability Lee, C; Kim, YB; Ji, H; Lee, Y; Hur, Y; Lim, H In this paper, we show that parameters of a neural network can have redundancy in their ranks, both theoretically and empirically. When viewed as a function from one space to another, neural networks can exhibit feature correlation and slower training due to this redundancy. Motivated by this, we propose a novel regularization method to reduce the redundancy in the rank of parameters. It is a combination of an objective function that makes the parameter rank-deficient and a dynamic low-rank factorization algorithm that gradually reduces the size of this parameter by fusing linearly dependent vectors together. This regularization-by-pruning approach leads to a neural network with better training dynamics and fewer trainable parameters. We also present experimental results that verify our claims. When applied to a neural network trained to classify images, this method provides statistically significant improvement in accuracy and 7.1 times speedup in terms of number of steps required for training. Furthermore, this approach has the side benefit of reducing the network size, which led to a model with 30.65% fewer trainable parameters. 1
Shape-Sphere: A metric space for analysing time series by their shape Kowsar, Y; Moshtaghi, M; Velloso, E; Bezdek, JC; Kulik, L; Leckie, C Shape analogy is a key technique in analyzing time series. That is, time series are compared by how much they look alike. This concept has been applied for many years in geometry. Notably, none of the current techniques describe a time series as a geometric curve that is expressed by its relative location and form in space. To fill this gap, we introduce Shape-Sphere, a vector space where time series are presented as points on the surface of a sphere. We prove a pseudo-metric property for distances in Shape-Sphere. We show how to describe the average shape of a time series set using the pseudo-metric property of Shape-Sphere by deriving a centroid from the set. We demonstrate the effectiveness of the pseudo-metric property and its centroid in capturing the 'shape' of a time series set, using two important machine learning techniques, namely: Nearest Centroid Classifier and K-Means clustering, using 85 publicly available data sets. Shape-Sphere improves the nearest centroid classification results when the shape is the differentiating feature while keeping the quality of clustering equivalent to current state-of-the-art techniques. (c) 2021 Elsevier Inc. All rights reserved. 1
Design of a Compact Omnidirectional Leaky-Wave Antenna Fed by Higher Order Mode Fu, YH; Gong, L; Chan, KY; Huang, S; Nanzer, JA; Ramer, R This article presents a novel leaky-wave antenna (LWA) based on a dielectric-filled rectangular waveguide (RWG) and fed by a higher order mode. The proposed quasi-uniform LWA has a compact configuration that is suitable for microwave applications, featuring the omnidirectional radiation patterns and low cross-polarization. In the frequency band of interest, the higher order mode, i.e., the TM11 mode, performs as a fast wave in the proposed quasiuniform LWA, radiating via the transverse slots etched on the walls. The advantage of the TM11 mode for LWA applications is analyzed and demonstrated. This higher order mode is excited via a mode convertor from an inline coaxial line to the dielectric-filled RWG. For the reduction in sidelobe levels at high frequencies, -25 dB Taylor amplitude distribution is applied to the etched slots. The dielectric-filled LWA with the inline coaxial feeding is fabricated and measured. The measurement is consistent with the simulation, showing frequency-driven beamscanning capability with low cross-polarization in the elevation plane. The simulated and measured radiation patterns in the azimuth plane illustrate the omnidirectional radiation patterns and thus confirm the radiating performance of the TM11 mode in the proposed LWA. The advantages of the scanned beams and omnidirectional radiations are promising for radar systems. 1
Impartial Predictive Modeling and the Use of Proxy Variables Johnson, KD; Foster, DP; Stine, RA Fairness aware data mining (FADM) aims to prevent algorithms from discriminating against protected groups. The literature has come to an impasse as to what constitutes explainable variability as opposed to discrimination. This distinction hinges on a rigorous understanding of the role of proxy variables; i.e., those variables which are associated both the protected feature and the outcome of interest. We demonstrate that fairness is achieved by ensuring impartiality with respect to sensitive characteristics and provide a framework for impartiality by accounting for different perspectives on the data generating process. In particular, fairness can only be precisely defined in a full-data scenario in which all covariates are observed. We then analyze how these models may be conservatively estimated via regression in partial-data settings. Decomposing the regression estimates provides insights into previously unexplored distinctions between explainable variability and discrimination that illuminate the use of proxy variables in fairness aware data mining. 1
Solutions with performance guarantees on tactical decisions for industrial gas network problems Cay, P; Esmali, A; Mancilla, C; Storer, RH; Zuluaga, LF In the gas distribution industry, creating a tactical strategy to meet customer demand while meeting the physical constraints in a gas pipeline network leads to complex and challenging optimization problems due to the non-linearity, non-convexity, and combinatorial nature of the corresponding mathematical formulation of the problem. In this article, we study the performance of different approaches presented in the literature to solve both natural gas and industrial gas problems to either find global optimal solutions or determine the optimality gap between a local optimal solution and a valid lower bound for the problem's objective. In addition to those considered in the literature, we consider alternative reformulations of the operational-level gas pipeline optimization problem. The performance of these alternative reformulations varies in terms of the optimality gap provided for a feasible solution of the problem and their solution time. In industry-sized problem instances, significant improvements are possible compared to solving the standard formulation of the problem. 1
Sparse Representations of Positive Functions via First- and Second-Order Pseudo-Mirror Descent Chakraborty, A; Rajawat, K; Koppel, A We consider expected risk minimization problems when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that the search space is a Reproducing Kernel Hilbert Space (RKHS). We develop first and second-order variants of stochastic mirror descent employing (i) pseudo-gradients and (ii) complexity-reducing projections. Compressive projection in the first-order scheme is executed via kernel orthogonal matching pursuit (KOMP), which overcomes the fact that the vanilla RKHS parameterization grows unbounded with the iteration index in the stochastic setting. Moreover, pseudo-gradients are needed when gradient estimates for cost are only computable up to some numerical error, which arise in, e.g., integral approximations. Under constant step-size and compression budget, we establish tradeoffs between the radius of convergence of the expected sub-optimality and the projection budget parameter, as well as non-asymptotic bounds on the model complexity. To refine the solution's precision, we develop a second-order extension which employs recursively averaged pseudo-gradient outer-products to approximate the Hessian inverse, whose convergence in mean is established under an additional eigenvalue decay condition on the Hessian of the optimal RKHS element, which is unique to this work. Experiments demonstrate favorable performance on inhomogeneous Poisson Process intensity estimation in practice. 1
Recommendations as treatments Joachims, T; London, B; Su, Y; Swaminathan, A; Wang, LQ In recent years, a new line of research has taken an interventional view of recommender systems, where recommendations arc viewed as actions that the system takes to have a desired effect. This interventional view has led to the development or colt at c !factual inference techniques for evaluating and optimizing recommendation policies. This article explains how these techniques enable unbiased offline evaluation and learning despite biased data. and how they can inform considerations of fairness and equity in recommender systems. 1
What Aspects of Formality Do Workers Value? Evidence from a Choice Experiment in Bangladesh Mahmud, M; Gutierrez, IA; Kumar, KB; Nataraj, S This study uses a choice experiment among 2,000 workers in Bangladesh to elicit willingness to pay (WTP) for job attributes: a contract, termination notice, working hours, paid leave, and a pension fund. Using a stated preference method allows calculation of WTP for benefits in this setting, despite the lack of data on worker transitions, and the fact that many workers are self-employed, which makes it difficult to use revealed preference methods. Workers highly value job stability: the average worker would be willing to forgo a 27 percent increase in income to obtain a one-year contract (relative to no contract), or to forgo a 12 percent increase to obtain thirty days of termination notice. There is substantial heterogeneity in WTP by type of employment and gender: women value shorter working hours more than men, while government workers place a higher value on contracts than do private-sector employees. 1
Nowcasting in real time using popularity priors Monokroussos, G; Zhao, YC We construct a Google Recession Index (GRI) using Google Trends data on internet search popularity, which tracks the public's attention to recession-related keywords in real time. We then compare nowcasts made with and without this index using both a standard dynamic factor model and a Bayesian approach with alternative prior setups. Our results indicate that using the Bayesian model with GRI-based popularity priors, we could identify the 2008Q3 turning point in real time, without sacrificing the accuracy of the nowcasts over the rest of the sample periods. (C) 2020 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. 1
GhostLink: Latent Network Inference for Influence-aware Recommendation Mukherjee, S; Gunnemann, S Social influence plays a vital role in shaping a user's behavior in online communities dealing with items of fine taste like movies, food, and beer. For online recommendation, this implies that users' preferences and ratings are influenced due to other individuals. Given only time-stamped reviews of users, can we find out who-influences-whom, and characteristics of the underlying influence network? Can we use this network to improve recommendation? While prior works in social-aware recommendation have leveraged social interaction by considering the observed social network of users, many communities like Amazon, Beeradvocate, and Ratebeer do not have explicit user-user links. Therefore, we propose GhostLink, an unsupervised probabilistic graphical model, to automatically learn the latent influence network underlying a review community given only the temporal traces (timestamps) of users' posts and their content. Based on extensive experiments with four real-world datasets with 13 million reviews, we show that GhostLink improves item recommendation by around 23% over state-of-the-art methods that do not consider this influence. As additional use-cases, we show that GhostLink can be used to differentiate between users' latent preferences and influenced ones, as well as to detect influential users based on the learned influence graph. 1
Sparse Multi-Path Corrections in Fringe Projection Profilometry Zhang, Y; Lau, D; Wipf, D Three-dimensional scanning by means of structured light illumination is an active imaging technique involving projecting and capturing a series of striped patterns and then using the observed warping of stripes to reconstruct the target object's surface through triangulating each pixel in the camera to a unique projector coordinate corresponding to a particular feature in the projected patterns. The undesirable phenomenon of multi-path occurs when a camera pixel simultaneously sees features from multiple projector coordinates. Bimodal multi-path is a particularly common situation found along step edges, where the camera pixel sees both a foreground and background surface. Generalized from bimodal multi-path, this paper examines the phenomenon of sparse or N-modal multi-path as a more general case, where the camera pixel sees no fewer than two reflective surfaces, resulting in decoding errors. Using fringe projection profilometry, our proposed solution is to treat each camera pixel as an underdetermined linear system of equations and to find the sparsest (least number of paths) solution by taking an application-specific Bayesian learning approach. We validate this algorithm with both simulations and a number of challenging real-world scenarios, demonstrating that it outperforms state-of-the-art techniques. 1
A novel risk model for mortality and hospitalization following cardiac resynchronization therapy in patients with non-ischemic cardiomyopathy: the alpha-score Yang, SW; Liu, ZM; Hu, YR; Jing, R; Gu, M; Niu, HX; Ding, LG; Xing, AL; Zhang, S; Hua, W BackgroundNon-ischemic cardiomyopathy (NICM) has been associated with a better left ventricle reverse remodeling response and improved clinical outcomes after cardiac resynchronization therapy (CRT). The aims of our study were to identify the predictors of mortality and heart failure hospitalization in patients treated with CRT and design a risk score for prognosis.MethodsA cohort of 422 consecutive NICM patients with CRT was retrospectively enrolled between January 2010 and December 2017. The primary endpoint was all-cause mortality and heart transplantation.ResultsIn a multivariate analysis, the predictors of all-cause death were left atrial diameter [Hazard ratio (HR): 1.056, 95% confidence interval (CI): 1.020-1.093, P=0.002]; non-left bundle branch block [HR: 1.793, 95% CI: 1.131-2.844, P=0.013]; high sensitivity C-reactive protein [HR: 1.081, 95% CI: 1.029-1.134 P=0.002]; and N-terminal pro-B-type natriuretic peptide [HR: 1.018, 95% CI: 1.007-1.030, P=0.002]; and New York Heart Association class IV [HR: 1.018, 95% CI: 1.007-1.030, P=0.002]. The Alpha-score (Atrial diameter, non-LBBB, Pro-BNP, Hs-CRP, NYHA class IV) was derived from each independent risk factor. The novel score had good calibration (Hosmer-Lemeshow test, P>0.05) and discrimination for both primary endpoints [c-statistics: 0.749 (95% CI: 0.694-0.804), P<0.001] or heart failure hospitalization [c-statistics: 0.692 (95% CI: 0.639-0.745), P<0.001].ConclusionThe Alpha-score may enable improved discrimination and accurate prediction of long-term outcomes among NICM patients with CRT. 1
Using the new VPMADD instructions for the new post quantum key encapsulation mechanism SIKE Gueron, S; Kostic, D This paper demonstrates the use of new processor instructions VPMADD, intended to appear in the coming generation of Intel processors (codename Cannon Lake), in order to accelerate the newly proposed key encapsulation mechanism (KEM) named SIKE. SIKE is one of the submissions to the NIST standardization process on post-quantum cryptography, and is based on pseudo-random walks in supersingular isogeny graphs. While very small keys are the main advantage of SIKE, its extreme computational intensiveness makes it one of the slowest KEM proposals. Performance optimizations are needed. We address here the Level 1 parameters that target 64-bit quantum security, and deemed sufficient for the NIST standardization effort. Thus, we focus on SIKE503 that operates over F p2 with a 503-bit prime p. These short operands pose a significant challenge on using VPMADD effectively. We demonstrate several optimization methods to accelerate F-p, F-p2, and the elliptic curve arithmetic, and predict a potential speedup by a factor of 1.72x. 1
NEW EVIDENCE ON NATIONAL BOARD CERTIFICATION AS A SIGNAL OF TEACHER QUALITY Horoi, I; Bhai, M Using longitudinal data from North Carolina that contains detailed identifiers, we estimate the effect of having a National Board for Professional Teaching Standards (NBPTS) teacher on academic achievement. We identify the effects of an NBPTS teacher exploiting multiple sources of variation including traditional-lagged achievement models, twin- and sibling-fixed effects, and aggregate grade-level variation. Our preferred estimates show that students taught by National Board certified teachers have higher math and reading scores by 0.04 and 0.01 of a standard deviation. We find that an NBPTS math teacher increases the present value of students' lifetime income by $48,000. (JEL I20, I21, J2444) 1
Linguistic Resources for Bhojpuri, Magahi, and Maithili: Statistics about Them, Their Similarity Estimates, and Baselines for Three Applications Mundotiya, RK; Singh, MK; Kapur, R; Mishra, S; Singh, AK Corpus preparation for low-resource languages and for development of human language technology to analyze or computationally process them is a laborious task, primarily due to the unavailability of expert linguists who are native speakers of these languages and also due to the time and resources required. Bhojpuri, Magahi, and Maithili, languages of the Purvanchal region of India (in the north-eastern parts), are low-resource languages belonging to the Indo-Aryan (or Indic) family. They are closely related to Hindi, which is a relatively high-resource language, which is why we compare them with Hindi. We collected corpora for these three languages from various sources and cleaned them to the extent possible, without changing the data in them. The text belongs to different domains and genres. We calculated some basic statistical measures for these corpora at character, word, syllable, and morpheme levels. These corpora were also annotated with parts-of-speech (POS) and chunk tags. The basic statistical measures were both absolute and relative and were expected to indicate linguistic properties, such as morphological, lexical, phonological, and syntactic complexities (or richness). The results were compared with a standard Hindi corpus. For most of the measures, we tried to match the corpus size across the languages to avoid the effect of corpus size, but in some cases it turned out that using the full corpus was better, even if sizes were very different. Although the results are not very clear, we tried to draw some conclusions about the languages and the corpora. For POS tagging and chunking, the BIS tagset was used to manually annotate the data. The POS-tagged data sizes are 16,067, 14,669, and 12,310 sentences, respectively, for Bhojpuri, Magahi, and Maithili. The sizes for chunking are 9,695 and 1,954 sentences for Bhojpuri andMaithili, respectively. The inter-annotator agreement for these annotations, using Cohen's Kappa, was 0.92, 0.64, and 0.74, respectively, for the three languages. These (annotated) corpora have been used for developing preliminary automated tools, which include POS tagger, Chunker, and Language Identifier. We have also developed the Bilingual dictionary (Purvanchal languages to Hindi) and a Synset (that can be integrated later in the Indo-WordNet) as additional resources. The main contribution of the work is the creation of basic resources for facilitating further language processing research for these languages, providing some quantitative measures about them and their similarities among themselves and with Hindi. For similarities, we use a somewhat novelmeasure of language similarity based on an n-gram-based language identification algorithm. An additional contribution is providing baselines for three basic NLP applications (POS tagging, chunking, and language identification) for these closely related languages. 1
The advantage of truncated permutations Gilboa, S; Gueron, S Constructing a Pseudo Random Function (PRF) is a fundamental problem in cryptology. Such a construction, implemented by truncating the last m bits of permutations of {0, 1}(n) was suggested by Hall et al. (1998). They conjectured that the distinguishing advantage of an adversary with q queries, Adv(n,m)(q), is small if q = o(2((n+m)/2)), established an upper bound on Adv(n,m)(q) that confirms the conjecture for m < n/7, and also declared a general lower bound Adv(n,m)(q) = Omega(q(2)/2(n+m)). The conjecture was essentially confirmed by Bellare and Impagliazzo (1999). Nevertheless, the problem of estimating Adv(n,m)(q) remained open. Combining the trivial bound 1, the birthday bound, and a result of Stam (1978) leads to the upper bound Adv(n,m)(q) = O(min{q(q-1)/2(n) , q/2(n+m/2) , 1}). In this paper we show that this upper bound is tight for every O <= m < n and any q. This, in turn, verifies that the converse to the conjecture of Hall et al. is also correct, i.e., that Adv(n,m)(q) is negligible only for q = o(2((n+m)/2)). (C) 2021 Elsevier B.V. All rights reserved. 1
Block-distributed Gradient Boosted Trees Vasiloudis, T; Cho, H; Bostrom, H The Gradient Boosted Tree (GBT) algorithm is one of the most popular machine learning algorithms used in production, for tasks that include Click-Through Rate (CTR) prediction and learning-to-rank. To deal with the massive datasets available today, many distributed GBT methods have been proposed. However, they all assume a row-distributed dataset, addressing scalability only with respect to the number of data points and not the number of features, and increasing communication cost for high-dimensional data. In order to allow for scalability across both the data point and feature dimensions, and reduce communication cost, we propose block-distributed GBTs. We achieve communication efficiency by making full use of the data sparsity and adapting the Quickscorer algorithm to the block-distributed setting. We evaluate our approach using datasets with millions of features, and demonstrate that we are able to achieve multiple orders of magnitude reduction in communication cost for sparse data, with no loss in accuracy, while providing a more scalable design. As a result, we are able to reduce the training time for high-dimensional data, and allow more cost-effective scale-out without the need for expensive network communication. 1
Fast and Effective Distribution-Key Recommendation for Amazon Redshift Parchas, P; Naamad, Y; Van Bouwel, P; Faloutsos, C; Petropoulos, M How should we split data among the nodes of a distributed data warehouse in order to boost performance for a forecasted workload? In this paper, we study the effect of different data partitioning schemes on the overall network cost of pairwise joins. We describe a generally-applicable data distribution framework initially designed for Amazon Redshift, a fully-managed petabyte-scale data warehouse in the cloud. To formalize the problem, we first introduce the Join Multi-Graph, a concise graph-theoretic representation of the workload history of a cluster. We then formulate the Distribution-Key Recommendation problem - a novel combinatorial problem on the Join Multi-Graph - and relate it to problems studied in other subfields of computer science. Our theoretical analysis proves that Distribution-Key Recommendation is NP-complete and is hard to approximate efficiently. Thus, we propose BAW, a hybrid approach that combines heuristic and exact algorithms to find a good data distribution scheme. Our extensive experimental evaluation on real and synthetic data showcases the efficacy of our method into recommending optimal (or close to optimal) distribution keys, which improve the cluster performance by reducing network cost up to 32x in some real workloads. 1
Unsupervised Bitext Mining and Translation via Self-Trained Contextual Embeddings Keung, P; Salazar, J; Lu, YC; Smith, NA We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text. We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training. We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods. We then improve an XLM-based unsupervised neural MT system pre-trained on Wikipedia by supplementing it with pseudo-parallel text mined from the same corpus, boosting unsupervised translation performance by up to 3.5 BLEU on the WMT' 14 French-English and WMT' 16 German-English tasks and outperforming the previous state-of-the-art. Finally, we enrich the IWSLT' 15 English-Vietnamese corpus with pseudo-parallel Wikipedia sentence pairs, yielding a 1.2 BLEU improvement on the low-resource MT task. We demonstrate that unsupervised bitext mining is an effective way of augmenting MT datasets and complements existing techniques like initializing with pre-trained contextual embeddings. 1
Understanding Neural Networks and Individual Neuron Importance via Information-Ordered Cumulative Ablation Amjad, RA; Liu, KR; Geiger, BC In this work, we investigate the use of three information-theoretic quantities--entropy, mutual information with the class variable, and a class selectivity measure based on Kullback-Leibler (KL) divergence--to understand and study the behavior of already trained fully connected feedforward neural networks (NNs). We analyze the connection between these information-theoretic quantities and classification performance on the test set by cumulatively ablating neurons in networks trained on MNIST, FashionMNIST, and CIFAR-10. Our results parallel those recently published by Morcos et al., indicating that class selectivity is not a good indicator for classification performance. However, looking at individual layers separately, both mutual information and class selectivity are positively correlated with classification performance, at least for networks with ReLU activation functions. We provide explanations for this phenomenon and conclude that it is ill-advised to compare the proposed information-theoretic quantities across layers. Furthermore, we show that cumulative ablation of neurons with ascending or descending information-theoretic quantities can be used to formulate hypotheses regarding the joint behavior of multiple neurons, such as redundancy and synergy, with comparably low computational cost. We also draw connections to the information bottleneck theory for NNs. 1
BCFA: Bespoke Control Flow Analysis for CFA at Scale Ramu, R; Upadhyaya, GB; Nguyen, HA; Rajan, H Many data-driven software engineering tasks such as discovering programming patterns, mining API specifications, etc., perform source code analysis over control flow graphs (CFGs) at scale. Analyzing millions of CFGs can be expensive and performance of the analysis heavily depends on the underlying CFG traversal strategy. State-of-the-art analysis frameworks use a fixed traversal strategy. We argue that a single traversal strategy does not fit all kinds of analyses and CFGs and propose bespoke control flow analysis (BCFA). Given a control flow analysis (CFA) and a large number of CFGs, BCFA selects the most efficient traversal strategy for each CFG. BCFA extracts a set of properties of the CFA by analyzing the code of the CFA and combines it with properties of the CFG, such as branching factor and cyclicity, for selecting the optimal traversal strategy. We have implemented BCFA in Boa, and evaluated BCFA using a set of representative static analyses that mainly involve traversing CFGs and two large datasets containing 287 thousand and 162 million CFGs. Our results show that BCFA can speedup the large scale analyses by 1%-28%. Further, BCFA has low overheads; less than 0.2%, and low misprediction rate; less than 0.01%. 1
Identification and estimation of a triangular model with multiple endogenous variables and insufficiently many instrumental variables Huang, LQ; Khalil, U; Yildiz, N We develop a novel identification method for a partially linear model with multiple endogenous variables of interest but a single instrumental variable, which could even be binary. We present an easy-to-implement consistent estimator for the parametric part. This estimator retains root n-convergence rate and asymptotic normality even though we have a generated regressor in our setup. The nonparametric part of the model is also identified. We also outline how our identification strategy can be extended to a fully non-parametric model. Finally, we use our methods to assess the impact of smoking during pregnancy on birth weight. (C) 2018 Elsevier B.V. All rights reserved. 1
Methods of Assessing the Rank Order of Prediction Models with Respect to Variance of Listening Test Ratings Pearce, A; Isabelle, S; Francois, H; Oh, E Although predictive models are widely used to predict the results of listening tests, there are currently no standardized statistical metrics for assessing the rank order, and commonly used rank order metrics do not consider the variance of the listening test data. This paper proposes two novel metrics for assessing rank order with respect to variance by adapting Spearman's Rho and Kendall's Tau and assesses the performance of these metrics against actual listening test data with standardized prediction models. 1
Bayesian Meta-Prior Learning Using Empirical Bayes Nabi, S; Nassif, H; Hong, J; Mamani, H; Imbens, G Adding domain knowledge to a learning system is known to improve results. In multiparameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, the various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in operation management and management science applications are the absence of informative priors and the inability to control parameter learning rates. In this study, we propose a hierarchical empirical Bayes approach that addresses both challenges and that can generalize to any Bayesian framework. Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features (or any other given feature grouping) in a generalized linear model. Because the first-order features are likely to have a more pronounced effect on the outcome, focusing on learning first-order weights first is likely to improve performance and convergence time. Our empirical Bayes method clamps features in each group together and uses the deployed model's observed data to empirically compute a hierarchical prior in hindsight. We report theoretical results for the unbiasedness, strong consistency, and optimal frequentist cumulative regret properties of our meta-prior variance estimator. We apply our method to a standard supervised learning optimization problem as well as an online combinatorial optimization problem in a contextual bandit setting implemented in an Amazon production system. During both simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising because optimizing over sparse data is often a challenge. 1
A Constructive Method for Designing Safe Multirate Controllers for Differentially-Flat Systems Agrawal, DR; Parwana, H; Cosner, RK; Rosolia, U; Ames, AD; Panagou, D We present a multi-rate control architecture that leverages fundamental properties of differential flatness to synthesize controllers for safety-critical nonlinear dynamical systems. We propose a two-layer architecture, where the high-level generates reference trajectories using a linear Model Predictive Controller, and the low-level tracks this reference using a feedback controller. The novelty lies in how we couple these layers, to achieve formal guarantees on recursive feasibility of the MPC problem, and safety of the nonlinear system. Furthermore, using differential flatness, we provide a constructive means to synthesize the multi-rate controller, thereby removing the need to search for suitable Lyapunov or barrier functions, or to approximately linearize/discretize nonlinear dynamics. We show the synthesized controller is a convex optimization problem, making it amenable to real-time implementations. The method is demonstrated experimentally on a ground rover and a quadruped robotic system. 1
Med2Meta: Learning Representations of Medical Concepts with Meta-embeddings Chowdhury, S; Zhang, CW; Yu, PS; Luo, Y Distributed representations of medical concepts have been used to support downstream clinical tasks recently. Electronic Health Records (EHR) capture different aspects of patients' hospital encounters and serve as a rich source for augmenting clinical decision making by learning robust medical concept embeddings. However, the same medical concept can be recorded in different modalities (e.g., clinical notes, lab results) - with each capturing salient information unique to that modality - and a holistic representation calls for relevant feature ensemble from all information sources. We hypothesize that representations learned from heterogeneous data types would lead to performance enhancement on various clinical informatics and predictive modeling tasks. To this end, our proposed approach makes use of metu-embeddings, embeddings aggregated from learned embeddings. Firstly, modality-specific embeddings for each medical concept is learned with graph autoencoders. The ensemble of all the embeddings is then modeled as a meta-embedding learning problem to incorporate their correlating and complementary information through a joint reconstruction. Empirical results of our model on both quantitative and qualitative clinical evaluations have shown improvements over state-of-the-art embedding models, thus validating our hypothesis. 1
Sequential optimization with particle splitting-based reliability assessment for engineering design under uncertainties Zhuang, XT; Pan, R; Sun, Q The evaluation of probabilistic constraints plays an important role in reliability-based design optimization. Traditional simulation methods such as Monte Carlo simulation can provide highly accurate results, but they are often computationally intensive to implement. To improve the computational efficiency of the Monte Carlo method, this article proposes a particle splitting approach, a rare-event simulation technique that evaluates probabilistic constraints. The particle splitting-based reliability assessment is integrated into the iterative steps of design optimization. The proposed method provides an enhancement of subset simulation by increasing sample diversity and producing a stable solution. This method is further extended to address the problem with multiple probabilistic constraints. The performance of the particle splitting approach is compared with the most probable point based method and other approximation methods through examples. 1
Asymptotic analysis for multi-objective sequential stochastic assignment problems Yu, G; Jacobson, SH; Kiyavash, N We provide an asymptotic analysis of multi-objective sequential stochastic assignment problems (MOSSAP). In MOSSAP, a fixed number of tasks arrive sequentially, with an n-dimensional value vector revealed upon arrival. Each task is assigned to one of a group of known workers immediately upon arrival, with the reward given by an n-dimensional product-form vector. The objective is to maximize each component of the expected reward vector. We provide expressions for the asymptotic expected reward per task for each component of the reward vector and compare the convergence rates for three classes of Pareto optimal policies. 1
DEEP REINFORCEMENT LEARNING-BASED IRRIGATION SCHEDULING Yang, Y; Hu, J; Porter, D; Marek, T; Heflin, K; Kong, H; Sun, L Machine learning has been widely applied in many areas, with promising results and large potential. In this article, deep reinforcement learning-based irrigation scheduling is proposed. This approach can automate the irrigation process and can achieve highly precise water application that results in higher simulated net return. Using this approach, the irrigation controller can automatically determine the optimal or near-optimal water application amount. Traditional reinforcement learning can be superior to traditional periodic and threshold-based irrigation scheduling. However, traditional reinforcement learning fails to accurately represent a real-world irrigation environment due to its limited state space. Compared with traditional reinforcement learning, the deep reinforcement learning method can better model a real-world environment based on multi-dimensional observations. Simulations for various weather conditions and crop types show that the proposed deep reinforcement learning irrigation scheduling can increase net return. 1
Evaluating teen options for preventing pregnancy: Impacts and mechanisms Luca, DL; Stevens, J; Rotz, D; Goesling, B; Lutz, R This paper presents findings from an experimental evaluation of the Teen Options to Prevent Pregnancy (TOPP) program, an 18-month intervention that consists of a unique combination of personalized contraceptive counseling, facilitated access to contraceptive services, and referrals to social services. We find that TOPP led to large and statistically significant increases in the use of long-acting reversible contraceptives (LARCs), accompanied by substantial reductions in repeat and unintended pregnancy among adolescent mothers. We provide an exploratory analysis of the channels through which TOPP achieved its impacts on contraceptive behavior and pregnancy outcomes. A back-of-the-envelope decomposition implies that the increase in LARC use can explain at most one-third of the reduction in repeat pregnancy. We provide suggestive evidence that direct access to contraceptive services was important for increasing LARC use and reducing repeat pregnancy. We did not find any spillover effects on non-targeted outcomes, such as educational attainment and benefit receipt. (c) 2021 Elsevier B.V. All rights reserved. 1
An HTM-Based Update-side Synchronization for RCU on NUMA systems Park, S; McKenney, PE; Dufour, L; Yeom, HY Read-copy update (RCU) can provide ideal scalability for read-mostly workloads, but some believe that it provides only poor performance for updates. This belief is due to the lack of RCU-centric update synchronization mechanisms. RCU instead works with a range of update-side mechanisms, such as locking. In fact, many developers embrace simplicity by using global locking. Logging, hardware transactional memory, or fine-grained locking can provide better scalability, but each of these approaches has limitations, such as imposing overhead on readers or poor scalability on nonuniform memory access (NUMA) systems, mainly due to their lack of NUMA-aware design principles. This paper introduces an RCU extension (RCX) that provides highly scalable RCU updates on NUMA systems while retaining RCU's read-side benefits. RCX is a software-based synchronization mechanism combining hardware transactional memory (HTM) and traditional locking based on our NUMA-aware design principles for RCU. Micro-bench-marks on a NUMA system having 144 hardware threads show RCX has up to 22.6 times better performance and up to 145 times lower HTM abort rates compared to a state-of-the-art RCU/HTM combination. To demonstrate the effectiveness and applicability of RCX, we have applied RCX to parallelize some of the Linux kernel memory management system and an in-memory database system. The optimized kernel and the database show up to 24 and 17 times better performance compared to the original version, respectively. 1
Orbiting radiation stars Foster, DP; Langford, J; Perez-Giz, G We study a spherically symmetric solution to the Einstein equations in which the source, which we call an orbiting radiation star (OR-star), is a compact object consisting of freely falling null particles. The solution avoids quantum scale regimes and hence neither relies upon nor ignores the interaction of quantum mechanics and gravitation. The OR-star spacetime exhibits a deep gravitational well yet remains singularity free. In fact, it is geometrically flat in the vicinity of the origin, with the flat region being of any desirable scale. The solution is observationally distinct from a black hole because a photon from infinity aimed at an OR-star escapes to infinity with a time delay. 1
When and Where: Behavior Dominant Location Forecasting with Micro-blog Streams Gautam, B; Basava, A; Singh, A; Agrawal, A The proliferation of smartphones and wearable devices has increased the availability of large amounts of geospatial streams to provide significant automated discovery of knowledge in pervasive environments, but most prominent information related to altering interests have not yet adequately capitalized. In this paper, we provide a novel algorithm to exploit the dynamic fluctuations in user's point-of-interest while forecasting the future place of visit with fine granularity. Our proposed algorithm is based on the dynamic formation of collective personality communities using different languages, opinions, geographical and temporal distributions for finding out optimized equivalent content. We performed extensive empirical experiments involving, real-time streams derived from 0.6 million stream tuples of micro-blog comprising 1945 social person fusion with graph algorithm and feed-forward neural network model as a predictive classification model. Lastly, The framework achieves 62.10% mean average precision on 1,20,000 embeddings on unlabeled users and surprisingly 85.92% increment on the state-of-the-art approach. 1
Optimal multi-unit mechanisms with private demands Devanur, NR; Haghpanah, N; Psomas, A A seller can produce multiple units of a single good. The buyer has constant marginal value for each unit she receives up to a demand, and zero marginal value for units beyond the demand. The marginal value and the demand are drawn from a distribution and are privately known to the buyer. We show that under natural regularity conditions on the distribution, the optimal (revenue-maximizing) selling mechanism is deterministic. It is a price schedule that specifies the payment based on the number of units purchased. Further, under the same conditions, the revenue as a function of the price schedule is concave, which in turn implies that the optimal price schedule can be found in polynomial time. We give a more detailed characterization of the optimal prices when there are only two possible demands. (C) 2020 Elsevier Inc. All rights reserved. 1
Voting with random classifiers (VORACE): theoretical and experimental analysis Cornelio, C; Donini, M; Loreggia, A; Pini, MS; Rossi, F In many machine learning scenarios, looking for the best classifier that fits a particular dataset can be very costly in terms of time and resources. Moreover, it can require deep knowledge of the specific domain. We propose a new technique which does not require profound expertise in the domain and avoids the commonly used strategy of hyper-parameter tuning and model selection. Our method is an innovative ensemble technique that uses voting rules over a set of randomly-generated classifiers. Given a new input sample, we interpret the output of each classifier as a ranking over the set of possible classes. We then aggregate these output rankings using a voting rule, which treats them as preferences over the classes. We show that our approach obtains good results compared to the state-of-the-art, both providing a theoretical analysis and an empirical evaluation of the approach on several datasets. 1
CrossBERT: a Triplet Neural Architecture for Ranking Entity Properties Manotumruksa, J; Dalton, J; Meij, E; Yilmaz, E Task-based Virtual Personal Assistants (VPAs) such as the Google Assistant, Alexa, and Siri are increasingly being adopted for a wide variety of tasks. These tasks are grounded in real-world entities and actions (e.g., book a hotel, organise a conference, or requesting funds). In this work we tackle the task of automatically constructing actionable knowledge graphs in response to a user query in order to support a wider variety of increasingly complex assistant tasks. We frame this as an entity property ranking task given a user query with annotated properties. We propose a new method for property ranking, CrossBERT. CrossBERT builds on the Bidirectional Encoder Representations from Transformers (BERT) and creates a new triplet network structure on cross query-property pairs that is used to rank properties. We also study the impact of using external evidence for query entities from textual entity descriptions. We perform experiments on two standard benchmark collections, the NTCIR-13 Actionable Knowledge Graph Generation (AKGG) task and Entity Property Identification (EPI) task. The results demonstrate that CrossBERT significantly outperforms the best performing runs from AKGG and EPI, as well as previous state-of-the-art BERT-based models. In particular, CrossBERT significantly improves Recall and NDCG by approximately 2-12% over the BERT models across the two used datasets. 1
Asymptotic Optimality of Base-Stock Policies for Perishable Inventory Systems Bu, JZ; Gong, XT; Chao, XL We consider periodic review perishable inventory systems with a fixed product lifetime. Unsatisfied demand can be either lost or backlogged. The objective is to minimize the long-run average holding, penalty, and outdating cost. The optimal policy for these systems is notoriously complex and computationally intractable because of the curse of dimensionality. Hence, various heuristic replenishment policies are proposed in the literature, including the base-stock policy, which raises the total inventory level to a constant in each review period. Whereas various studies show near-optimal numerical performances of base-stock policies in the classic system with zero replenishment lead time and a first-in-first-out issuance policy, the results on their theoretical performances are very limited. In this paper, we first focus on this classic system and show that a simple base-stock policy is asymptotically optimal when any one of the product lifetime, demand population size, unit penalty cost, and unit outdating cost becomes large; moreover, its optimality gap converges to zero exponentially fast in the first two parameters. We then study two important extensions. For a system under a last-in-first-out or even an arbitrary issuance policy, we prove that a simple base-stock policy is asymptotically optimal with large product lifetime, large unit penalty costs, and large unit outdating costs, and for a backlogging system with positive lead times, we prove that our results continue to hold with large product lifetime, large demand population sizes, and large unit outdating costs. Finally, we provide a numerical study to demonstrate the performances of base-stock policies in these systems. 1
Noncontact Human Body Voltage Measurement Using Microsoft Kinect and Field Mill for ESD Applications Yong, SH; Hosseinbeig, A; Yang, SY; Heaney, MB; Pommerenke, D A technique for measuring the voltage on a person by combining an overhead mounted electrostatic field sensor with Microsoft Kinect image recognition software and hardware is presented in this paper. The Kinect's built-in skeleton detection method was used to determine the posture, height, and location of a person relative to the position of the field sensor. The voltage is estimated using field strength and geometry information. Two different algorithms are proposed for calculating the human body voltage using field strength data captured by the field sensor and human body coordinates recognized by the Kinect. First, using calibration data, the human body voltage was obtained by charging a person to a known voltage and walking through the field of vision, then a correction was performed to convert the field sensor data into the human body voltage. The second algorithm is based on solving Laplace equation. The human body ismodeled by using a superposition of spheroids, and the E-field is solved analytically. Results of the algorithms were compared, with the proposed methodology providing less than a +/- 15% error margin for the human body voltage over a detection area of 2.0 m x 2.5 m. 1
On empirical likelihood option pricing Zhong, XL; Cao, J; Jin, Y; Zheng, AW The Black-Scholes model is the golden standard for pricing derivatives and options in the modern financial industry. However, this method imposes some parametric assumptions on the stochastic process, and its performance becomes doubtful when these assumptions are violated. This paper investigates the application of a nonparametric method, namely the empirical likelihood (EL) method, in the study of option pricing. A blockwise EL procedure is proposed to deal with dependence in the data. Simulation and real data studies show that this new method performs reasonably well and, more importantly, outperforms classical models developed to account for jumps and stochastic volatility, thanks to the fact that nonparametric methods capture information about higher-order moments. 1
Algorithms for the selection of fluorescent reporters Vaidyanathan, P; Appleton, E; Tran, D; Vahid, A; Church, G; Densmore, D Molecular biologists rely on the use of fluorescent probes to take measurements of their model systems. These fluorophores fall into various classes (e.g. fluorescent dyes, fluorescent proteins, etc.), but they all share some general properties (such as excitation and emission spectra, brightness) and require similar equipment for data acquisition. Selecting an ideal set of fluorophores for a particular measurement technology or vice versa is a multidimensional problem that is difficult to solve with ad hoc methods due to the enormous solution space of possible fluorophore panels. Choosing sub-optimal fluorophore panels can result in unreliable or erroneous measurements of biochemical properties in model systems. Here, we describe a set of algorithms, implemented in an open-source software tool, for solving these problems efficiently to arrive at fluorophore panels optimized for maximal signal and minimal bleed-through. Vaidyanathan et al. present a heuristic algorithm for the selection of fluorescent reporters in the context of single-cell analysis. They present a tool to enable biologists to design multi-colour fluorophore panels based on specific equipment's configurations. The authors demonstrate the efficacy of their algorithm by comparing computational predictions with experimental observations. 1
1996-2017 GPS position time series, velocities and quality measures for the CORS Network Saleh, J; Yoon, S; Choi, K; Sun, LJ; Snay, R; McFarland, P; Williams, S; Haw, D; Coloma, F The CORS network is a volunteer-based network of Global Positioning System reference stations located mainly in the US and its territories. We discuss the most recent comprehensive reprocessing of all GPS data collected via this network since 1996. Daily data for GPS weeks 834 through 1933 were reprocessed leading to epoch 2010.0 co-ordinates and velocities of 3049 stations aligned to IGS14. The updated realization of the US National Spatial Reference System derived in this work has been in use since late 2019. As a validation of the results, the derived velocity field is compared to several other solutions and to three regional geophysical and geodetic velocity models. These comparisons uncovered unstable stations which move differently than the regional kinematics around them. Once these are ignored, we estimate the horizontal and vertical stability of this updated realization to be better than similar to 0.3 and similar to 0.6 mm/year, respectively. We use the position residuals and estimated uncertainties from this reprocessing to derive long-term stability measures for all active stations serving longer than 3 years. These measures exposed similar to 60 CORS with the poorest long-term stability, which have been consequently excluded from serving as mapping control. 1
The 1-dimensional discrete Voronoi game Banik, A; Bhattacharya, BB; Das, S; Das, S The discrete Voronoi game in consists of two competing players P1 and P2 and a set of N users placed on a line-segment. The players alternately place one facility each on the line-segment for R-rounds, where the objective is to maximize their own total payoffs. We prove bounds on the worst-case (over the arrangement of the N users) payoffs of the two players, and discuss algorithms for finding the optimal strategies of the players in the 2-round game. (C) 2019 Elsevier B.V. All rights reserved. 1
Enhancing Few-Shot Image Classification with Unlabelled Examples Bateni, P; Barber, J; Van de Meent, JW; Wood, F We develop a transductive meta-learning method that uses unlabelled instances to improve few-shot image classification performance. Our approach combines a regularized Mahalanobis-distance-based soft k-means clustering procedure with a modified state of the art neural adaptive feature extractor to achieve improved test-time classification accuracy using unlabelled data. We evaluate our method on transductive few-shot learning tasks, in which the goal is to jointly predict labels for query (test) examples given a set of support (training) examples. We achieve state of the art performance on the Meta-Dataset, mini-ImageNet and tiered-ImageNet benchmarks. All trained models and code have been made publicly available. 1
ADMIRING: Adversarial Multi-Network Mining Zhou, QH; Li, LY; Cao, N; Ying, L; Tong, HH Multi-sourced networks naturally appear in many application domains, ranging from bioinformatics, social networks, neuroscience to management. Although state-of-the-art offers rich models and algorithms to find various patterns when input networks are given, it has largely remained nascent on how vulnerable the mining results are due to the adversarial attacks. In this paper, we address the problem of attacking multi-network mining through the way of deliberately perturbing the networks to alter the mining results. The key idea of the proposed method (ADMIRING) is effective influence functions on the Sylvester equation defined over the input networks, which plays a central and unifying role in various multi-network mining tasks. The proposed algorithms bear two main advantages, including (1) effectiveness, being able to accurately quantify the rate of change of the mining results in response to attacks; and (2) generality, being applicable to a variety of multi-network mining tasks (e.g., graph kernel, network alignment, cross-network node similarity) with different attacking strategies (e.g., edge/node removal, attribute alteration). 1
Application of machine learning methods to pathogen safety evaluation in biological manufacturing processes Panjwani, S; Cui, I; Spetsieris, K; Mleczko, M; Wang, WS; Zou, JX; Anwaruzzaman, M; Liu, S; Canales, R; Hesse, O The production of recombinant therapeutic proteins from animal or human cell lines entails the risk of endogenous viral contamination from cell substrates and adventitious agents from raw materials and environment. One of the approaches to control such potential viral contamination is to ensure the manufacturing process can adequately clear the potential viral contaminants. Viral clearance for production of human monoclonal antibodies is achieved by dedicated unit operations, such as low pH inactivation, viral filtration, and chromatographic separation. The process development of each viral clearance step for a new antibody production requires significant effort and resources invested in wet laboratory experiments for process characterization studies. Machine learning methods have the potential to help streamline the development and optimization of viral clearance unit operations for new therapeutic antibodies. The current work focuses on evaluating the usefulness of machine learning methods for process understanding and predictive modeling for viral clearance via a case study on low pH viral inactivation. 1
Comparative analysis of the intracellular responses to disease-related aggregation-prone proteins Melnik, A; Cappelletti, V; Vaggi, F; Piazza, I; Tognetti, M; Schwarz, C; Cereghetti, G; Ahmed, MA; Soste, M; Matlack, K; de Souza, N; Csikasz-Nagy, A; Picotti, P Aggregation-prone proteins (APPs) have been implicated in numerous human diseases but the underlying mechanisms are incompletely understood. Here we comparatively analysed cellular responses to different APPs. Our study is based on a systematic proteomic and phosphoproteomic analysis of a set of yeast proteotoxicity models expressing different human disease-related APPs, which accumulate intracellular APP inclusions and exhibit impaired growth. Clustering and functional enrichment analyses of quantitative proteome-level data reveal that the cellular response to APP expression, including the chaperone response, is specific to the APP, and largely differs from the response to a more generalized proteotoxic insult such as heat shock. We further observe an intriguing association between the subcellular location of inclusions and the location of the cellular response, and provide a rich dataset for future mechanistic studies. Our data suggest that care should be taken when designing research models to study intracellular aggregation, since the cellular response depends markedly on the specific APP and the location of inclusions. Further, therapeutic approaches aimed at boosting protein quality control in protein aggregation diseases should be tailored to the subcellular location affected by inclusion formation. Significance: We have examined the global cellular response, in terms of protein abundance and phosphorylation changes, to the expression of five human neurodegeneration-associated, aggregation-prone proteins (APPs) in a set of isogenic yeast models. Our results show that the cellular response to each APP is unique to that protein, is different from the response to thermal stress, and is associated with processes at the subcellular location of APP inclusion formation. These results further our understanding of how cells, in a model organism, respond to expression of APPs implicated in neurodegenerative diseases like Parkinson's, Alzheimer's, and ALS. They have implications for mechanisms of toxicity as well as of protective responses in the cell. The specificity of the response to each APP means that research models of these diseases should be tailored to the APP in question. The subcellular localization of the response suggest that therapeutic interventions should also be targeted within the cell. 1
FbHash-E: A time and memory efficient version of FbHash similarity hashing algorithm Singh, M; Khunteta, A; Ghosh, M; Chang, D; Sanadhya, SK With the rapid advancements in digital technologies and the exponential growth of digital artifacts, automated filtering of cybercrime data for digital investigation from a variety of resources has become the need of the hour. Many techniques primarily based on the Approximate Matching  approach have been proposed in the literature to address this challenging task. In the year 2019, Chang et al. proposed one such algorithm -FbHash: A New Similarity Hashing Scheme for Digital Forensics that was shown to produce the best correlation results compared to other existing techniques and also resist active adversary attack, unlike others. However, no performance analysis of the tool was given. In this work, we show that the current design structure of FbHash is slower and memory intensive compared to its peers. We then propose a novel Bloom filter based efficient version, i.e., FbHash-E that has a much lower memory footprint and is computationally faster compared to FbHash. While the speed of FbHash-E is comparable to other state-of-the-art tools, it is resistant (like its predecessor) to inten-tional/intelligent modifications that can fool the tool  attacks, unlike its peers. Our version thus renders FbHash-E fit for practical use-cases. We perform various modification tests to evaluate the security and correctness of FbHash-E. Our experiment results show that our scheme is secure against active attacks and detects similarity with 87% accuracy. Compared to FbHash, there is only 3% drop in accuracy results. We demonstrate the sensitivity and robustness of our proposed scheme by performing a variety of containment and resemblance tests. We show that FbHash-E can correlate files with up to 10% random-noise with 100% detection rate and is able to detect commonality as small as 1% between the two documents with an appropriate similarity score. We also show that our proposed scheme performs best to identify similarities between different versions of software or program files. We also introduce a new test, i.e., consistency test, and exhibit that our tool produces consistent results across all files under a fixed category with very low standard deviation, unlike other tools where standard deviation under a fixed test varies significantly. This shows that our tool is more robust and stable against different modifications. (C) 2022 Elsevier Ltd. All rights reserved. 1
Differential Input Current Regulation in Parallel Output Connected Battery Power Modules Kamel, M; Rehman, MU; Zhang, F; Zane, RA; Maksimovic, D Parallel output connected converters have been widely investigated with a focus on equal current and power sharing. However, parallel output connected battery power modules (BPMs) require unequal currents to enable state-of-charge (SOC) control in active battery management systems (BMS.) This article presents simple differential input current regulation for SOC control. Compared with equal current sharing, differential current regulation is more critical on the system stability due to the cross-coupling between the paralleled BPMs. The article proposes design guidelines that enable differential current control while considering the cross-coupling between the paralleled BPMs. The small-signal model of a battery brick consisting of N parallel output connected BPMs that operate in boost mode is presented. This article shows the effect of paralleling and differential currents on the individual input current regulation loops. Simulations and experiments verify the analysis. Experimental validation using a 300-W prototype consisting of three parallel output connected battery modules in an active BMS is presented. 1
Detecting Product Review Spammers Using Principles of Big Data Rout, JK; Dalmia, A; Rath, SK; Mohanta, BK; Ramasubbareddy, S; Gandomi, AH The growing consumerism has led to the importance of online reviews on the Internet. Opinions voiced by these reviews are taken into consideration by many consumers for making financial decisions online. This has led to the development of opinion spamming for profitable motives or otherwise. This work has been done to tackle the challenge of identifying such spammers, but the scale of the real-world review systems demands this problem to be tackled as a big data challenge. So, an effort has been made to detect online review spammers using the principle of big data. In this article, a rating-based model has been studied under the light of large-scale datasets (more than 80 million reviews by 20 million reviewers) using the Hadoop and Spark frameworks. Scale effects have been identified and mitigated to provide better context to large review systems. An improved computational framework has been presented to compute the overall spamcity of reviewers using exponential smoothing. The value of the smoothing factor was set empirically. Finally, future directions have been discussed. 1
Utilizing image and caption information for biomedical document classification Li, PY; Jiang, XY; Zhang, GB; Trabucco, JT; Raciti, D; Smith, C; Ringwald, M; Marai, GE; Arighi, C; Shatkay, H Motivation: Biomedical research findings are typically disseminated through publications. To simplify access to domain-specific knowledge while supporting the research community, several biomedical databases devote significant effort to manual curation of the literature-a labor intensive process. The first step toward biocuration requires identifying articles relevant to the specific area on which the database focuses. Thus, automatically identifying publications relevant to a specific topic within a large volume of publications is an important task toward expediting the biocuration process and, in turn, biomedical research. Current methods focus on textual contents, typically extracted from the title-and-abstract. Notably, images and captions are often used in publications to convey pivotal evidence about processes, experiments and results. Results: We present a new document classification scheme, using both image and caption information, in addition to titles-and-abstracts. To use the image information, we introduce a new image representation, namely Figure-word, based on class labels of subfigures. We use word embeddings for representing captions and titles-and-abstracts. To utilize all three types of information, we introduce two information integration methods. The first combines Figure-words and textual features obtained from captions and titles-and-abstracts into a single larger vector for document representation; the second employs a meta-classification scheme. Our experiments and results demonstrate the usefulness of the newly proposed Figure-words for representing images. Moreover, the results showcase the value of Figure-words, captions and titles-and-abstracts in providing complementary information for document classification; these three sources of information when combined, lead to an overall improved classification performance. 1
MM-Hand: 3D-Aware Multi-Modal Guided Hand Generative Network for 3D Hand Pose Synthesis Wu, ZY; Hoang, D; Lin, SY; Xie, YS; Chen, LJ; Lin, YY; Wang, ZY; Fan, W Estimating the 3D hand pose from a monocular RGB image is important but challenging. A solution is training on large-scale RGB hand images with accurate 3D hand keypoint annotations. However, it is too expensive in practice. Instead, we have developed a learning-based approach to synthesize realistic, diverse, and 3D pose-preserving hand images under the guidance of 3D pose information. We propose a 3D-aware multi-modal guided hand generative network (MM-Hand), together with a novel geometrybased curriculum learning strategy. Our extensive experimental results demonstrate that the 3D-annotated images generated by MM-Hand qualitatively and quantitatively outperform existing options. Moreover, the augmented data can consistently improve the quantitative performance of the state-of-the-art 3D hand pose estimators on two benchmark datasets. The code will be available at https://github.com/ScottHoang/mm-hand. 1
Estimating the Impact of High-Fidelity Rainfall Data on Traffic Conditions and Traffic Prediction Prokhorchuk, A; Mitrovic, N; Muhammad, U; Stevanovic, A; Asif, MT; Dauwels, J; Jaillet, P Accurate prediction of network-level traffic parameters during inclement weather conditions can greatly help in many transportation applications. Rainfall tends to have a quantifiable impact on driving behavior and traffic network performance. This impact is often studied for low-resolution rainfall data on small road networks, whereas this study investigates it in the context of a large traffic network and high-resolution rainfall radar images. First, the impact of rainfall intensity on traffic performance throughout the day and for different road categories is analyzed. Next, it is investigated whether including rainfall information can improve the predictive accuracy of the state-of-the-art traffic forecasting methods. Numerical results show that the impact of rainfall on traffic varies for different rainfall intensities as well as for different times of the day and days of the week. The results also show that incorporating rainfall data into prediction models improves their overall performance. The average reduction in mean absolute percentage error (MAPE) for models with rainfall data is 4.5%. Experiments with downsampled rainfall data were also performed, and it was concluded that incorporating higher resolution weather data does indeed lead to an increase in performance of traffic prediction models. 1
The Interaction Between Conscientiousness and General Mental Ability: Support for a Compensatory Interaction in Task Performance Harris-Watson, AM; Kung, MC; Tocci, MC; Boyce, AS; Weekley, JA; Guenole, N; Carter, NT We propose a compensatory interactive influence of conscientiousness and GMA in task performance such that conscientiousness is most beneficial to performance for low-GMA individuals. Drawing on trait by trait interaction theory and empirical evidence for a compensatory mechanism of conscientiousness for low GMA, we contrast our hypothesis with prior research on a conscientiousness-GMA interaction and argue that prior research considered a different interaction type. We argue that observing a compensatory interaction likely requires (a) considering the appropriate interaction form, including a possible curvilinear conscientiousness-performance relationship; (b) measuring the full conscientiousness domain (as opposed to motivation proxies); (c) narrowing the criterion domain to reflect task performance; and (d) appropriate psychometric scoring of variables to increase power and avoid type 1 error. In four employee samples (N-1 = 300; N-2 = 261; N-3 = 1,413; N-4 = 948), we test a conscientiousness-GMA interaction in two employee samples. In three of four samples, results support a nuanced compensatory mechanism such that conscientiousness compensates for low to moderate GMA, and high conscientiousness may be detrimental to or unimportant for task performance in high-GMA individuals. 1

开始使用

开始使用

联系我们的教育行业专家,开启您的云上之旅。
关闭
1010 0766
由光环新网运营的
北京区域
1010 0966
由西云数据运营的
宁夏区域
关闭
由光环新网运营的
北京区域
由西云数据运营的
宁夏区域