本专栏是计算机视觉方向论文收集积累,时间:2021年5月11日,来源:paper digest
欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!
直达笔记地址:机器学习手推笔记(GitHub地址)
1, TITLE: Galois/monodromy Groups for Decomposing Minimal Problems in 3D Reconstruction
AUTHORS: Timothy Duff ; Viktor Korotynskiy ; Tomas Pajdla ; Margaret H. Regan
CATEGORY: math.AG [math.AG, cs.CV, cs.NA, math.NA]
HIGHLIGHT: For this problem of degree 64, we can reduce the degree to 16; the latter better reflecting the intrinsic difficulty of algebraically solving the problem.
2, TITLE: Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning
AUTHORS: PAN LU et. al.
CATEGORY: cs.CL [cs.CL, cs.AI, cs.CV, cs.FL]
HIGHLIGHT: Thus, we construct a new large-scale benchmark, Geometry3K, consisting of 3,002 geometry problems with dense annotation in formal language.
3, TITLE: Deep Feature Selection-and-fusion for RGB-D Semantic Segmentation
AUTHORS: Yuejiao Su ; Yuan Yuan ; Zhiyu Jiang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This work proposes a unified and efficient feature selectionand-fusion network (FSFNet), which contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information.
4, TITLE: Examining and Mitigating Kernel Saturation in Convolutional Neural Networks Using Negative Images
AUTHORS: Nidhi Gowdra ; Roopak Sinha ; Stephen MacDonell
CATEGORY: cs.CV [cs.CV, cs.NE]
HIGHLIGHT: In this paper, we analyze the effect of convolutional kernel saturation in CNNs and propose a simple data augmentation technique to mitigate saturation and increase classification accuracy, by supplementing negative images to the training dataset.
5, TITLE: Coupling Intent and Action for Pedestrian Crossing Behavior Prediction
AUTHORS: Yu Yao ; Ella Atkins ; Matthew Johnson Roberson ; Ram Vasudevan ; Xiaoxiao Du
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we follow the neuroscience and psychological literature to define pedestrian crossing behavior as a combination of an unobserved inner will (a probabilistic representation of binary intent of crossing vs. not crossing) and a set of multi-class actions (e.g., walking, standing, etc.).
6, TITLE: Multi-Agent Semi-Siamese Training for Long-tail and Shallow Face Learning
AUTHORS: HAILIN SHI et. al.
CATEGORY: cs.CV [cs.CV, I.4.10]
HIGHLIGHT: Based on the Semi-Siamese Training (SST), we introduce an advanced solution, named Multi-Agent Semi-Siamese Training (MASST), to address these problems.
7, TITLE: PillarSegNet: Pillar-based Semantic Grid Map Estimation Using Sparse LiDAR Data
AUTHORS: Juncong Fei ; Kunyu Peng ; Philipp Heidenreich ; Frank Bieder ; Christoph Stiller
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: To train and evaluate our approach, we use both sparse and dense ground truth, where the dense ground truth is obtained from multiple superimposed scans.
8, TITLE: Estimation of 3D Human Pose Using Prior Knowledge
AUTHORS: Shu Chen ; Lei Zhang ; Beiji Zou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: The experimental results on the H36M show that the method performed better than other state-of-the-art three-dimensional human pose estimation approaches.
9, TITLE: E-ViL: A Dataset and Benchmark for Natural Language Explanations in Vision-Language Tasks
AUTHORS: MAXIME KAYSER et. al.
CATEGORY: cs.CV [cs.CV, cs.CL, cs.LG]
HIGHLIGHT: In this work, we introduce e-ViL, a benchmark for explainable vision-language tasks that establishes a unified evaluation framework and provides the first comprehensive comparison of existing approaches that generate NLEs for VL tasks.
10, TITLE: PCA Event-Based Otical Flow for Visual Odometry
AUTHORS: Mahmoud Z. Khairallah ; Fabien Bonardi ; David Roussel ; Samia Bouchafa
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a Principal Component Analysis (PCA) approach to the problem of event-based optical flow estimation.
11, TITLE: Slash or Burn: Power Line and Vegetation Classification for Wildfire Prevention
AUTHORS: Austin Park ; Farzaneh Rajabi ; Ross Weber
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: Data is frequently taken from drone or satellite footage, but Google Street View offers an even more scalable and lower cost solution.
12, TITLE: Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning
AUTHORS: Dandan Guo ; Ruiying Lu ; Bo Chen ; Zequn Zeng ; Mingyuan Zhou
CATEGORY: cs.CV [cs.CV, stat.ML]
HIGHLIGHT: Inspired by recent successes in integrating semantic topics into this task, this paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework, which couples a visual extractor with a deep topic model to guide the learning of a language model.
13, TITLE: Analysis and Mitigations of Reverse Engineering Attacks on Local Feature Descriptors
AUTHORS: DEEKSHA DANGWAL et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We take this a step further and model potential adversaries using a privacy threat model.
14, TITLE: TrTr: Visual Tracking with Transformer
AUTHORS: Moju Zhao ; Kei Okada ; Masayuki Inaba
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel tracker network based on a powerful attention mechanism called Transformer encoder-decoder architecture to gain global and rich contextual interdependencies.
15, TITLE: Unsupervised Human Pose Estimation Through Transforming Shape Templates
AUTHORS: LUCA SCHMIDTKE et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper we present a novel method for learning pose estimators for human adults and infants in an unsupervised fashion.
16, TITLE: Trajectory Prediction for Autonomous Driving with Topometric Map
AUTHORS: Jiaolong Xu ; Liang Xiao ; Dawei Zhao ; Yiming Nie ; Bin Dai
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this work, we propose an end-to-end transformer networks based approach for map-less autonomous driving.
17, TITLE: Selective Probabilistic Classifier Based on Hypothesis Testing
AUTHORS: Saeed Bakhshi Germi ; Esa Rahtu ; Heikki Huttunen
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this paper, we propose a simple yet effective method to deal with the violation of the Closed-World Assumption for a classifier.
18, TITLE: An End-to-end Optical Character Recognition Approach for Ultra-low-resolution Printed Text Images
AUTHORS: Julian D. Gilbey ; Carola-Bibiane Sch�nlieb
CATEGORY: cs.CV [cs.CV, 68T10, I.7.5]
HIGHLIGHT: This approach is inspired from our understanding of the human visual system, and builds on established neural networks for performing OCR. We make our code and data (including a set of low-resolution images with their ground truths) publicly available as a benchmark for future work in this field.
19, TITLE: Seismic Fault Segmentation Via 3D-CNN Training By A Few 2D Slices Labels
AUTHORS: YiMin Dou ; Kewen Li ; Jianbing Zhu ; Xiao Li ; Yingjie Xi
CATEGORY: cs.CV [cs.CV, physics.geo-ph]
HIGHLIGHT: In this study, we present a new binary cross-entropy and smooth L1 loss ({\lambda}-BCE and {\lambda}-smooth L1) to effectively train 3D-CNN by sampling some 2D slices from 3D seismic data, so that the model can learn the segmentation of 3D seismic data from a few 2D slices.
20, TITLE: Distribution Matching for Heterogeneous Multi-Task Learning: A Large-scale Face Study
AUTHORS: Dimitrios Kollias ; Viktoriia Sharmanska ; Stefanos Zafeiriou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we deal with heterogeneous MTL, simultaneously addressing detection, classification & regression problems.
21, TITLE: Interaction Detection Between Vehicles and Vulnerable Road Users: A Deep Generative Approach with Attention
AUTHORS: HAO CHENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a deep conditional generative model for interaction detection at such locations.
22, TITLE: Conformer: Local Features Coupling Global Representations for Visual Recognition
AUTHORS: ZHILIANG PENG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a hybrid network structure, termed Conformer, to take advantage of convolutional operations and self-attention mechanisms for enhanced representation learning.
23, TITLE: AnomalyHop: An SSL-based Image Anomaly Localization Method
AUTHORS: KAITAI ZHANG et. al.
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: An image anomaly localization method based on the successive subspace learning (SSL) framework, called AnomalyHop, is proposed in this work.
24, TITLE: Binarized Weight Error Networks With A Transition Regularization Term
AUTHORS: Savas Ozkan ; Gozde Bozdagi Akar
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper proposes a novel binarized weight network (BT) for a resource-efficient neural structure.
25, TITLE: RelationTrack: Relation-aware Multiple Object Tracking with Decoupled Representation
AUTHORS: En Yu ; Zhuoling Li ; Shoudong Han ; Hongwei Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: With the target of alleviating this contradiction, we devise a module named Global Context Disentangling (GCD) that decouples the learned representation into detection-specific and ReID-specific embeddings.
26, TITLE: DocReader: Bounding-Box Free Training of A Document Information Extraction Model
AUTHORS: Shachar Klaiman ; Marius Lehne
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work we present DocReader, an end-to-end neural-network-based information extraction solution which can be trained using solely the images and the target values that need to be read.
27, TITLE: Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
AUTHORS: MATHEW MONFORT et. al.
CATEGORY: cs.CV [cs.CV, cs.CL, cs.LG, cs.SD, eess.AS]
HIGHLIGHT: To address this, we present the Spoken Moments (S-MiT) dataset of 500k spoken captions each attributed to a unique short video depicting a broad range of different events.
28, TITLE: ICON: Learning Regular Maps Through Inverse Consistency
AUTHORS: Hastings Greer ; Roland Kwitt ; Francois-Xavier Vialard ; Marc Niethammer
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We explore what induces regularity for spatial transformations, e.g., when computing image registrations.
29, TITLE: An Autonomous Drone for Search and Rescue in Forests Using Airborne Optical Sectioning
AUTHORS: D. C. Schedl ; I. Kurmi ; O. Bimber
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a first prototype that finds people fully autonomously in densely occluded forests.
30, TITLE: Truly Shift-equivariant Convolutional Neural Networks with Adaptive Polyphase Upsampling
AUTHORS: Anadi Chaman ; Ivan Dokmani?
CATEGORY: cs.CV [cs.CV, eess.SP]
HIGHLIGHT: We address this problem by proposing adaptive polyphase upsampling (APS-U), a non-linear extension of conventional upsampling, which allows CNNs to exhibit perfect shift equivariance.
31, TITLE: End-to-End Optical Character Recognition for Bengali Handwritten Words
AUTHORS: Farisa Benta Safir ; Abu Quwsar Ohi ; M. F. Mridha ; Muhammad Mostafa Monowar ; Md. Abdul Hamid
CATEGORY: cs.CV [cs.CV, cs.IR]
HIGHLIGHT: This paper introduces an end-to-end OCR system for Bengali language.
32, TITLE: Active Terahertz Imaging Dataset for Concealed Object Detection
AUTHORS: Dong Liang ; Fei Xue ; Ling Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we provide a public dataset for evaluating multi-object detection algorithms in active Terahertz imaging resolution 5 mm by 5 mm.
33, TITLE: A Novel Triplet Sampling Method for Multi-Label Remote Sensing Image Search and Retrieval
AUTHORS: Tristan Kreuziger ; Mahdyar Ravanbakhsh ; Beg�m Demir
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: To address this problem, in this paper we propose a novel triplet sampling method in the framework of deep neural networks (DNNs) defined for multi-label RS CBIR problems.
34, TITLE: Temporal-Spatial Feature Pyramid for Video Saliency Detection
AUTHORS: Qinyao Chang ; Shiping Zhu ; Lanyun Zhu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a 3D fully convolutional encoder-decoder architecture for video saliency detection, which combines scale, space and time information for video saliency modeling.
35, TITLE: You Only Learn One Representation: Unified Network for Multiple Tasks
AUTHORS: Chien-Yao Wang ; I-Hau Yeh ; Hong-Yuan Mark Liao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a unified network to encode implicit knowledge and explicit knowledge together, just like the human brain can learn knowledge from normal learning as well as subconsciousness learning.
36, TITLE: MDA-Net: Multi-Dimensional Attention-Based Neural Network for 3D Image Segmentation
AUTHORS: Rutu Gandhi ; Yi Hong
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: To address this challenge, we propose a multi-dimensional attention network (MDA-Net) to efficiently integrate slice-wise, spatial, and channel-wise attention into a U-Net based network, which results in high segmentation accuracy with a low computational cost.
37, TITLE: Action Shuffling for Weakly Supervised Temporal Localization
AUTHORS: Xiao-Yu Zhang ; Haichao Shi ; Changsheng Li ; Xinchu Shi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To be specific, we propose a novel two-branch network architecture with intra/inter-action shuffling, referred to as ActShufNet.
38, TITLE: Overcoming The Distance Estimation Bottleneck in Camera Trap Distance Sampling
AUTHORS: Timm Haucke ; Hjalmar S. K�hl ; Jacqueline Hoyer ; Volker Steinhage
CATEGORY: cs.CV [cs.CV, I.4.9; I.5.4]
HIGHLIGHT: To overcome this distance estimation bottleneck in CTDS, this study proposes a completely automatized workflow utilizing state-of-the-art methods of image processing and pattern recognition.
39, TITLE: Self-Supervised Adversarial Example Detection By Disentangled Representation
AUTHORS: ZHAOXI ZHANG et. al.
CATEGORY: cs.CV [cs.CV, cs.CR, cs.LG]
HIGHLIGHT: We compare our method with the state-of-the-art self-supervised detection methods under different adversarial attacks and different victim models (30 attack settings), and it exhibits better performance in various measurements (AUC, FPR, TPR) for most attacks settings.
40, TITLE: Stochastic Image-to-Video Synthesis Using CINNs
AUTHORS: MICHAEL DORKENWALD et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Video understanding calls for a model to learn the characteristic interplay between static scene content and its dynamics: Given an image, the model must be able to predict a future progression of the portrayed scene and, conversely, a video should be explained in terms of its static image content and all the remaining characteristics not present in the initial frame.
41, TITLE: Self-Supervised Learning with Swin Transformers
AUTHORS: ZHENDA XIE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a self-supervised learning approach called MoBY, with Vision Transformers as its backbone architecture.
42, TITLE: Event-LSTM: An Unsupervised and Asynchronous Learning-based Representation for Event-based Data
AUTHORS: Lakshmi Annamalai ; Vignesh Ramanathan ; Chetan Singh Thakur
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome this limitation, we propose Event-LSTM, an unsupervised Auto-Encoder architecture made up of LSTM layers as a promising alternative to learn 2D grid representation from event sequence.
43, TITLE: Estimating Parkinsonism Severity in Natural Gait Videos of Older Adults with Dementia
AUTHORS: Andrea Sabo ; Sina Mehdizadeh ; Andrea Iaboni ; Babak Taati
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We propose a two-stage training approach consisting of a self-supervised pretraining stage that encourages the ST-GCN model to learn about gait patterns before predicting clinical scores in the finetuning stage.
44, TITLE: An Attention-Fused Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery
AUTHORS: XUAN YANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a multipath encoder structure to extract features of multipath inputs, a multipath attention-fused block module to fuse multipath features, and a refinement attention-fused block module to fuse high-level abstract features and low-level spatial features.
45, TITLE: Video Anomaly Detection By The Duality Of Normality-Granted Optical Flow
AUTHORS: Hongyong Wang ; Xinjian Zhang ; Su Yang ; Weishan Zhang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose to discriminate anomalies from normal ones by the duality of normality-granted optical flow, which is conducive to predict normal frames but adverse to abnormal frames.
46, TITLE: Visual Grounding with Transformers
AUTHORS: Ye Du ; Zehua Fu ; Qingjie Liu ; Yunhong Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a transformer based approach for visual grounding.
47, TITLE: Primitive Representation Learning for Scene Text Recognition
AUTHORS: Ruijie Yan ; Liangrui Peng ; Shanyu Xiao ; Gang Yao
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a primitive representation learning method that aims to exploit intrinsic representations of scene text images.
48, TITLE: The IWildCam 2021 Competition Dataset
AUTHORS: Sara Beery ; Arushi Agarwal ; Elijah Cole ; Vighnesh Birodkar
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Object detection techniques can be used to find the number of individuals in each image.
49, TITLE: Human-Aided Saliency Maps Improve Generalization of Deep Learning
AUTHORS: Aidan Boyd ; Kevin Bowyer ; Adam Czajka
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We address these challenges in a novel way, with the first-ever (to our knowledge) exploration of encoding human judgement about salient regions of images into the training data.
50, TITLE: TextAdaIN: Fine-Grained AdaIN for Robust Text Recognition
AUTHORS: Oren Nuriel ; Sharon Fogel ; Ron Litman
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Motivated by this, we suggest an approach to regulate the reliance on local statistics that improves overall text recognition performance.
51, TITLE: KDExplainer: A Task-oriented Attention Model for Explaining Knowledge Distillation
AUTHORS: MENGQI XUE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we introduce a novel task-oriented attention model, termed as KDExplainer, to shed light on the working mechanism underlying the vanilla KD.
52, TITLE: AFINet: Attentive Feature Integration Networks for Image Classification
AUTHORS: XINGLIN PAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we design Attentive Feature Integration (AFI) modules, which are widely applicable to most recent network architectures, leading to new architectures named AFI-Nets.
53, TITLE: Preserving Privacy in Human-Motion Affect Recognition
AUTHORS: Matthew Malek-Podjaski ; Fani Deligianni
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: Therefore to this end, we propose a cross-subject transfer learning technique for training a multi-encoder autoencoder deep neural network to learn disentangled latent representations of human motion features.
54, TITLE: Good Practices and A Strong Baseline for Traffic Anomaly Detection
AUTHORS: YUXIANG ZHAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a straightforward and efficient framework that includes pre-processing, a dynamic track module, and post-processing.
55, TITLE: A Hybrid Model for Combining Neural Image Caption and K-Nearest Neighbor Approach for Image Captioning
AUTHORS: Kartik Arora ; Ajul Raj ; Arun Goel ; Seba Susan
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: A hybrid model is proposed that integrates two popular image captioning methods to generate a text-based summary describing the contents of the image.
56, TITLE: Dataset and Performance Comparison of Deep Learning Architectures for Plum Detection and Robotic Harvesting
AUTHORS: Jasper Brown ; Salah Sukkarieh
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this work, two new datasets are gathered during day and night operation of an actual robotic plum harvesting system.
57, TITLE: Fish Disease Detection Using Image Based Machine Learning Technique in Aquaculture
AUTHORS: Md Shoaib Ahmed ; Tanjim Taharat Aurpa ; Md. Abul Kalam Azad
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we want to find out the salmon fish disease in aquaculture, as salmon aquaculture is the fastest-growing food production system globally, accounting for 70 percent (2.5 million tons) of the market.
58, TITLE: Beyond Monocular Deraining: Parallel Stereo Deraining Network Via Semantic Prior
AUTHORS: KAIHAO ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a Paired Rain Removal Network (PRRNet), which exploits both stereo images and semantic information.
59, TITLE: Video Class Agnostic Segmentation with Contrastive Learningfor Autonomous Driving
AUTHORS: Mennatullah Siam ; Alex Kendall ; Martin Jagersand
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel auxiliary contrastive loss to learn the segmentation of known classes and unknown objects. We further release a large-scale synthetic dataset for different autonomous driving scenarios that includes distinct and rare unknown objects.
60, TITLE: Elastic Weight Consolidation (EWC): Nuts and Bolts
AUTHORS: Abhishek Aich
CATEGORY: cs.CV [cs.CV, cs.LG, stat.ML]
HIGHLIGHT: In this report, we present a theoretical support of the continual learning method \textbf{Elastic Weight Consolidation}, introduced in paper titled `Overcoming catastrophic forgetting in neural networks'.
61, TITLE: Reconstructive Sequence-Graph Network for Video Summarization
AUTHORS: Bin Zhao ; Haopeng Li ; Xiaoqiang Lu ; Xuelong Li
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Motivated by this point, we propose a Reconstructive Sequence-Graph Network (RSGN) to encode the frames and shots as sequence and graph hierarchically, where the frame-level dependencies are encoded by Long Short-Term Memory (LSTM), and the shot-level dependencies are captured by the Graph Convolutional Network (GCN).
62, TITLE: Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms
AUTHORS: WEI LOU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper proposes Dynamic-OFA, a novel dynamic DNN approach for state-of-the-art platform-aware NAS models (i.e. Once-for-all network (OFA)).
63, TITLE: Robust Training Using Natural Transformation
AUTHORS: SHUO WANG et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To bridge this gap, we present NaTra, an adversarial training scheme that is designed to improve the robustness of image classification algorithms.
64, TITLE: CFPNet-M: A Light-Weight Encoder-Decoder Based Network for Multimodal Biomedical Image Real-Time Segmentation
AUTHORS: Ange Lou ; Shuyue Guan ; Murray Loew
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: Based on these two modifications, we proposed a novel light-weight architecture -- Channel-wise Feature Pyramid Network for Medicine (CFPNet-M).
65, TITLE: Self-supervised Spectral Matching Network for Hyperspectral Target Detection
AUTHORS: Can Yao ; Yuan Yuan ; Zhiyu Jiang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: The model adopts a spectral similarity based matching network framework.
66, TITLE: Facial Emotion Recognition: State of The Art Performance on FER2013
AUTHORS: Yousif Khaireddin ; Zhuofa Chen
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this work, we achieve the highest single-network classification accuracy on the FER2013 dataset.
67, TITLE: ABCNet V2: Adaptive Bezier-Curve Network for Real-time End-to-end Text Spotting
AUTHORS: YULIANG LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Our main contributions are four-fold: 1) For the first time, we adaptively fit arbitrarily-shaped text by a parameterized Bezier curve, which, compared with segmentation-based methods, can not only provide structured output but also controllable representation.
68, TITLE: SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation
AUTHORS: Bing Li ; Cheng Zheng ; Silvio Giancola ; Bernard Ghanem
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We propose a novel scene flow estimation approach to capture and infer 3D motions from point clouds.
69, TITLE: CASIA-Face-Africa: A Large-scale African Face Image Database
AUTHORS: Jawad Muhammad ; Yunlong Wang ; Caiyong Wang ; Kunbo Zhang ; Zhenan Sun
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Many investigative studies on face recognition algorithms have reported higher false positive rates of African subjects cohorts than the other cohorts. To this end, we collect a face image database namely CASIA-Face-Africa which contains 38,546 images of 1,183 African subjects.
70, TITLE: Domain-Specific Suppression for Adaptive Object Detection
AUTHORS: YU WANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose the domain-specific suppression, an exemplary and generalizable constraint to the original convolution gradients in backpropagation to detach the two parts of directions and suppress the domain-specific one.
71, TITLE: Incremental Training and Group Convolution Pruning for Runtime DNN Performance Scaling on Heterogeneous Embedded Platforms
AUTHORS: Lei Xun ; Long Tran-Thanh ; Bashir M Al-Hashimi ; Geoff V. Merrett
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a dynamic DNN using incremental training and group convolution pruning.
72, TITLE: Optimising Resource Management for Embedded Machine Learning
AUTHORS: Lei Xun ; Long Tran-Thanh ; Bashir M Al-Hashimi ; Geoff V. Merrett
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present approaches for online resource management in heterogeneous multi-core systems and show how they can be applied to optimise the performance of machine learning workloads.
73, TITLE: Improving Robustness for Pose Estimation Via Stable Heatmap Regression
AUTHORS: YUMENG ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In view of this problem, a stable heatmap regression method is proposed to alleviate network vulnerability to small perturbations.
74, TITLE: Sign-Agnostic CONet: Learning Implicit Surface Reconstructions By Sign-Agnostic Optimization of Convolutional Occupancy Networks
AUTHORS: JIAPENG TANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we propose to learn implicit surface reconstruction by sign-agnostic optimization of convolutional occupancy networks, to simultaneously achieve advanced scalability, generality, and applicability in a unified framework.
75, TITLE: Boosting Semi-Supervised Face Recognition with Noise Robustness
AUTHORS: YUCHI LIU et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
76, TITLE: An Enhanced Randomly Initialized Convolutional Neural Network for Columnar Cactus Recognition in Unmanned Aerial Vehicle Imagery
AUTHORS: SAFA BEN ATITALLAH et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose an Enhanced Randomly Initialized Convolutional Neural Network (ERI-CNN) for the recognition of columnar cactus, which is an endemic plant that exists in the Tehuac\'an-Cuicatl\'an Valley in southeastern Mexico.
77, TITLE: Unsupervised Remote Sensing Super-Resolution Via Migration Image Prior
AUTHORS: JIAMING WANG et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: In this paper, we proposed a new unsupervised learning framework, called "MIP", which achieves SR tasks without low/high resolution image pairs.
78, TITLE: Learning to Predict Repeatability of Interest Points
AUTHORS: Anh-Dzung Doan ; Daniyar Turmukhambetov ; Yasir Latif ; Tat-Jun Chin ; Soohyun Bae
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: This paper proposes to predict the repeatability of an interest point as a function of time, which can tell us the lifespan of the interest point considering daily or seasonal variation.
79, TITLE: Improving Cost Learning for JPEG Steganography By Exploiting JPEG Domain Knowledge
AUTHORS: Weixuan Tang ; Bin Li ; Mauro Barni ; Jin Li ; Jiwu Huang
CATEGORY: cs.CR [cs.CR, cs.CV]
HIGHLIGHT: To address the issue, in this paper we extend an existing automatic cost learning scheme to JPEG, where the proposed scheme called JEC-RL (JPEG Embedding Cost with Reinforcement Learning) is explicitly designed to tailor the JPEG DCT structure.
80, TITLE: The Modulo Radon Transform: Theory, Algorithms and Applications
AUTHORS: Matthias Beckmann ; Ayush Bhandari ; Felix Krahmer
CATEGORY: cs.IT [cs.IT, cs.CV, eess.SP, math.IT]
HIGHLIGHT: By harnessing a joint design between hardware and algorithms, we present a single-shot HDR tomography approach, which to our knowledge, is the only approach that is backed by mathematical guarantees.
81, TITLE: Contrastive Conditional Transport for Representation Learning
AUTHORS: HUANGJIE ZHENG et. al.
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: This paper proposes contrastive conditional transport (CCT) that defines its CL loss over dependent sample-query pairs, which in practice is realized by drawing a random query, randomly selecting positive and negative samples, and contrastively reweighting these samples according to their distances to the query, exerting a greater force to both pull more distant positive samples towards the query and push closer negative samples away from the query.
82, TITLE: MetaKernel: Learning Variational Random Features with Limited Labels
AUTHORS: YINGJUN DU et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we propose meta-learning kernels with random Fourier features for few-shot learning, we call MetaKernel.
83, TITLE: Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels
AUTHORS: Erik Englesson ; Hossein Azizpour
CATEGORY: cs.LG [cs.LG, cs.CV, stat.ML]
HIGHLIGHT: We propose two novel loss functions based on Jensen-Shannon divergence for learning under label noise.
84, TITLE: In-Hindsight Quantization Range Estimation for Quantized Training
AUTHORS: Marios Fournarakis ; Markus Nagel
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: We propose a simple alternative to dynamic quantization, in-hindsight range estimation, that uses the quantization ranges estimated on previous iterations to quantize the present.
85, TITLE: Learning High-Dimensional Distributions with Latent Neural Fokker-Planck Kernels
AUTHORS: Yufan Zhou ; Changyou Chen ; Jinhui Xu
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this paper, we introduce new techniques to formulate the problem as solving Fokker-Planck equation in a lower-dimensional latent space, aiming to mitigate challenges in high-dimensional data space.
86, TITLE: HamNet: Conformation-Guided Molecular Representation with Hamiltonian Neural Networks
AUTHORS: Ziyao Li ; Shuwen Yang ; Guojie Song ; Lingsheng Cai
CATEGORY: cs.LG [cs.LG, cs.CV, physics.chem-ph]
HIGHLIGHT: In this paper, we propose a novel molecular representation algorithm which preserves 3D conformations of molecules with a Molecular Hamiltonian Network (HamNet).
87, TITLE: Optimization of Graph Neural Networks: Implicit Acceleration By Skip Connections and More Depth
AUTHORS: Keyulu Xu ; Mozhi Zhang ; Stefanie Jegelka ; Kenji Kawaguchi
CATEGORY: cs.LG [cs.LG, cs.CV, math.OC, stat.ML]
HIGHLIGHT: We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs.
88, TITLE: De-homogenization Using Convolutional Neural Networks
AUTHORS: Martin O. Elingaard ; Niels Aage ; J. Andreas B�rentzen ; Ole Sigmund
CATEGORY: cs.LG [cs.LG, cs.CV, J.6; I.4.9; I.2.6]
HIGHLIGHT: This paper presents a deep learning-based de-homogenization method for structural compliance minimization.
89, TITLE: Pareto-Optimal Quantized ResNet Is Mostly 4-bit
AUTHORS: AMIRALI ABDOLRASHIDI et. al.
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this work, we use ResNet as a case study to systematically investigate the effects of quantization on inference compute cost-quality tradeoff curves.
90, TITLE: Learning Image Attacks Toward Vision Guided Autonomous Vehicles
AUTHORS: Hyung-Jin Yoon ; Hamid Jafarnejad Sani ; Petros Voulgaris
CATEGORY: cs.RO [cs.RO, cs.CR, cs.CV, cs.LG]
HIGHLIGHT: This paper presents an online adversarial machine learning framework that can effectively misguide autonomous vehicles' missions.
91, TITLE: Chameleon: A Semi-AutoML Framework Targeting Quick and Scalable Development and Deployment of Production-ready ML Systems for SMEs
AUTHORS: Johannes Otterbach ; Thomas Wollmann
CATEGORY: cs.SE [cs.SE, cs.CV, cs.LG]
HIGHLIGHT: Subsequently, we present how one can use a templatable framework in order to automate the experiment iteration cycle, as well as close the gap between development and deployment.
92, TITLE: A Framework for The Automation of Testing Computer Vision Systems
AUTHORS: Franz Wotawa ; Lorenz Klampfl ; Ledio Jahaj
CATEGORY: cs.SE [cs.SE, cs.CV, 68N30]
HIGHLIGHT: In this paper, we contribute to the area of testing vision software, and present a framework for the automated generation of tests for systems based on vision and image recognition.
93, TITLE: T-EMDE: Sketching-based Global Similarity for Cross-modal Retrieval
AUTHORS: Barbara Rychalska ; Mikolaj Wieczorek ; Jacek Dabrowski
CATEGORY: stat.ML [stat.ML, cs.CV, cs.LG, I.5.1; I.5.4; I.2.7; I.2.10; H.3.3]
HIGHLIGHT: With T-EMDE we introduce a trainable version of EMDE which allows full end-to-end training.
94, TITLE: Automatic Segmentation of Vertebral Features on Ultrasound Spine Images Using Stacked Hourglass Network
AUTHORS: HONG-YE ZENG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We propose an automatic segmentation method based on Stacked Hourglass Network (SHN) to detect the spinous processes (SP) on ultrasound (US) spine images and to measure the SPAs of clinical scoliotic subjects.
95, TITLE: DiagSet: A Dataset for Prostate Cancer Histopathological Image Classification
AUTHORS: MICHA? KOZIARSKI et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper we introduce a novel histopathological dataset for prostate cancer detection.
96, TITLE: Improved Simultaneous Multi-Slice Functional MRI Using Self-supervised Deep Learning
AUTHORS: OMER BURAK DEMIREL et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG, eess.SP, physics.med-ph]
HIGHLIGHT: In this work, we extend self-supervised DL reconstruction to SMS imaging.
97, TITLE: Weakly Supervised Pan-cancer Segmentation Tool
AUTHORS: MARVIN LEROUSSEAU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we propose a novel weakly supervised multi-instance learning approach that deciphers quantitative slide-level annotations which are fast to obtain and regularly present in clinical routine.
98, TITLE: Coconut Trees Detection and Segmentation in Aerial Imagery Using Mask Region-based Convolution Neural Network
AUTHORS: Muhammad Shakaib Iqbal ; Hazrat Ali ; Son N. Tran ; Talha Iqbal
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this article, a deep learning approach is presented for the detection and segmentation of coconut tress in aerial imagery provided through the AI competition organized by the World Bank in collaboration with OpenAerialMap and WeRobotics.
99, TITLE: Lightweight Image Super-Resolution with Hierarchical and Differentiable Neural Architecture Search
AUTHORS: HAN HUANG et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose a novel differentiable Neural Architecture Search (NAS) approach on both the cell-level and network-level to search for lightweight SISR models.
100, TITLE: Acute Lymphoblastic Leukemia Detection from Microscopic Images Using Weighted Ensemble of Convolutional Neural Networks
AUTHORS: CHAYAN MONDAL et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: Since the proposed kappa value-based weighted ensemble yields a better result for the aimed task in this article, it can experiment in other domains of medical diagnostic applications.