一、简介
二、人体关键点检测数据集
三、关键点检测任务的目标构建
四、单人2D关键点检测相关算法
五、多人2D关键点检测相关算法
六、3D关键点检测相关算法
关键点检测领域包括人脸关键点、人体关键点、特定类别物体(如手骨)关键点检测等。其中人体骨骼关键点检测是其中比较热门,难度系数较高,且应用非常广泛的一个研究领域,在自动驾驶中也会有很好的应用前景,所以本文主要是介绍人体关键点检测的一些相关内容。
人体骨骼关键点检测是诸多计算机视觉任务的基础,例如姿态估计,行为识别,人机交互,虚拟现实,智能家居,以及无人驾驶等等。由于人体具有柔韧性,会出现各种姿态,人体任何部位的变化都会产生新的姿态,同时关键点的可见性受姿态、穿着、视角等影响非常大,而且还面临着遮挡、光照等环境的影响,使得人体骨骼关键点检测成为计算机视觉领域中一个极具挑战性的课题。本文主要介绍内容包括:
LSP 地址:http://sam.johnson.io/research/lsp.html
FLIC 地址:https://bensapp.github.io/flic-dataset.html
MPII 地址:http://human-pose.mpi-inf.mpg.de/
MSCOCO 地址:http://cocodataset.org/#download
AI Chanllenge 地址:https://challenger.ai/competition/keypoint/subject
Pose Track 地址:https://www.posetrack.net/users/download.php
3D数据集
Human3.6M 地址:http://vision.imar.ro/human3.6m/description.php
HumanEva 地址:http://humaneva.is.tue.mpg.de/
Total Capture 地址:https://github.com/CMU-Perceptual-Computing-Lab/panoptic-toolbox、http://domedb.perception.cs.cmu.edu/dataset.html
JTA Dataset 地址:http://aimagelab.ing.unimore.it/jta、https://github.com/fabbrimatteo/JTA-Dataset
MPI-INF-3DHP 地址:http://gvv.mpi-inf.mpg.de/3dhp-dataset/
SURREAL 地址:https://www.di.ens.fr/willow/research/surreal/data/
UP-3D 地址:http://files.is.tuebingen.mpg.de/classner/up/
DensePose COCO 地址:https://github.com/facebookresearch/DensePose、https://www.aiuai.cn/aifarm278.html、http://densepose.org/#dataset
1)Coordinate
Coordinate即直接将关键点坐标作为最后网络需要回归的目标,这种情况下可以直接得到每个坐标点的直接位置信息。
2)Heatmap
Heatmap即将每一类坐标用一个概率图来表示,对图片中的每个像素位置都给一个概率,表示该点属于对应类别关键点的概率,比较自然的是,距离关键点位置越近的像素点的概率越接近1,距离关键点越远的像素点的概率越接近0,具体可以通过相应函数进行模拟,如二维Gaussian等,如果同一个像素位置距离不同关键点的距离大小不同,即相对于不同关键点该位置的概率不一样,这时可以取Max或Average。
对于两种Ground Truth的差别:
3)Heatmap + Offsets
Heatmap + Offsets是Google在CVPR 2017上提出的,与单纯的Heatmap不同的是,Google的Heatmap指的是在距离目标关键点一定范围内的所有点的概率值都为1,在Heatmap之外,使用Offsets,即偏移量来表示距离目标关键点一定范围内的像素位置与目标关键点之间的关系。
1.DeepPose: Human Pose Estimation via Deep Neural Networks (CVPR’14)
2.Efficient Object Localization Using Convolutional Networks (CVPR’15)
3.Convolutional Pose Machines(2016)
4.Learning Feature Pyramids for Human Pose Estimation(ICCV2017)
5.Stacked Hourglass Networks for Human Pose Estimation (2017)
6.Multi-Context Attention for Human Pose Estimation (2018)
7.A Cascaded Inception of Inception Network with Attention Modulated Feature Fusion for Human Pose Estimation (2018)
8.Deeply Learned Compositional Models for Human Pose Estimation (2018ECCV)
9.Human Pose Estimation with Spatial Contextual Information (2019)
10.Cascade Feature Aggregation for Human Pose Estimation (2019)
11.Toward fast and accurate human pose estimation via soft-gated skip connections (2020)
多人关键点检测分自上而下和自下而上两种方法:
1.RMPE: Regional Multi-Person Pose Estimation(2018)
2.Cascaded Pyramid Network for Multi-Person Pose Estimation(cpn)(2018)
3.Rethinking on Multi-Stage Networks for Human Pose Estimation(2019)
4.Spatial Shortcut Network for Human Pose Estimation(2019)
5.Deep High-Resolution Representation Learning for Human Pose Estimation (2019cvpr)
1.OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields(IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE2019)
2.Single-Network Whole-Body Pose Estimation(ICCV2019)
1.Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose(2017)
2.A simple yet effective baseline for 3d human pose estimation(ICCV2017)
3.RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation(CVPR2019)
4.Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network(cvpr2019)
5.Learnable Triangulation of Human Pose(ICCV 2019 oral)
6.Weakly-Supervised Discovery of Geometry-Aware Representation for 3D HumanPose Estimation(cvpr2019)
7.3D human pose estimation in video with temporal convolutions and semi-supervised training (cvpr2019)
8.Semantic Graph Convolutional Networks for 3D Human Pose Regression (cvpr2019)
9.Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks(ICCV2019)
10.3D Human Pose Estimation using Spatio-Temporal Networks with Explicit Occlusion Training (AAAI2020)
11.Motion Guided 3D Pose Estimation from Videos(2020)
12.XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera(2020)
13.VIBE: Video Inference for Human Body Pose and Shape Estimation (2020cvpr)