
Department of Electrical Engineering
Control Robotics and Machine Learning Lab
Technion - Israel Institute of Technology
המעבדה לבקרה רובוטיקה ולמידה חישובית

Unsupervised domain adaptation for human movement

Background
Generative Adversarial Networks (GANs) are a class of neural networks which have gained popularity in the past couple years, and for good reason. Put most simply, they allow a network to learn to generate data with the same internal structure as other data. If that description sounds a little general, that is because GANs are powerful and flexible tools. To make things a little more concrete, one of the more common applications of GANs is image generation. Say you have a bunch of images, such as pictures of cats. A GAN can learn to generate pictures of cats like those real ones used for training, but not actually replicate any one of the individual images.
GANs typically use two Neural Nets: G the Generator, and D, the Discriminator, that are trained against each other. D is trained to discriminate between real images and the images that are generated by G, while G is trained to fool G.
One of the most promising applications for GANs is unsupervised domain adaptation. Here we want to transfer inputs between two domains, say A and B, using GANs. For example, domain A can be images of horses and B images of Zebras. We can then train a GAN to learn the mapping between the domains, even without pairing the images. For this type of problems, the CyclicGAN architecture showed promising results lately.
Skeleton tracking: 2D pose estimation is the problem of detecting key-points or "parts" of humans in an image, then coherently combining them into skeletons. Recently, the "OpenPose" library (https://github.com/CMU-Perceptual-Computing-Lab/openpose) was released for public usage. This library presents 2D multiple human pose estimation algorithm which is based on a deep neural network architecture. Openpose significantly exceeded previous state-of-the-art results and is working in real-time.
Project Goal
Given an input a video, the goal is to alter the body movements of people in the input video, to any other desired movement pattern (dancing, jumping, running, etc.). For that end, we will use two state-of-the-art machine learning based methods, namely, human skeleton estimation and generative adversarial networks.
Project Steps
-
Choose video data sets (denote as domains A,B)
-
Use OpenPose to extract the skeletons from both videos (Denote A',B')
-
Train a CyclicGan/Pix2Pix to map images from A to A' (G_AA', G_A'A) and from B to B' (G_BB', G_B'B).
-
Given a video in A map it as GBB'(GA'A(A))
Required knowledge
-
Any knowledge in Machine Learning or Neural Networks is an advantage.
Links
GANs:
Cyclic GAN:
Skeleton Tracking:
Environment
-
PyTorch/TensorFlow (python).

