Skeletons key points

Introduction

The chameleon simulator can record key animation points for each animatable character in the scene; keypoint tracking allows trackable points to be defined when actors are created so that the screen space locations of those points can be tracked during the simulation.

Trackable points can be defined for any vertex defined as part of the skinned mesh plus any of the skeleton transforms of the actor model, plus the transforms of any game objects that may be attached to the model for explicit tracking purposes.

Tracked points' screen space coordinates are written out to an annotations file.

The key points are the 2D screen-space projections of the transform positions of each bone in the animated skeleton. Rotations are not recorded.

Basic concepts

In the 3D model, each bone in an animated skeleton is specified by the position of its origin plus its rotation. Bones are stored in a hierarchy starting at a root bone, usually the hip, with the position and rotation of nested bones stored as local transforms relative to their immediate parent.

Pose estimation networks attempt to extract pose from 2D images without the context of hierarchy so the stored ground truth pose information is in screen space, with the relationships between key points stored explicitly to form a node/edge graph. For points on a surface, such as face keypoints, no hierarchy exists and so the points and their mesh relationships are always stored explicitly.

In Chameleon, the bones as well as keypoints have their connection relationships (if any)s stored explicitly when the model is imported so that bones and points can be combined into the same skeleton and so that multiple skeletons can be stored for each actor.

Selection of skeletons in the Scenario Editor

In the scenario editor, the actor configuration dialog will expose the available skeletons and allow the user to choose which ones will be tracked. Only one skeleton, or combination of skeletons may be set for a specific category/subcategory.

Skeletons supported in this version

Chameleon will support the following skeletons:

Name Purpose Notes
minimal head and foot tracking for top-down view
basic general activity/gait/gesture tracking. Compatible with Google posenet
hands detailed gesture tracking
face emotion and gaze tracking

The skeletons are set in the simulation template, you can find about the simulation template and these setting here.

Minimal Skeleton

The minimal skeleton shows the keypoints of the top of the head and the bottom of the feet.
Minimal Skeleton.png

"minimal":
[
	"L_Foot, Head_Top",
	"R_Foot, Head_Top"
	"L_Foot, R_Foot"
]

Basic Skeleton

The basic skeleton give more joint keypoints and basic facial keypoints (eyes and nose)
Basic Skeleton.png

"basic":
[
	"DLib_34, ",
	"L_Eye, ",
	"R_Eye, "
	"DLib_2, ",
	"DLib_16, ",
	"L_UpperArm, L_ForeArm",
	"R_UpperArm, R_ForeArm",
	"L_ForeArm, L_HAnd ",
	"R_ForeArm, R_Hand",
	"L_Hand, L_ForeArm",
	"R_Hand, R_ForeArm",
	"L_Thigh, L_UpperArm ",
	"R_Thigh, R_UpperArm",
	"L_Calf, L_Thigh",
	"R_Calf, R_Thigh",
	"L_Ankle, L_Calf",
	"R_Ankle, R_Calf",
	"L_Foot, L_Ankle",
	"R_Foot, R_Ankle"
]

Hands Skeleton

The hand skeleton creates keypoints for every joint in the hand.
Hands-keypoints.png

"hands":
[
	"L_Hand,",
	"L_Thumb_1,   L_Hand ",
	"L_Thumb_2,   L_Thumb_1 ",
	"L_Thumb_3,   L_Thumb_2 ",
	"L_Thumb_Tip, L_Thumb_3 ",
	"L_Index_1,   L_Hand ",
	"L_Index_2,   L_Index_1 ",
	"L_Index_3,   L_Index_2 ",
	"L_Index_Tip, L_Index_3 ",
	"L_Middle_1,   L_Hand ",
	"L_Middle_2,   L_Middle_1 ",
	"L_Middle_3,   L_Middle_2 ",
	"L_Middle_Tip, L_Middle_3 ",
	"L_Ring_1,   L_Hand ",
	"L_Ring_2,   L_Ring_1 ",
	"L_Ring_3,   L_Ring_2 ",
	"L_Ring_Tip, L_Ring_3 ",
	"L_Little_1,   L_Hand ",
	"L_Little_2,   L_Little_1 ",
	"L_Little_3,   L_Little_2 ",
	"L_Little_Tip, L_Little_3 ",
	"R_Hand,",
	"R_Thumb_1,    R_Hand ",
	"R_Thumb_2,    R_Thumb_1 ",
	"R_Thumb_3,    R_Thumb_2 ",
	"R_Thumb_Tip,  R_Thumb_3 ",
	"R_Index_1,    R_Hand ",
	"R_Index_2,    R_Index_1 ",
	"R_Index_3,    R_Index_2 ",
	"R_Index_Tip,  R_Index_3 ",
	"R_Middle_1,   R_Hand ",
	"R_Middle_2,   R_Middle_1 ",
	"R_Middle_3,   R_Middle_2 ",
	"R_Middle_Tip, R_Middle_3 ",
	"R_Ring_1,     R_Hand ",
	"R_Ring_2,     R_Ring_1 ",
	"R_Ring_3,     R_Ring_2 ",
	"R_Ring_Tip,   R_Ring_3 ",
	"R_Little_1,   R_Hand ",
	"R_Little_2,   R_Little_1 ",
	"R_Little_3,   R_Little_2 ",
	"R_Little_Tip, R_Little_3 "
]

Face Skeleton

The face skeleton uses the Dilib 68 standard, which uses 68 points to map the jaw line, mouth, eyes and eye brows.
Face-keypoints.png

"face":
[
	"L_Eye,",
	"R_Eye, ",
	"DLib_1 , ",
	"DLib_2 , ",
	"DLib_3 , ",
	"DLib_4 , ",
	"DLib_5 , ",
	"DLib_6 , ",
	"DLib_7 , ",
	"DLib_8 , ",
	"DLib_9 , ",
	"DLib_10, ",
	"DLib_11, ",
	"DLib_12, ",
	"DLib_13, ",
	"DLib_14, ",
	"DLib_15, ",
	"DLib_16, ",
	"DLib_17, ",
	"DLib_18, ",
	"DLib_19, ",
	"DLib_20, ",
	"DLib_21, ",
	"DLib_22, ",
	"DLib_23, ",
	"DLib_24, ",
	"DLib_25, ",
	"DLib_26, ",
	"DLib_27, ",
	"DLib_28, ",
	"DLib_29, ",
	"DLib_30, ",
	"DLib_31, ",
	"DLib_32, ",
	"DLib_33, ",
	"DLib_34, ",
	"DLib_35, ",
	"DLib_36, ",
	"DLib_37, ",
	"DLib_38, ",
	"DLib_39, ",
	"DLib_40, ",
	"DLib_41, ",
	"DLib_42, ",
	"DLib_43, ",
	"DLib_44, ",
	"DLib_45, ",
	"DLib_46, ",
	"DLib_47, ",
	"DLib_48, ",
	"DLib_49, ",
	"DLib_50, ",
	"DLib_51, ",
	"DLib_52, ",
	"DLib_53, ",
	"DLib_54, ",
	"DLib_55, ",
	"DLib_56, ",
	"DLib_57, ",
	"DLib_58, ",
	"DLib_59, ",
	"DLib_60, ",
	"DLib_61, ",
	"DLib_62, ",
	"DLib_63, ",
	"DLib_64, ",
	"DLib_65, ",
	"DLib_66, ",
	"DLib_67, ",
	"DLib_68, "
]

Scenario File support for skeletons

The actor reference in the scenario file will have a new field consisting of a list of strings containing the names of the activated skeletons.

Sim Script support for skeletons

The user can choose to output skeleton annotations on a per-category/subcategory basis in the sim script editor. You can find out more about the Simulation template settings here

Key Point Recording

On initialisation, the activated skeletons defined in the scenario file will be initialised into a single skeleton. If multiple skeletons are activated they will be combined into a single output skeleton. While capture is active, the simulator evaluates the pose information on each update and returns the screen space and visibility information plus the InstanceID back to the camera cluster which builds the keypoint entry for each actor and inserts them into the KeyPointAnnotations file for that image capture.

Common Objects in Context

Information is available from https://cocodataset.org/#format-data only the section on KeyPoint Detection is relevant.

File Format

Key points for any image can be found by using the Key Point Annotations file alone, there is no need to access the annotations file. If there is a need to cross reference the key points with information in the annotations file, the exact line reference can be found from the image filename and the unique ID which are the keyPoints.Key and keyPoints[key].id respectively.

Key points for each object in an image are stored as a simple array of integer values. The information on how to interpret these can be found by catenating the keyPoints.category and keyPoints.subCategory integer values to produce a unique integer than can be used as the key to the categories dictionary where the node names and edge graphs are stored for each category.

CoCo vs Mindtech classification.

Mindtech divides tracked objects top-down into categories that then have subCategories. Each category has a unique number and each subCategory has a number that is unique within that category. CoCo divides objects bottom-up so that each category is given a unique number and objects are then grouped into named supercategories that are not numbered. In order to mimic the single-number categories of CoCo while staying compatible with the top-down categorisation, the key point annotation scheme forms a single id for each subcategory by catenating ten times the category number with the subcategory number so that category 1:subCategory 9 would have a unique id number of 109 in the categoryMap and software would find it by multiplying keyPoints.category by 10 and catenating the result with the keyPoints.subCategory field.

The Key Point Annotations file is accessed by the class:

KeyPointAnnotations

public class KeyPointAnnotations

Field name Type Editable Notes
keypoints Dictionary<string, List> A dictionary whose key is the image filename and value is a list of keyPoints structs.
categories Dictionary<int, categoryMap> A dictionary where the key is an integer representing the annotation category of the object.

Inherited Fields

Field name Type Editable Notes
scriptVersion string n specifies the version number of the script itself.
editorVersion string n specifies the version number of the specification used to generate this file.
status string n The status of the version used to generate this file
date string y The date the file was generated
company string y Company name
url string y url of company website
copyright string y copyright notice
notes string y notes relating to the file

Custom Types

actor_skeleton

public struct actor_skeleton

Field name Type Notes
keypoints List array of keypoint names. These names correspond to the named keypoints selected to be part of this skeleton
connections List Array of keypoint name pairs, this list is optional. If a keypoint has no connections, it can be omitted or stored with an empty string as the second element of the pair.

keyPoints

public struct keyPoints

Field name Type Editable Notes
id int unique id of the object - matches the id stored in the annotations file and the mask image.
points List a length 3k array where k is the total number of keypoints defined for the category. Each keypoint has a 0-indexed location x,y and a visibility flag v defined as v=0: not labeled (in which case x=y=0), v=1: labeled but not visible, and v=2: labeled and visible. A keypoint is considered visible if it falls inside the object segment.
num_points int indicates the number of labeled keypoints (v>0) for a given object
category int category id of the object, matches the category id stored in the annotations file
subCategory int subCategory id of the object, matches the subCategory id stored in the annotations file

categoryMap

public struct categoryMap

Field name Type Editable Notes
keypointNames List a length k array of keypoint names where k is the total number of keypoints defined for the category. The order of names establishes the order to read the keyPoints.points list when mapping the points to the skeleton.
skeleton List a list of comma separated string pairs defining the skeleton connectivity

Output Examples

Example KeyPointAnnotations File

{
	"header":
	{
		"version":0.1,
		"status":"draft",
		"date":"03:10:2021",
		"company":"Mindtech",
		"url":"www.mindtech.global",
		"copyright":"Mindtech Global 2021",
		"notes":"Example keypoints"
	},
   "keypoints":
   {
	   "example_image":
	   [
		   {
			   "id":1,
			   "points":
			   [
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0
			   ],
			   "num_points":68,
			   "category":1,
			   "subCategory":1
		   }
	   ]
   },
   "categories":
   {
	   "101":
	   {
		   "keypointNames":
		   [
			"1","2","3","4","5","6","7","8","9","10","11","12",
			"13","14","15","16","17","18","19","20","21","22",
			"23","24","25","26","27","28","29","30","31","32",
			"33","34","35","36","37","38","39","40","41","42",
			"43","44","45","46","47","48","49","50","51","52",
			"53","54","55","56","57","58","59","60","61","62",
			"63","64","65","66","67","68"
		   ],
		   "skeleton":
		   [
			"1,2","2,3","3,4","4,5","5,6","6,7","7,8","8,9",
			"9,10","10,11","11,12","12,13","13,14","14,15",
			"15,16","16,17","18,18","19,18","20,18","21,18",
			"23,18","24,18","25,18","26,18","28,18","29,18",
			"30,18","32,18","33,18","34,18","35,18","37,18",
			"38,18","39,18","40,18","41,18","43,18","44,18",
			"45,18","46,18","47,18","49,18","50,18","51,18",
			"52,18","53,18","54,18","55,18","56,18","57,18",
			"58,18","59,18","61,18","62,18","63,18","64,18",
			"65,18","66,18","67,18"
		   ]
	   }
   }
}

Example Skeleton File

{
	 "version" : 2020.1,
	 "status": "ISSUED",
	 "date": "2020:7:19",
	 "company": "Mindtech",
	 "url": "https://www/mindtech.global",
	 "copyright": "Copyright Mindetch Global Ltd 2020",
	 "notes": "This is an example skeleton file",
	  "bones":
	  {
	       "CC_Base_BoneRoot": 
	       {
		       "hip_bone" : "CC_Base_Hip",
		       "left_toe_bone" : "CC_Base_ToeBase_L",
		       "head_center" : "CC_Base_Head"
		   }
	  },
	  "points": 
	  {
		   "CC_Base_Body": 
		   {
			"eye_1" : 1,
			"eye_2" : 2,
			"eye_3": 3
		   },
		   "CC_Base_Eye":
		  {
			"right_eyeball" : 1024,
			"left_eyeball" : 3024
		   }
	   },
	   "skeletons": 
	   {
		  "basic":
	      [
			"hip_bone, elbow_bone",
			"left_toe_bone, hip_bone",
			"head_center, right_eyeball",
			"head_center, "left_eyeball'
		  ],
		  "face":
		   [
			"eye_1, eye_2",
			"eye_2, eye_3"
		   ]
	   }
}