Skeletons key points

Introduction

The chameleon simulator can record key animation points for each animatable character in the scene; keypoint tracking allows trackable points to be defined when actors are created so that the screen space locations of those points can be tracked during the simulation.

Trackable points can be defined for any vertex defined as part of the skinned mesh plus any of the skeleton transforms of the actor model, plus the transforms of any game objects that may be attached to the model for explicit tracking purposes.

Tracked points' screen space coordinates are written out to an annotations file.

The key points are the 2D screen-space projections of the transform positions of each bone in the animated skeleton. Rotations are not recorded.

Basic concepts

In the 3D model, each bone in an animated skeleton is specified by the position of its origin plus its rotation. Bones are stored in a hierarchy starting at a root bone, usually the hip, with the position and rotation of nested bones stored as local transforms relative to their immediate parent.

Pose estimation networks attempt to extract pose from 2D images without the context of hierarchy so the stored ground truth pose information is in screen space, with the relationships between key points stored explicitly to form a node/edge graph. For points on a surface, such as face keypoints, no hierarchy exists and so the points and their mesh relationships are always stored explicitly.

In Chameleon, the bones as well as keypoints have their connection relationships (if any)s stored explicitly when the model is imported so that bones and points can be combined into the same skeleton and so that multiple skeletons can be stored for each actor.

Selection of skeletons in the Scenario Editor

In the scenario editor, the actor configuration dialog will expose the available skeletons and allow the user to choose which ones will be tracked. Only one skeleton, or combination of skeletons may be set for a specific category/subcategory.

Skeletons supported in this version

Chameleon will support the following skeletons:

Name	Purpose	Notes
minimal	head and foot tracking for top-down view
basic	general activity/gait/gesture tracking. Compatible with Google posenet
hands	detailed gesture tracking
face	emotion and gaze tracking

The skeletons are set in the simulation template, you can find about the simulation template and these setting here.

Minimal Skeleton

The minimal skeleton shows the keypoints of the top of the head and the bottom of the feet.
Minimal Skeleton.png

"minimal":
[
	"L_Foot, Head_Top",
	"R_Foot, Head_Top"
	"L_Foot, R_Foot"
]

Basic Skeleton

The basic skeleton give more joint keypoints and basic facial keypoints (eyes and nose)
Basic Skeleton.png

"basic":
[
	"DLib_34, ",
	"L_Eye, ",
	"R_Eye, "
	"DLib_2, ",
	"DLib_16, ",
	"L_UpperArm, L_ForeArm",
	"R_UpperArm, R_ForeArm",
	"L_ForeArm, L_HAnd ",
	"R_ForeArm, R_Hand",
	"L_Hand, L_ForeArm",
	"R_Hand, R_ForeArm",
	"L_Thigh, L_UpperArm ",
	"R_Thigh, R_UpperArm",
	"L_Calf, L_Thigh",
	"R_Calf, R_Thigh",
	"L_Ankle, L_Calf",
	"R_Ankle, R_Calf",
	"L_Foot, L_Ankle",
	"R_Foot, R_Ankle"
]

Hands Skeleton

The hand skeleton creates keypoints for every joint in the hand.

"hands":
[
	"L_Hand,",
	"L_Thumb_1,   L_Hand ",
	"L_Thumb_2,   L_Thumb_1 ",
	"L_Thumb_3,   L_Thumb_2 ",
	"L_Thumb_Tip, L_Thumb_3 ",
	"L_Index_1,   L_Hand ",
	"L_Index_2,   L_Index_1 ",
	"L_Index_3,   L_Index_2 ",
	"L_Index_Tip, L_Index_3 ",
	"L_Middle_1,   L_Hand ",
	"L_Middle_2,   L_Middle_1 ",
	"L_Middle_3,   L_Middle_2 ",
	"L_Middle_Tip, L_Middle_3 ",
	"L_Ring_1,   L_Hand ",
	"L_Ring_2,   L_Ring_1 ",
	"L_Ring_3,   L_Ring_2 ",
	"L_Ring_Tip, L_Ring_3 ",
	"L_Little_1,   L_Hand ",
	"L_Little_2,   L_Little_1 ",
	"L_Little_3,   L_Little_2 ",
	"L_Little_Tip, L_Little_3 ",
	"R_Hand,",
	"R_Thumb_1,    R_Hand ",
	"R_Thumb_2,    R_Thumb_1 ",
	"R_Thumb_3,    R_Thumb_2 ",
	"R_Thumb_Tip,  R_Thumb_3 ",
	"R_Index_1,    R_Hand ",
	"R_Index_2,    R_Index_1 ",
	"R_Index_3,    R_Index_2 ",
	"R_Index_Tip,  R_Index_3 ",
	"R_Middle_1,   R_Hand ",
	"R_Middle_2,   R_Middle_1 ",
	"R_Middle_3,   R_Middle_2 ",
	"R_Middle_Tip, R_Middle_3 ",
	"R_Ring_1,     R_Hand ",
	"R_Ring_2,     R_Ring_1 ",
	"R_Ring_3,     R_Ring_2 ",
	"R_Ring_Tip,   R_Ring_3 ",
	"R_Little_1,   R_Hand ",
	"R_Little_2,   R_Little_1 ",
	"R_Little_3,   R_Little_2 ",
	"R_Little_Tip, R_Little_3 "
]

Face Skeleton

The face skeleton uses the Dilib 68 standard, which uses 68 points to map the jaw line, mouth, eyes and eye brows.

"face":
[
	"L_Eye,",
	"R_Eye, ",
	"DLib_1 , ",
	"DLib_2 , ",
	"DLib_3 , ",
	"DLib_4 , ",
	"DLib_5 , ",
	"DLib_6 , ",
	"DLib_7 , ",
	"DLib_8 , ",
	"DLib_9 , ",
	"DLib_10, ",
	"DLib_11, ",
	"DLib_12, ",
	"DLib_13, ",
	"DLib_14, ",
	"DLib_15, ",
	"DLib_16, ",
	"DLib_17, ",
	"DLib_18, ",
	"DLib_19, ",
	"DLib_20, ",
	"DLib_21, ",
	"DLib_22, ",
	"DLib_23, ",
	"DLib_24, ",
	"DLib_25, ",
	"DLib_26, ",
	"DLib_27, ",
	"DLib_28, ",
	"DLib_29, ",
	"DLib_30, ",
	"DLib_31, ",
	"DLib_32, ",
	"DLib_33, ",
	"DLib_34, ",
	"DLib_35, ",
	"DLib_36, ",
	"DLib_37, ",
	"DLib_38, ",
	"DLib_39, ",
	"DLib_40, ",
	"DLib_41, ",
	"DLib_42, ",
	"DLib_43, ",
	"DLib_44, ",
	"DLib_45, ",
	"DLib_46, ",
	"DLib_47, ",
	"DLib_48, ",
	"DLib_49, ",
	"DLib_50, ",
	"DLib_51, ",
	"DLib_52, ",
	"DLib_53, ",
	"DLib_54, ",
	"DLib_55, ",
	"DLib_56, ",
	"DLib_57, ",
	"DLib_58, ",
	"DLib_59, ",
	"DLib_60, ",
	"DLib_61, ",
	"DLib_62, ",
	"DLib_63, ",
	"DLib_64, ",
	"DLib_65, ",
	"DLib_66, ",
	"DLib_67, ",
	"DLib_68, "
]

Scenario File support for skeletons

The actor reference in the scenario file will have a new field consisting of a list of strings containing the names of the activated skeletons.

Sim Script support for skeletons

The user can choose to output skeleton annotations on a per-category/subcategory basis in the sim script editor. You can find out more about the Simulation template settings here

Key Point Recording

On initialisation, the activated skeletons defined in the scenario file will be initialised into a single skeleton. If multiple skeletons are activated they will be combined into a single output skeleton. While capture is active, the simulator evaluates the pose information on each update and returns the screen space and visibility information plus the InstanceID back to the camera cluster which builds the keypoint entry for each actor and inserts them into the KeyPointAnnotations file for that image capture.

Common Objects in Context

Information is available from https://cocodataset.org/#format-data only the section on KeyPoint Detection is relevant.

File Format

Key points for any image can be found by using the Key Point Annotations file alone, there is no need to access the annotations file. If there is a need to cross reference the key points with information in the annotations file, the exact line reference can be found from the image filename and the unique ID which are the keyPoints.Key and keyPoints[key].id respectively.

Key points for each object in an image are stored as a simple array of integer values. The information on how to interpret these can be found by catenating the keyPoints.category and keyPoints.subCategory integer values to produce a unique integer than can be used as the key to the categories dictionary where the node names and edge graphs are stored for each category.

CoCo vs Synthera classification.

Synthera divides tracked objects top-down into categories that then have subCategories. Each category has a unique number and each subCategory has a number that is unique within that category. CoCo divides objects bottom-up so that each category is given a unique number and objects are then grouped into named supercategories that are not numbered. In order to mimic the single-number categories of CoCo while staying compatible with the top-down categorisation, the key point annotation scheme forms a single id for each subcategory by catenating ten times the category number with the subcategory number so that category 1:subCategory 9 would have a unique id number of 109 in the categoryMap and software would find it by multiplying keyPoints.category by 10 and catenating the result with the keyPoints.subCategory field.

The Key Point Annotations file is accessed by the class:

KeyPointAnnotations

public class KeyPointAnnotations

Field name	Type	Editable	Notes
keypoints	Dictionary<string, List>		A dictionary whose key is the image filename and value is a list of keyPoints structs.
categories	Dictionary<int, categoryMap>		A dictionary where the key is an integer representing the annotation category of the object.

Inherited Fields

Field name	Type	Editable	Notes
scriptVersion	string	n	specifies the version number of the script itself.
editorVersion	string	n	specifies the version number of the specification used to generate this file.
status	string	n	The status of the version used to generate this file
date	string	y	The date the file was generated
company	string	y	Company name
url	string	y	url of company website
copyright	string	y	copyright notice
notes	string	y	notes relating to the file

Custom Types

actor_skeleton

public struct actor_skeleton

Field name	Type	Notes
keypoints	List	array of keypoint names. These names correspond to the named keypoints selected to be part of this skeleton
connections	List	Array of keypoint name pairs, this list is optional. If a keypoint has no connections, it can be omitted or stored with an empty string as the second element of the pair.

keyPoints

public struct keyPoints

Field name	Type	Notes
id	int	unique id of the object - matches the id stored in the annotations file and the mask image.
points	List	a length 3k array where k is the total number of keypoints defined for the category. Each keypoint has a 0-indexed location x,y and a visibility flag v defined as v=0: not labeled (in which case x=y=0), v=1: labeled but not visible, and v=2: labeled and visible. A keypoint is considered visible if it falls inside the object segment.
num_points	int	indicates the number of labeled keypoints (v>0) for a given object
category	int	category id of the object, matches the category id stored in the annotations file
subCategory	int	subCategory id of the object, matches the subCategory id stored in the annotations file

categoryMap

public struct categoryMap

Field name	Type	Editable	Notes
keypointNames	List		a length k array of keypoint names where k is the total number of keypoints defined for the category. The order of names establishes the order to read the `keyPoints.points` list when mapping the points to the skeleton.
skeleton	List		a list of comma separated string pairs defining the skeleton connectivity

Output Examples

Example KeyPointAnnotations File

{
	"header":
	{
		"version":0.1,
		"status":"draft",
		"date":"03:10:2021",
		"company":"Synthera",
		"url":"www.syntheracorp.com",
		"copyright":"Synthera Corporation 2025",
		"notes":"Example keypoints"
	},
   "keypoints":
   {
	   "example_image":
	   [
		   {
			   "id":1,
			   "points":
			   [
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
				0.0,0.0,1.0
			   ],
			   "num_points":68,
			   "category":1,
			   "subCategory":1
		   }
	   ]
   },
   "categories":
   {
	   "101":
	   {
		   "keypointNames":
		   [
			"1","2","3","4","5","6","7","8","9","10","11","12",
			"13","14","15","16","17","18","19","20","21","22",
			"23","24","25","26","27","28","29","30","31","32",
			"33","34","35","36","37","38","39","40","41","42",
			"43","44","45","46","47","48","49","50","51","52",
			"53","54","55","56","57","58","59","60","61","62",
			"63","64","65","66","67","68"
		   ],
		   "skeleton":
		   [
			"1,2","2,3","3,4","4,5","5,6","6,7","7,8","8,9",
			"9,10","10,11","11,12","12,13","13,14","14,15",
			"15,16","16,17","18,18","19,18","20,18","21,18",
			"23,18","24,18","25,18","26,18","28,18","29,18",
			"30,18","32,18","33,18","34,18","35,18","37,18",
			"38,18","39,18","40,18","41,18","43,18","44,18",
			"45,18","46,18","47,18","49,18","50,18","51,18",
			"52,18","53,18","54,18","55,18","56,18","57,18",
			"58,18","59,18","61,18","62,18","63,18","64,18",
			"65,18","66,18","67,18"
		   ]
	   }
   }
}

Example Skeleton File

{
	 "version" : 2020.1,
	 "status": "ISSUED",
	 "date": "2020:7:19",
	 "company":"Synthera",
	 "url":"www.syntheracorp.com",
	 "copyright":"Synthera Corporation 2025",
	 "notes": "This is an example skeleton file",
	  "bones":
	  {
	       "CC_Base_BoneRoot": 
	       {
		       "hip_bone" : "CC_Base_Hip",
		       "left_toe_bone" : "CC_Base_ToeBase_L",
		       "head_center" : "CC_Base_Head"
		   }
	  },
	  "points": 
	  {
		   "CC_Base_Body": 
		   {
			"eye_1" : 1,
			"eye_2" : 2,
			"eye_3": 3
		   },
		   "CC_Base_Eye":
		  {
			"right_eyeball" : 1024,
			"left_eyeball" : 3024
		   }
	   },
	   "skeletons": 
	   {
		  "basic":
	      [
			"hip_bone, elbow_bone",
			"left_toe_bone, hip_bone",
			"head_center, right_eyeball",
			"head_center, "left_eyeball'
		  ],
		  "face":
		   [
			"eye_1, eye_2",
			"eye_2, eye_3"
		   ]
	   }
}