Skeletons key points
The chameleon simulator can record key animation points for each animatable character in the scene; keypoint tracking allows trackable points to be defined when actors are created so that the screen space locations of those points can be tracked during the simulation.
Trackable points can be defined for any vertex defined as part of the skinned mesh plus any of the skeleton transforms of the actor model, plus the transforms of any game objects that may be attached to the model for explicit tracking purposes.
Tracked points' screen space coordinates are written out to an annotations file.
The key points are the 2D screen-space projections of the transform positions of each bone in the animated skeleton. Rotations are not recorded.
In the 3D model, each bone in an animated skeleton is specified by the position of its origin plus its rotation. Bones are stored in a hierarchy starting at a root bone, usually the hip, with the position and rotation of nested bones stored as local transforms relative to their immediate parent.
Pose estimation networks attempt to extract pose from 2D images without the context of hierarchy so the stored ground truth pose information is in screen space, with the relationships between key points stored explicitly to form a node/edge graph. For points on a surface, such as face keypoints, no hierarchy exists and so the points and their mesh relationships are always stored explicitly.
In Chameleon, the bones as well as keypoints have their connection relationships (if any)s stored explicitly when the model is imported so that bones and points can be combined into the same skeleton and so that multiple skeletons can be stored for each actor.
In the scenario editor, the actor configuration dialog will expose the available skeletons and allow the user to choose which ones will be tracked. Only one skeleton, or combination of skeletons may be set for a specific category/subcategory.
Chameleon will support the following skeletons:
Name | Purpose | Notes |
---|---|---|
minimal | head and foot tracking for top-down view | |
basic | general activity/gait/gesture tracking. Compatible with Google posenet | |
hands | detailed gesture tracking | |
face | emotion and gaze tracking |
The skeletons are set in the simulation template, you can find about the simulation template and these setting here.
The minimal skeleton shows the keypoints of the top of the head and the bottom of the feet.
"minimal":
[
"L_Foot, Head_Top",
"R_Foot, Head_Top"
"L_Foot, R_Foot"
]
The basic skeleton give more joint keypoints and basic facial keypoints (eyes and nose)
"basic":
[
"DLib_34, ",
"L_Eye, ",
"R_Eye, "
"DLib_2, ",
"DLib_16, ",
"L_UpperArm, L_ForeArm",
"R_UpperArm, R_ForeArm",
"L_ForeArm, L_HAnd ",
"R_ForeArm, R_Hand",
"L_Hand, L_ForeArm",
"R_Hand, R_ForeArm",
"L_Thigh, L_UpperArm ",
"R_Thigh, R_UpperArm",
"L_Calf, L_Thigh",
"R_Calf, R_Thigh",
"L_Ankle, L_Calf",
"R_Ankle, R_Calf",
"L_Foot, L_Ankle",
"R_Foot, R_Ankle"
]
The hand skeleton creates keypoints for every joint in the hand.
"hands":
[
"L_Hand,",
"L_Thumb_1, L_Hand ",
"L_Thumb_2, L_Thumb_1 ",
"L_Thumb_3, L_Thumb_2 ",
"L_Thumb_Tip, L_Thumb_3 ",
"L_Index_1, L_Hand ",
"L_Index_2, L_Index_1 ",
"L_Index_3, L_Index_2 ",
"L_Index_Tip, L_Index_3 ",
"L_Middle_1, L_Hand ",
"L_Middle_2, L_Middle_1 ",
"L_Middle_3, L_Middle_2 ",
"L_Middle_Tip, L_Middle_3 ",
"L_Ring_1, L_Hand ",
"L_Ring_2, L_Ring_1 ",
"L_Ring_3, L_Ring_2 ",
"L_Ring_Tip, L_Ring_3 ",
"L_Little_1, L_Hand ",
"L_Little_2, L_Little_1 ",
"L_Little_3, L_Little_2 ",
"L_Little_Tip, L_Little_3 ",
"R_Hand,",
"R_Thumb_1, R_Hand ",
"R_Thumb_2, R_Thumb_1 ",
"R_Thumb_3, R_Thumb_2 ",
"R_Thumb_Tip, R_Thumb_3 ",
"R_Index_1, R_Hand ",
"R_Index_2, R_Index_1 ",
"R_Index_3, R_Index_2 ",
"R_Index_Tip, R_Index_3 ",
"R_Middle_1, R_Hand ",
"R_Middle_2, R_Middle_1 ",
"R_Middle_3, R_Middle_2 ",
"R_Middle_Tip, R_Middle_3 ",
"R_Ring_1, R_Hand ",
"R_Ring_2, R_Ring_1 ",
"R_Ring_3, R_Ring_2 ",
"R_Ring_Tip, R_Ring_3 ",
"R_Little_1, R_Hand ",
"R_Little_2, R_Little_1 ",
"R_Little_3, R_Little_2 ",
"R_Little_Tip, R_Little_3 "
]
The face skeleton uses the Dilib 68 standard, which uses 68 points to map the jaw line, mouth, eyes and eye brows.
"face":
[
"L_Eye,",
"R_Eye, ",
"DLib_1 , ",
"DLib_2 , ",
"DLib_3 , ",
"DLib_4 , ",
"DLib_5 , ",
"DLib_6 , ",
"DLib_7 , ",
"DLib_8 , ",
"DLib_9 , ",
"DLib_10, ",
"DLib_11, ",
"DLib_12, ",
"DLib_13, ",
"DLib_14, ",
"DLib_15, ",
"DLib_16, ",
"DLib_17, ",
"DLib_18, ",
"DLib_19, ",
"DLib_20, ",
"DLib_21, ",
"DLib_22, ",
"DLib_23, ",
"DLib_24, ",
"DLib_25, ",
"DLib_26, ",
"DLib_27, ",
"DLib_28, ",
"DLib_29, ",
"DLib_30, ",
"DLib_31, ",
"DLib_32, ",
"DLib_33, ",
"DLib_34, ",
"DLib_35, ",
"DLib_36, ",
"DLib_37, ",
"DLib_38, ",
"DLib_39, ",
"DLib_40, ",
"DLib_41, ",
"DLib_42, ",
"DLib_43, ",
"DLib_44, ",
"DLib_45, ",
"DLib_46, ",
"DLib_47, ",
"DLib_48, ",
"DLib_49, ",
"DLib_50, ",
"DLib_51, ",
"DLib_52, ",
"DLib_53, ",
"DLib_54, ",
"DLib_55, ",
"DLib_56, ",
"DLib_57, ",
"DLib_58, ",
"DLib_59, ",
"DLib_60, ",
"DLib_61, ",
"DLib_62, ",
"DLib_63, ",
"DLib_64, ",
"DLib_65, ",
"DLib_66, ",
"DLib_67, ",
"DLib_68, "
]
The actor reference in the scenario file will have a new field consisting of a list of strings containing the names of the activated skeletons.
The user can choose to output skeleton annotations on a per-category/subcategory basis in the sim script editor. You can find out more about the Simulation template settings here
On initialisation, the activated skeletons defined in the scenario file will be initialised into a single skeleton. If multiple skeletons are activated they will be combined into a single output skeleton. While capture is active, the simulator evaluates the pose information on each update and returns the screen space and visibility information plus the InstanceID back to the camera cluster which builds the keypoint entry for each actor and inserts them into the KeyPointAnnotations file for that image capture.
Information is available from https://cocodataset.org/#format-data only the section on KeyPoint Detection is relevant.
Key points for any image can be found by using the Key Point Annotations file alone, there is no need to access the annotations file. If there is a need to cross reference the key points with information in the annotations file, the exact line reference can be found from the image filename and the unique ID which are the keyPoints.Key and keyPoints[key].id respectively.
Key points for each object in an image are stored as a simple array of integer values. The information on how to interpret these can be found by catenating the keyPoints.category
and keyPoints.subCategory
integer values to produce a unique integer than can be used as the key to the categories
dictionary where the node names and edge graphs are stored for each category.
Mindtech divides tracked objects top-down into categories that then have subCategories. Each category has a unique number and each subCategory has a number that is unique within that category. CoCo divides objects bottom-up so that each category is given a unique number and objects are then grouped into named supercategories that are not numbered. In order to mimic the single-number categories of CoCo while staying compatible with the top-down categorisation, the key point annotation scheme forms a single id for each subcategory by catenating ten times the category number with the subcategory number so that category 1:subCategory 9 would have a unique id number of 109 in the categoryMap and software would find it by multiplying keyPoints.category
by 10 and catenating the result with the keyPoints.subCategory
field.
The Key Point Annotations file is accessed by the class:
public class KeyPointAnnotations
Field name | Type | Editable | Notes |
---|---|---|---|
keypoints | Dictionary<string, List> | A dictionary whose key is the image filename and value is a list of keyPoints structs. | |
categories | Dictionary<int, categoryMap> | A dictionary where the key is an integer representing the annotation category of the object. |
Field name | Type | Editable | Notes |
---|---|---|---|
scriptVersion | string | n | specifies the version number of the script itself. |
editorVersion | string | n | specifies the version number of the specification used to generate this file. |
status | string | n | The status of the version used to generate this file |
date | string | y | The date the file was generated |
company | string | y | Company name |
url | string | y | url of company website |
copyright | string | y | copyright notice |
notes | string | y | notes relating to the file |
public struct actor_skeleton
Field name | Type | Notes |
---|---|---|
keypoints | List | array of keypoint names. These names correspond to the named keypoints selected to be part of this skeleton |
connections | List | Array of keypoint name pairs, this list is optional. If a keypoint has no connections, it can be omitted or stored with an empty string as the second element of the pair. |
public struct keyPoints
Field name | Type | Editable | Notes |
---|---|---|---|
id | int | unique id of the object - matches the id stored in the annotations file and the mask image. | |
points | List | a length 3k array where k is the total number of keypoints defined for the category. Each keypoint has a 0-indexed location x,y and a visibility flag v defined as v=0: not labeled (in which case x=y=0), v=1: labeled but not visible, and v=2: labeled and visible. A keypoint is considered visible if it falls inside the object segment. | |
num_points | int | indicates the number of labeled keypoints (v>0) for a given object | |
category | int | category id of the object, matches the category id stored in the annotations file | |
subCategory | int | subCategory id of the object, matches the subCategory id stored in the annotations file |
public struct categoryMap
Field name | Type | Editable | Notes |
---|---|---|---|
keypointNames | List | a length k array of keypoint names where k is the total number of keypoints defined for the category. The order of names establishes the order to read the keyPoints.points list when mapping the points to the skeleton. |
|
skeleton | List | a list of comma separated string pairs defining the skeleton connectivity |
{
"header":
{
"version":0.1,
"status":"draft",
"date":"03:10:2021",
"company":"Mindtech",
"url":"www.mindtech.global",
"copyright":"Mindtech Global 2021",
"notes":"Example keypoints"
},
"keypoints":
{
"example_image":
[
{
"id":1,
"points":
[
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,
0.0,0.0,1.0
],
"num_points":68,
"category":1,
"subCategory":1
}
]
},
"categories":
{
"101":
{
"keypointNames":
[
"1","2","3","4","5","6","7","8","9","10","11","12",
"13","14","15","16","17","18","19","20","21","22",
"23","24","25","26","27","28","29","30","31","32",
"33","34","35","36","37","38","39","40","41","42",
"43","44","45","46","47","48","49","50","51","52",
"53","54","55","56","57","58","59","60","61","62",
"63","64","65","66","67","68"
],
"skeleton":
[
"1,2","2,3","3,4","4,5","5,6","6,7","7,8","8,9",
"9,10","10,11","11,12","12,13","13,14","14,15",
"15,16","16,17","18,18","19,18","20,18","21,18",
"23,18","24,18","25,18","26,18","28,18","29,18",
"30,18","32,18","33,18","34,18","35,18","37,18",
"38,18","39,18","40,18","41,18","43,18","44,18",
"45,18","46,18","47,18","49,18","50,18","51,18",
"52,18","53,18","54,18","55,18","56,18","57,18",
"58,18","59,18","61,18","62,18","63,18","64,18",
"65,18","66,18","67,18"
]
}
}
}
{
"version" : 2020.1,
"status": "ISSUED",
"date": "2020:7:19",
"company": "Mindtech",
"url": "https://www/mindtech.global",
"copyright": "Copyright Mindetch Global Ltd 2020",
"notes": "This is an example skeleton file",
"bones":
{
"CC_Base_BoneRoot":
{
"hip_bone" : "CC_Base_Hip",
"left_toe_bone" : "CC_Base_ToeBase_L",
"head_center" : "CC_Base_Head"
}
},
"points":
{
"CC_Base_Body":
{
"eye_1" : 1,
"eye_2" : 2,
"eye_3": 3
},
"CC_Base_Eye":
{
"right_eyeball" : 1024,
"left_eyeball" : 3024
}
},
"skeletons":
{
"basic":
[
"hip_bone, elbow_bone",
"left_toe_bone, hip_bone",
"head_center, right_eyeball",
"head_center, "left_eyeball'
],
"face":
[
"eye_1, eye_2",
"eye_2, eye_3"
]
}
}