Finger Manipulation#
Finger is a classic manipulation task from the DeepMind Control Suite. A two-link “finger” applies torques to interact with a rotating spinner. MotrixLab currently provides three Finger environments:
dm-finger-spin: make the spinner rotate continuously in the target directiondm-finger-turn-easy: align the spinner tip (tip) with a target point (larger target radius)dm-finger-turn-hard: same as Turn, but with a smaller target radius
Task Description#
Finger is a planar (x-z) interaction task:
The finger has 2 actuated hinge joints:
proximalanddistalThe spinner rotates around joint
hinge, andtipdenotes the spinner tip positionFor Turn tasks, a target point is sampled around the spinner at the beginning of each episode
Action Space#
Item |
Details |
|---|---|
Type |
|
Dimension |
2 |
The actions correspond to:
Index |
Action Description |
Min Control |
Max Control |
XML Name |
Joint Type |
|---|---|---|---|---|---|
0 |
Torque applied to |
-1 |
1 |
proximal |
hinge |
1 |
Torque applied to |
-1 |
1 |
distal |
hinge |
Observation Space#
MotrixLab follows dm_control-style observations, but flattens them into a single vector.
Spin Observation Space#
Item |
Details |
|---|---|
Type |
|
Dimension |
9 |
The observation vector contains (in order):
position (4):
qpos(proximal, distal)+tip_xz(tip position relative to the spinner in x-z)velocity (3):
qvel(proximal, distal, hinge)(hinge velocity is used by Spin reward)touch (2):
log(1 + touchtop),log(1 + touchbottom)
Turn Observation Space#
Item |
Details |
|---|---|
Type |
|
Dimension |
12 |
Compared to Spin, Turn adds:
target_position (2): target position relative to the spinner in x-z
dist_to_target (1): signed distance from tip to the target sphere surface (negative means “inside”)
Reward Function Design#
Spin#
In dm_control, Spin is typically defined with a sparse threshold on spinner angular velocity. MotrixLab defaults to a dense/shaped reward for easier training, while also logging the sparse version:
spin_sparse = 1 if hinge_velocity <= -15 else 0
spin = clip(-hinge_velocity / 15, 0, 1)
Turn (Easy / Hard)#
Turn aims to bring the spinner tip into a target sphere around the spinner:
turn_sparse = 1whendist_to_target <= 0MotrixLab defaults to a shaped reward based on distance-to-target (exponential decay), and adds auxiliary terms to reduce “no-contact” failure modes and action jitter:
approach-to-spinner shaping
touch bonus
action magnitude / action change penalties
The final shaped reward is clipped to [0, 1].
Initial State#
proximal,distaljoint angles are sampled uniformly within joint limitsspinner
hingeangle is sampled uniformly in[-pi, pi]for Turn tasks, the target is sampled around the spinner on the x-z plane at reset
Episode Termination Conditions#
Termination#
If NaN appears in the observations
Usage Guide#
1. Environment Preview (random actions)#
uv run scripts/view.py --env dm-finger-spin
uv run scripts/view.py --env dm-finger-turn-easy
uv run scripts/view.py --env dm-finger-turn-hard
2. Start Training#
uv run scripts/train.py --env dm-finger-spin --train-backend torch
uv run scripts/train.py --env dm-finger-turn-easy --train-backend torch
uv run scripts/train.py --env dm-finger-turn-hard --train-backend torch
3. View Training Progress#
uv run tensorboard --logdir runs/dm-finger-spin
4. Test Training Results#
scripts/play.py will auto-discover the latest best_agent.* under runs/{env-name}/ (or you can pass --policy explicitly):
uv run scripts/play.py --env dm-finger-turn-hard
Expected Training Results#
dm-finger-spin: stable continuous rotation in the target directiondm-finger-turn-easy: consistent contact and alignment with the (larger) target region, with reduced jitterdm-finger-turn-hard: successful alignment with a smaller target region, typically requiring more training and better contact behavior