Ground truth
World model prediction
Controller input
0.0s / 60.0s
Position
x
0.000
y
0.000
z
0.000
Orientation
r
0.000
p
0.000
y
0.000
Gripper
g
0.000