<linkrel="index"title="Index"href="../../../genindex/"/><linkrel="search"title="Search"href="../../../search/"/><linkrel="next"title="Handling Time Limits"href="../handling_time_limits/"/><linkrel="prev"title="Gymnasium Basics"href="../"/>
<liclass="toctree-l2"><aclass="reference internal"href="../../../environments/classic_control/mountain_car_continuous/">Mountain Car Continuous</a></li>
<liclass="toctree-l1 has-children"><aclass="reference internal"href="../../../environments/toy_text/">Toy Text</a><inputclass="toctree-checkbox"id="toctree-checkbox-6"name="toctree-checkbox-6"role="switch"type="checkbox"/><labelfor="toctree-checkbox-6"><divclass="visually-hidden">Toggle navigation of Toy Text</div><iclass="icon"><svg><usehref="#svg-arrow-right"></use></svg></i></label><ul>
<liclass="toctree-l1 current has-children"><aclass="reference internal"href="../">Gymnasium Basics</a><inputchecked=""class="toctree-checkbox"id="toctree-checkbox-8"name="toctree-checkbox-8"role="switch"type="checkbox"/><labelfor="toctree-checkbox-8"><divclass="visually-hidden">Toggle navigation of Gymnasium Basics</div><iclass="icon"><svg><usehref="#svg-arrow-right"></use></svg></i></label><ulclass="current">
<liclass="toctree-l2 current current-page"><aclass="current reference internal"href="#">Load custom quadruped robot environments</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../handling_time_limits/">Handling Time Limits</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../environment_creation/">Make your own custom environment</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../vector_envs_tutorial/">Training A2C with Vector Envs and Domain Randomization</a></li>
</ul>
</li>
<liclass="toctree-l1 has-children"><aclass="reference internal"href="../../training_agents/">Training Agents</a><inputclass="toctree-checkbox"id="toctree-checkbox-9"name="toctree-checkbox-9"role="switch"type="checkbox"/><labelfor="toctree-checkbox-9"><divclass="visually-hidden">Toggle navigation of Training Agents</div><iclass="icon"><svg><usehref="#svg-arrow-right"></use></svg></i></label><ul>
<liclass="toctree-l2"><aclass="reference internal"href="../../training_agents/reinforce_invpend_gym_v26/">Training using REINFORCE for Mujoco</a></li>
<liclass="toctree-l2"><aclass="reference internal"href="../../training_agents/blackjack_tutorial/">Solving Blackjack with Q-Learning</a></li>
<liclass="toctree-l1"><aclass="reference external"href="https://github.com/Farama-Foundation/Gymnasium/blob/main/docs/README.md">Contribute to the Docs</a></li>
<aclass="muted-link"href="https://github.com/Farama-Foundation/Gymnasium/edit/main/docs/tutorials/gymnasium_basics/load_quadruped_model.py"title="Edit this page">
<h1>Load custom quadruped robot environments<aclass="headerlink"href="#load-custom-quadruped-robot-environments"title="Link to this heading">¶</a></h1>
<p>In this tutorial we will see how to use the <codeclass="docutils literal notranslate"><spanclass="pre">MuJoCo/Ant-v5</span></code> framework to create a quadruped walking environment, using a model file (ending in <codeclass="docutils literal notranslate"><spanclass="pre">.xml</span></code>) without having to create a new class.</p>
<p>Steps:</p>
<olclass="arabic simple"start="0">
<li><p>Get your <strong>MJCF</strong> (or <strong>URDF</strong>) model file of your robot.</p>
<ulclass="simple">
<li><p>Create your own model (see the <aclass="reference external"href="https://mujoco.readthedocs.io/en/stable/m22odeling.html">Guide</a>) or,</p></li>
<li><p>Find a ready-made model (in this tutorial, we will use a model from the <aclass="reference external"href="https://github.com/google-deepmind/mujoco_menagerie"><strong>MuJoCo Menagerie</strong></a> collection).</p></li>
</ul>
</li>
<li><p>Load the model with the <codeclass="docutils literal notranslate"><spanclass="pre">xml_file</span></code> argument.</p></li>
<li><p>Tweak the environment parameters to get the desired behavior.</p>
<olclass="arabic simple">
<li><p>Tweak the environment simulation parameters.</p></li>
<li><p>Tweak the environment termination parameters.</p></li>
<li><p>Tweak the environment reward parameters.</p></li>
<li><p>Tweak the environment observation parameters.</p></li>
</ol>
</li>
<li><p>Train an agent to move your robot.</p></li>
</ol>
<p>The reader is expected to be familiar with the <codeclass="docutils literal notranslate"><spanclass="pre">Gymnasium</span></code> API & library, the basics of robotics, and the included <codeclass="docutils literal notranslate"><spanclass="pre">Gymnasium/MuJoCo</span></code> environments with the robot model they use. Familiarity with the <strong>MJCF</strong> file model format and the <codeclass="docutils literal notranslate"><spanclass="pre">MuJoCo</span></code> simulator is not required but is recommended.</p>
<sectionid="setup">
<h2>Setup<aclass="headerlink"href="#setup"title="Link to this heading">¶</a></h2>
<p>We will need <codeclass="docutils literal notranslate"><spanclass="pre">gymnasium>=1.0.0</span></code>.</p>
<h2>Step 0.1 - Download a Robot Model<aclass="headerlink"href="#step-0-1-download-a-robot-model"title="Link to this heading">¶</a></h2>
<p>In this tutorial we will load the <aclass="reference external"href="https://github.com/google-deepmind/mujoco_menagerie/blob/main/unitree_go1/README.md">Unitree Go1</a> robot from the excellent <aclass="reference external"href="https://github.com/google-deepmind/mujoco_menagerie">MuJoCo Menagerie</a> robot model collection.
<imgalt="Unitree Go1 robot in a flat terrain scene"src="https://github.com/google-deepmind/mujoco_menagerie/blob/main/unitree_go1/go1.png?raw=true"/></p>
<p><codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> is a quadruped robot, controlling it to move is a significant learning problem, much harder than the <codeclass="docutils literal notranslate"><spanclass="pre">Gymnasium/MuJoCo/Ant</span></code> environment.</p>
<p>We can download the whole MuJoCo Menagerie collection (which includes <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code>),</p>
<p>You can use any other quadruped robot with this tutorial, just adjust the environment parameter values for your robot.</p>
</section>
<sectionid="step-1-load-the-model">
<h2>Step 1 - Load the model<aclass="headerlink"href="#step-1-load-the-model"title="Link to this heading">¶</a></h2>
<p>To load the model, all we have to do is use the <codeclass="docutils literal notranslate"><spanclass="pre">xml_file</span></code> argument with the <codeclass="docutils literal notranslate"><spanclass="pre">Ant-v5</span></code> framework.</p>
<p>Although this is enough to load the model, we will need to tweak some environment parameters to get the desired behavior for our environment, for now we will also explicitly set the simulation, termination, reward and observation arguments, which we will tweak in the next step.</p>
<h2>Step 2 - Tweaking the Environment Parameters<aclass="headerlink"href="#step-2-tweaking-the-environment-parameters"title="Link to this heading">¶</a></h2>
<p>Tweaking the environment parameters is essential to get the desired behavior for learning.
In the following subsections, the reader is encouraged to consult the <aclass="reference external"href="https://gymnasium.farama.org/main/environments/mujoco/ant/#arguments">documentation of the arguments</a> for more detailed information.</p>
<h2>Step 2.1 - Tweaking the Environment Simulation Parameters<aclass="headerlink"href="#step-2-1-tweaking-the-environment-simulation-parameters"title="Link to this heading">¶</a></h2>
<p>The arguments of interest are <codeclass="docutils literal notranslate"><spanclass="pre">frame_skip</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">reset_noise_scale</span></code> and <codeclass="docutils literal notranslate"><spanclass="pre">max_episode_steps</span></code>.</p>
<p>We want to tweak the <codeclass="docutils literal notranslate"><spanclass="pre">frame_skip</span></code> parameter to get <codeclass="docutils literal notranslate"><spanclass="pre">dt</span></code> to an acceptable value (typical values are <codeclass="docutils literal notranslate"><spanclass="pre">dt</span></code><spanclass="math notranslate nohighlight">\(\in [0.01, 0.1]\)</span> seconds),</p>
<p>Reminder: <spanclass="math notranslate nohighlight">\(dt = frame\_skip \times model.opt.timestep\)</span>, where <codeclass="docutils literal notranslate"><spanclass="pre">model.opt.timestep</span></code> is the integrator time step selected in the MJCF model file.</p>
<p>The <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> model we are using has an integrator timestep of <codeclass="docutils literal notranslate"><spanclass="pre">0.002</span></code>, so by selecting <codeclass="docutils literal notranslate"><spanclass="pre">frame_skip=25</span></code> we can set the value of <codeclass="docutils literal notranslate"><spanclass="pre">dt</span></code> to <codeclass="docutils literal notranslate"><spanclass="pre">0.05s</span></code>.</p>
<p>To avoid overfitting the policy, <codeclass="docutils literal notranslate"><spanclass="pre">reset_noise_scale</span></code> should be set to a value appropriate to the size of the robot, we want the value to be as large as possible without the initial distribution of states being invalid (<codeclass="docutils literal notranslate"><spanclass="pre">Terminal</span></code> regardless of control actions), for <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> we choose a value of <codeclass="docutils literal notranslate"><spanclass="pre">0.1</span></code>.</p>
<p>And <codeclass="docutils literal notranslate"><spanclass="pre">max_episode_steps</span></code> determines the number of steps per episode before <codeclass="docutils literal notranslate"><spanclass="pre">truncation</span></code>, here we set it to 1000 to be consistent with the based <codeclass="docutils literal notranslate"><spanclass="pre">Gymnasium/MuJoCo</span></code> environments, but if you need something higher you can set it so.</p>
<spanclass="n">reset_noise_scale</span><spanclass="o">=</span><spanclass="mf">0.1</span><spanclass="p">,</span><spanclass="c1"># set to avoid policy overfitting</span>
<spanclass="n">frame_skip</span><spanclass="o">=</span><spanclass="mi">25</span><spanclass="p">,</span><spanclass="c1"># set dt=0.05</span>
<spanclass="n">max_episode_steps</span><spanclass="o">=</span><spanclass="mi">1000</span><spanclass="p">,</span><spanclass="c1"># kept at 1000</span>
<h2>Step 2.2 - Tweaking the Environment Termination Parameters<aclass="headerlink"href="#step-2-2-tweaking-the-environment-termination-parameters"title="Link to this heading">¶</a></h2>
<p>Termination is important for robot environments to avoid sampling “useless” time steps.</p>
<p>The arguments of interest are <codeclass="docutils literal notranslate"><spanclass="pre">terminate_when_unhealthy</span></code> and <codeclass="docutils literal notranslate"><spanclass="pre">healthy_z_range</span></code>.</p>
<p>We want to set <codeclass="docutils literal notranslate"><spanclass="pre">healthy_z_range</span></code> to terminate the environment when the robot falls over, or jumps really high, here we have to choose a value that is logical for the height of the robot, for <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> we choose <codeclass="docutils literal notranslate"><spanclass="pre">(0.195,</span><spanclass="pre">0.75)</span></code>.
Note: <codeclass="docutils literal notranslate"><spanclass="pre">healthy_z_range</span></code> checks the absolute value of the height of the robot, so if your scene contains different levels of elevation it should be set to <codeclass="docutils literal notranslate"><spanclass="pre">(-np.inf,</span><spanclass="pre">np.inf)</span></code></p>
<p>We could also set <codeclass="docutils literal notranslate"><spanclass="pre">terminate_when_unhealthy=False</span></code> to disable termination altogether, which is not desirable in the case of <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code>.</p>
<spanclass="n">healthy_z_range</span><spanclass="o">=</span><spanclass="p">(</span><spanclass="mf">0.195</span><spanclass="p">,</span><spanclass="mf">0.75</span><spanclass="p">),</span><spanclass="c1"># set to avoid sampling steps where the robot has fallen or jumped too high</span>
<p>Note: If you need a different termination condition, you can write your own <codeclass="docutils literal notranslate"><spanclass="pre">TerminationWrapper</span></code> (see the <aclass="reference external"href="https://gymnasium.farama.org/main/api/wrappers/">documentation</a>).</p>
<h2>Step 2.3 - Tweaking the Environment Reward Parameters<aclass="headerlink"href="#step-2-3-tweaking-the-environment-reward-parameters"title="Link to this heading">¶</a></h2>
<p>The arguments of interest are <codeclass="docutils literal notranslate"><spanclass="pre">forward_reward_weight</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">ctrl_cost_weight</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">contact_cost_weight</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">healthy_reward</span></code>, and <codeclass="docutils literal notranslate"><spanclass="pre">main_body</span></code>.</p>
<p>For the arguments <codeclass="docutils literal notranslate"><spanclass="pre">forward_reward_weight</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">ctrl_cost_weight</span></code>, <codeclass="docutils literal notranslate"><spanclass="pre">contact_cost_weight</span></code> and <codeclass="docutils literal notranslate"><spanclass="pre">healthy_reward</span></code> we have to pick values that make sense for our robot, you can use the default <codeclass="docutils literal notranslate"><spanclass="pre">MuJoCo/Ant</span></code> parameters for references and tweak them if a change is needed for your environment. In the case of <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> we only change the <codeclass="docutils literal notranslate"><spanclass="pre">ctrl_cost_weight</span></code> since it has a higher actuator force range.</p>
<p>For the argument <codeclass="docutils literal notranslate"><spanclass="pre">main_body</span></code> we have to choose which body part is the main body (usually called something like “torso” or “trunk” in the model file) for the calculation of the <codeclass="docutils literal notranslate"><spanclass="pre">forward_reward</span></code>, in the case of <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> it is the <codeclass="docutils literal notranslate"><spanclass="pre">"trunk"</span></code> (Note: in most cases including this one, it can be left at the default value).</p>
<spanclass="n">forward_reward_weight</span><spanclass="o">=</span><spanclass="mi">1</span><spanclass="p">,</span><spanclass="c1"># kept the same as the 'Ant' environment</span>
<spanclass="n">ctrl_cost_weight</span><spanclass="o">=</span><spanclass="mf">0.05</span><spanclass="p">,</span><spanclass="c1"># changed because of the stronger motors of `Go1`</span>
<spanclass="n">contact_cost_weight</span><spanclass="o">=</span><spanclass="mf">5e-4</span><spanclass="p">,</span><spanclass="c1"># kept the same as the 'Ant' environment</span>
<spanclass="n">healthy_reward</span><spanclass="o">=</span><spanclass="mi">1</span><spanclass="p">,</span><spanclass="c1"># kept the same as the 'Ant' environment</span>
<spanclass="n">main_body</span><spanclass="o">=</span><spanclass="mi">1</span><spanclass="p">,</span><spanclass="c1"># represents the "trunk" of the `Go1` robot</span>
<p>Note: If you need a different reward function, you can write your own <codeclass="docutils literal notranslate"><spanclass="pre">RewardWrapper</span></code> (see the <aclass="reference external"href="https://gymnasium.farama.org/main/api/wrappers/reward_wrappers/">documentation</a>).</p>
<h2>Step 2.4 - Tweaking the Environment Observation Parameters<aclass="headerlink"href="#step-2-4-tweaking-the-environment-observation-parameters"title="Link to this heading">¶</a></h2>
<p>The arguments of interest are <codeclass="docutils literal notranslate"><spanclass="pre">include_cfrc_ext_in_observation</span></code> and <codeclass="docutils literal notranslate"><spanclass="pre">exclude_current_positions_from_observation</span></code>.</p>
<p>Here for <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> we have no particular reason to change them.</p>
<spanclass="n">include_cfrc_ext_in_observation</span><spanclass="o">=</span><spanclass="kc">True</span><spanclass="p">,</span><spanclass="c1"># kept the game as the 'Ant' environment</span>
<spanclass="n">exclude_current_positions_from_observation</span><spanclass="o">=</span><spanclass="kc">False</span><spanclass="p">,</span><spanclass="c1"># kept the game as the 'Ant' environment</span>
<p>Note: If you need additional observation elements (such as additional sensors), you can write your own <codeclass="docutils literal notranslate"><spanclass="pre">ObservationWrapper</span></code> (see the <aclass="reference external"href="https://gymnasium.farama.org/main/api/wrappers/observation_wrappers/">documentation</a>).</p>
</section>
<sectionid="step-3-train-your-agent">
<h2>Step 3 - Train your Agent<aclass="headerlink"href="#step-3-train-your-agent"title="Link to this heading">¶</a></h2>
<p>Finally, we are done, we can use a RL algorithm to train an agent to walk/run the <codeclass="docutils literal notranslate"><spanclass="pre">Go1</span></code> robot.
Note: If you have followed this guide with your own robot model, you may discover during training that some environment parameters were not as desired, feel free to go back to step 2 and change anything as needed.</p>
Which can run up to `4.7 m/s` according to the manufacturer
-->
</section>
<sectionid="epilogue">
<h2>Epilogue<aclass="headerlink"href="#epilogue"title="Link to this heading">¶</a></h2>
<p>You can follow this guide to create most quadruped environments.
To create humanoid/bipedal robots, you can also follow this guide using the <codeclass="docutils literal notranslate"><spanclass="pre">Gymnasium/MuJoCo/Humnaoid-v5</span></code> framework.</p>