grl.agents

QGPOAgent

class grl.agents.QGPOAgent(config, model)[source]
Overview:

The agent for the QGPO algorithm.

Interface:

__init__, action

__init__(config, model)[source]
Overview:

Initialize the agent.

Parameters:
  • config (EasyDict) – The configuration.

  • model (Union[torch.nn.Module, torch.nn.ModuleDict]) – The model.

act(obs, return_as_torch_tensor=False)[source]
Overview:

Given an observation, return an action.

Parameters:
  • obs (Union[np.ndarray, torch.Tensor, Dict]) – The observation.

  • return_as_torch_tensor (bool) – Whether to return the action as a torch tensor.

Returns:

The action.

Return type:

action (Union[np.ndarray, torch.Tensor, Dict])

SRPOAgent

class grl.agents.SRPOAgent(config, model)[source]
Overview:

The QGPO agent.

Interface:

__init__, action

__init__(config, model)[source]
Overview:

Initialize the agent.

Parameters:
  • config (EasyDict) – The configuration.

  • model (Union[torch.nn.Module, torch.nn.ModuleDict]) – The model.

act(obs, return_as_torch_tensor=False)[source]
Overview:

Given an observation, return an action.

Parameters:
  • obs (Union[np.ndarray, torch.Tensor, Dict]) – The observation.

  • return_as_torch_tensor (bool) – Whether to return the action as a torch tensor.

Returns:

The action.

Return type:

action (Union[np.ndarray, torch.Tensor, Dict])

GPAgent

class grl.agents.GPAgent(config, model)[source]
Overview:

The agent trained for generative policies. This class is designed to be used with the GMPGAlgorithm and GMPOAlgorithm.

Interface:

__init__, action

__init__(config, model)[source]
Overview:

Initialize the agent with the configuration and the model.

Parameters:
  • config (EasyDict) – The configuration.

  • model (Union[torch.nn.Module, torch.nn.ModuleDict]) – The model.

act(obs, return_as_torch_tensor=False)[source]
Overview:

Given an observation, return an action as a numpy array or a torch tensor.

Parameters:
  • obs (Union[np.ndarray, torch.Tensor, Dict]) – The observation.

  • return_as_torch_tensor (bool) – Whether to return the action as a torch tensor.

Returns:

The action.

Return type:

action (Union[np.ndarray, torch.Tensor, Dict])