Table of Contents

Shortcuts

grl.datasets¶

QGPOD4RLDataset¶

class grl.datasets.QGPOD4RLDataset(env_id)[source]¶

Overview:: Dataset for QGPO algorithm. The training of QGPO algorithm is based on contrastive energy prediction, which needs true action and fake action. The true action is sampled from the dataset, and the fake action is sampled from the action support generated by the behaviour policy.
Interface:: __init__, __getitem__, __len__.

__init__(env_id)[source]¶

Overview:: Initialization method of QGPOD4RLDataset class

Parameters:: env_id (str) – The environment id

QGPODataset¶

class grl.datasets.QGPODataset[source]¶

Overview:: Dataset for QGPO algorithm. The training of QGPO algorithm is based on contrastive energy prediction, which needs true action and fake action. The true action is sampled from the dataset, and the fake action is sampled from the action support generated by the behaviour policy.
Interface:: __init__, __getitem__, __len__.

__init__()[source]¶

Overview:: Initialization method of QGPOD4RLDataset class

GPD4RLDataset¶

class grl.datasets.GPD4RLDataset(env_id)[source]¶

Overview:: D4RL Dataset for Generative Policy algorithm.
Interface:: __init__, __getitem__, __len__.

__init__(env_id)[source]¶

Overview:: Initialization method of GPD4RLDataset class

Parameters:: env_id (str) – The environment id

GPDataset¶

class grl.datasets.GPDataset[source]¶

Overview:: Dataset for Generative Policy algorithm. The training of Generative Policy algorithm sometimes needs true action and fake action. The true action is sampled from the dataset, and the fake action is sampled from the behaviour policy, which is data augmentation.
Interface:: __init__, __getitem__, __len__.

__init__()[source]¶

Overview:: Initialization method of GPD4RLDataset class