Fido
 All Classes Namespaces Files Functions Variables Typedefs Macros Pages
Classes | Public Member Functions | Public Attributes | Protected Member Functions | Protected Attributes | List of all members
rl::FidoControlSystem Class Reference

A highly effective reinforcement learning control system (Truell and Gruenstein) More...

#include <FidoControlSystem.h>

Inherits rl::WireFitQLearn.

Classes

struct  History
 

Public Member Functions

 FidoControlSystem (int stateDimensions, Action minAction, Action maxAction, int baseOfDimensions)
 Initializes a FidoControlSystem. More...
 
std::vector< double > chooseBoltzmanActionDynamic (State state)
 
void applyReinforcementToLastAction (double reward, State newState)
 Update the control system's model, by giving it reward for its last action. More...
 
void reset ()
 Reverts the control system to a newely initialized state. More...
 
- Public Member Functions inherited from rl::WireFitQLearn
 WireFitQLearn (unsigned int stateDimensions, unsigned int actionDimensions_, unsigned int numHiddenLayers, unsigned int numNeuronsPerHiddenLayer, unsigned int numberOfWires_, Action minAction_, Action maxAction_, unsigned int baseOfDimensions_, Interpolator *interpolator_, net::Trainer *trainer_, double learningRate_, double devaluationFactor_)
 Initializes a completely new WireFitQLearn object with all necesary values. More...
 
 WireFitQLearn ()
 Initializes an empty, non-valid WireFitQLearn object. More...
 
 WireFitQLearn (std::ifstream *input)
 Initializes a WireFitQLearn object from a stream. More...
 
Action chooseBestAction (State currentState)
 Gets the action that the network deems most benificial for the current state. More...
 
Action chooseBoltzmanAction (State currentState, double explorationConstant)
 Gets an action using the Boltzman softmax probability distribution. More...
 
void applyReinforcementToLastAction (double reward, State newState)
 Updates expected reward values. More...
 
void reset ()
 Resets the system's model so that to a newely initialized state. More...
 
void store (std::ofstream *output)
 Stores this model in a stream. More...
 

Public Attributes

const double initialExploration = 1
 
const unsigned int samplesOfHistory = 10
 
double explorationLevel
 
double lastUncertainty
 
- Public Attributes inherited from rl::WireFitQLearn
net::NeuralNetnetwork
 
Interpolatorinterpolator
 
net::Trainertrainer
 
unsigned int numberOfWires
 
unsigned int actionDimensions
 
double learningRate
 
double devaluationFactor
 
double controlPointsGDErrorTarget
 
double controlPointsGDLearningRate
 
int controlPointsGDMaxIterations
 
unsigned int baseOfDimensions
 
State lastState
 
Action minAction
 
Action maxAction
 
Action lastAction
 
net::NeuralNetmodelNet
 

Protected Member Functions

std::vector
< FidoControlSystem::History
selectHistories ()
 
void trainOnHistories (std::vector< FidoControlSystem::History > selectedHistories)
 
void adjustExploration (double uncertainty)
 
double getError (std::vector< double > input, std::vector< double > correctOutput)
 
std::vector< WirenewControlWiresForHistory (History history)
 
- Protected Member Functions inherited from rl::WireFitQLearn
std::vector< WiregetWires (State state)
 
std::vector< WiregetSetOfWires (const State &state, int baseOfDimensions)
 
std::vector< double > getRawOutput (std::vector< Wire > wires)
 
double highestReward (State state)
 
Action bestAction (State state)
 
double getQValue (double reward, const State &oldState, const State &newState, const Action &action, const std::vector< Wire > &controlWires)
 
std::vector< WirenewControlWires (const Wire &correctWire, std::vector< Wire > controlWires)
 

Protected Attributes

std::vector< Historyhistories
 

Detailed Description

A highly effective reinforcement learning control system (Truell and Gruenstein)

Constructor & Destructor Documentation

FidoControlSystem::FidoControlSystem ( int  stateDimensions,
Action  minAction,
Action  maxAction,
int  baseOfDimensions 
)

Initializes a FidoControlSystem.

Parameters
stateDimensionsthe number of dimensions of the state being fed to the control system (aka. number of elements in the state vector)
minActionthe minimum possible action (e.g. a vector of doubles) that the control system may output
maxActionthe maximum possible action (e.g. a vector of doubles) that the control system may output
baseOfDimensionsthe number of possible descrete values in each dimension. Ex. if baseOfDimensions=2, minAction={0, 0}, maxAction={1, 1}, possibleActions={{0, 0}, {0, 1}, {1, 0}, {1, 1}}.

Member Function Documentation

void FidoControlSystem::adjustExploration ( double  uncertainty)
protected
void FidoControlSystem::applyReinforcementToLastAction ( double  reward,
State  newState 
)
virtual

Update the control system's model, by giving it reward for its last action.

Parameters
rewardthe reward associated with the control system's last action
newStatethe new state vector (needed because states may change after performing an action)

Implements rl::Learner.

std::vector< double > FidoControlSystem::chooseBoltzmanActionDynamic ( State  state)
double FidoControlSystem::getError ( std::vector< double >  input,
std::vector< double >  correctOutput 
)
protected
std::vector< Wire > FidoControlSystem::newControlWiresForHistory ( History  history)
protected
void FidoControlSystem::reset ( )
virtual

Reverts the control system to a newely initialized state.

Reset's the control system's model and wipes the system's memory of past actions, states, and rewards.

Implements rl::Learner.

std::vector< FidoControlSystem::History > FidoControlSystem::selectHistories ( )
protected
void FidoControlSystem::trainOnHistories ( std::vector< FidoControlSystem::History selectedHistories)
protected

Member Data Documentation

double rl::FidoControlSystem::explorationLevel
std::vector<History> rl::FidoControlSystem::histories
protected
const double rl::FidoControlSystem::initialExploration = 1
double rl::FidoControlSystem::lastUncertainty
const unsigned int rl::FidoControlSystem::samplesOfHistory = 10

The documentation for this class was generated from the following files: