Skip to Main Content
In Smart Grid environments, with distributed generation, homes are encouraged to generate power and sell it back to utilities. Time of Use pricing techniques and the introduction of storage devices would greatly influence a user in deciding when to sell back power and how much to sell. Therefore, a study of sequential decision making algorithms that can optimize the total pay off for the user is necessary. In this paper, Reinforcement Learning is used to solve this optimization problem. The problem of determining when to sell back power is formulated as a Markov decision process and a near optimal strategy is chosen using policy iteration. The results show a significant increase of total rewards from selling back power with the proposed approach.