Image
volume 3 issue 10

APPLICATION ATO EMBEDDED AGENTS MARKOV DECISION PRECESSES

Abstract

When modeling the many different forms of multi-agent multi-agent systems, a significant amount of Markov decision processes, also called MMDPS in some contexts, are used. In this piece of writing, two innovative algorithms that are based on learning automata are proposed as a way to solve MMDPS and decide the policies that will be most effective overall. In many of the techniques that have been developed, the Markov problem is given in the form of a directed graph. The problem can be thought of as existing in different states, each of which is represented by a node in this graph; Actions that lead to transitions from one state to another are represented by directed edges that connect nodes in the network. Each node in the network is equipped with a learning automaton that is in charge of the node's tasks, which are represented by the node's outgoing edges. Each agent moves from one node to another during its journey as they work towards the goal of arriving at the goal state. When the agent is deciding which path to follow next, the learning automation at each node provides guidance so that it can make an informed choice. Activities performed by learning automata along the path traveled by the agent are then rewarded or punished according to the learning algorithm based on the cost of the path traveled by the agent.

Keywords
  • Embedded,
  • Markov Decision,
  • Automation,
  • Transitions
References
  • C. McGoogan, Robot security guard knocks over a toddler at a shopping center, The Telegraph. (2016).
  • K. Bousmalis, A. Irpan, P. Wohlhart, Y. Bai, M. Kelcey, M. Kalakrishnan,
  • Downs, J. Ibarz, P. Pastor, K. Konolige, S. Levine, V. Vanhoucke, Using simulation and domain adaptation to improve efficiency of deep robotic grasp-ing, arXiv:1709.07857 [Cs]. (2017).http://arxiv.org/abs/1709.07857(ac-cessed April 19, 2021).
  • J. Leike, M. Martic, V. Krakovna, P.A. Ortega, T. Everitt, A. Lefrancq, L. Orseau, S. Legg, AI safety grid worlds, arXiv:1711.09883 [Cs]. (2017).http://arxiv.org/abs/1711.09883(accessed October 25, 2019).
  • W. Saunders, G. Sastry, A. Stuhlmueller, O. Evans, Trial without error: To-wards safe reinforcement learning via human intervention, arXiv:1707.05173 [Cs]. (2017).http://arxiv.org/abs/1707.05173(accessed February 2, 2021).
  • S. Patterson, How the ’flash crash’ echoed black Monday, The Wall Street Journal. (2010).
  • S. Bling, SNES code injection – flappy bird in SMW, 2016.https://www.youtube.com/watch?v=hB6eY73sLV0.
  • oskarsve, ”Important, Spoofing” - zero-click, wormable, cross-platform remote code execution in Microsoft Teams, (2020).https://github.com/oskarsve/ms-teams-rce.
  • M. Guri, AIR-FI: Generating covert wi-fi signals from air-gapped computers, arXiv:2012.06884 [Cs]. (2020).http://arxiv.org/abs/2012.06884(ac-cessed April 19, 2021).
  • M. Guri, B. Zadov, Y. Elovici, ODINI: Escaping sensitive data from faraday-caged, air-gapped computers via magnetic fields, IEEE Trans.Inform.ForensicSecur. 15 (2020) 1190–1203.https://doi.org/10.1109/TIFS.2019.2938404.
  • Y. Wang, Q. Yao, J.T. Kwok, L.M. Ni, Generalizing from a Few Examples:
  • a. Survey on Few-shot Learning, ACM Computing Surveys. 53 (2020) 1–34. https://doi.org/10.1145/3386252.
  • D. Abel, Y. Jinnai, S.Y. Guo, G. Konidaris, M. Littman, Policy and value transfer in lifelong reinforcement learning, in: International Conference on Machine Learning, PMLR, 2018: pp. 20–29.
  • T. Cowen, D. Parfit, others, Against the social discount rate, Justice Between Age Groups and Generations. 144 (1992) 145.
  • T. Dohmen, A. Trivedi, Discounting the Past in Stochastic Games, arXiv:2102.06985 [Cs, Math]. (2021).http://arxiv.org/abs/2102.06985 (accessed April 20, 2021).
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Vikrant Lochan. (2020). APPLICATION ATO EMBEDDED AGENTS MARKOV DECISION PRECESSES. International Journal of Multidisciplinary Research and Studies, 3(10), 0–9. Retrieved from https://ijmras.com/index.php/ijmras/article/view/182

Download Citation

Downloads

Download data is not yet available.

Similar Articles

You may also start an advanced similarity search for this article.