WEIGHT SPEEDY Q-LEARNING FOR FEEDBACK STABILIZATION OF PROBABILISTIC BOOLEAN CONTROL NETWORKS
In this paper, a reinforcement learning (RL)-based scalable technique is presented to control the probabilistic Boolean control networks (PBCNs). In particular, we propose an improved Q-learning (QL) algorithm: weight speedy Q-learning (WSQL). Based on WSQL, the feedback stability problem of PBCN is solved, and the state feedback controller is designed to make the PBCN stable at the given equilibrium point. According to the design of the controller, the PBCN can have finite-time stability and asymptotic stability. The presented method is model-free and offers scalability. We also verify the convergence of the proposed algorithm. Finally, simulation results illustrate that compared with the QL, our proposed algorithm converges to the fixed point faster.
probabilistic Boolean control networks, feedback stabilization, weight speedy Q-learning, model-free technique.
Received: February 3, 2023; Accepted: March 17, 2023; Published: April 19, 2023
How to cite this article: Yangyang Chen, Weight speedy Q-learning for feedback stabilization of probabilistic Boolean control networks, Far East Journal of Applied Mathematics 116(2) (2023), 149-171. http://dx.doi.org/10.17654/0972096023009
This Open Access Article is Licensed under Creative Commons Attribution 4.0 International License
References:
[1] S. A. Kauffman, Metabolic stability and epigenesis in randomly constructed genetic nets, J. Theoret. Biol. 22(3) (1969), 437-467.[2] H. Kitano, Computational systems biology, Nature 420(6912) (2002), 206-210.[3] B. Faryabi, E. R. Dougherty and A. Datta, On approximate stochastic control in genetic regulatory networks, IET Syst. Biol. 1(6) (2007), 361-368.[4] T. Akutsu, S. Kosub, A. A. Melkman and T. Tamura, Finding a periodic attractor of a Boolean network, IEEE/ACM Trans. Comput. Biol. Bioinf. 9(5) (2012), 1410-1421.[5] D. Cheng, H. Qi and Z. Li, Analysis and Control of Boolean Networks: A Semi-tensor Product Approach, Springer, London, U.K., 2011.[6] F. Li, H. Yan and H. R. Karimi, Single-input pinning controller design for reachability of Boolean networks, IEEE Trans. Neural Netw. Learn. Syst. 29(7) (2018), 3264-3269.[7] H. Li, Y. Wang and Z. Liu, Stability analysis for switched Boolean networks under arbitrary switching signals, IEEE Trans. Automat. Control 59(7) (2014), 1978-1982.[8] R. Liu, J. Lu, Y. Liu, J. Cao and Z.-G. Wu, Delayed feedback control for stabilization of Boolean control networks with state delay, IEEE Trans. Neural Netw. Learn. Syst. 29(7) (2018), 3283-3288.[9] H. Chen, J. Liang, T. Huang and J. Cao, Synchronization of arbitrarily switched Boolean networks, IEEE Trans. Neural Netw. Learn. Syst. 28(3) (2017), 612-619.[10] I. Shmulevich, E. R. Dougherty, S. Kim and W. Zhang, Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks, Bioinformatics 18(2) (2002), 261-274.[11] E. Fornasini and M. E. Valcher, Observability and reconstructibility of probabilistic Boolean networks, IEEE Control Syst. Lett. 4(2) (2020), 319-324.[12] Y. Guo, R. Zhou, Y. Wu, W. Gui and C. Yang, Stability and set stability in distribution of probabilistic Boolean networks, IEEE Trans. Automat. Control 64(2) (2019), 736-742.[13] R. Zhou, Y. Guo, Y. Wu and W. Gui, Asymptotical feedback set stabilization of probabilistic Boolean control networks, IEEE Trans. Neural Netw. Learn. Syst. 31 (2020), 4524-4537. doi: 10.1109/TNNLS.2019.2955974.[14] L. Wang, Y. Liu, Z.-G. Wu, J. Lu and L. Yu, Stabilization and finite-time stabilization of probabilistic Boolean control networks, IEEE Trans. Syst., Man, Cybern.: Syst. 51 (2021), 1559-1566. doi: 10.1109/TSMC.2019.2898880.[15] H. Li, Y. Wang and P. Guo, State feedback based output tracking control of probabilistic Boolean networks, Inform. Sci. 349 (2016), 1-11.[16] A. Yerudkar, C. Del Vecchio and L. Glielmo, Output tracking control of probabilistic Boolean control networks, Proc. IEEE Int. Conf. Syst. Man, Cybern. (SMC), Bari, Italy, 2019, pp. 2109-2114.[17] W. Stach, L. Kurgan, W. Pedrycz and M. Reformat, Genetic learning of fuzzy cognitive maps, Fuzzy Sets Syst. 153(3) (2005), 371-401.[18] M. U. Ahmed, S. Brickman, A. Dengg, N. Fasth, M. Mihajlovic and J. Norman, A machine learning approach to classify pedestrians events based on IMU and GPS, Int. J. Artif. Intell. 17(2) (2019), 154-167.[19] F. L. Lewis, D. Vrabie and K. G. Vamvoudakis, Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers, IEEE Control Syst. Mag. 32(6) (2012), 76-105.[20] B. Faryabi, A. Datta and E. Dougherty, On approximate stochastic control in genetic regulatory networks, IET Syst. Biol. 1(6) (2007), 361-368.[21] A. Sootla, N. Strelkowa, D. Ernst, M. Barahona and G.-B. Stan, On periodic reference tracking using batch-mode reinforcement learning with application to gene regulatory network control, Proc. 52nd IEEE Conf. Decis. Control, Florence, Italy, 2013, pp. 4086-4091.[22] Aivar Sootla, Natalja Strelkowa, Damien Ernst, Mauricio Barahona and Guy-Bart Stan, Toggling a genetic switch using reinforcement learning, 2013. arXiv preprint arXiv:1303.3183.[23] Georgios Papagiannis and Sotiris K. Moschoyiannis, Deep reinforcement learning for control of probabilistic Boolean networks, International Workshop on Complex Networks and their Applications, 2019.[24] A. Acernese, A. Yerudkar, L. Glielmo and C. Del Vecchio, Reinforcement learning approach to feedback stabilization problem of probabilistic Boolean control networks, IEEE Control Syst. Lett. 5(1) (2021), 337-342. doi: 10.1109/LCSYS. 2020.3001993[25] Tommi Jaakkola, Michael I. Jordan and Satinder P. Singh, On the convergence of stochastic iterative dynamic programming algorithms, Neural Comput. 6(6) (1994), 1185-1201.[26] M. G. Azar, R. Munos, M. Ghavamzadeh and H. Kappen, Speedy Q-learning, Advances in Neural Information Processing Systems, 2011.