Доклады Российской академии наук. Математика, информатика, процессы управления, 2023, T. 514, № 2, стр. 212-224

ОПТИМАЛЬНЫЙ АНАЛИЗ МЕТОДА С БАТЧИРОВАНИЕМ ДЛЯ СТОХАСТИЧЕСКИХ ВАРИАЦИОННЫХ НЕРАВЕНСТВ ВИДА КОНЕЧНОЙ СУММЫ

А. Пичугин 1, М. Печин 1, А. Безносиков 1*, А. Савченко 2, А. Гасников 1

1 Московский физико-технический институт
Долгопрудный, Россия

2 Лаборатория искусственного интеллекта, ПАО “Сбербанк”
Москва, Россия

* E-mail: beznosikov.an@phystech.edu

Поступила в редакцию 01.09.2023
После доработки 15.09.2023
Принята к публикации 18.10.2023

Аннотация

Вариационные неравенства являются унифицированной оптимизационной постановкой, которая интересна не только сама по себе, но и потому что включает в себя задачи минимизации и поиска седловой точки. Между тем современные приложения побуждают рассматривать стохастические формулировки оптимизационных задач. В данной работе представлен анализ метода, имеющего оптимальные оценки сходимости для монотонных стохастических вариационных неравенств вида конечной суммы. В отличие от предыдущих работ, наш метод поддерживает батчирование и не теряет оптимальности оракульной сложности для любых размеров батча. Эффективность алгоритма, особенно в случае малых, но не единичных батчей, подтверждается численными экспериментами.

Ключевые слова: стохастическая оптимизация, вариационные неравенства, задачи вида конечной суммы, батчирование

Список литературы

  1. Facchinei F., Pang J.-S. Finite-Dimensional Variational Inequalities and Complementarity Problems (Springer Series in Operations Research). Springer, 2003.

  2. Scutari G., Palomar D.P., Facchinei F., Pang J.-S. “Convex optimization, game theory, and variational inequality theory,” IEEE Signal Processing Magazine. 2010. V. 27. № 3. P. 35–49.

  3. Neumann J.V., Morgenstern O. Theory of games and economic behavior. Princeton University Press, 1944.

  4. Harker P.T., Pang J.-S. “Finite-dimensional variational inequality and nonlinear complementarity problems: A survey of theory, algorithms and applications,” Mathematical programming, 1990.

  5. Ben-Tal A., Ghaoui L.E., Nemirovski A. Robust Optimization. Princeton University Press, 2009.

  6. Nesterov Y. “Dual extrapolation and its applications to solving variational inequalities and related problems,” Mathematical Programming. 2007. V. 109. № 2. P. 319–344.

  7. Nemirovski A. “Prox-method with rate of convergence o (1/t) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems,” SIAM Journal on Optimization. 2004. V. 15. № 1. P. 229–251.

  8. Joachims T. “A support vector method for multivariate performance measures,” Jan. 2005. P. 377–384. https://doi.org/10.1145/1102351.1102399

  9. Bach F., Jenatton R., Mairal J., Obozinski G. “Optimization with sparsity-inducing penalties,” arXiv preprint arXiv:1108.0775, 2011.

  10. Xu L., Neufeld J., Larson B., Schuurmans D. “Maximum margin clustering,” in Advances in Neural Information Processing Systems. 2005. V. 17. P. 1537–1544.

  11. Bach F., Mairal J., Ponce J. “Convex sparse matrix factorizations,” arXiv preprint arXiv:0812.1869, 2008.

  12. Esser E., Zhang X., Chan T.F. “A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science,” SIAM Journal on Imaging Sciences. 2010. V. 3. № 4. P. 1015–1046.

  13. Chambolle A., Pock T. “A first-order primal-dual algorithm for convex problems with applications to imaging,” Journal of mathematical imaging and vision. 2011. V. 40. № 1. P. 120–145.

  14. Omidshafiei S., Pazis J., Amato C., How J.P., Vian J. “Deep decentralized multi-task multi-agent reinforcement learning under partial observability,” in Proceedings of the 34th International Conference on Machine Learning (ICML). V. 70, PMLR, 2017. P. 2681–2690. [Online]. Available: http://proceedings.mlr.press/v70/omidshafiei17a.html.

  15. Jin Y., Sidford A. “Efficiently solving MDPs with stochastic mirror descent,” in Proceedings of the 37th International Conference on Machine Learning (ICML). V. 119, PMLR, 2020. P. 4890–4900.

  16. Madry A., Makelov A., Schmidt L., Tsipras D., Vladu A. “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations (ICLR), 2018.

  17. Goodfellow I.J., Pouget-Abadie J., Mirza M. et al., Generative adversarial networks, 2014. arXiv: 1406.2661 [stat.ML].

  18. Daskalakis C., Ilyas A., Syrgkanis V., Zeng H. “Training gans with optimism,” arXiv preprint arXiv:1711.00141, 2017.

  19. Gidel G., Berard H., Vignoud G., Vincent P., Lacoste-Julien S. “A variational inequality perspective on generative adversarial networks,” arXiv preprint arXiv:1802.10551, 2018.

  20. Mertikopoulos P., Lecouat B., Zenati H., Foo C.-S., Chandrasekhar V., Piliouras G. “Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile,” arXiv preprint arXiv:1807.02629, 2018.

  21. Robbins H., Monro S. “A Stochastic Approximation Method,” The Annals of Mathematical Statistics. 1951. V. 22. № 3. P. 400–407.

  22. Johnson R., Zhang T. “Accelerating stochastic gradient descent using predictive variance reduction,” vol. 26, 2013. P. 315–323.

  23. Defazio A., Bach F., Lacoste-Julien S. “Saga: A fast incremental gradient method with support for non-strongly convex composite objectives,” in Advances in neural information processing systems, 2014. P. 1646–1654.

  24. Nguyen L.M., Liu J., Scheinberg K., Takáč M. “SARAH: a novel method for machine learning problems using stochastic recursive gradient,” in International Conference on Machine Learning, PMLR, 2017. P. 2613–2621.

  25. Allen-Zhu Z. “Katyusha: The first direct acceleration of stochastic gradient methods,” in Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, 2017. P. 1200–1205.

  26. Korpelevich G.M. “The extragradient method for finding saddle points and other problems,” Matecon. V. 12. P. 35–49, 1977; Russian original: Economika Mat. Metody. 1976. V. 12. № 4. P. 747–756.

  27. Mokhtari A., Ozdaglar A., Pattathil S. “A unified analysis of extra-gradient and optimistic gradient methods for saddle point problems: Proximal point approach,” in International Conference on Artificial Intelligence and Statistics, PMLR, 2020. P. 1497–1507.

  28. Popov L.D. “A modification of the arrow-hurwicz method for search of saddle points,” Mathematical notes of the Academy of Sciences of the USSR. 1980. V. 28. P. 845–848.

  29. Tseng P. “A modified forward-backward splitting method for maximal monotone mappings,” SIAM Journal on Control and Optimization. 2000. V. 38. № 2. P. 431–446.

  30. Juditsky A., Nemirovski A., Tauvel C. “Solving variational inequalities with stochastic mirror-prox algorithm,” Stochastic Systems. 2011. V. 1. № 1. P. 17–58.

  31. Carmon Y., Jin Y., Sidford A., Tian K. “Variance reduction for matrix games,” arXiv preprint arXiv:1907.02056, 2019.

  32. Alacaoglu A., Malitsky Y. “Stochastic variance reduction for variational inequality methods,” arXiv preprint arXiv:2102.08352, 2021.

  33. Alacaoglu A., Malitsky Y., Cevher V. “Forward-reflected-backward method with variance reduction,” Computational Optimization and Applications. V. 80, Nov. 2021. https://doi.org/10.1007/s10589-021-00305-3

  34. Han Y., Xie G., Zhang Z. “Lower complexity bounds of finite-sum optimization problems: The results and construction,” arXiv preprint arXiv:2103.08280, 2021.

  35. Kovalev D., Beznosikov A., Sadiev A., Persiianov M., Richt’arik P., Gasnikov A. “Optimal algorithms for decentralized stochastic variational inequalities,” arXiv preprint arXiv:2202.02771, 2022.

  36. Nemirovski A. Mini-course on convex programming algorithms, 2013.

  37. Nemirovski A., Juditsky A., Lan G., Shapiro A. “Robust stochastic approximation approach to stochastic programming,” SIAM Journal on optimization. 2009. V. 19. № 4. P. 1574–1609.

  38. Condat L. “Fast projection onto the simplex and the l 1 ball,” Mathematical Programming. 2016. V. 158. № 1–2. P. 575–585.

Дополнительные материалы отсутствуют.

Инструменты

Доклады Российской академии наук. Математика, информатика, процессы управления