Доклады Российской академии наук. Математика, информатика, процессы управления, 2023, T. 514, № 2, стр. 60-71

НЕЙРОСЕТЕВОЕ ОБУЧЕНИЕ МЕТРИК: СРАВНЕНИЕ ФУНКЦИЙ ПОТЕРЬ

Р. Л. Васильев 1*, А. Г. Дьяконов 2**

1 ООО “Яндекс”
Москва, Россия

2 Центральный университет
Москва, Россия

* E-mail: artnitolog@yandex.com
** E-mail: djakonov@mail.ru

Поступила в редакцию 30.06.2023
После доработки 19.09.2023
Принята к публикации 15.10.2023

Аннотация

Представлен обзор методов обучения метрик с помощью глубоких нейронных сетей. Эти методы появились в последние годы, но сравнивались лишь с предшественниками, используя для обучения представлений (на которых вычисляется метрика) нейронные сети устаревших на данный момент архитектур. Проведено сравнение описанных методов на разных датасетах из нескольких доменов, используя предобученные нейронные сети, сопоставимые по качеству с SotA (state of the art): ConvNeXt для изображений, DistilBERT для текстов. Использовались размеченные наборы данных, разбитые на две части (обучение и контроль) таким образом, чтобы классы не пересекались (т.е. в контроле нет объектов тех классов, которые были в обучении). Подобное масштабное честное сравнение сделано впервые и привело к неожиданным выводам: некоторые “старые” методы, например Tuplet Margin Loss, превосходят по качеству свои современные модификации и методы, предложенные в совсем свежих работах.

Ключевые слова: машинное обучение, глубокое обучение, метрика, схожесть

Список литературы

  1. Wei Chen, Yang Liu, Weiping Wang, Bakker E.M., Georgiou T.K., Paul Fieguth, Li Liu, Lew M.S.K. Deep image retrieval: A survey. ArXiv, 2021.

  2. Reimers N., Gurevych I. Sentence-bert: Sentence embeddings using Siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.

  3. Iacopo Masi, Yue Wu, Tal Hassner, Prem Natarajan. Deep face recognition: A survey. In 2018 31st SIBGRAPI conference on graphics, patterns and images (SIBGRAPI). IEEE, 2018. P. 471–478.

  4. Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, Steven CH Hoi. Deep learning for person re-identification: A survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

  5. Musgrave K., Belongie S., Ser-Nam Lim. A metric learning reality check. In European Conference on Computer Vision. Springer, 2020. P. 681–699.

  6. Johnson J., Douze M., Jégou H. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data. 2019. V. 7. № 3. P. 535–547.

  7. Chopra S., Hadsell R., LeCun Y. Learning a similarity metric discriminatively, with application to face veri_cation. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). IEEE. 2005. V. 1. P. 539–546.

  8. Schroff F., Kalenichenko D., Philbin J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. P. 815–823.

  9. Fatih Cakir, Kun He, Xide Xia, Brian Kulis, Stan Sclaroff. Deep metric learning to rank. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. P. 1861–1870.

  10. Ustinova E., Lempitsky V. Learning deep embeddings with histogram loss. Advances in Neural Information Processing Systems. 2016. P. 29.

  11. Wieczorek V., Rychalska B., Dąbrowski J. On the unreasonable effectiveness of centroids in image retrieval. In International Conference on Neural Information Processing. Springer, 2021. P. 212–223.

  12. Chao-Yuan Wu, Manmatha R., Smola A.J., Krahenbuhl P. Sampling matters in deep embedding learning. In Proceedings of the IEEE International Conference on Computer Vision. 2017. P. 2840–2848.

  13. Xun Wang, Xintong Han, Weilin Huang, Dengke Dong, Scott M.R. Multisimilarity loss with general pair weighting for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. P. 5022–5030.

  14. Frosst N., Papernot N., Hinton G. Analyzing and improving representations with the soft nearest neighbor loss. In International conference on machine learning. PMLR, 2019. P. 2012–2020.

  15. Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan. Supervised contrastive learning. Advances in Neural Information Processing Systems. 2020. V. 33. P. 18661–18673.

  16. Tongtong Yuan, Weihong Deng, Jian Tang, Yinan Tang, Binghui Chen. Signal-tonoise ratio: A robust distance metric for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. P. 4815–4824.

  17. Baosheng Yu, Dacheng Tao. Deep metric learning with tuplet margin loss. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. P. 6490–6499.

  18. Yifan Sun, Changmao Cheng, Yuhan Zhang, Chi Zhang, Liang Zheng, Zhongdao Wang, Yichen Wei. Circle loss: A unified perspective of pair similarity optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. P. 6398–6407.

  19. Jiankang Deng, Jia Guo, Niannan Xue, Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. P. 4690–4699.

  20. Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, Wei Liu. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. P. 5265–5274.

  21. Jiankang Deng, Jia Guo, Tongliang Liu, Mingming Gong, Stefanos Zafeiriou. Subcenter arcface: Boosting face recognition by large-scale noisy web faces. In European Conference on Computer Vision. Springer, 2020. P. 741–757.

  22. Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, Rong Jin. Softtriple loss: Deep metric learning without triplet sampling. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. P. 6450–6458.

  23. Sungyeon Kim, Dongwon Kim, Minsu Cho, Suha Kwak. Proxy anchor loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. P. 3238–3247.

  24. Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei. 3d object representations for fine-grained categorization. In 4th International IEEE Workshop on 3D Representation and Recognition (3dRR-13), Sydney, Australia, 2013.

  25. Catherine Wah, Steve Branson, Peter Welinder, Pietro Pe-rona, Serge Belongie. The caltech-ucsd birds-200-2011 dataset. 2011.

  26. Hyun Oh Song, Yu Xiang, Stefanie Jegelka, Silvio Savarese. Deep metric learning via lifted structured feature embedding. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

  27. Ding-Nan Zou, Song-Hai Zhang, Tai-Jiang Mu, Min Zhang. A new dataset of dog breed images and a benchmark for finegrained classification. Computational Visual Media. 2020. V. 6. № 4. P. 477–487.

  28. Ken Lang. Newsweeder: Learning to filter netnews. In Machine Learning Proceedings 1995, Elsevier, 1995. P. 331–339.

  29. Kamran Kowsari, Donald E Brown, Mojtaba Heidarysafa, Kiana Jafari Meimandi, Matthew S Gerber, Laura E Barnes. Hdltex: Hierarchical deep learning for text classification. In 2017 16th IEEE international conference on machine learning and applications (ICMLA). IEEE, 2017. P. 364–371.

  30. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. 770–778.

  31. Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. P. 1–9.

  32. Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feich-tenhofer, Trevor Darrell, Saining Xie. A convnet for the 2020s. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

  33. Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas  Wolf. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108, 2019.

  34. Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Touta- nova. Bert: Pretraining of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.

  35. Kevin Musgrave, Serge Belongie, Ser-Nam Lim. Pytorch metric learning, 2020.

  36. Kingma D.P., Ba J. Adam: A method for stochastic optimization. ArXiv preprint arXiv:1412.6980, 2014.

  37. Wohlwend J., Elenberg T.R., Altschul S., Henry S., Tao Lei. Metric learning for dynamic text classification. arXiv preprint arXiv:1911.01026, 2019.

Дополнительные материалы отсутствуют.

Инструменты

Доклады Российской академии наук. Математика, информатика, процессы управления