LỌC CỘNG TÁC TĂNG CƯỜNG NGỮ NGHĨA CHO HỆ THỐNG GỢI Ý PHIM BẰNG MÔ HÌNH NGÔN NGỮ LỚN

Phạm Thị Thu Trang, Đặng Khánh Hòa; Nguyễn Hoàng, Nguyễn Vũ Sơn

doi:10.59266/houjs.2026.1181

Authors

Phạm Thị Thu Trang, Đặng Khánh Hòa
Nguyễn Hoàng, Nguyễn Vũ Sơn

DOI:

https://doi.org/10.59266/houjs.2026.1181

Keywords:

trí tuệ nhân tạo, hệ thống gợi ý, lọc cộng tác, mô hình ngôn ngữ lớn, nhúng ngữ nghĩa

Abstract

Các hệ gợi ý truyền thống thường gặp khó khăn do tính thưa của dữ liệu và khả năng hiểu ngữ nghĩa hạn chế đối với sở thích người dùng cũng như nội dung của mục. Để giải quyết các thách thức này, bài báo đề xuất một khung gợi ý lai mới tích hợp lọc cộng tác với các đặc trưng ngữ nghĩa được rút ra từ siêu dữ liệu phim và các mô tả do mô hình ngôn ngữ lớn (LLM) sinh ra. Chúng tôi sử dụng một LLM huấn luyện sẵn để tự động tạo ra nội dung văn bản phong phú cho phim, sau đó biểu diễn bằng fastText nhằm tăng cường cách biểu diễn mục. Các embedding ngữ nghĩa này được kết hợp với dữ liệu tương tác người dùng - mục để tạo ra các gợi ý chính xác và cá nhân hóa hơn. Kết quả thực nghiệm trên bộ dữ liệu MovieLens-20M cho thấy mô hình đề xuất vượt trội đáng kể so với các phương pháp truyền thống theo các chỉ số RMSE, Precision@5 và Recall@5. Các phát hiện này nhấn mạnh tiềm năng của LLM và tăng cường văn bản trong việc cải thiện hiệu quả của các hệ gợi ý.

References

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734-749.

Bao, W., Ding, Y., Wan, M., Zhang, Y., He, X., & Chua, T.-S. (2023). LLM4Rec: Enhancing recommender systems with large language models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM.

Bommasani, R., Hudson, D. A., Adeli, E., & và cộng sự (2021). On the opportunities and risks of foundation models. arXiv. https://arxiv.org/abs/2108.07258

Brown, T., Mann, B., Ryder, N., & và cộng sự (2020). Language models are few- shot learners. In Advances in Neural Information Processing Systems, 33, 1877-1901.

Chen, M., Tworek, J., Jun, H., Yuan, Q., Pinto, H. P. d. O., Kaplan, J., Edwards, H., Burda, Y., Joseph, N., & Brockman, G. (2021). Evaluating large language models trained on code. arXiv. https://arxiv.org/abs/2107.03374

Devlin, J., Chang, M.-W., Lee, K., &Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT (pp. 4171-4186).

Fan, Q., Zheng, K., Qu, L., & Huang, X. (2023). A survey on large language models (LLMs) for recommender systems. arXiv. https://arxiv.org/abs/2305.12619

Geng, Z., Richards, S., & He, M. (2022). Promptrec: Learning to recommend items with prompts. In Proceedings of the ACM Web Conference 2022 (pp. 1613-1624). ACM.

Harper, F. M., & Konstan, J. A. (2015). The MovieLens datasets: History and context. ACM Transactions on Interactive Intelligent Systems, 5(4), 1-19.

Herlocker, J. L., Konstan, J. A., Borchers, A., & Riedl, J. (1999). An algorithmic framework for performing collaborative filtering. In Proceedings of SIGIR (pp. 230-237). ACM.

Koren, Y., & Bell, R. (2011). Advances in collaborative filtering. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender Systems Handbook (2nd ed., pp. 77-118). Springer.

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37.

Linden, G., Smith, B., & York, J. (2003). Amazon.com recommendations: Item- to-item collaborative filtering. IEEE Internet Computing, 7(1), 76-80.

Lops, P., De Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State of the art and trends. In Recommender Systems Handbook (pp. 73-105). Springer.

Pazzani, M. J., & Billsus, D. (2007). Content- based recommendation systems. In F. Ricci, L. Rokach, B. Shapira, & P. B. Kantor (Eds.), Recommender Systems Handbook (pp. 325-341). Springer.

Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of EMNLP-IJCNLP (pp. 3982-3992).

Su, X., & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Advances in Artificial Intelligence, 2009, Article 4.

Sun, L., & Zhang, X. (2021). Conversational recommender system survey. arXiv. https://arxiv.org/abs/2106.01242

Van Meteren, R., & Van Someren, M. (2000). Using content-based filtering for recommendation. In Proceedings of ECML/ PKDD Workshop: Machine Learning in New Information Age (pp. 47-56).

Vargas, S., & Castells, P. (2011). Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the Fifth ACM Conference on Recommender Systems (pp. 109-116). ACM.

Vaswani, A., Shazeer, N., Parmar, N., & và cộng sự (2017). Attention is all you need. In Advances in Neural Information Processing Systems, 30, 6000-6010.

Vig, J., Sen, S., & Riedl, J. (2012). The tag genome: Encoding community knowledge to support novel interaction. In Proceedings of IUI (pp. 199-208). ACM.

Zhang, S., & Chen, L. (2020). Explainable recommendation: A survey and new perspectives. Foundations and Trends® in Information Retrieval, 14(1), 1-101.