AI-Optimized Kubernetes Scheduling: Node Affinity for Java Microservices

Authors

  • Sandeep Reddy Gundla Lead Software Engineer, MACYS Inc, GA, USA Author

DOI:

https://doi.org/10.56830/IJSIE202405

Keywords:

AI-Optimized Scheduling,, Kubernetes, Java Microservices, Node Affinity, Reinforcement Learning

Abstract

This study investigates Kubernetes scheduling that is optimized with AI through Node Affinity to maximize the deployment and performance of Java microservices. It also reviews the drawbacks of conventional, rule-based schedulers —the lack of adaptability in workloads and their resource requirements that tend to cause inefficient node assignment, resource partitioning, and performance issues. Given that the impending integration of machine learning algorithms into the core Kubernetes codebase will support only supervised learning of the resource requirements and reinforcement learning of adaptive schedules, the proposed framework will enhance the native scheduler in Kubernetes with the ability to make data-derived decisions. A custom scheduler plugin makes all of them use historical and real-time scale data as well as metrics, such as CPU, memory, I/O, pod latency, throughput, predictive node scoring, and affinity-based pod placement. The experiments conducted in an isolated Kubernetes cluster to test the AI-optimized scheduling reveal a 12% decrease in the mean absolute error when forecasting the required resources, a 25% throughput improvement in microservices, an 18% increase in CPU and memory utilization, and a 15% decline in response time as compared to the default scheduler. The placement efficiency of the Java microservices also increases by 22%, which can affirm successful matching of the microservice specifications to the node capabilities through the framework. These findings indicate high performance, scalability, and cost-effectiveness and provide recommendations to help industries incorporate AI models into production Kubernetes practices. Work in the future will include deep learning advanced architectures, continuous ad-hoc model retraining, and the extension to heterogeneous workloads in the cloud. This framework eliminates the manual overhead of configuration. It can be continuously optimized, shifting the need for resilient and efficient operations as part of the larger-scale Kubernetes cluster and the ability to scale  economically in any location around the globe. This research affirms the power of AIbased scheduling, which can drive the container orchestration industry that focuses on optimizing object or node affinity-based selection decisions, as seen in Java microservices.

References

[1] Abiodun, O. I., Jantan, A., Omolara, A. E., Dada, K. V., Umar, A. M., Linus, O. U., ... & Kiru, M. U. (2019). Comprehensive review of artificial neural network applications to pattern recognition. IEEE access, 7, 158820-158846. DOI: https://doi.org/10.1109/ACCESS.2019.2945545

[2] Acito, F. (2023). Dimensionality Reduction. In Predictive Analytics with KNIME: Analytics for Citizen Data Scientists (pp. 85-103). Cham: Springer Nature Switzerland. DOI: https://doi.org/10.1007/978-3-031-45630-5_5

[3] Bang, J., Kim, C., Wu, K., Sim, A., Byna, S., Kim, S., & Eom, H. (2020, June). HPC workload characterization using feature selection and clustering. In Proceedings of the 3rd International Workshop on Systems and Network Telemetry and Analytics (pp. 33-40). DOI: https://doi.org/10.1145/3391812.3396270

[4] Borrohou, S., Fissoune, R., & Badir, H. (2023). Data cleaning survey and challenges–improving outlier detection algorithm in machine learning. Journal of Smart Cities and Society, 2(3), 125-140. DOI: https://doi.org/10.3233/SCS-230008

[5] Carrión, C. (2022). Kubernetes scheduling: Taxonomy, ongoing issues and challenges. ACM Computing Surveys, 55(7), 1-37. DOI: https://doi.org/10.1145/3539606

[6] Chavan, A. (2021). Eventual consistency vs. strong consistency: Making the right choice in microservices. International Journal of Software and Applications, 14(3), 45-56. https://ijsra.net/content/eventual-consistency-vs-strong-consistency-makingright-choice-microservices

[7] Chavan, A. (2023). Managing scalability and cost in microservices architecture: Balancing infinite scalability with financial constraints. Journal of Artificial Intelligence & Cloud Computing, 2, E264. http://doi.org/10.47363/JAICC/2023(2)E264 DOI: https://doi.org/10.47363/JAICC/2023(2)E264

[8] Emmanuel, T., Maupong, T., Mpoeleng, D., Semong, T., Mphago, B., & Tabona, O.

(2021). A survey on missing data in machine learning. Journal of Big data, 8(1), 140.

[9] Hodson, T. O. (2022). Root mean square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development Discussions, 2022, 1-10. DOI: https://doi.org/10.5194/gmd-2022-64

[10] Karwa, K. (2023). AI-powered career coaching: Evaluating feedback tools for design students. Indian Journal of Economics & Business. https://www.ashwinanokha.com/ijeb-v22-4-2023.php

[11] Konneru, N. M. K. (2021). Integrating security into CI/CD pipelines: A DevSecOps approach with SAST, DAST, and SCA tools. International Journal of Science and Research Archive. Retrieved from https://ijsra.net/content/role-notificationscheduling-improving-patient

[12] Kumar, A. (2019). The convergence of predictive analytics in driving business intelligence and enhancing DevOps efficiency. International Journal of Computational Engineering and Management, 6(6), 118-142. Retrieved from https://ijcem.in/wp-content/uploads/THE-CONVERGENCE-OF-PREDICTIVEANALYTICS-IN-DRIVING-BUSINESS-INTELLIGENCE-AND-ENHANCINGDEVOPS-EFFICIENCY.pdf

[13] Liang, W., Tadesse, G. A., Ho, D., Fei-Fei, L., Zaharia, M., Zhang, C., & Zou, J.

(2022). Advances, challenges and opportunities in creating data for trustworthy AI. Nature Machine Intelligence, 4(8), 669-677. DOI: https://doi.org/10.1038/s42256-022-00516-1

[14] Martinez, I., Hafid, A. S., & Jarray, A. (2020). Design, resource management, and evaluation of fog computing systems: a survey. IEEE Internet of Things Journal, 8(4), 2494-2516.. DOI: https://doi.org/10.1109/JIOT.2020.3022699

[15] Nama, P. (2022). Optimizing automation systems with AI: A study on enhancing workflow efficiency through intelligent decision-making algorithms. World Journal of Advanced Engineering Technology and Sciences, 7(02), 296-307. DOI: https://doi.org/10.30574/wjaets.2022.7.2.0118

[16] Nyati, S. (2018). Transforming telematics in fleet management: Innovations in asset tracking, efficiency, and communication. International Journal of Science and Research (IJSR), 7(10), 1804-1810. Retrieved from https://www.ijsr.net/getabstract.php?paperid=SR24203184230 DOI: https://doi.org/10.21275/SR24203184230

[17] Petalotis, C. (2023). A First Investigation Into the Detection of Energy-related Issues in Microservice-based Systems via Anomaly Detection and Root-Cause Analysis.

[18] Qi, S., Kulkarni, S. G., & Ramakrishnan, K. K. (2020). Assessing container network interface plugins: Functionality, performance, and scalability. IEEE Transactions on Network and Service Management, 18(1), 656-671. DOI: https://doi.org/10.1109/TNSM.2020.3047545

[19] Raju, R. K. (2017). Dynamic memory inference network for natural language inference. International Journal of Science and Research, 6(2). https://www.ijsr.net/archive/v6i2/SR24926091431.pdf DOI: https://doi.org/10.21275/SR24926091431

[20] Rejiba, Z., & Chamanara, J. (2022). Custom scheduling in kubernetes: A survey on common problems and solution approaches. ACM Computing Surveys, 55(7), 1-37. DOI: https://doi.org/10.1145/3544788

[21] Said, S., Gozdzik, M., Roche, T. R., Braun, J., Rössler, J., Kaserer, A., ... & Tscholl, D. W. (2020). Validation of the raw national aeronautics and space administration task load index (NASA-TLX) questionnaire to assess perceived workload in patient monitoring tasks: pooled analysis study using mixed models. Journal of medical Internet research, 22(9), e19472. DOI: https://doi.org/10.2196/19472

[22] Sardana, J. (2022). Scalable systems for healthcare communication: A design perspective. International Journal of Science and Research Archive. https://doi.org/10.30574/ijsra.2022.7.2.0253 DOI: https://doi.org/10.30574/ijsra.2022.7.2.0253

[23] Sardana, J. (2022). The role of notification scheduling in improving patient outcomes. International Journal of Science and Research Archive. Retrieved from https://ijsra.net/content/role-notification-scheduling-improving-patient

[24] Shams, S. R., Jahani, A., Kalantary, S., Moeinaddini, M., & Khorasani, N. (2021). The evaluation on artificial neural networks (ANN) and multiple linear regressions (MLR) models for predicting SO2 concentration. Urban Climate, 37, 100837. DOI: https://doi.org/10.1016/j.uclim.2021.100837

[25] Shantal, M., Othman, Z., & Bakar, A. A. (2023). A novel approach for data feature weighting using correlation coefficients and min–max normalization. Symmetry, 15(12), 2185. DOI: https://doi.org/10.3390/sym15122185

[26] Singh, V. (2023). Large language models in visual question answering: Leveraging LLMs to interpret complex questions and generate accurate answers based on visual input. International Journal of Advanced Engineering and Technology (IJAET), 5(S2). https://romanpub.com/resources/Vol%205%20%2C%20No%20S2%20-%2012.pdf

[27] Taha, A., & Hadi, A. S. (2019). Anomaly detection methods for categorical data: A review. ACM Computing Surveys (CSUR), 52(2), 1-35. DOI: https://doi.org/10.1145/3312739

[28] Tan, H., Wang, G., Wang, W., & Zhang, Z. (2022). Feature selection based on distance correlation: a filter algorithm. Journal of Applied Statistics, 49(2), 411426. DOI: https://doi.org/10.1080/02664763.2020.1815672

[29] Thota, R. C. (2023). Optimizing Kubernetes workloads with AI-driven performance tuning in AWS EKS. International Journal of Science and Research Archive, 9(2), 1-11. DOI: https://doi.org/10.30574/ijsra.2023.9.2.0546

[30] Toka, L., Dobreff, G., Fodor, B., & Sonkoly, B. (2021). Machine learning-based scaling management for kubernetes edge clusters. IEEE Transactions on Network and Service Management, 18(1), 958-972. Sampaio Jr, A. R., Rubin, J., DOI: https://doi.org/10.1109/TNSM.2021.3052837

[31] Wang, J. (2022). Edge artificial intelligence-based affinity task offloading under resource adjustment in a 5G network. Applied Intelligence, 52(7), 8167-8188. DOI: https://doi.org/10.1007/s10489-021-02786-5

[32] Wang, Y., Kadiyala, H., & Rubin, J. (2021). Promises and challenges of microservices: an exploratory study. Empirical Software Engineering, 26(4), 63. DOI: https://doi.org/10.1007/s10664-020-09910-y

[33] Yepuri, V. K., Polamarasetty, V. K., Donthi, S., & Gondi, A. K. R. (2023). Containerization of a polyglot microservice application using Docker and Kubernetes. arXiv preprint arXiv:2305.00600.

[34] Ying, X. (2019, February). An overview of overfitting and its solutions. In Journal of physics: Conference series (Vol. 1168, p. 022022). IOP Publishing. DOI: https://doi.org/10.1088/1742-6596/1168/2/022022

[35] Yuan, M., Zhang, L., Li, X. Y., Yang, L. Z., & Xiong, H. (2022). Adaptive model scheduling for resource-efficient data labeling. ACM Transactions on Knowledge Discovery from Data (TKDD), 16(4), 1-22. DOI: https://doi.org/10.1145/3494559

[36] Zhang, Y., Hua, W., Zhou, Z., Suh, G. E., & Delimitrou, C. (2021, April). Sinan:

ML-based and QoS-aware resource management for cloud microservices. In Proceedings of the 26th ACM international conference on architectural support for programming languages and operating systems (pp. 167-181).

[37] Zhong, Z., & Buyya, R. (2020). A cost-efficient container orchestration strategy in kubernetes-based cloud computing infrastructures with heterogeneous resources. DOI: https://doi.org/10.1145/3378447

ACM Transactions on Internet Technology (TOIT), 20(2), 1-24.

Downloads

Published

2026-03-06

Issue

Section

Articles