Multilevel rules mining association for processing big data using genetic algorithm

  • Gebeyehu Belay Gebremeskel Bahir Dar Institute of Technology, Bahir Dar University, Bahir Dar PO BOX 26, Ethiopia
  • Teshale Wubie Yilma Department of Computer Science, Informatics College, Debre Tabor University, Debra Tabor PO BOX 272, Ethiopia
Article ID: 1819
Keywords: data mining; market basket data; genetic algorithm; association rules; apriori algorithm; optimization

Abstract

Data mining is a machine learning method and a subset of artificial intelligence that focuses on developing algorithms to enable a computer to learn from data and past experiences within its context. Multilevel association rules mining is a crucial area for discovering interesting relationships between data elements at various levels of abstraction. Many existing algorithms addressing this issue rely on exhaustive search methods such as Apriori and FP-growth. However, these methods incur significant computational costs when applied to big data applications searching for association rules. Therefore, we propose a novel genetic-based method with three key innovations to speed up the search for multilevel association rules and reduce excessive computation. Firstly, we utilize the category tree to describe multilevel application data sets as domain knowledge. Next, we introduce a unique tree-encoding schema based on the category tree to develop the heuristic multilevel association-mining algorithm. Lastly, we present a genetic algorithm based on the tree-encoding schema that greatly decreases the association rule search space. This method is valuable for mining multilevel association rules in big data applications.

References

[1]Zakur Y, Flaih L. Apriori Algorithm and Hybrid Apriori Algorithm in the Data Mining: A Comprehensive Review. E3S Web of Conferences. 2023; 448: 02021. doi: 10.1051/e3sconf/202344802021

[2]Setiabudi DH, Budhi GS, Purnama IWJ, et al. Data mining market basket analysis’ using hybrid-dimension association rules, case study in Minimarket X. In: Proceedings of the 2011 International Conference on Uncertainty Reasoning and Knowledge Engineering; August 2011.

[3]Sigala M, Rahimi R, Thelwall M, et al. Big Data and Innovation in Tourism, Travel, and Hospitality. Springer Singapore; 2019.

[4]Sharma A, Ganpati A. Association Rule Mining Algorithms: A Comparative Review. International Research Journal of Engineering and Technology (IRJET). 2021; 8(11): 848-853.

[5]Wahidi N, Ismailova R. Association rule mining algorithm implementation for e-commerce in the retail sector. Journal of Applied Research in Technology & Engineering. 2024; 5(2): 63-68. doi: 10.4995/jarte.2024.20753

[6]Grami M, Gheibi R, Rahimi F. A novel association rule mining using genetic algorithm. In: Proceedings of the 2016 Eighth International Conference on Information and Knowledge Technology (IKT); September 2016.

[7]Moslehi F, Haeri A. A genetic algorithm-based framework for mining quantitative association rules without specifying minimum support and confidence. Scientia Iranica D. 2020; 27(3): 1316-1332.

[8]Zhang M, Fan J, Sharma A, et al. Data mining applications in university information management system development. Journal of Intelligent Systems. 2022; 31(1): 207-220. doi: 10.1515/jisys-2022-0006

[9]Selvi RS, Valarmathi ML. Optimal Feature Selection for Big Data Classification: Firefly with Lion-Assisted Model. Big Data. 2020; 8(2): 125-146. doi: 10.1089/big.2019.0022

[10]Saxena A, Rajpoot V. A Comparative Analysis of Association Rule Mining Algorithms. IOP Conference Series: Materials Science and Engineering. 2021; 1099(1): 012032. doi: 10.1088/1757-899x/1099/1/012032

[11]Karapiperis C, Chasapi A, Angelis L, et al. The Coming of Age for Big Data in Systems Radiobiology, an Engineering Perspective. Big Data. 2021; 9(1): 63-71. doi: 10.1089/big.2019.0144

[12]Hashad AA, Khai Wah K, et al. Exploratory analysis with association rule mining algorithms in the retail industry. Malaysian journal of computing (MJOC). 2024; 9(1): 1746-1758. doi: 10.24191/mjoc.v9i1.21433

[13]Shrivastava A, Jain RC. Performance Analysis of Modified Algorithm for Finding Multilevel Association Rules. Computer Science & Engineering: An International Journal. 2013; 3(4): 1-9. doi: 10.5121/cseij.2013.3401

[14]Zhen H, Chiou BC, Tsou YT, et al. Association Rule Mining with Differential Privacy. In: Proceedings of the 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W); June 2020.

[15]Rawat SS, Rajamani L. Probability apriori based approach to mine rare association rules. In: Proceedings of the 2011 3rd Conference on Data Mining and Optimization (DMO); June 2011.

[16]Bagui S, Stanley P. Mining frequent itemsets from streaming transaction data using genetic algorithms. Journal of Big Data. 2020; 7(1). doi: 10.1186/s40537-020-00330-9

[17]Dolores M, Fernandez-Basso C, Gómez-Romero J, et al. A big data association rule mining based approach for energy building behaviour analysis in an IoT environment. Scientific Reports. 2023; 13(1). doi: 10.1038/s41598-023-47056-1

[18]Kelotra A, Pandey P. Stock Market Prediction Using Optimized Deep-ConvLSTM Model. Big Data. 2020; 8(1): 5-24. doi: 10.1089/big.2018.0143

[19]Zhao Z, Jian Z, Gaba GS, et al. An improved association rule mining algorithm for large data. Journal of Intelligent Systems. 2021; 30(1): 750-762. doi: 10.1515/jisys-2020-0121

[20]Kaur G, Singh L. Data Mining: An Overview. International Journal of Computer Science and Telecommunications. 2011; 2(2): 336-339.

[21]Wu H. Data Association Rules Mining Method Based on Improved Apriori Algorithm. In: Proceedings of the 2020 the 4th International Conference on Big Data Research (ICBDR’20); 27 November 2020.

[22]Bao F, Mao L, Zhu Y, et al. An Improved Evaluation Methodology for Mining Association Rules. Axioms. 2021; 11(1): 17. doi: 10.3390/axioms11010017

[23]Wu X, Wen Q, Zhu J. Association rule mining with a special rule coding and dynamic genetic algorithm for air quality impact factors in Beijing, China. PLOS ONE. 2024; 19(3): e0299865. doi: 10.1371/journal.pone.0299865

[24]Aggarwal H, Kumar V, Arora HD. Data mining algorithm based on Renyi fuzzy association rule: an application for selecting a suitable course. Research in Statistics. 2023; 1(1). doi: 10.1080/27684520.2023.2271902

[25]Darwish SM, Essa RM, Osman MA, et al. Privacy-Preserving Data Mining Framework for Negative Association Rules: An Application to Healthcare Informatics. IEEE Access. 2022; 10: 76268-76280. doi: 10.1109/access.2022.3192447

[26]Malamsha GC, Nyambo DG. Multi-level Association Rule Mining for the Discovery of Strong Underrepresented Patterns. Engineering, Technology & Applied Science Research. 2023; 13(2): 10377-10383. doi: 10.48084/etasr.5683

[27]Yildirim Taşer P, Birant Ku, Birant D. Multitask-based association rule mining. Turkish journal of electrical engineering & computer sciences. 2020; 28(2): 933-955. doi: 10.3906/elk-1905-88

[28]Raorane A, Kulkarni RV. Data Mining Techniques: A Source for Consumer Behavior Analysis. International Journal of Database Management Systems. 2011; 3(3): 45-56. doi: 10.5121/ijdms.2011.3304

[29]Mudumba B, Kabir MF. Mine-first association rule mining: An integration of independent frequent patterns in distributed environments. Decision Analytics Journal. 2024; 10: 100434. doi: 10.1016/j.dajour.2024.100434

[30]Schlegel B, Gemulla R, Lehner W. Memory-efficient frequent-itemset mining. In: Proceedings of the 14th International Conference on Extending Database Technology; 21 March 2011.

[31]Kishor P, Porika S. A Novel Association Rule Mining Model for Generating Positive and Negative Association Rules with Hybridized Meta-Heuristic Development. International Journal of Intelligent Engineering and Systems. 2023; 16(2): 125-141.

[32]Zheng L. Research on E-Commerce Potential Client Mining Applied to Apriori Association Rule Algorithm. In: Proceedings of the 2020 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS); January 2020.

[33]Yusof R, Manaf M, Mohd ZNA, et al. The Complexity of Extracting Knowledge in Big Data. Journal of Mathematical and Computational Science. 2016; 1(1): 12-18.

[34]Xu Y, Zeng M, Liu Q, et al. A Genetic Algorithm Based Multilevel Association Rules Mining for Big Datasets. Ding S, ed. Mathematical Problems in Engineering. 2014; 2014(1). doi: 10.1155/2014/867149

[35]Hibbi FZ, Abdoun O, Haimoudi EK. Exploration of Analytical Mechanisms in the Feedback model. Procedia Computer Science. 2019; 148: 201-207. doi: 10.1016/j.procs.2019.01.037

[36]Jain A. Association Rule Mining in Transactional Data: Challenges and Opportunities. International Journal of Mechanical Engineering. 2021; 6(3): 4548-4558.

[37]Xie L, Chen Z, Yu S. Deep Convolutional Transformer Network for Stock Movement Prediction. Electronics. 2024; 13(21): 4225. doi: 10.3390/electronics13214225

[38]Yan X, Zhang C, Zhang S. Genetic algorithm-based strategy for identifying association rules without specifying actual minimum support. Expert Systems with Applications. 2009; 36(2): 3066-3076. doi: 10.1016/j.eswa.2008.01.028,

[39]Nurmayanti WP, Sastriana HM, et al. Market Basket Analysis with Apriori Algorithm and Frequent Pattern Growth (Fp-Growth) on Outdoor Product Sales Data, 2021. International Journal of Educational Research and Social Science. 2021; 2(1): 132-139.

[40]Zhao Z, Jian Z, Gaba GS, et al. An improved association rule mining algorithm for large data, De Gruyter. Journal of Intelligent Systems. 2021; 30: 750–762.

[41]Aqra I, Abdul Ghani N, Maple C, et al. Incremental Algorithm for Association Rule Mining under Dynamic Threshold. Applied Sciences. 2019; 9(24): 5398. doi: 10.3390/app9245398

[42]Susan S, Bhutani A. Data Mining with Association Rules for Scheduling Open Elective Courses Using Optimization Algorithms. Advances in Intelligent Systems and Computing. 2020; 941: 770–778.

[43]Darwish SM, Amer AA, Taktak SG. A Novel Approach for Discovery Quantitative Fuzzy Multi-Level Association Rules Mining Using Genetic Algorithm. International Journal of Advanced Research in Artificial Intelligence. 2016; 5(6): 35–44.

[44]Albattah W, Khan RU, Alsharekh MF, Khasawneh SF. Feature Selection Techniques for Big Data Analytics. Electronics. 2022; 11(19): 3177. doi: 10.3390/electronics11193177

[45]Xie HY. Research and Case Analysis of Apriori Algorithm Based on Mining Frequent Item-Sets. Open Journal of Social Sciences. 2021; 9: 458-468. doi: 10.4236/jss.2021.94034

[46]Jagli DRD, Sangeeta O. Association Rule Mining: Improved Tree-Based and Graph-Based Approach for Mining Frequent Item Sets. SSRN; 2024.

[47]Osuntokun OD, Adeyemo AB, Makolo AU. A Review on Biclustering Algorithms for Data Mining Analysis of Gene Expression Data. Uijslictr. 2020; 5(1): 64-76.

[48]Aljehani SS, Alotaibi YA. Preserving Privacy in Association Rule Mining Using Metaheuristic-Based Algorithms: A Systematic Literature Review. IEEE Access. 2024; 12: 21217-21236. doi: 10.1109/access.2024.3362907

[49]Hasan AA, Bari QH, Lorber P, et al. An Association Rule Mining approach to explore the dynamics in the plastic recycling business. Cleaner Waste Systems. 2024; 9: 100186. doi: 10.1016/j.clwas.2024.100186

[50]Wahidi N, Ismailova R. Association rule mining algorithm implementation for e-commerce in the retail sector. J Appl Res Eng Technol & Engineering. 2024; 5(2): 1-6.

[51]Guo F. Data mining techniques for customer relationship management. In: Proceedings of the International Conference on Communication and Electronic Information Engineering (CEIE 2016); 2017.

[52]Ramya HR, Pradeep M. A Novel Method for Association Rule Mining using Genetic Algorithm. International Journal for Scientific Research & Development. 2015; 3(4): 533-537.

[53]Sharma A, Tiwari N. A Survey of Association Rule Mining Using Genetic Algorithm. International Journal of Computer Applications & Information Technology. 2012; 1(2): 5-11.

[54]Sarma R, Sarma PKD, Mazumdar N. Study of Algorithms for Mining Fuzzy Association Rules and Applications. Journal of Electrical Systems. 2024; 20(10s): 5812-5834. doi: 10.52783/jes.6478

[55]Machfudiyanto RA, Chen JH, Latief Y, et al. Applying Association Rule Mining to Explore Unsafe Behaviors in the Indonesian Construction Industry. Sustainability. 2023; 15(6): 5261. doi: 10.3390/su15065261

[56]Vijaya CJ. A Framework for Implementing Machine Learning algorithms using Data sets. International Journal of Innovative Technology and Exploring Engineering. 2019; 8(11): 155-160. doi: 10.35940/ijitee.k1263.0981119

[57]Babi C, Rao MV, Rao VV. Mining Frequent Patterns from Big Data Sets using Genetic Algorithm. International Journal of Engineering Research and Technology. 2018; 11(2): 287-306.

[58]Djenour Y, Djenour D, Lin JC, Belhad A. Frequent Itemset Mining in Big Data with Effective Single Scan Algorithms. IEEE Access. 2016; 6: 68013–68026. doi: 10.1109/ACCESS.2018.2880275

Published
2025-02-25
How to Cite
Gebremeskel, G. B., & Yilma, T. W. (2025). Multilevel rules mining association for processing big data using genetic algorithm. Computing and Artificial Intelligence, 3(1), 1819. https://doi.org/10.59400/cai1819