This paper deals with the discovery of association rules from transactional database and explores the combination of GPU and cluster-based parallel computing techniques. Four HPC-based bees swarm optimization approaches are proposed. The first and the second approaches called respectively BSOMW (Bees Swarm Optimization based on Master-Workers) and MWBSO (Master-Workers based on Bees Swarm Optimization) benefit from the cluster intensive computing in the generation and the evaluation steps. Given that the evaluation step is the most time consuming task, the third and fourth approaches, BSOMW-SEGPU (Bees Swarm Optimization based on Master-Workers and Single Evaluation on GPU) and MWBSO-MEGPU (Master-Workers based on Bees Swarm Optimization and Multiple Evaluation on GPU) use the GPU host in the evaluation of the generated rules. The proposed approaches are analyzed, empirically evaluated in comparison with state-of-the-art solutions. The results reveal that MWBSO-MEGPU outperforms the other proposed approaches in terms of speed up. Moreover, MWBSO-MEGPU outperforms the HPC-based ARM approaches when exploring Webdocs instance (the largest instance available on the web). The scalability of this approach is demonstrated when dealing with big transactional database (for more than 6 millions of transactions and 1 millions of items). To our knowledge, this is the first work that explores the combination of GPU and cluster-based parallel computing in association rule mining.
- Big data mining
- HPC computing