Abstract:
Kaggle crime statistics for South Africa were used to create machine learning categorization models. Although the techniques used in the experiments that came before this one differed, the dataset that was used was. The accuracy of other previous studies conducted on different datasets from this one and compared during the experiment stage were utilized to identify the three classification algorithms employed in this study. The study chose to use the random forest, K-nearest neighbor, and Naive Bayes classifier models. The Python-based algorithms were trained on a pre-processed crime dataset. Data preparation and processing, missing value analysis, exploratory analysis, and finally model construction and evaluation made up the analytical process. The best model should be chosen in accordance with the results. In both approaches, RF is outperforming the other models. According to the study's evaluation of both metrics and logloss, RF appears to be doing better.