J. Cruz-Arreola, V. Menéndez, R. Peña-Castillo, M.E. Castellanos
This proposal explores the use of data mining techniques to address a critical social issue: violence against children in southeastern Mexico. The objective is to characterize this phenomenon through data analysis, with the aim of identifying risk factors and profiles associated with different types of child violence. The underlying assumption is that data-driven tools can support the design of more targeted prevention and intervention strategies, promoting informed decision-making that is essential for optimizing resources and ensuring child protection.
The main data source consists of the results from a risk-detection instrument applied to approximately 4,000 children in the Yucatán Peninsula. This dataset provides a solid foundation for a comprehensive and context-aware analysis.
The study is based on the premise that child violence is a complex phenomenon influenced by the interaction of individual, family, community, and socioeconomic factors. Therefore, the selected data mining techniques are those capable of modeling nonlinear relationships, identifying groups of individuals with similar risk profiles, and uncovering associations among multiple variables.
Three key techniques will be applied: the J48 classification algorithm, to categorize children into risk levels (low, medium, high); the K-means clustering algorithm, to group individuals based on shared characteristics related to risk factors; and the Apriori association rule algorithm, to detect frequent combinations of variables associated with reported cases of violence.
The expected outcomes of this research include the development of more focused prevention strategies and improved decision-making processes by child protection institutions. Moreover, this approach offers a replicable framework for addressing similar challenges in other regions and demonstrates the value of data-driven tools in strengthening systems of care and protection for children.
Keywords: Data mining, child protection, child violence, predictive analysis.