Análise Comparativa de Redes Neurais Artificiais e Algoritmos de Ensemble para a Predição de Risco de Crédito
dc.creator | Manguinho, Maria Letícia da Silva | |
dc.date.accessioned | 2024-02-20T23:38:27Z | |
dc.date.available | 2024-02-20T23:38:27Z | |
dc.date.issued | 2023-12-06 | |
dc.identifier.citation | MANGUINHO, Maria Letícia da Silva. Análise Comparativa de Redes Neurais Artificiais e Algoritmos de Ensemble para a Predição de Risco de Crédito. Orientador: Antônio Correia de Sá Barreto Neto. 2023. Artigo (Tecnólogo em Análise e Desenvolvimento de Sistemas) - Instituto Federal de Educação, Ciência e Tecnologia de Pernambuco - Campus Paulista, Paulista, PE, 2023. | pt_BR |
dc.identifier.uri | https://repositorio.ifpe.edu.br/xmlui/handle/123456789/1177 | |
dc.description.abstract | In this study, the importance of credit risk analysis in the financial system is explored, highlighting its challenges and economic implications. Given the complexity of imbalanced data and the dynamics of the economic landscape, the aim of the work is to employ Artificial Intelligence techniques, such as Artificial Neural Networks and ensemble models, to enhance the detection of credit risk cases. Following the principles of the Cross Industry Standard Process for Data Mining, which is divided into five iterative phases, the research began with an understanding of the problem context and the establishment of clear objectives. Subsequently, a detailed analysis of credit datasets was conducted to understand the relevance of each attribute. After data preprocessing, including cleaning and treatment, modeling techniques were then applied. Finally, the models were evaluated using metrics such as accuracy, precision, and recall. Notably, ensemble models, especially Gradient Boosting and XGBoost, consistently performed well across all three datasets examined in this study. Both achieved a remarkable accuracy of 94%, the highest recorded in this research, emphasizing the effectiveness of these models. | pt_BR |
dc.format.extent | 21 p. | pt_BR |
dc.language | pt_BR | pt_BR |
dc.relation | ABEDIN, M. Z. et al. Combining weighted smote with ensemble learning for the class-imbalanced prediction of small business credit risk. Complex & Intelligent Systems, Springer, p. 1–21, 2022. 4, 5, 6 ALONSO, A.; CARBO, J. M. Machine learning in credit risk: Measuring the dilemma between prediction and supervisory cost. Banco de Espana Working Paper, 2020. 2 BAO, W.; LIANJU, N.; YUE, K. Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Systems with Applications, Elsevier, v. 128, p. 301–315, 2019. 2, 3 BERRAR, D. et al. Cross-Validation. 2019. 11 BROWNLEE, J. How to use standardscaler and minmaxscaler transforms in python. Machine Learning Mastery, v. 10, 2020. 10 BUSSMANN, N. et al. Explainable machine learning in credit risk management. Computational Economics, Springer, v. 57, p. 203–216, 2021. 2 CARUSO, G. et al. Cluster analysis for mixed data: An application to credit risk evaluation. Socio-Economic Planning Sciences, Elsevier, v. 73, p. 100850, 2021. 4 CHAWLA, N. V. et al. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, v. 16, p. 321–357, 2002. 3 CHEN, T.; GUESTRIN, C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. [S.l.: s.n.], 2016. p. 785–794. 5 CREDIT Risk. 2020. ⟨https://www.kaggle.com/datasets/upadorprofzs/credit-risk⟩. Acesso em: 21 de jun de 2023. 9 CREDIT Risk Dataset. 2020. ⟨https://www.kaggle.com/datasets/laotse/credit-risk-dataset⟩. Acesso em: 21 de jun de 2023. 7 GERMAN Credit. 2019. ⟨https://online.stat.psu.edu/stat508/book/export/html/796⟩. Acesso em: 21 de jun de 2023. 8 LI, Z. et al. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE transactions on neural networks and learning systems, IEEE, 2021. 5 LIU, W.; FAN, H.; XIA, M. Credit scoring based on tree-enhanced gradient boosting decision trees. Expert Systems with Applications, Elsevier, v. 189, p. 116034, 2022. 5 MART´ıNEZ-PLUMED, F. et al. Crisp-dm twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering, IEEE, v. 33, n. 8, p. 3048–3061, 2019. 5 MIIKKULAINEN, R. et al. Evolving deep neural networks. In: Artificial intelligence in the age of neural networks and brain computing. [S.l.]: Elsevier, 2019. p. 293–312. 5 RASCHKA, S. Python machine learning. [S.l.]: Packt publishing ltd, 2015. 2 20 RIGATTI, S. J. Random forest. Journal of Insurance Medicine, American Academy of Insurance Medicine 1700 Magnavox Way, Fort Wayne, IN 46804, v. 47, n. 1, p. 31–39, 2017. 5 RODR´IGUEZ, P. et al. Beyond one-hot encoding: Lower dimensional target embedding. Image and Vision Computing, Elsevier, v. 75, p. 21–31, 2018. 10 SCHWARZ, K. Mind the gap: Disentangling credit and liquidity in risk spreads. Review of Finance, Oxford University Press, v. 23, n. 3, p. 557–597, 2019. 2 SHEN, F. et al. A novel ensemble classification model based on neural networks and a classifier optimisation technique for imbalanced credit risk evaluation. Physica A: Statistical Mechanics and its Applications, Elsevier, v. 526, p. 121073, 2019. 3, 6 TAUD, H.; MAS, J. Multilayer perceptron (mlp). Geomatic approaches for modeling land change scenarios, Springer, p. 451–455, 2018. 5 TELES, G. et al. Artificial neural network and bayesian network models for credit risk prediction. Journal of Artificial Intelligence and Systems, Institute of Electronics and Computer, v. 2, n. 1, p. 118–132, 2020. 4, 5 TURNER, R. et al. Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020. In: PMLR. NeurIPS 2020 Competition and Demonstration Track. [S.l.], 2021. p. 3–26. 7 WICKHAM, H.; WICKHAM, H. Data analysis. [S.l.]: Springer, 2016. 2 WINSTON, P. H. Artificial intelligence. [S.l.]: Addison-Wesley Longman Publishing Co., Inc., 1984. 2 YANG, M. et al. Deep neural networks with l1 and l2 regularization for high dimensional corporate credit risk prediction. Expert Systems with Applications, Elsevier, v. 213, p. 118873, 2023. 5 YEGNANARAYANA, B. Artificial neural networks. [S.l.]: PHI Learning Pvt. Ltd., 2009. 3 ZHANG, C.; MA, Y. Ensemble machine learning: methods and applications. [S.l.]: Springer, 2012. 3 ZHANG, H. The optimality of naive bayes. Aa, v. 1, n. 2, p. 3, 2004. 4, 11 | pt_BR |
dc.rights | Acesso Aberto | pt_BR |
dc.rights | An error occurred on the license name. | * |
dc.rights.uri | An error occurred getting the license - uri. | * |
dc.subject | Previsão de Risco de Crédito | pt_BR |
dc.subject | Aprendizado de Máquina | pt_BR |
dc.subject | Modelos de Ensemble | pt_BR |
dc.subject | Redes Neurais Artificiais | pt_BR |
dc.subject | Avaliação Comparativa | pt_BR |
dc.title | Análise Comparativa de Redes Neurais Artificiais e Algoritmos de Ensemble para a Predição de Risco de Crédito | pt_BR |
dc.type | Article | pt_BR |
dc.creator.Lattes | http://lattes.cnpq.br/9008035687531979 | pt_BR |
dc.contributor.advisor1 | Barreto Neto, Antônio Correia de Sá | |
dc.contributor.advisor1Lattes | http://lattes.cnpq.br/2773609778338983 | pt_BR |
dc.contributor.referee1 | Oliveira, Flávio Rosendo da Silva | |
dc.contributor.referee2 | Medeiros, Erika Carlos | |
dc.contributor.referee1Lattes | http://lattes.cnpq.br/6828380394080049 | pt_BR |
dc.contributor.referee2Lattes | http://lattes.cnpq.br/6574506939749437 | pt_BR |
dc.publisher.department | Campus Paulista | pt_BR |
dc.publisher.country | Brasil | pt_BR |
dc.subject.cnpq | CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO | pt_BR |
dc.description.resumo | Neste estudo, é explorada a importância da análise de risco de crédito no sistema financeiro, destacando-se seus desafios e implicações econômicas. Diante da complexidade dos dados desbalanceados e da dinâmica do cenário econômico, o trabalho tem como objetivo empregar técnicas de Inteligência Artificial, como Redes Neurais Artificiais e modelos de ensembles, para aprimorar a detecção de casos de riscos de crédito. Seguindo os ideais do Cross Industry Standard Process for Data Mining, que por sua vez, ´e dividido em cinco fases iterativas, foi iniciada a pesquisa com a compreensão do contexto do problema e estabelecimento de objetivos claros. Em seguida, houve uma análise detalhada dos conjuntos de dados de crédito para compreender a relevância de cada atributo. Após o pré-processamento dos dados, incluindo limpeza e tratamento, foram então aplicadas as técnicas de modelagem. Finalmente, os modelos foram avaliados com métricas como acurácia, precisão e recall. Notavelmente, os modelos do tipo ensemble, especialmente o Gradient Boosting e o XGBoost, estiveram com desempenho consiste em todas as três bases de dados examinadas neste estudo. Ambos alcançaram uma notável acurácia de 94%, a mais alta registrada nesta pesquisa, enfatizando assim, a escolha desses modelos para o problema de identificação de casos de alto risco de crédito. | pt_BR |