Mostrar registro simples

dc.creatorHora, Thiago Jorge Lins da
dc.date.accessioned2026-01-29T18:04:54Z
dc.date.available2026-01-29T18:04:54Z
dc.date.issued2025-11-26
dc.identifier.citationHORA, Thiago Jorge Lins da. Classificação de vocalizações em indivíduos com TEA: avaliação de modelos de learning machine. Artigo. 21f. Trabalho de conclusão de curso, Instituto Federal de Educação, Ciência e Tecnologia de Pernambuco, Campus Jaboatão dos Guararapes, (Tecnólogo em Análise e Desenvolvimento de Sistemas), Jaboatão dos Guararapes, 2025.pt_BR
dc.identifier.urihttps://repositorio.ifpe.edu.br/xmlui/handle/123456789/1974
dc.description.abstractThis study investigates the application of machine learning techniques for classifying nonverbal vocalizations of minimally verbal individuals with Autism Spectrum Disorder (ASD). Based on the American ReCANVo dataset, six categories of vocalizations were explored: delight, dysregulated, frustrated, request, selftalk, and social. The process involved acoustic feature extraction, class balancing, and training multiple models. The experimental evaluation demonstrated that models trained exclusively on the ReCANVo dataset are ineffective at generalizing to the Portuguese-speaking context. In a test with 53 vocalizations from a Brazilian individual, the 75.47% accuracy achieved by the best model (SVC) proved to be misleading, as it was concentrated on a single over represented class while failing on the minority classes. These results demonstrate the lack of effective cross-cultural generalization and highlight the critical need for data aligned with the local linguistic context. To address this gap, the mobile application VocalizeAI was developed to enable the creation of a Brazilian dataset, which is essential for advancing assistive communication technologies for individuals with ASD.pt_BR
dc.format.extent21f.pt_BR
dc.languagept_BRpt_BR
dc.relationAMERICAN PSYCHIATRIC ASSOCIATION. Diagnostic and statistical manual of mental disorders (DSM-5). [S.l.]: American Psychiatric Publishing, 2013. ARLOT, Sylvain; CELISSE, Alain. A survey of cross-validation procedures for model selection. Statistics Surveys, Amer. Statist. Assoc., the Bernoulli Soc., the Inst. Math. Statist., e the Statist. Soc. Canada, v. 4, none, p. 40–79, 2010. DOI: 10.1214/09-SS054. Disponível em: <https://doi.org/10.1214/09-SS054>. BARBETTA, P. A.; REIS, M. M.; BORNIA, A. C. Estatística para cursos de engenharia e informática. Atlas, 2008. BERSCH, Rita. Introdução à tecnologia assistiva. Porto Alegre: CEDI, v. 21, p. 1–20, 2008. BEUKELMAN, David R.; MIRENDA, Pat. Augmentative and Alternative Communication: Supporting Children & Adults with Complex Communication Needs. 4. ed. Baltimore: Paul H. Brookes Publishing, 2013. BISCHL, Bernd et al. Hyperparameter Optimization: foundations, algorithms, best practices and open challenges. arXiv (Cornell University), jan. 2021. DOI: 10.48550/arxiv.2107.05847. Disponível em: <https://arxiv.org/abs/2107.05847>. BREIMAN, Leo. Random Forests. Machine Learning, v. 45, n. 1, p. 5–32, jan. 2001. DOI: 10.1023/a:1010933404324. Disponível em: <https://doi.org/10.1023/a:1010933404324>. BURGES, Christopher J.C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, v. 2, n. 2, p. 121–167, jan. 1998. DOI: 10.1023/a:1009715923555. Disponível em: <https://doi.org/10.1023/a:1009715923555>. CHAWLA, Nitesh V. et al. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, v. 16, p. 321–357, 2002. DOI: 10.1613/jair.953. CHEN, Tianqi; GUESTRIN, Carlos. XGBoost: A Scalable Tree Boosting System. In: PROCEEDINGS of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, California, USA: Association for Computing Machinery, 2016. (KDD ’16), p. 785–794. ISBN 9781450342322. DOI: 10.1145/2939672.2939785. Disponível em: <https://doi.org/10.1145/2939672.2939785>. COOK, Albert M; POLGAR, Janice Miller. Assistive technologies-e-book: principles and practice. [S.l.]: Elsevier Health Sciences, 2014. CRESWELL, J. W. Research design: Qualitative, quantitative, and mixed methods approaches. [S.l.]: Sage publications, 2014. DELL, A.G.; NEWTON, D.A.; PETROFF, J.G. Assistive Technology in the Classroom: Enhancing the School Experiences of Students with Disabilities. [S.l.]: Pearson Merrill Prentice Hall, 2008. ISBN 9780131191648. Disponível em: <https://books.google.com.br/books?id=Fly1OwAACAAJ>. DI AUTOMATICA E INFORMATICA, Dipartimento et al. Exploring the Adaptability of Large Speech Models to Non-Verbal Vocalization task. [S.l.: s.n.], jan. 2025. Disponível em: <https://hdl.handle.net/11583/3002059>. DOMINGOS, Pedro. A few useful things to know about machine learning. Communications of the ACM, v. 55, n. 10, p. 78–87, 25 set. 2012. DOI: 10.1145/2347736.2347755. Disponível em: <https://doi.org/10.1145/2347736.2347755>. FAWCETT, Tom. An Introduction to ROC Analysis. Pattern Recognition Letters, v. 27, n. 8, p. 861–874, 2006. DOI: 10.1016/j.patrec.2005.10.010. FENG, Tiantian et al. Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational Modeling. [S.l.: s.n.], 2025. arXiv: 2409.09340 [cs.SD]. Disponível em: <https://arxiv.org/abs/2409.09340>. FRIEDMAN, Jerome H. Greedy function approximation: a gradient boosting machine. Annals of statistics, JSTOR, p. 1189–1232, 2001. GIL, A.C. Como elaborar projetos de pesquisa. [S.l.]: Atlas, 2002. ISBN 9788522431694. Disponível em: <https://books.google.com.br/books?id=X4uvAAAACAAJ>. GOUYON, Fabien; PACHET, Francois; DELERUE, Olivier. On the Use of Zero-Crossing Rate for an Application of Classification of Percussive Sounds, ago. 2002. JOHNSON, Kristina T.; NARAIN, Jaya et al. ReCANVo: A database of real-world communicative and affective nonverbal vocalizations. Scientific Data, Springer Nature, v. 10, p. 523, 2023. JOHNSON, Kristina T.; O’BRIEN, Amanda M. et al. Affective Ratings of Nonverbal Vocalizations Produced by Minimally-Speaking Individuals: What Do Naive Listeners Perceive? In: IEEE. 10TH International Conference on Affective Computing and Intelligent Interaction (ACII). [S.l.: s.n.], 2022. P. 1–8. KE, Guolin et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In . Advances in Neural Information Processing Systems. [S.l.]: Curran Associates, Inc., 2017. v. 30. Disponível em: <https://proceedings.neurips.cc/pap er_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf>. KONECKI, Mario; LOVRENČIĆ, Sandra; JERVIS, Keith. The use of assistive technology in education of programming. In: 2016 ICBTS International Academic Research Conference Proceedings, Boston, USA. [S.l.: s.n.], 2016. P. 1–11. KOUDOUNAS, Alkis et al. voc2vec: A Foundation Model for Non-Verbal Vocalization. [S.l.: s.n.], fev. 2025. DOI: 10.48550/arXiv.2502.16298. LAZAR, Jonathan; GOLDSTEIN, Daniel F; TAYLOR, Anne. Ensuring digital accessibility through process and policy. [S.l.]: Morgan kaufmann, 2015. MARCONI, M. de A.; LAKATOS, E. M. Fundamentos de metodologia científica. [S.l.]: Atlas, 2003. MCFEE, Brian et al. librosa: Audio and Music Signal Analysis in Python. Proceedings of the Python in Science Conferences, p. 18–24, jan. 2015. DOI: 10.25080/majora-7b98e3ed-003. Disponível em: <https://doi.org/10.25080/majora-7b98e3ed-003>. NARAIN, Jaya et al. Nonverbal Vocalizations as Speech: Characterizing Natural-Environment Audio from Nonverbal Individuals with Autism. Laughter Workshop Proceedings, MIT Media Lab, 2020. PEDREGOSA, F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, v. 12, p. 2825–2830, 2011. SHAW, Kelly A. et al. Prevalence and early identification of autism spectrum disorder among children aged 4 and 8 years — Autism and Developmental Disabilities Monitoring Network, 16 sites, United States, 2022. MMWR Surveillance Summaries, v. 74, n. 2, p. 1–22, abr. 2025. DOI: 10.15585/mmwr.ss7402a1. Disponível em: <https://doi.org/10.15585/mmwr.ss7402a1>. TRAN, van-Thuan; TSAI, Wei-Ho. Identification of Non-Speaking and Minimal-Speaking Individuals Using Nonverbal Vocalizations. IEEE Access, v. 12, p. 68954–68967, 2024. DOI: 10.1109/ACCESS.2024.3398584.pt_BR
dc.rightsAcesso Abertopt_BR
dc.rightsAn error occurred on the license name.*
dc.rights.uriAn error occurred getting the license - uri.*
dc.subjectTranstorno do espectro autistapt_BR
dc.subjectVocalizaçãopt_BR
dc.subjectMachine learningpt_BR
dc.subjectClassificação - Materiais audiovisuaispt_BR
dc.titleClassificação de vocalizações em indivíduos com TEA: avaliação de modelos de machine learningpt_BR
dc.typeArticlept_BR
dc.creator.Latteshttp://lattes.cnpq.br/6273597801019660pt_BR
dc.contributor.advisor1Alencar, Roberto Luiz Sena de
dc.contributor.advisor1Latteshttp://lattes.cnpq.br/4 839735568204936pt_BR
dc.contributor.referee1Alencar, Roberto Luiz Sena de
dc.contributor.referee2Cabral, Luciano de Souza
dc.contributor.referee3Andrade, Havana Diogo Alves
dc.contributor.referee1Latteshttp://lattes.cnpq.br/4 839735568204936pt_BR
dc.contributor.referee2Latteshttp://lattes.cnpq.br/9 195362898891079pt_BR
dc.contributor.referee3Latteshttp://lattes.cnpq.br/1 553497037631903pt_BR
dc.publisher.departmentJaboatão dos Guararapespt_BR
dc.publisher.countryBrasilpt_BR
dc.subject.cnpqCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAOpt_BR
dc.description.resumoEste trabalho investiga a aplicação de técnicas de machine learning para a classificação de vocalizações não verbais de indivíduos minimamente verbais com Transtorno do Espectro Autista (TEA). Utilizando como base o conjunto de dados americano ReCANVo, foram exploradas seis categorias de vocalizações: delight, dysregulated, frustrated, request, selftalk e social. O processo envolveu a extração de características acústicas, o balanceamento de classes e o treinamento de múltiplos modelos. A avaliação experimental demonstrou que os modelos treinados exclusivamente com a base ReCANVo são ineficazes para generalizar ao contexto lusófono. Em um teste com 53 vocalizações de um indivíduo brasileiro, a acurácia de 75,47% obtida pelo melhor modelo (SVC) mostrou-se enganosa, concentrando-se em uma única classe majoritária e falhando completamente nas classes minoritárias. Tais resultados demonstram a ausência de generalização transcultural efetiva e evidenciam a necessidade crítica de dados alinhados ao contexto linguístico local. Para suprir essa lacuna, foi desenvolvido o aplicativo móvel VocalizeAI, que permitirá a criação de uma base de dados brasileira, fundamental para o avanço de tecnologias assistivas de comunicação para indivíduos com TEA.pt_BR


Arquivos deste item

Thumbnail
Thumbnail

Este item aparece na(s) seguinte(s) coleção(s)

Mostrar registro simples