Show simple item record

dc.contributor.advisorRigo, Sandro José
dc.contributor.authorSchwertner, Marco Antonio
dc.date.accessioned2020-11-25T17:48:54Z
dc.date.accessioned2022-09-22T19:41:07Z
dc.date.available2020-11-25T17:48:54Z
dc.date.available2022-09-22T19:41:07Z
dc.date.issued2020-08-24
dc.identifier.urihttps://hdl.handle.net/20.500.12032/63782
dc.description.abstractWith the preventive and personalized medicine advances, and technological improvements enabling better interaction from patients with their healthcare information, the volume of healthcare data gathered has increased. A relevant part of these data is recorded as an unstructured format in natural language free-text, making it harder for Clinical Decision Support Systems (CDSS) to process these data. Consequently, healthcare professionals get overwhelmed keeping themselves updated with the patient’s healthcare information because they need more time to gather and analyze it manually. Furthermore, to define an oncology diagnosis and its treatment plan is a complex decision-making process because it is affected by a broad range of parameters. This research’s main objective is to apply several text classification methods in non-synthetic oncology clinical notes corpora to help with this decision-making process. First, the corpora were obtained from an Oncology EHR system from three different oncology clinics. Two corpora versions were created: the per-clinical-event version with each patient’s medical note per record; and the per-patient version with one record per patient with his or her medical notes. Then, these corpora were preprocessed to leverage the performance of the classifiers. As the last step, several machine learning and one deep learning text classification methods were trained using these corpora with each patient’s diagnosis as enriched data. The following machine learning and deep learning classification methods were applied: Multilayer Perceptron (MLP) neural network, Logistic Regression, Decision Tree classifier, Random Forest classifier, K-nearest neighbors (KNN) classifier, and Long-Short Term Memory (LSTM). An additional experiment with an MLP classifier was performed to evaluate the preprocessing step’s influence on the results, and it found that the classifier’s mean accuracy was leveraged from 26.1% to 86.7% with the per-clinical-event corpus, and 93.9% with the perpatient corpus. The classifier that best performed was the MLP with 2 hidden layers (800 and 500 neurons), which achieved 93.90% accuracy, a Macro F1 score of 93.61%, and a Weighted F1 score of 93.99%. The experiments were performed in a dataset with 3,308 medical notes from a small oncology clinic.en
dc.description.sponsorshipNenhumapt_BR
dc.languageenpt_BR
dc.publisherUniversidade do Vale do Rio dos Sinospt_BR
dc.rightsopenAccesspt_BR
dc.subjectArtificial intelligenceen
dc.subjectInteligência artificialpt_BR
dc.titleExploring text classification methods in oncological medical notes using machine learning and deep learningen
dc.typeDissertaçãopt_BR


Files in this item

FilesSizeFormatView
Marco Antônio Schwertner_.pdf4.127Mbapplication/pdfView/Open

This item appears in the following Collection(s)

Show simple item record


© AUSJAL 2022

Asociación de Universidades Confiadas a la Compañía de Jesús en América Latina, AUSJAL
Av. Santa Teresa de Jesús Edif. Cerpe, Piso 2, Oficina AUSJAL Urb.
La Castellana, Chacao (1060) Caracas - Venezuela
Tel/Fax (+58-212)-266-13-41 /(+58-212)-266-85-62

Nuestras redes sociales

facebook Facebook

twitter Twitter

youtube Youtube

Asociaciones Jesuitas en el mundo
Ausjal en el mundo AJCU AUSJAL JESAM JCEP JCS JCAP