Show simple item record

dc.contributor.advisorValiati, Joao Francisco
dc.contributor.authorBarbosa, Alexandre Nunes
dc.date.accessioned2015-07-18T12:21:33Z
dc.date.accessioned2022-09-22T19:17:00Z
dc.date.available2015-07-18T12:21:33Z
dc.date.available2022-09-22T19:17:00Z
dc.date.issued2012-03-26
dc.identifier.urihttps://hdl.handle.net/20.500.12032/59057
dc.description.abstractThis study suggests a process of investigation of the content of a database, comprising descriptive and pre-structured data related to the health domain, more particularly in the area of Rheumatology. For the investigation of the database, three sets of interest were composed. The first one formed by a class of descriptive content related only to the area of Rheumatology in general, and another whose content belongs to other areas of medicine. The second and third sets were constituted after statistical analysis in the database. One of them formed by the descriptive content associated to the five highest frequencies of ICD codes, and another formed by descriptive content associated with the three highest frequencies of ICD codes related exclusively to the area of Rheumatology. These sets were pre-processed with classic Pre-processing techniques such as Stopword Removal and Stemming. In order to extract patterns that, through their interpretation, result in knowledge production, association and classification techniques were applied to the sets of interest, aiming at to relate the textual content that describes symptoms of diseases with pre-structured content, which defines the diagnosis of these diseases. The implementation of these techniques was carried out by applying the classification algorithm Support Vector Machines and the Association Rules Apriori Algorithm. For the development of this process, theoretical references concerning data mining were researched, including selection and review of scientific publications produced on text mining and related to Electronic Medical Record, focusing on the content of the databases used, techniques for pre-processing and mining used in the literature, as well as the reported results. The classification technique used in this study reached over 80% accurate results, demonstrating the capacity the algorithm has to correctly label health data related to the field of interest. Associations between text content and pre-structured content were also found, which, according to expert analysis, may be questioned as for the use of certain ICDs in the place of origin of the data.en
dc.description.sponsorshipUNISINOS - Universidade do Vale do Rio dos Sinospt_BR
dc.languagept_BRpt_BR
dc.publisherUniversidade do Vale do Rio dos Sinospt_BR
dc.rightsopenAccesspt_BR
dc.subjectProntuário médico eletrônicopt_BR
dc.subjectElectronic medical recorden
dc.titleDescoberta de conhecimento aplicado à base de dados textual de saúdept_BR
dc.typeDissertaçãopt_BR


Files in this item

FilesSizeFormatView
42c.pdf1.016Mbapplication/pdfView/Open

This item appears in the following Collection(s)

Show simple item record


© AUSJAL 2022

Asociación de Universidades Confiadas a la Compañía de Jesús en América Latina, AUSJAL
Av. Santa Teresa de Jesús Edif. Cerpe, Piso 2, Oficina AUSJAL Urb.
La Castellana, Chacao (1060) Caracas - Venezuela
Tel/Fax (+58-212)-266-13-41 /(+58-212)-266-85-62

Nuestras redes sociales

facebook Facebook

twitter Twitter

youtube Youtube

Asociaciones Jesuitas en el mundo
Ausjal en el mundo AJCU AUSJAL JESAM JCEP JCS JCAP