Descoberta de conhecimento aplicado à base de dados textual de saúde

Barbosa, Alexandre Nunes

dc.contributor.advisor	Valiati, Joao Francisco
dc.contributor.author	Barbosa, Alexandre Nunes
dc.date.accessioned	2015-07-18T12:21:33Z
dc.date.accessioned	2022-09-22T19:17:00Z
dc.date.available	2015-07-18T12:21:33Z
dc.date.available	2022-09-22T19:17:00Z
dc.date.issued	2012-03-26
dc.identifier.uri	https://hdl.handle.net/20.500.12032/59057
dc.description.abstract	This study suggests a process of investigation of the content of a database, comprising descriptive and pre-structured data related to the health domain, more particularly in the area of Rheumatology. For the investigation of the database, three sets of interest were composed. The first one formed by a class of descriptive content related only to the area of Rheumatology in general, and another whose content belongs to other areas of medicine. The second and third sets were constituted after statistical analysis in the database. One of them formed by the descriptive content associated to the five highest frequencies of ICD codes, and another formed by descriptive content associated with the three highest frequencies of ICD codes related exclusively to the area of Rheumatology. These sets were pre-processed with classic Pre-processing techniques such as Stopword Removal and Stemming. In order to extract patterns that, through their interpretation, result in knowledge production, association and classification techniques were applied to the sets of interest, aiming at to relate the textual content that describes symptoms of diseases with pre-structured content, which defines the diagnosis of these diseases. The implementation of these techniques was carried out by applying the classification algorithm Support Vector Machines and the Association Rules Apriori Algorithm. For the development of this process, theoretical references concerning data mining were researched, including selection and review of scientific publications produced on text mining and related to Electronic Medical Record, focusing on the content of the databases used, techniques for pre-processing and mining used in the literature, as well as the reported results. The classification technique used in this study reached over 80% accurate results, demonstrating the capacity the algorithm has to correctly label health data related to the field of interest. Associations between text content and pre-structured content were also found, which, according to expert analysis, may be questioned as for the use of certain ICDs in the place of origin of the data.	en
dc.description.sponsorship	UNISINOS - Universidade do Vale do Rio dos Sinos	pt_BR
dc.language	pt_BR	pt_BR
dc.publisher	Universidade do Vale do Rio dos Sinos	pt_BR
dc.rights	openAccess	pt_BR
dc.subject	Prontuário médico eletrônico	pt_BR
dc.subject	Electronic medical record	en
dc.title	Descoberta de conhecimento aplicado à base de dados textual de saúde	pt_BR
dc.type	Dissertação	pt_BR

Files in this item

Files	Size	Format	View
42c.pdf	1.016Mb	application/pdf	View/Open

This item appears in the following Collection(s)

Documentos - UNISINOS

Show simple item record