A new strategy to seed selection for the high recall task

Todescato, Matheus Vinícius

Use este identificador para citar ou linkar para este item: https://rd.uffs.edu.br/handle/prefix/5001

Registro completo de metadados

Campo DC	Valor	Idioma
dc.contributor.advisor1	Dal Bianco, Guilherme	-
dc.creator	Todescato, Matheus Vinícius	-
dc.date	2021-05-12	-
dc.date.accessioned	2022-02-01T19:53:55Z	-
dc.date.available	2022-02-01	-
dc.date.available	2022-02-01T19:53:55Z	-
dc.date.issued	2021-05-12	-
dc.identifier.uri	https://rd.uffs.edu.br/handle/prefix/5001	-
dc.description.abstract	High Recall Information Retrieval (HIRE) aims at identifying all (or nearly all) relevant documents given a query. HIRE, for example, is used in the systematic literature review task, where the goal is to identify all relevant scientific articles. The documents selected by HIRE as relevant define the user effort to identify the target information. On this way, one of HIRE goals is only to return relevant documents avoiding overburning the user with non-relevant information. Traditionally, supervised machine learning algorithms are used as HIRE’ core to produce a ranking of relevant documents (e.g. SVM). However, such algorithms depend on an initial training set (seed) to start the process of learning. In this work, we propose a new approach to produce the initial seed for HIRE focus on reducing the user effort. Our approach combines an active learning approach with a raking strategy to select only the informative examples. The experimentation shows that our approach is able to reduce until 18% the labeling effort with competitive recall.	pt_BR
dc.description.provenance	Submitted by Rafael Pinheiro de Almeida (rafael.almeida@uffs.edu.br) on 2022-02-01T18:32:16Z No. of bitstreams: 1 TODESCATO.pdf: 930671 bytes, checksum: 79dafcab46e574bba99e6ceb0eb9582c (MD5)	en
dc.description.provenance	Approved for entry into archive by Franciele Scaglioni da Cruz (franciele.cruz@uffs.edu.br) on 2022-02-01T19:53:55Z (GMT) No. of bitstreams: 1 TODESCATO.pdf: 930671 bytes, checksum: 79dafcab46e574bba99e6ceb0eb9582c (MD5)	en
dc.description.provenance	Made available in DSpace on 2022-02-01T19:53:55Z (GMT). No. of bitstreams: 1 TODESCATO.pdf: 930671 bytes, checksum: 79dafcab46e574bba99e6ceb0eb9582c (MD5) Previous issue date: 2021-05-12	en
dc.language	por	pt_BR
dc.publisher	Universidade Federal da Fronteira Sul	pt_BR
dc.publisher.country	Brasil	pt_BR
dc.publisher.department	Campus Chapecó	pt_BR
dc.publisher.initials	UFFS	pt_BR
dc.rights	Acesso Aberto	pt_BR
dc.subject	Informação	pt_BR
dc.subject	Aprendizagem	pt_BR
dc.subject	Documento	pt_BR
dc.subject	Recuperação da informação	pt_BR
dc.subject	Ciência da computação	pt_BR
dc.title	A new strategy to seed selection for the high recall task	pt_BR
dc.type	Artigo Cientifico	pt_BR
Aparece nas coleções:	Ciência da Computação

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
TODESCATO.pdf		908,86 kB	Adobe PDF	Visualizar/Abrir

Mostrar registro simples do item Recomendar este item Visualizar estatísticas