Agile text mining with Sherlok

The successful development of an intelligent text mining application requires the collaboration of two main stakeholders: subject matter experts and text miners. In this paper, we describe a new methodology, agile text mining to improve that collaboration. Agile text mining is characterized by short development cycles, frequent tasks redefinition and continuous performance monitoring through integration tests. We introduce Sherlok, a system supporting the development of agile text mining applications and present an application to extract mention of neurons from a very large corpus of scientific articles. The resulting code and models are publicly available.


Publié dans:
2015 IEEE International Conference on Big Data (Big Data), 1479-1484
Présenté à:
2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October - 1 November 2015
Année
2015
Publisher:
IEEE
Mots-clefs:
Laboratoires:




 Notice créée le 2016-10-28, modifiée le 2019-08-12


Évaluer ce document:

Rate this document:
1
2
3
 
(Pas encore évalué)