%0 Journal Article %T Caipirini: using gene sets to rank literature %A Theodoros G Soldatos %A Se¨¢n I O'Donoghue %A Venkata P Satagopam %A Adriano Barbosa-Silva %A Georgios A Pavlopoulos %A Ana Wanderley-Nogueira %A Nina Soares-Cavalcanti %A Reinhard Schneider %J BioData Mining %D 2012 %I BioMed Central %R 10.1186/1756-0381-5-1 %X To evaluate the usefulness of Caipirini, we used two test cases, one related to the human cell cycle, and a second related to disease defense mechanisms in Arabidopsis thaliana. In both cases, the new method achieved high precision in finding literature related to the biological mechanisms underlying the input data sets.To our knowledge Caipirini is the first service enabling literature search directly based on biological relevance to gene sets; thus, Caipirini gives the research community a new way to unlock hidden knowledge from gene sets derived via high-throughput experiments.Keeping up-to-date with bioscience literature is becoming more challenging as the number of new papers appearing daily - currently over 2,000 - continues to increase. As a result, there is an increasing need for methods that can efficiently search this literature [1], and to this end a wide range of tools and services are now available [2,3]. Currently, most tools used for retrieving bioscience literature are based on keyword searches, although such approaches have limitations: firstly, it can be difficult for a researcher to find a set of keywords that exactly specify the biological functions she or he may be interested in; secondly, the ranking of results is usually not based on relevance to the biological functions of interest. Several recent methods have been proposed to address these limitations, e.g., ETBLAST [4] can launch literature searches based on a single text document such as an abstract; such methods allow searches to be defined implicitly, e.g., based on a text of interest, rather than having to explicitly define keywords. Several tools have extended this approach, allowing collections of abstracts as input, e.g., PubFinder [5] and MScanner [6].A common problem with all literature search methods is that only a fraction of the literature retrieved is truly of interest or relevance for the end-user. Recently, a new tool, MedlineRanker [7], partly addresses this problem by allow %U http://www.biodatamining.org/content/5/1/1