000114914 001__ 114914
000114914 005__ 20190416055708.0
000114914 0247_ $$2doi$$a10.1371/journal.pone.0000579
000114914 022__ $$a1932-6203
000114914 037__ $$aARTICLE
000114914 245__ $$aIndexing strategies for rapid searches of short words in genome sequences
000114914 269__ $$a2007
000114914 260__ $$bPublic Library of Science$$c2007
000114914 336__ $$aJournal Articles
000114914 520__ $$aSearching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.
000114914 700__ $$aIseli, C.
000114914 700__ $$0243605$$g176602$$aAmbrosini, G.
000114914 700__ $$g113607$$aBucher, P.$$0244404
000114914 700__ $$aJongeneel, C. V.
000114914 773__ $$j2$$tPLoS ONE$$k6$$qe579
000114914 8564_ $$uhttps://infoscience.epfl.ch/record/114914/files/journal.pone.0000579.pdf$$zPublisher's version$$s254376$$yPublisher's version
000114914 909C0 $$xU11780$$0252244$$pGR-BUCHER
000114914 909CO $$ooai:infoscience.tind.io:114914$$qGLOBAL_SET$$pSV$$particle
000114914 917Z8 $$x182396
000114914 937__ $$aGR-BUCHER-ARTICLE-2007-002
000114914 973__ $$rREVIEWED$$sPUBLISHED$$aOTHER
000114914 980__ $$aARTICLE