Schnürer, Raimund2024-07-302024-07-302024-07-292024https://infoscience.epfl.ch/handle/20.500.14299/240495MapPool is a dataset of 75 million potential maps and textual captions. It has been derived from CommonPool, a dataset consisting of 12 billion text-image pairs from the Internet. The images have been encoded by a vision transformer and classified into maps and non-maps by a support vector machine. This approach outperforms previous models and yields a validation accuracy of 98.5%. The MapPool dataset may help to train data-intensive architectures in order to establish vision and language foundation models specialized in maps. The analysis of the dataset and the exploration of the embedding space offers a large potential for future work. It is accessible via https://geoai.icaci.org/mappool/enmachine learningdatasetmap classificationfoundation modelMapPool -Bubbling up an extremely large corpus of maps for AItext::conference output::conference paper not in proceedings