Enhancing biomedical search interfaces with imagesThe system architecture. The online components include: web interface, application server for data retrieval, and Apache Lucene search engine. The offline components are responsible for labeling, model training and document preprocessing.
Researchers: G. Elisabeta Marai, Juan Trelles Trabucco
Funding: US National Institutes of Health R01LM012527 Searching for scientific documents is a pervasive task among researchers. While most academic search engines, such as Google Scholar, search over text data, image data provides relevant information to complement text-based queries. For instance, in biomedical and biology research, images show experimental results and provide cues about the methods followed. This work shows the importance of combining text and image-based features to aid researchers in finding relevant documents in a COVID-19 collection. We derived a taxonomy to represent image content based on their acquisition method (e.g., captured in a microscope), integrated this information by leveraging deep learning image classifiers, and presented the result in hybrid image+text surrogates. Our design improved the user experience during document retrieval. Date: August 31, 2023 |