Object Based Image Retrieval Using Feature Maps of a YOLOv5 Network

University essay from KTH/Matematisk statistik

Abstract: As Machine Learning (ML) methods have gained traction in recent years, someproblems regarding the construction of such methods have arisen. One such problem isthe collection and labeling of data sets. Specifically when it comes to many applicationsof Computer Vision (CV), one needs a set of images, labeled as either being of someclass or not. Creating such data sets can be very time consuming. This project setsout to tackle this problem by constructing an end-to-end system for searching forobjects in images (i.e. an Object Based Image Retrieval (OBIR) method) using an objectdetection framework (You Only Look Once (YOLO) [16]). The goal of the project wasto create a method that; given an image of an object of interest q, search for that sameor similar objects in a set of other images S. The core concept of the idea is to passthe image q through an object detection model (in this case YOLOv5 [16]), create a”fingerprint” (can be seen as a sort of identity for an object) from a set of feature mapsextracted from the YOLOv5 [16] model and look for corresponding similar parts of aset of feature maps extracted from other images. An investigation regarding whichvalues to select for a few different parameters was conducted, including a comparisonof performance for a couple of different similarity metrics. In the table below,the parameter combination which resulted in the highest F_Top_300-score (a measureindicating the amount of relevant images retrieved among the top 300 recommendedimages) in the parameter selection phase is presented. Layer: 23Pool Methd: maxSim. Mtrc: eucFP Kern. Sz: 4 Evaluation of the method resulted in F_Top_300-scores as can be seen in the table below. Mouse: 0.820Duck: 0.640Coin: 0.770Jet ski: 0.443Handgun: 0.807Average: 0.696

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)