Selective Kernel Network based Crowding Counting and Crowd Density Estimation

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Managing crowd density has become an immense challenge for public authorities due to population growth and evolving human dynamics. Crowd counting estimates the number of individuals in a given area or scene, making it a practical technique applicable in real-world scenarios such as surveillance and traffic control. It contributes to urban planning, retail analytics, and security systems by providing insights into population dynamics and aiding in anomaly detection. This thesis focuses on implementing and evaluating a selective kernel mechanism in crowd counting. The selective kernel block, introduced in a computer vision research known as the Selective Kernel (SK) Network [1], presents an adapted convolution layer as a substitute for the traditional convolution neural network (CNN) architecture. This adaptation has the potential to enhance object detection and image regression tasks. Building upon the C3 framework [2], the thesis applies the selective kernel mechanism to three state-of-the-art crowd counting designs: ResNet [3], CSRNet [4], and SANet [5], resulting in the creation of SK adaptive models. The evaluation process mainly involves collecting and comparing Mean Absolute Error (MAE) and Mean Squared Error (MSE), as well as crowd statistics and crowd density maps. These evaluations are performed using the ShanghaiTech crowd Part A (random high-density crowd images from the website) and Part B (street views in similar scenes) datasets [6]. In 6 comparisons with two different datasets, SK adaptive models were found to have better prediction results in 4 of them against the original models. In conclusion, the SK block offers several advantages: firstly, it enhances feature extraction performance, especially when pretrained with large datasets; secondly, it improves image regression in more straightforward dataset scenarios. On the downside, its impact is limited or detrimental in sparse datasets. This finding suggests that the selective kernel approach holds promise in supporting and improving crowd counting in the high-density group and street view scenarios, facilitating effective public management.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)