Evaluation of Machine Learning Primitives on a Digital Signal Processor
Abstract: Modern handheld devices rely on specialized hardware for evaluating machine learning algorithms. This thesis investigates the feasibility of using the digital signal processor, a part of the modem of the device, as an alternative to this specialized hardware. Memory management techniques and implementations for evaluating the machine learning primitives convolutional, max-pooling and fully connected layers are proposed. The implementations are evaluated based on to what degree they utilize available hardware units. New instructions for packing data and facilitating instruction pipelining are suggested and evaluated. The results show that convolutional and fully connected layers are well-suited to the processor used. The aptness of the convolutional layer is subject to the kernel being applied with a stride of 1 as larger strides cause the hardware usage to plummet. Max-pooling layers, while not ill-suited, are the most limited in terms of hardware usage. The proposed instructions are shown to have positive effects on the throughput of the implementations.
AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)