Automatic Reference Resolution for Pedestrian Wayfinding Systems

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Imagine that you are in the new city and want to explore it. Trying to navigate with maps leads to the unnecessary confusion about street names and prevents you from a enjoying a wonderful walk. The dialogue system that could navigate you from by means of a simple conversation using salient landmarks in your immediate vicinity would be much more helpful! Developing such dialogue system is non-trivial and requires solving a lot of complicated tasks. One of such tasks, tackled in the present thesis, is called reference resolution (RR), i.e. resolving utterances to the underlying geographical entities, referents (if any). The utterances that have referent(s) are called referring expressions (REs). The RR task is decomposed into two tasks: RE identification and resolution itself. Neural network models for both tasks have been designed and extensively evaluated. The model for RE identification, called RefNet, utilizes recurrent neural networks (RNNs) for handling sequential input, i.e. phrases. For each word in an utterance, RefNet outputs a label indicating whether this word is in the beginning of the RE, inside or outside it. The reference resolution model, called SpaceRefNet, uses the RefNet's RNN layer to encode REs and the designed feature extractor to represent geographical objects. Both encodings are fed to a simple feed-forward network with a softmax prediction layer, yielding the probability of match between the RE and the geographical object. Both introduced models have beaten the respective baselines and show promising results in general.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)