EMBODIED QUESTION ANSWERING IN ROBOTIC ENVIRONMENT Automatic generation of a synthetic question-answer data-set

University essay from Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori

Abstract: Embodied question answering is the task of asking a robot about objects in a 3D environment. The robot has to navigate the environment, find the entities in question, and then stop to answer the question. The answering system consists of navigation and visual-question-answering components. The agent is trained on a synthetic data-set of question-answers and navigational paths called EQA-MP3D. Each question in thedata-set is an executable function that could be run in the environment to yield an answer. EQA-MP3D includes only two types of questions, color and location questions. The type of questions asked could be considered unnatural, and we observe that the question-answers contain biases.Our work extends the data-set by automatically generating size and spatial questions. We generate a total of 19 207 question-answers for training and 3 186 question-answers for validation. Our data extension is intended to train the system to answer more question types and enhance the system’s overall ability toperform the task.

