dc.description.abstract | A great variety of applications require HAR (Human Action Recognition) information as
input. A topic that is applied in such various areas becomes particularly prevalent recently because of their explosively emerging real-world applications. Many works tried to understand actions by only observing the actor’s poses, introducing methods to model human appearances and poses, trying to more robust features. However, several actions may be performed with comparable postures, and these strategies ignored features, which made them less applicable to recognize complex actions. Although several proposals have already been submitted, HAR is a problem that is still far from having a definitive solution. The solutions focus on exploring different techniques for getting features and enabling machine learning algorithms to identify actions. However, the variety of possible human actions, the small number of dataset examples, and the complexity of the task mean that several studies are still required to reach a final
solution. Simultaneously, as we try to make a computer understand actions in videos, neuroscientists are trying to understand how the human brain recognizes activities. The analysis shows that object recognition is a hard task, even to the brain. Although, studies suggest that the brain’s algorithm is relatively simple and most likely processes the visual input only once. In this thesis, we explore what we know so far about how the human brain recognizes actions to simulate this same behavior on a computer. A model that proves to be robust can serve as the basis for developing solutions in the most varied branches. For this, we studied the Neuroscience and Physiology areas for information about how the human brain works. From this information, we developed the Brain Action model to simulate this behavior and introduced an algorithm workflow to implement this model on a computer. During the development of the research, we tried to understand how other proposals with similar methods solve the same problem, as well as solutions that explore other different techniques. We have gathered this knowledge to propose a model that can explore techniques that are already accepted in state of the art with the human mind’s recognition of actions. This proposal aimed to develop a model
that has as input RGB videos, and by identifying the positioning and movements of the elements in the scenes, and using only the relationship of this information, be able to recognize human actions, targeting applications in various domains. We followed this research by implementing the model in a challenging surgical operations HAR task, evaluating it with the state-of-the-art metrics. We built our surgical dataset with seven different classes during this process, tested the model with three different machine learning classification methods, and achieved 44.1% of correctly classified actions applying cross-validation. Our contributions are threefold: (I) A new biological inspired HAR model, (II) a new movement feature extraction design, and (III) a
HAR implementation for surgery action recognition scenario. | en |