M.Yu. Uzdiaev – Post-graduate Student, Junior Research Scientist, St. Petersburg Institute for Informatics and Automation of RAS
D.K. Levonevskiy – Research Scientist, St. Petersburg Institute for Informatics and Automation of RAS
O.O. Shumskaya – Post-graduate Student, Junior Research Scientist, St. Petersburg Institute for Informatics and Automation of RAS
M.A. Letenkov – Junior Research Scientist, St. Petersburg Institute for Informatics and Automation of RAS
Because of ever increasing introduction of information environments in different human activity domains and extensive use of machine learning methods in these environments, also broadens the scope of potential destructive activities. A specific case of such activity is aggression. Aggression is a complex phenomenon and it includes interdependent physiological, behavioral, affective, emotional and cognitive aspects. Researching affective and cognitive aspects of human behavior, it is possible to determine the aggression level. The aggression manifestations can be detected in text or in images/video. Hence, the problem of aggression detection among the users of information environments should be solved on the basis of such data.
The importance of aggression detection is directly connected with potential threats to individuals or groups of users.
This paper studies different aggression detection methods, implemented using artificial neural networks, which proved to be efficient in such applications as object detection on images, voice-based user identification, face recognition, etc. In this context the aggression detection problem seems to be insufficiently researched. Also, currently don’t exist any representative free datasets, tailored for aggression detection and respective neural network training.
In this work we propose using synthetic data as training sets to increase the efficiency of aggressive behavior detection. Hence, we use the generative-adversarial neural network (GAN) to prepare such a training dataset. GAN architecture consists of two principal parts: 1) generator, which generates the data and 2) discriminator, which determines numerical difference value between the generated and actual data. We use convolutional neural networks (CNN) as generator and discriminator here. The generative part in this work is represented by a pre-trained Mobilenet model with an output layer to be deleted. Closer generator investigation revealed an implicit dependency between the model depth and training dataset complexity. Additionally, this architecture is tightly connected with the set of methods, that were used for synthetic training datasets generation.
For multimodal human aggression detection with the lack of representative datasets we propose using transfer learning approach. Heterogeneous data analysis is based on deep representations, taken from a pre-trained neural network for each data modality, as well collective matrix factorization approach. The scientific novelty of the proposed method consists in possibility of comprehensive analysis of all available data within a single model. Training models on synthetic data we can reduce required computing resources and avoid overfitting. Besides, the proposed methods may be applied not only in aggression detection, but in broader problems as well, concerned with emotion recognition.
- Russakovsky O., Deng J., Su H., Krause J., Satheesh S., Ma S., Berg A.C. Imagenet large scale visual recognition challenge. International Journal of Computer Vision. 2015. V. 115. № 3. P. 211−252.
- Ain Q.T., Ali M., Riaz A., Noureen A., Kamran M., Hayat B., Rehman A. Sentiment analysis using deep learning techniques: a review. Int. J. Adv. Comput. Sci. Appl. 2017. V. 8. № 6. P. 424.
- Lei Y., Scheffer N., Ferrer L., McLaren M. A novel scheme for speaker recognition using a phonetically-aware deep neural network. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE 2014. P. 1695−1699.
- Hannun A., Case C., Casper J., Catanzaro B., Diamos G., Elsen E., Ng A.Y. Deep speech: Scaling up end-to-end speech recognition. arXiv preprint arXiv:1412.5567. 2014.
- Tran D., Bourdev L., Fergus R., Torresani L., Paluri M. Learning spatiotemporal features with 3d convolutional networks. Proc. of the IEEE International Conference on Computer Vision. 2015. P. 4489−4497.
- Wang M., Deng W. Deep face recognition: A survey. arXiv preprint arXiv:1804.06655. 2018; Li S., Deng W. Deep facial expression recognition: A survey. arXiv preprint arXiv:1804.08348. 2018.
- Enikolopov S.N. Ponyatie agressii v sovremennoi psikhologii. Prikladnaya psikhologiya. 2001. № 1. S. 60−72. (In Russian).
- Buss A.H. The psychology of aggression. Wiley. 1961; Enikolopov S.N. Ponyatie agressii v sovremennoi psikhologii. Prikladnaya psikhologiya. 2001. № 1. S. 60−72. (In Russian).
- Feshbach S. Aggression In PH Mussen. Carmichael’s manual of child psychology. 1970. V. 2.
- Zillmann D. Hostility and aggression. Hillsdall, NJ: Lawrence Erlbaum Associates. 1979.
- Usova E.B. Psikhologiya deviantnogo povedeniya. Minsk: Izd-vo MIU. 2010. (In Russian).
- Suris A., Lind L., Emmett G., Borman P.D., Kashner M., Barratt E.S. Measures of aggressive behavior: overview of clinical and research instruments. Aggression and Violent Behavior. 2004. V. 9. № 2. P. 165−227.
- Coccaro E.F., Berman M.E., Kavoussi R.J. Assessment of life history of aggression: development and psychometric characteristics. Psychiatry research. 1997. V. 73. № 3. P. 147−157.
- Kaplan R.M., Saccuzzo D.P. Psychological testing: Principles, applications, and issues. Nelson Education. 2017.
- Ekman P. An argument for basic emotions. Cognition & Emotion. 1992. V. 6. № 3−4. P. 169−200.
- Ekman P. Basic emotions. Handbook of Cognition and Emotion. 1999. V. 98. № 45−60. P. 16.
- Ekman P., Levenson R.W., Friesen W.V. Autonomic nervous system activity distinguishes among emotions. Science. 1983. V. 221. № 4616. P. 1208−1210.
- Davidson R.J., Ekman P., Saron C.D., Senulis J.A., Friesen W.V. Approach-withdrawal and cerebral asymmetry: emotional expression and brain physiology: I. Journal of Personality and Social Psychology. 1990. V. 58. № 2. P. 330.
- Ekman P., Davidson R.J., Friesen W.V. The Duchenne smile: Emotional expression and brain physiology: II. Journal of Personality and Social Psychology. 1990. V. 58. № 2. P. 342.
- Ortony A., Turner T.J. What's basic about basic emotions?. Psychological Review. 1990. V. 97. № 3. P. 315.
- IqbalQuraishi M., Pal Choudhury J., De M., Chakrabort P. A framework for the recognition of human emotion using soft computing models. International Journal of Computer Applications. 2012. V. 40. № 17. P. 50−55.
- Moussa M., Hmila M., Douik A. A Novel Face Recognition Approach Based on Genetic Algorithm Optimization. Studies in Informatics and Control. 2018. V. 27. № 1. P. 127−134.
- Salmam F.Z., Madani A., Kissi M. Emotion recognition from facial expression based on fiducial points detection and using neural network. International Journal of Electrical and Computer Engineering. 2018. V. 8. № 1. P. 52.
- Yang H., Ciftci U., Yin L. Facial expression recognition by de-expression residue learning. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. P. 2168−2177.
- Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial nets. Advances in Neural Information Processing Systems. 2016. P. 2672−2680.
- Shrivastava A., Pfister T., Tuzel O., Susskind J., Wang W., Webb R. Learning from simulated and unsupervised images through adversarial training. Proc. of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. P. 2107−2116.
- Wang X., Gupta A. Generative image modeling using style and structure adversarial net-works. European Conference on Computer Vision. Springer, Cham. 2016. P. 318−335.
- Zhu J.Y., Krähenbühl P., Shechtman E., Efros A.A. Generative visual manipulation on the natural image manifold. European Conference on Computer Vision. Springer, Cham. 2016. P. 597−613.
- Liu M.Y., Tuzel O. Coupled generative adversarial networks. Advances in Neural Information Processing Systems. 2016. P. 469−477.
- Tuzel O., Taguchi Y., Hershey J.R. Global-local face upsampling network. arXiv preprint arXiv:1603.07235. 2016.
- Goodfellow I., Bengio Y., Courville A. Deep learning. MIT Press. 2016.