Radiotekhnika
Publishing house Radiotekhnika

"Publishing house Radiotekhnika":
scientific and technical literature.
Books and journals of publishing houses: IPRZHR, RS-PRESS, SCIENCE-PRESS


Тел.: +7 (495) 625-9241

 

The hybrid intelligent Russian language based dialog information system using metagraph approach

Keywords:

Yu.E. Gapanyuk – Ph.D.(Eng.), Associate Professor, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
E-mail: gapyu@bmstu.ru
A.V. Leontiev – Post-graduate Student, Master, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
E-mail: alekseyl@list.ru
I.I. Latkin – Post-graduate Student, Master, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
E-mail: igor.latkin@outlook.com
S.V. Chernobrovkin – Post-graduate Student, Master, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
E-mail: sergey.chernobrovkin@inbox.ru
M.A. Belyanova – Undergraduate, Bachelor, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
E-mail: flerchy@gmail.com
O.N. Morozenkov – Student, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
E-mail: m@oleg.rocks


Basically, modern dialog systems (chatbots) satisfactorily cope only with the task of answering frequently asked questions. Also, the task of answering user questions based on the knowledge base is actual. Currently, many users are ready to provide their domain data in the form of a denormalized table (a set of denormalized tables). Typical requirements for a dialog system, in this case, are the following: I. Answers to the frequently asked questions; II. Answers to questions from the knowledge base presented in the form of a denormalized table; III. Active dialog with the user. Asking the user counter questions.
The task of answering a user's question is usually solved as a classification problem, where the answers are the target classes, and possible questions are used as features. There are also commercial systems that implement the functionality of answering frequently asked questions using machine learning methods, in particular, Microsoft QnA Maker.
The frequently asked questions answering module is implemented using a combination of measures TF-IDF (with a weight of 0.8) and Doc2Vec (with a weight of 0.2) and also using the cosine distance. But to answer questions about the knowledge base, machine learning methods do not offer such stable solutions. The most promising methods based on recurrent neural networks (LSTM, Seq2Seq) do not allow to answer questions on the knowledge base accurately.
As a model of the knowledge base, a metagraph model is used. It allows using metavertices both as data elements to answer questions and as information elements to implement an active dialog. The history of the user's responses is saved in the user's session. Counter questions are given to the user on the basis of the knowledge base and the history of his answers.
Linguistic preprocessing of user's questions includes tokenization, transliteration, morphological normalization, phonetization, spell-checking and replacement of synonyms.
The information system provides, as far as possible, seamless integration of the frequently asked questions answering module and the knowledge base processing module.

References:
  1. Ranoliya B.R., Raghuwanshi N., Singh S. Chatbot for university related FAQs // International Conference on Advances in Computing, Communications and Informatics (ICACCI). Udupi. 2017. P. 1525−1530. doi: 10.1109/ICACCI.2017.8126057.
  2. Mikolov T., Sutskever I., Chen K., Corrado G.S., Dean J. Distributed representations of words and phrases and their compositionality // Advances in neural information processing systems. 2013. P. 3111−3119.
  3. Khan R., Das A. Build Better Chatbots. Apress. 2018.
  4. Sundermeyer M., Ney H., Schlüter R. From feedforward to recurrent LSTM neural networks for language modeling // IEEE/ACM Trans. Audio, Speech and Lang. Proc. 23(3). 2015. P. 517−529. doi: http://dx.doi.org/10.1109/TASLP.2015.2400218.
  5. Sutskever I., Vinyals O., Le Q.V. Sequence to sequence learning with neural networks // Advances in neural information processing systems. 2014. P. 3104−3112.
  6. Chernen’kij V.M., Terexov V.I., Gapanyuk Yu.E. Struktura gibridnoj intellektual’noj informaczionnoj sistemy’ na osnove metagrafov // Nejrokomp’yutery’: razrabotka, primenenie. 2016. № 9. S. 3−14.
  7. Chernen’kij V.M., Gapanyuk Yu.E., Revunkov G.I., Terexov V.I., Kaganov Yu.T. Metagrafovy’j podxod dlya opisaniya gibridny’x intellektual’ny’x informaczionny’x sistem // Prikladnaya informatika. 2017. T. 12. № 3 (69). S. 57−79.
  8. Parmar V.P., Kumbharana C.K. Study existing various phonetic algorithms and designing and development of a working model for the new developed algorithm and comparison by implementing it with existing algorithms // J. Comput. Appl. 2014. 98(19). P. 45−49.
May 29, 2020

© Издательство «РАДИОТЕХНИКА», 2004-2017            Тел.: (495) 625-9241                   Designed by [SWAP]Studio