Yu.E. Gapanyuk – Ph.D.(Eng.), Associate Professor, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
G.I. Revunkov – Ph.D.(Eng.), Associate Professor, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
S.V. Zlobina – Student, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
Z.D. Kadiev – Bachelor, Master, Department «Information Processing and Control Systems», Bauman Moscow State Technical University
One of the most modern approaches to solving the problem of processing graphs of large dimension is the rejection of the traditional representation of a graph as a set of vertices and a set of edges directly when processing a graph. The model of a complex graph in terms of its processing is considered at two levels. By analogy with the data model of relational DBMS, these levels can be called the «logical» and «physical» model of a complex graph. Currently, continuous vector spaces are traditionally used as a physical model of graph representation. However, there are no restrictions on the use of other types of spaces. The operation of converting a graph into a vector space is called a «vector representation» or «embedding».
A knowledge graph embedding is a mathematical transformation of a graph into a vector or into a set of vectors in a given vector space. It can be carried out separately for the vertices of the graph, for the vertices and edges of the graph, and even for the entire graph as a whole. In the first two cases, the result will be a set of vectors, in the latter - one vector for the whole graph. The main condition is that such a transformation should adequately convey the semantics and topology of the original graph. The embedding process of a know-ledge graph can be divided into three stages: the choice of how entities and relationships will be represented in the vector space; setting the scoring function; embedding model training. All existing embedding models can be divided into two groups: «translational distance models» and «semantic matching models». The «translational distance models» includes TransE and TransR. The TransE model represents both nodes and links as vectors in the same vector space. Moreover, the relation operator is considered as a vector of move-ment between the subject and the object. The main disadvantage of the TransE model is the complexity of describing one-to-many and many-to-many relationships. The TransR model uses two vector spaces, one for embedding entities, and one for embedding relation-ships. This eliminates the disadvantages of the TransE model. But the TransR model is less computationally efficient.
The «semantic matching models» include RESCAL, DistMult, and HolE. The RESCAL model associates each entity with a vector and tries to convey the hidden meanings (factors) of this entity. Each relationship is represented as a matrix that models pairwise interactions of hidden factors. The disadvantages of the model include its computational complexity. The DistMult model simplifies the RESCAL model. But DistMult can only work with symmetrical relationships. This is clearly not enough for an abstract knowledge graph, in which relations can be arbitrary. The HolE model combines the power of RESCAL and the simplicity of DistMult. The results of experiments show that this model is the most successful for the embedding of knowledge graphs.
The metagraph can be converted to a multipartite flat graph and then processed using embedding patterns for flat graphs.
- Evin I.A. Vvedenie s teoriyu slozhnykh setei. Kompyuternye issledovaniya i modelirovanie. 2010. T. 2. № 2. S. 121−141.
- Kuznetsov O.P., Zhilyakova L.Yu. Slozhnye seti i kognitivnye nauki. Sb. nauchnykh trudov XVII Vseros. nauchno-tekhnich. konf. Neiroinformatika-2015. M.: MIFI. 2015. Ch. 1. S. 18.
- Anokhin K.V. Kognitom: gipersetevaya model mozga. Sb. nauchnykh trudov XVII Vseros. nauchno-tekhnich. konf. Neiroinformatika-2015. M.: MIFI. 2015. Ch. 1. S. 14−15.
- Chapela V., Regino Criado, Santiago Moral, Miguel Romance. Intentional risk management through complex networks analysis. Springer. 2015: SpringerBriefs in optimization.
- Ehrlinger L., Wöß W. Towards a Definition of Knowledge Graphs. Joint Proceedings of the Posters and Demos Track of 12th International Conference on Semantic Systems – SEMANTiCS2016 and 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS16), 2016. Leipzig. Germany. V. 1695.
- Wang Q., Mao Z., Wang B., Guo L. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering. 2017. V. 29. № 12. P. 2724−2743.
- Bordes A., Usunier N., Garcia-Duran A., Weston J., Yakhnenko O. Translating embeddings for modeling multirelational data. Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013. V. 2. P. 2787−2795.
- Lin H., Liu Y., Wang W., Yue Y., Lin Z. Learning Entity and Relation Embeddings for Knowledge Resolution. Procedia Computer Science. 2017. V. 108. P. 345−354.
- Nickel M., Tresp V., Kriegel H. A Three-way Model for Collective Learning on Multi-relational Data. Proceedings of the 28th International Conference on International Conference on Machine Learning. 2011. P. 809−816.
- Yang B., Yih W., He X., Gao J., Deng L. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. 2015. arXiv:1412.6575.
- Nickel M., Rosasco L., Poggio T. Holographic Embeddings of Knowledge Graphs. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. 2016. P. 1955−1961.
- Chernenkii V.M., Terekhov V.I., Gapanyuk Yu.E. Struktura gibridnoi intellektualnoi informatsionnoi sistemy na osnove metagrafov. Neirokompyutery: razrabotka, primenenie. 2016. № 9. S. 3−14.
- Dunin I.V., Gapanyuk Yu.E., Revunkov G.I. Osobennosti preobrazovaniya metagrafa v model ploskogo grafa. Dinamika slozhnykh sistem – XXI vek. 2018. № 9. S. 47−51.