Machine Learning on Linked Data:

Tensors and their Applications in Graph-Structured Domains

When &
Where

The tutorial will take place on Sunday, November 11th from 2:00pm - 5:30pm in Room Alcott


Slides

Slides for the talk are available here. Will most likely only work with Chrome (for now, sorry...)


Abstract

Machine learning has become increasingly important in the context of Linked Data as it is an enabling technology for many important tasks such as link prediction, information retrieval or group detection. The fundamental data structure of Linked Data is a graph. Graphs are also ubiquitous in many other fields of application, such as social networks, bioinformatics or the World Wide Web. Recently, tensor factorizations have emerged as a highly promising approach to machine learning on graph-structured data, showing both scalability and excellent results on benchmark data sets, while matching perfectly to the triple structure of RDF. This tutorial will provide an introduction to tensor factorizations and their applications for machine learning on graphs. By the means of concrete tasks such as link prediction we will discuss several factorization methods in-depth and also provide necessary theoretical background on tensors in general. Emphasis is put on tensor models that are of interest to Linked Data, which will include models that are able to factorize large-scale graphs with millions of entities and known facts or models that can handle the open-world assumption of Linked Data. Furthermore, we will discuss tensor models for temporal and sequential graph data, e.g. to analyze social networks over time.

Audience
& Agenda

The tutorial is directed at Researchers as well as Industry Professionals and Practitioners. It will feature the introduction to a novel line of research for machine learning on Linked Data and an introduction to tensor methods in general. Tensor factorization methods will be discussed by the means of their applications for typical machine learning tasks on Linked Data, what will include novel and scalable tensor-based approaches to important practical problems such as link prediction, information retrieval or group detection. The tutorial will cover the following topics:
  • Machine Learning on Graphs and Linked Data

    (~ 15 Min.)
    • Motivation of Machine Learning on Graphs, e.g. link prediction, entity resolution and disambiguation. Focus on Linked Data / Semantic Web data. Discussion of machine learning on Graphs in this context, i.e. prediction of unknown triples, instance matching, ontology alignment. Discussion of associated challenges.
    • Very brief overview of common approaches for machine learning on graphs respectively relational data
  • Tensors, Graphs and Multilinear Algebra

    (~ 25 Min.)
    • Introduction to graphs and networks
    • Modeling of graphs as matrices and tensors. Tensors as a natural and very efficient data structure for RDF
    • Introduction to tensors and multilinear algebra in general, i.e. terminology, structure of tensors, multilinear operations on tensors
    • Presentation of common tensor factorizations , important properties and ways to compute tensor factorizations.
  • Tensors in the Multi-Relational Domain

    (~ 40 Min.)
    • Tensor factorizations for general multi-relational data and applications
    • Collective learning with tensors, RESCAL [16]
    • Bayesian non-parametric machine learning, Bayesian Clustered Tensor Factorization [29]
    • Hierarchical multilinear models [32]
  • Coffee Break

  • Learning with Tensors on Semantic Web Data and Linked Data

    (~ 40 Min.)
    • Tensor factorizations in the Linked Data / Semantic Web context
    • Large-scale data, scalability
    • Large-scale prediction of unknown triples on YAGO with RESCAL [29]
    • Prediction of RDF triples in knowledge bases with Pairwise Interaction Tensor Factorization [21, 31]
    • Faceted Browsing on RDF data with TripleRank [17]
  • Learning with Tensors on Time-evolving Graphs

    (~ 40 Min.)
    • Modelling and analyzing graphs and networks that change over time
    • Analyzing communication-networks, DEDICOM [18]
    • Item recommendation on sequential data [16]
    • Time-evolving networks, Streaming Tensor Analysis [20]

Presenters

Maximilian Nickel is a Ph.D. student at the Ludwig Maximilian University of Munich, Germany. His research interests are focused on tensor factorizations, statistical relational learning and the Semantic Web. He received a Diploma degree in computer science with honors at the Ludwig Maximilian University of Munich. He is the corresponding author of publications in top tier conferences such as WWW 2012, NIPS 2010/11 and ICML 2012. He is also involved in the EU FP7 LarKC project and in the THESEUS program, the Internet of Services, funded by Federal Ministry of Economics and Technology of Germany. His main focus of research is the development of scalable tensor and matrix factorization approaches for Semantic Web data and their application to large data sets.

Volker Tresp received a Diploma degree from the University of Göttingen, Germany, in 1984 and the M.Sc. and Ph.D. degrees from Yale University, New Haven, CT, in 1986 and 1989 respectively. Since 1989 he is the head of a research team in machine learning at Siemens, Corporate Research and Technology. In 1994 he was a visiting scientist at the Massachusetts Institute of Technology's Center for Biological and Computational Learning. He filed more than 54 patent applications and was inventor of the year of Siemens in 1996. He has published more than 100 scientific articles and administered more than 10 Ph.D. theses. The company Panoratio is a spin-off out of his team. He has been involved in all leading programme committees in machine learning. He is coordinating the Siemens effort in the nationally funded project THESEUS for the development of semantic, multimedia and learning technologies and has lead the machine learning efforts in the EU FP7 project LarKC (2008-2011) on scalable reasoning and machine learning for the Semantic Web. In 2011 he became a Professor at the Ludwig-Maximilians University of Munich. He has presented tutorials at the Machine Learning Summer School 2006 in Canberra, Australia, at ICML 2009 and at ILP 2010. At ESWC 2012, he presented a tutorial titled: Accessing the Semantic Web via Statistical Machine Learning.

Machine Learning on Linked Data

Tensors and Graphs

Applications

Relevant Publications by the Presenters