APPLICATIONS OF MINING HETEROGENEOUS INFORMATION NETWORKS Yizhou Sun College of Computer and Information Science Northeastern University [email protected] 2/7/20 Heterogeneous Information Networks Multiple object types and/or multiple link types Movie Studio Venue Paper

Author DBLP Bibliographic Network Actor Movie Director The IMDB Movie Network The Facebook Network 1. Homogeneous networks are Information loss projection of heterogeneous networks! 2. New problems are emerging in heterogeneous networks! Directly Mining information richer heterogeneous networks

2 Outline Why Heterogeneous Information Networks? Entity Recommendation Information Diffusion Ideology Detection Summary 3 Recommendation Paradigm feedback user community useritem feedback recommender system

recommendation product features Collaborative Content-Based Filtering Methods Hybrid Methods E.g., K-Nearest (BalabanovicNeighbor Comm. ACM (Sarwar 97, Zhang WWW01)

SIGIR02) ,,Matrix Content-Based CF (Antonopoulus, IS06) Factorization (Hu ICDM08, IEEE-CS09), External Knowledge CF (MaKoren WSDM11) Probabilistic Model (Hofmann SIGIR03) external knowledge 4 Problem Definition

feedback user implicit user feedback recommender system recommendation hybrid collaborative filtering with information networks information network 5 Hybrid Collaborative Filtering with Networks Utilizing

network relationship information can enhance the recommendation quality However, most of the previous studies only use single type of relationship between users or items (e.g., social network Ma,WSDM11, trust relationship Ester, KDD10, service membership Yuan, RecSys11) 6 The Heterogeneous Information Network View of Recommender System Avatar Aliens Revolution -ary Road

Titanic James Cameron Zoe Saldana Adventure Romance Leonardo Dicaprio Kate Winslet 7

Relationship Heterogeneity Alleviates Data Sparsity A small number of users and items have a large number of ratings # of ratings Collaborative filtering methods suffer from data sparsity issue Most users and items have a small number of ratings # of users or items

Heterogeneous relationships complement each other Users and items with limited feedback can be connected to the network by different types of paths Connect new users or items (cold start) in the 8 Relationship Heterogeneity Based Personalized Recommendation Models Different users may have different behaviors or preferences Two levels of personalization James Cameron fan Aliens 80s Sci-fi fan

Sigourney Weaver fan Different users may be interested in the same movie for different reasons Data level Most recommendation methods use one model for all users and rely on personal feedback to achieve personalization Model level With different

entity relationships, we can learn personalized models for different users to further 9 Preference Propagation-Based Latent Features genre: drama Bob Charlie Naomi Watts

King Kong tag: Oscar Nomination Ralph Fiennes Alice Titanic skyfall Kate Winslet Generate L different meta-path (path types) connecting users and items revolutionary road

Propagate user implicit feedback along each metapath Sam Mendes Calculate latentfeatures for users and items for each meta-path with NMF related method 10 Recommendation Models Observation 1: Different meta-paths may have different importance Global Recommendation Model ranking score

features for user i and item j (1) the q-th meta-path Observation 2: Different users may require different models Personalized Recommendation Model user-cluster similarity L (2) c total soft user clusters 11 Parameter Estimation

Bayesian personalized ranking (Rendle UAI09) Objective function sigmoid function min (3) for each correctly ranked item pair i.e., gave feedback to but not Soft cluster users with NMF + k-means For each user cluster, learn one model with Eq. (3)

Generate personalized model for each user on the fly with Eq. (2) Learning Personalized Recommendation Model 12 Experiment Setup Datasets Comparison methods: Popularity: recommend the most popular items to users Co-click: conditional probabilities between items NMF: non-negative matrix factorization on user

feedback Hybrid-SVM: use Rank-SVM with plain features (utilize both user feedback and information network) 13 Performance Comparison HeteRec personalized recommendation (HeteRec-p) provides the best recommendation results 14 Performance under Different Scenarios p p user

HeteRecp consistently outperform other methods in different scenarios better recommendation results if users provide more feedback better recommendation for users who like less popular items 15 Entity Recommendation in Information Contributions Networks with Implicit User Feedback (RecSys13, WSDM14a) Propose latent representations for users and items by propagating user preferences along different meta-paths Employ Bayesian ranking optimization technique to correctly evaluate recommendation models Further improve recommendation quality by considering user differences at model level and

define personalized recommendation models Two levels of personalization 16 Outline Why Heterogeneous Information Networks? Entity Recommendation Information Diffusion Ideology Detection Summary 17 Information Diffusion in Networks Action of a node is triggered by the actions of their neighbors

18 Linear Threshold Model [Granovetter, 1978] If the weighted activation number of its neighbors is bigger than a pre-specified threshold , the node u is going to be activated In other words 19 Heterogeneous Bibliographic Network Multiple types of objects Multiple types of links

20 Derived Multi-Relational Bibliographic Network Collaboration: Author-Paper-Author Citation: Author-Paper->Paper-Author Sharing Co-authors: Author-Paper-Author-Paper-Author Co-attending venues: Author-Paper-Venue-Paper-Author How to generate these meta-paths ? PathSim: Sun et.al, VLDB11 21 How Topics Are Propagated among Authors? To Apply Existing approaches Select one relation between authors (say, A-P-A)

Use all the relations, but ignore the relation types Do different relation types play m r o f n I ! s s lo n o ati

different roles? Need new models! 22 Two Assumptions for Topic Diffusion in Multi-Relational Networks Assumption 1: Relation independent diffusion Model-level aggregation 23 Assumption 2: Relation interdependent diffusion Relation-level aggregation 24 Two Models under the Two Assumptions

Two multi-relational linear threshold models Model 1: MLTM-M Model-level aggregation Model 2: MLTM-R Relation-level aggregation 25 MLTM-M For each relation type k The activation probability for object i at time t+1: The collective model The final activation probability for object i is an aggregation over all relation types

26 Properties of MLTM-M 27 MLTM-R Aggregate multi-relational network with different weights Treat the activation as in a single-relational network To make sure the activation probability non-negative, weights are required non-negative

28 Properties of MLTM-R 29 How to Evaluate the Two Models? Test on the real action log on multiple topics! } Diffusion model learning from action log MLE estimation over 30 Two Real Datasets DBLP Computer Science

Relation types APA, AP->PA, APAPA, APVPA APS Physics Relation types APA, AP->PA, APAPA, APOPA 31 Topics Selected Select topics with increasing trends 32 Evaluation Methods Global Prediction How many authors are activated at t+1

Error rate = (predicted#/true# + true#/predicted#)-1 Local Prediction Which author is likely to be activated at t+1 AUPR (Area under Precision-Recall Curve) 33 Global Prediction 34 Local Prediction - AUPR 1: Different Relation Play Different Roles in Diffusion Process 2: Relation-Level Aggregation is better than ModelLevel Aggregation

Case Study 36 Prediction Results on social network Diffusion 37 38 WIN! 39 Outline Why Heterogeneous Information Networks? Entity Recommendation

Information Diffusion Ideology Detection Summary 40 Topic-Factorized Ideal Point Estimation Model for Legislative Voting Network (KDD14, Gu, Sun et al.) 41 Background Federal Legislation (bill) The House

Senate Law Bill 1 Bill 2 United Stated Congress Ronald Paul Barack Obama The House Senate Politician Republican

Democrat Ronald Paul Barack Obama liberal conservative 42 Legislative Voting Network 43 Problem Definition Input:

Legislative Network Output: : Ideal Points for Politician : Ideal Points for Bill s on different topics 44 Existing Work 1-dimensional ideal point model (Poole and Rosenthal, 1985; Gerrish and Blei, 2011) High-dimensional ideal point model (Poole and Rosenthal, 1997) Issue-adjusted ideal point model (Gerrish and Blei, 2012)

45 Motivations Voters have different positions on different topics. Topic 1 Topic 2 Topic 3 Topic 4 Traditional matrix factorization method cannot give the meanings for each dimension.

latent factor Topics of bills can influence politicians voting, and the voting behavior can better interpret the topics of bills as well. Topic Model: Health Public Transport

Voting-guided Topic Model: Health Service Health Expenses Public Transport 46 Topic-Factorized IPM ( , )

Entities: Politicians Bills Terms Links: (, ) () Politicians Bills Heterogeneous Voting Network Terms Parameters to maximize the

likelihood of generating two types of links: Ideal points for politicians Ideal points for bills Topic models 47 Text Part Politicians Bills Terms 48

Text Part We model the probability of each word in each document as a mixture of categorical distributions, as in PLSA (Hofmann, 1999) and LDA (Blei et al., 2003) Bill =() =() Topic

Word 49 Voting Part Intuitions: The more similar of the ideal points of u and d, the higher probability of YEA link Politicians Bills Terms

The higher portion a bill belongs to topic k, the higher weight of ideal points on topic k 50 Voting Part 1 2 1 -1 1 1

1 1 1 0 -1 1 1 1 -1

1 Bill 1 1 1 ^ =

=1 Topic Topic 1 0

2 0 Voter Topic Topic 1 2 ^ =

=1 1 1User-Bill 1 1voting -1 matrix 1 0 0 YEA ) ( =1 ) = ( +

NAY ) ( =1 ) =1 ( + { ( | , , , )= (, ) : 0 ( ( =1 ) =1 }

1+ 2 { ( =1 ) = 1 } 1 2 )

51 Combining Two Parts Together The final objective function is a linear combination of the two average log-likelihood functions over the word links and voting links. We also add an regularization term to and to reduce over- fitting. 52 Learning Algorithm An iterative algorithm where ideal points related parameters () and topic model related parameters ()

enhance each other. Step 1: Update given Gradient descent Step 2: Update given Follow the idea of expectation-maximization (EM) algorithm: maximize a lower bound of the objective function in each iteration 53 Learning Algorithm Update : A nonlinear constrained optimization problem. Remove the constraints by a logistic function based transformation:

1 1+ = =1 1 1 1+

if if =1

and update using gradient descent. Update : Since only appears in the topic model part, we use the same updating rule as in PLSA: where 54 Data Description Dataset: U.S. House and Senate roll call data in the years between 1990 and 2013 1,540 legislators 7,162 bills 2,780,453 votes (80% are YEA)

Keep the latest version of a bill if there are multiple versions. Randomly select 90% of the votes as training and 10% as testing. Downloaded from http://thomas.loc.gov/home/rollcallvotes.html 55 Evaluation Measures Root mean square error (RMSE) between the predicted vote score and the ground truth RMSE = Accuracy of correctly predicted votes (using 0.5 as a threshold for the predicted accuracy) Accuracy =

Average log-likelihood of the voting link AvelogL = 56 Experimental Results Training Data set Testing Data set 57 Parameter Study 1 ( , , , , )=( 1 ) ( ) + ( )

2 2 Parameter study on Parameter study on (regularization coefficient) 58 Case Studies Ideal points for three famous politicians: (Republican, Democrat) Ronald Paul (R), Barack Obama (D), Joe Lieberman

(D) Public Transportation Funds Health Expenses Health Service Law Militar y Financial Institution Individual Property Educatio n Foreign Ronald Paul Barack Obama Joe Lieberman

59 Case Studies Scatter plots over selected dimensions: (Republican, Democrat) 60 Case Studies Bill: H_RES_578 109th Congress (2005-2006) It is about supporting the government of Romania to improve the standard health care and well-being of children in Romania. R. Paul YEA

H_RES_578 For Unseen Bill : ( =1 ) = ( + ) Topic Model TF-IPM Experts/ Algorithm

( =1 ) 61 Outline Why Heterogeneous Information Networks? Entity Recommendation Information Diffusion Ideology Detection Summary 62 Summary Heterogeneous Information Networks are networks

with multiple types of objects and links Principles in mining heterogeneous information networks Meta-path-based mining Systematically form new types of relations Relation strength-aware mining Different types of relations have different strengths Relation semantic-aware mining Different types of relations need different modeling 63 Q&A THANK YOU!

64