# predictive.link.probability: Use the RTM to predict whether a link exists between two... In lda: Collapsed Gibbs Sampling Methods for Topic Models

## Description

This function takes a fitted LDA-type model (e.g., LDA or RTM) and makes predictions about the likelihood of a link existing between pairs of documents.

## Usage

 1 predictive.link.probability(edgelist, document_sums, alpha, beta) 

## Arguments

 edgelist A two-column integer matrix where each row represents an edge on which to make a prediction. An edge is expressed as a pair of integer indices (1-indexed) into the columns (i.e., documents) of document_sums (see below). document_sums A K \times D matrix where each entry is a numeric proportional to the probability of seeing a topic (row) conditioned on document (column) (this entry is sometimes denoted θ_{d,k} in the literature, see details). The document_sums field or the document_expects field from the output of lda.collapsed.gibbs.sampler and rtm.collapsed.gibbs.sampler can be used. alpha The value of the Dirichlet hyperparamter generating the distribution over document_sums. This, in effect, smooths the similarity between documents. beta A numeric vector of regression weights which is used to determine the similarity between two vectors (see details). Arguments will be recycled to create a vector of length dim(document_sums).

## Details

Whether or not a link exists between two documents i and j is a function of the weighted inner product of the document_sums[,i] and document_sums[,j]. After normalizing document_sums column-wise, this inner product is weighted by beta.

This quantity is then passed to a link probability function. Like rtm.collapsed.gibbs.sampler in this package, only the exponential link probability function is supported. Note that quantities are automatically scaled to be between 0 and 1.

## Value

A numeric vector of length dim(edgelist), representing the probability of a link existing between each pair of documents given in the edge list.

## Author(s)

Jonathan Chang (slycoder@gmail.com)

## References

Chang, Jonathan and Blei, David M. Relational Topic Models for Document Networks. Artificial intelligence and statistics. 2009.

## See Also

rtm.collapsed.gibbs.sampler for the format of document_sums. links.as.edgelist produces values for edgelist. predictive.distribution makes predictions about document content instead.

## Examples

 1 2 3 ## See demo. ## Not run: demo(rtm) 

lda documentation built on May 1, 2019, 10:34 p.m.