genericSummary: Summarize a text

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Selects sentences from a text that best describe its topic

Usage

1

Arguments

text

A character vector of length(text) = 1 specifiying the text to be summarized

k

The number of sentences to be used in the summary

split

A character vector specifying which symbols determine the end of a sentence in the document

min

The minimum amount of words a sentence must have to be included in the computations

breakdown

If TRUE, the function breakdown is applied to the input

...

Further arguments to be passed on to textmatrix

Details

Applies the method of Gong & Liu (2001) for generic text summarization of text document D via Latent Semantic Analysis:

  1. Decompose the document D into individual sentences, and use these sentences to form the candidate sentence set S, and set k = 1.

  2. Construct the terms by sentences matrix A for the document D.

  3. Perform the SVD on A to obtain the singular value matrix Σ, and the right singular vector matrix V^t. In the singular vector space, each sentence i is represented by the column vector ψ _i = [v_i1, v_i2, ... , v_ir]^t of V^t.

  4. Select the k'th right singular vector from matrix V^t.

  5. Select the sentence which has the largest index value with the k'th right singular vector, and include it in the summary.

  6. If k reaches the predefined number, terminate the op- eration; otherwise, increment k by one, and go to Step 4.

(Cited directly from Gong & Liu, 2001, p. 21)

Value

A character vector of the length k

Author(s)

Fritz G?nther

See Also

textmatrix, lsa, svd

Examples

1
2
3
4
5
6
7
D <- "This is just a test document. It is set up just to throw some random 
sentences in this example. So do not expect it to make much sense. Probably, even 
the summary won't be very meaningful. But this is mainly due to the document not being
meaningful at all. For test purposes, I will also include a sentence in this 
example that is not at all related to the rest of the document. Lions are larger than cats."

genericSummary(D,k=1)

codymarquart/LSAfun2 documentation built on May 13, 2019, 8:48 p.m.