search_new | R Documentation |
Creates a new search object and runs the search in a corpus object. Only 'x' and 'pattern' are obligatory. The other arguments can be left to their default values.
search_new(
x,
pattern,
searchMode = c("content", "fulltext", "fulltext.byTime", "fulltext.byTier"),
searchNormalized = TRUE,
name = "mysearch",
resultid.prefix = "result",
resultid.start = 1,
filterTranscriptNames = NULL,
filterTranscriptIncludeRegEx = NULL,
filterTranscriptExcludeRegEx = NULL,
filterTierNames = NULL,
filterTierIncludeRegEx = NULL,
filterTierExcludeRegEx = NULL,
filterSectionStartsec = NULL,
filterSectionEndsec = NULL,
concordanceMake = TRUE,
concordanceWidth = NULL,
cutSpanBeforesec = 0,
cutSpanAftersec = 0,
runSearch = TRUE
)
x |
Corpus object; basis in which will be searched. |
pattern |
Character string; search pattern as regular expression. |
searchMode |
Character string; takes the following values: |
searchNormalized |
Logical; if |
name |
Character string; name of the search. Will be used, for example, as name of the sub folder when creating media cuts. |
resultid.prefix |
Character string; search results will be numbered consecutively; This character string will be placed before the consecutive numbers. |
resultid.start |
Integer; search results will be numbered consecutively; This is the start number of the identifiers. |
filterTranscriptNames |
Vector of character strings; names of transcripts to be included. |
filterTranscriptIncludeRegEx |
Character string; as regular expression, limit search to certain transcripts matching the expression. |
filterTranscriptExcludeRegEx |
Character string; as regular expression, exclude certain transcripts matching the expression. |
filterTierNames |
Vector of character strings; names of tiers to be included. |
filterTierIncludeRegEx |
Character string; as regular expression, limit search to certain tiers matching the expression. |
filterTierExcludeRegEx |
Character string; as regular expression, exclude certain tiers matching the expression. |
filterSectionStartsec |
Double; start time of region for search. |
filterSectionEndsec |
Double; end time of region for search. |
concordanceMake |
Logical; if |
concordanceWidth |
Integer; number of characters to the left and right of the search hit in the concordance , the default is |
cutSpanBeforesec |
Double; Start the media and transcript cut some seconds before the hit to include some context, the default is |
cutSpanAftersec |
Double; End the media and transcript cut some seconds before the hit to include some context, the default is |
runSearch |
Logical; if |
Search object.
search_run, search_makefilter, search_sub
library(act)
# Search for the 1. Person Singular Pronoun in Spanish.
mysearch <- act::search_new(examplecorpus, pattern= "yo")
mysearch
# Search in normalized content vs. original content
mysearch.norm <- act::search_new(examplecorpus, pattern="yo", searchNormalized=TRUE)
mysearch.org <- act::search_new(examplecorpus, pattern="yo", searchNormalized=FALSE)
mysearch.norm@results.nr
mysearch.org@results.nr
# The difference is because during normalization capital letters will be converted
# to small letters. One annotation in the example corpus contains a "yo" with a
# capital letter:
mysearch <- act::search_new(examplecorpus, pattern="yO", searchNormalized=FALSE)
mysearch@results$hit
# Search in full text vs. original content.
# Full text search will find matches across annotations.
# Let's define a regular expression with a certain span.
# Search for the word "no" 'no' followed by a "pero" 'but'
# in a distance ranging from 1 to 20 characters.
myRegEx <- "\\bno\\b.{1,20}pero"
mysearch <- act::search_new(examplecorpus, pattern=myRegEx, searchMode="fulltext")
mysearch
mysearch@results$hit
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.