Man pages for textpress
A Lightweight and Versatile NLP Toolkit

abbreviationsCommon Abbreviations for Sentence Splitting
api_huggingface_embeddingsCall Hugging Face API for Embeddings
dot-decode_duckduckgo_urlsDecode DuckDuckGo Redirect URLs
dot-extract_linksExtract links from a search engine result page
dot-get_siteGet Site Content and Extract HTML Elements
dot-insert_highlightInsert Highlight in Text
dot-process_bingProcess Bing search results
dot-process_duckduckgoProcess DuckDuckGo search results
dot-process_yahooProcess Yahoo News search results
dot-translate_queryTranslate Search Query
extract_dateExtract Date from HTML Content
nlp_build_chunksBuild Chunks for NLP Analysis
nlp_cast_tokensConvert Token List to Data Frame
nlp_melt_tokensTokenize Data Frame by Specified Column(s)
nlp_split_paragraphsSplit Text into Paragraphs
nlp_split_sentencesSplit Text into Sentences
nlp_tokenize_textTokenize Text Data (mostly) Non-Destructively
sem_nearest_neighborsFind Nearest Neighbors Based on Cosine Similarity
sem_search_corpusNLP Search Corpus
standardize_dateStandardize Date Format
textpress-packagetextpress: A Lightweight and Versatile NLP Toolkit
web_scrape_urlsScrape News Data from Various Sources
web_searchProcess search results from multiple search engines
textpress documentation built on Oct. 14, 2024, 5:08 p.m.