sparklyr: R Interface to Apache Spark

R interface to Apache Spark, a fast and general engine for big data processing, see <http://spark.apache.org>. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

AuthorJavier Luraschi [aut, cre], Kevin Ushey [aut], JJ Allaire [aut], RStudio [cph], The Apache Software Foundation [aut, cph]
Date of publication2017-02-16 22:40:14
MaintainerJavier Luraschi <javier@rstudio.com>
LicenseApache License 2.0 | file LICENSE
Version0.5.2
http://spark.rstudio.com

View on CRAN

Man pages

compile_package_jars: Compile Scala sources into a Java Archive (jar)

connection_config: Read configuration values for a connection

connection_is_open: Check whether the connection is open

copy_to.spark_connection: Copy an R Data Frame to Spark

DBISparkResult-class: DBI Spark Result.

ensure: Enforce Specific Structure for R Objects

find_scalac: Discover the Scala Compiler

ft_binarizer: Feature Transformation - Binarizer

ft_bucketizer: Feature Transformation - Bucketizer

ft_discrete_cosine_transform: Feature Transformation - Discrete Cosine Transform (DCT)

ft_elementwise_product: Feature Transformation - ElementwiseProduct

ft_index_to_string: Feature Transformation - IndexToString

ft_one_hot_encoder: Feature Transformation - OneHotEncoder

ft_quantile_discretizer: Feature Transformation - QuantileDiscretizer

ft_regex_tokenizer: Feature Tranformation - RegexTokenizer

ft_sql_transformer: Feature Transformation - SQLTransformer

ft_string_indexer: Feature Transformation - StringIndexer

ft_tokenizer: Feature Tranformation - Tokenizer

ft_vector_assembler: Feature Transformation - VectorAssembler

invoke: Invoke a Method on a JVM Object

invoke_method: Generic call interface for spark shell

livy_config: Create a Spark Configuration for Livy

livy_install: Install Livy

livy_service: Start Livy

ml_als_factorization: Spark ML - Alternating Least Squares (ALS) matrix...

ml_binary_classification_eval: Spark ML - Binary Classification Evaluator

ml_classification_eval: Spark ML - Classification Evaluator

ml_create_dummy_variables: Create Dummy Variables

ml_decision_tree: Spark ML - Decision Trees

ml_generalized_linear_regression: Spark ML - Generalized Linear Regression

ml_gradient_boosted_trees: Spark ML - Gradient-Boosted Tree

ml_kmeans: Spark ML - K-Means Clustering

ml_lda: Spark ML - Latent Dirichlet Allocation

ml_linear_regression: Spark ML - Linear Regression

ml_logistic_regression: Spark ML - Logistic Regression

ml_model: Create an ML Model Object

ml_multilayer_perceptron: Spark ML - Multilayer Perceptron

ml_naive_bayes: Spark ML - Naive-Bayes

ml_one_vs_rest: Spark ML - One vs Rest

ml_options: Options for Spark ML Routines

ml_pca: Spark ML - Principal Components Analysis

ml_prepare_dataframe: Prepare a Spark DataFrame for Spark ML Routines

ml_prepare_inputs: Pre-process the Inputs to a Spark ML Routine

ml_random_forest: Spark ML - Random Forests

ml_saveload: Save / Load a Spark ML Model Fit

ml_survival_regression: Spark ML - Survival Regression

ml_tree_feature_importance: Spark ML - Feature Importance for Tree Models

na.replace: Replace Missing Values in Objects

pipe: Pipe operator

print_jobj: Generic method for print jobj for a connection type

reexports: Objects exported from other packages

register_extension: Register a Package that Implements a Spark Extension

sdf_copy_to: Copy an Object into Spark

sdf_mutate: Mutate a Spark DataFrame

sdf_partition: Partition a Spark Dataframe

sdf_persist: Persist a Spark DataFrame

sdf_predict: Model Predictions with Spark DataFrames

sdf_quantile: Compute (Approximate) Quantiles with a Spark DataFrame

sdf_read_column: Read a Column from a Spark DataFrame

sdf_register: Register a Spark DataFrame

sdf_sample: Randomly Sample Rows from a Spark DataFrame

sdf-saveload: Save / Load a Spark DataFrame

sdf_schema: Read the Schema of a Spark DataFrame

sdf_sort: Sort a Spark DataFrame

sdf_with_unique_id: Add a Unique ID Column to a Spark DataFrame

spark-api: Access the Spark API

spark_compilation_spec: Define a Spark Compilation Specification

spark_compile: Compile Scala sources into a Java Archive

spark_config: Read Spark Configuration

spark_connection: Retrieve the Spark Connection Associated with an R Object

spark-connections: Manage Spark Connections

spark_dataframe: Retrieve a Spark DataFrame

spark_default_compilation_spec: Default Compilation Specification for Spark Extensions

spark_dependency: Define a Spark dependency

spark_home_dir: Find the SPARK_HOME directory for a version of Spark

spark_install: Download and install various versions of Spark

spark_jobj: Retrieve a Spark JVM Object Reference

spark_load_table: Load a Spark Table into a Spark DataFrame.

spark_log: View Entries in the Spark Log

spark_read_csv: Read a CSV file into a Spark DataFrame

spark_read_json: Read a JSON file into a Spark DataFrame

spark_read_parquet: Read a Parquet file into a Spark DataFrame

spark_save_table: Saves a Spark DataFrame as a Spark table

spark_version: Get the Spark Version Associated with a Spark Connection

spark_version_from_home: Get the Spark Version Associated with a Spark Installation

spark_web: Open the Spark web interface

spark_write_csv: Write a Spark DataFrame to a CSV

spark_write_json: Write a Spark DataFrame to a JSON file

spark_write_parquet: Write a Spark DataFrame to a Parquet file

tbl_cache: Cache a Spark Table

tbl_uncache: Uncache a Spark Table

Functions

\%>\% Man page
compile_package_jars Man page
connection_config Man page
connection_is_open Man page
copy_to Man page
copy_to.spark_connection Man page
DBISparkResult-class Man page
ensure Man page
ensure_scalar_boolean Man page
ensure_scalar_character Man page
ensure_scalar_double Man page
ensure_scalar_integer Man page
find_scalac Man page
ft_binarizer Man page
ft_bucketizer Man page
ft_discrete_cosine_transform Man page
ft_elementwise_product Man page
ft_index_to_string Man page
ft_one_hot_encoder Man page
ft_quantile_discretizer Man page
ft_regex_tokenizer Man page
ft_sql_transformer Man page
ft_string_indexer Man page
ft_tokenizer Man page
ft_vector_assembler Man page
hive_context Man page
invoke Man page
invoke_method Man page
invoke_new Man page
invoke_static Man page
java_context Man page
livy_available_versions Man page
livy_config Man page
livy_home_dir Man page
livy_install Man page
livy_install_dir Man page
livy_installed_versions Man page
livy_service_start Man page
livy_service_stop Man page
ml_als_factorization Man page
ml_binary_classification_eval Man page
ml_classification_eval Man page
ml_create_dummy_variables Man page
ml_decision_tree Man page
ml_generalized_linear_regression Man page
ml_gradient_boosted_trees Man page
ml_kmeans Man page
ml_lda Man page
ml_linear_regression Man page
ml_load Man page
ml_logistic_regression Man page
ml_model Man page
ml_multilayer_perceptron Man page
ml_naive_bayes Man page
ml_one_vs_rest Man page
ml_options Man page
ml_pca Man page
ml_prepare_dataframe Man page
ml_prepare_features Man page
ml_prepare_inputs Man page
ml_prepare_response_features_intercept Man page
ml_random_forest Man page
ml_save Man page
ml_saveload Man page
ml_survival_regression Man page
ml_tree_feature_importance Man page
na.replace Man page
print_jobj Man page
reexports Man page
registered_extensions Man page
register_extension Man page
sdf_copy_to Man page
sdf_import Man page
sdf_load_parquet Man page
sdf_load_table Man page
sdf_mutate Man page
sdf_mutate_ Man page
sdf_partition Man page
sdf_persist Man page
sdf_predict Man page
sdf_quantile Man page
sdf_read_column Man page
sdf_register Man page
sdf_sample Man page
sdf-saveload Man page
sdf_save_parquet Man page
sdf_save_table Man page
sdf_schema Man page
sdf_sort Man page
sdf_with_unique_id Man page
spark-api Man page
spark_available_versions Man page
spark_compilation_spec Man page
spark_compile Man page
spark_config Man page
spark_connect Man page
spark_connection Man page
spark_connection_is_open Man page
spark-connections Man page
spark_context Man page
spark_dataframe Man page
spark_default_compilation_spec Man page
spark_dependency Man page
spark_disconnect Man page
spark_disconnect_all Man page
spark_home_dir Man page
spark_install Man page
spark_install_dir Man page
spark_installed_versions Man page
spark_install_tar Man page
spark_jobj Man page
spark_load_table Man page
spark_log Man page
spark_read_csv Man page
spark_read_json Man page
spark_read_parquet Man page
spark_save_table Man page
spark_session Man page
spark_uninstall Man page
spark_version Man page
spark_version_from_home Man page
spark_web Man page
spark_write_csv Man page
spark_write_json Man page
spark_write_parquet Man page
tbl_cache Man page
tbl_uncache Man page

Files

sparklyr
sparklyr/inst
sparklyr/inst/staticdocs
sparklyr/inst/staticdocs/index.r
sparklyr/inst/conf
sparklyr/inst/conf/config-template.yml
sparklyr/inst/java
sparklyr/inst/java/sparklyr-2.0-2.11.jar
sparklyr/inst/java/spark-csv_2.11-1.3.0.jar
sparklyr/inst/java/sparklyr-1.6-2.10.jar
sparklyr/inst/java/commons-csv-1.1.jar
sparklyr/inst/java/sparklyr-1.5-2.10.jar
sparklyr/inst/java/univocity-parsers-1.5.1.jar
sparklyr/inst/livy
sparklyr/inst/livy/livyutils.scala
sparklyr/inst/livy/stream.scala
sparklyr/inst/livy/utils.scala
sparklyr/inst/livy/tracker.scala
sparklyr/inst/livy/logging.scala
sparklyr/inst/livy/invoke.scala
sparklyr/inst/livy/sqlutils.scala
sparklyr/inst/livy/serializer.scala
sparklyr/inst/extdata
sparklyr/inst/extdata/install_spark.csv
sparklyr/tests
sparklyr/tests/testthat
sparklyr/tests/testthat/test-install-spark.R
sparklyr/tests/testthat/test-read-write.R
sparklyr/tests/testthat/test-dplyr-do.R
sparklyr/tests/testthat/test-ml-linear-regression.R
sparklyr/tests/testthat/test-feature-transformers.R
sparklyr/tests/testthat/test-ml-kmeans.R
sparklyr/tests/testthat/test-ml-generalized-linear-regression.R
sparklyr/tests/testthat/test-ml-saveload.R
sparklyr/tests/testthat/test-naive-bayes.R
sparklyr/tests/testthat/test.csv
sparklyr/tests/testthat/test-serialization.R
sparklyr/tests/testthat/helper-initialize.R
sparklyr/NAMESPACE
sparklyr/NEWS.md
sparklyr/R
sparklyr/R/test_connection.R sparklyr/R/dplyr_spark_connection.R sparklyr/R/sdf_saveload.R sparklyr/R/ml_classification_evaluators.R sparklyr/R/spark_serialize.R sparklyr/R/data_csv.R sparklyr/R/spark_globals.R sparklyr/R/utils.R sparklyr/R/connection_spark.R sparklyr/R/ml_kmeans.R sparklyr/R/connection_instances.R sparklyr/R/install_spark.R sparklyr/R/ml_interface.R sparklyr/R/ml_backwards_compatibility.R sparklyr/R/spark_version.R sparklyr/R/config_spark.R sparklyr/R/livy_install.R sparklyr/R/ml_logistic_regression.R sparklyr/R/spark_dataframe.R sparklyr/R/dplyr_spark_table.R sparklyr/R/na_actions.R sparklyr/R/spark_shell.R sparklyr/R/dplyr_spark.R sparklyr/R/mutation.R sparklyr/R/ml_feature_transformation.R sparklyr/R/dplyr_sql.R sparklyr/R/ml_lda.R sparklyr/R/ml_survival_regression.R sparklyr/R/data_interface.R sparklyr/R/livy_invoke.R sparklyr/R/dbi_spark_transactions.R sparklyr/R/ml_gradient_boosted_tree.R sparklyr/R/ml_utils.R sparklyr/R/spark_jobj.R sparklyr/R/dplyr_do.R sparklyr/R/tbl_spark.R sparklyr/R/spark_compile.R sparklyr/R/ml_alternating_least_squares.R sparklyr/R/livy_connection.R sparklyr/R/precondition.R sparklyr/R/install_spark_versions.R sparklyr/R/spark_hive.R sparklyr/R/sdf_sql.R sparklyr/R/spark_connection.R sparklyr/R/data_copy.R sparklyr/R/reexports.R sparklyr/R/ml_decision_tree.R sparklyr/R/ml_pca.R sparklyr/R/sdf_wrapper.R sparklyr/R/connection_viewer.R sparklyr/R/ml_generalized_linear_regression.R sparklyr/R/ml_saveload.R sparklyr/R/ml_options.R sparklyr/R/tables_spark.R sparklyr/R/ml_random_forest.R sparklyr/R/ml_multilayer_perceptron.R sparklyr/R/ml_one_vs_rest.R sparklyr/R/livy_service.R sparklyr/R/spark_invoke.R sparklyr/R/dbi_spark_result.R sparklyr/R/dbi_spark_table.R sparklyr/R/livy_sources.R sparklyr/R/spark_deserialize.R sparklyr/R/sdf_interface.R sparklyr/R/spark_magrittr.R sparklyr/R/dbi_spark_query.R sparklyr/R/dbi_spark_connection.R sparklyr/R/dplyr_spark_data.R sparklyr/R/ml_feature_transformation_utils.R sparklyr/R/ml_linear_regression.R sparklyr/R/ml_model_print_methods.R sparklyr/R/zzz.R sparklyr/R/imports.R sparklyr/R/spark_gateway.R sparklyr/R/formulas.R sparklyr/R/ml_naive_bayes.R sparklyr/R/connection_windows.R sparklyr/R/spark_extensions.R
sparklyr/README.md
sparklyr/MD5
sparklyr/java
sparklyr/java/backend.scala
sparklyr/java/stream.scala
sparklyr/java/utils.scala
sparklyr/java/handler.scala
sparklyr/java/tracker.scala
sparklyr/java/logging.scala
sparklyr/java/invoke.scala
sparklyr/java/sqlutils.scala
sparklyr/java/serializer.scala
sparklyr/DESCRIPTION
sparklyr/man
sparklyr/man/spark_read_json.Rd sparklyr/man/DBISparkResult-class.Rd sparklyr/man/sdf_quantile.Rd sparklyr/man/spark_version.Rd sparklyr/man/sdf-saveload.Rd sparklyr/man/ml_generalized_linear_regression.Rd sparklyr/man/connection_is_open.Rd sparklyr/man/ml_lda.Rd sparklyr/man/ft_vector_assembler.Rd sparklyr/man/ml_linear_regression.Rd sparklyr/man/register_extension.Rd sparklyr/man/ml_binary_classification_eval.Rd sparklyr/man/pipe.Rd sparklyr/man/ml_random_forest.Rd sparklyr/man/spark_default_compilation_spec.Rd sparklyr/man/ml_decision_tree.Rd sparklyr/man/ml_model.Rd sparklyr/man/livy_config.Rd sparklyr/man/spark_jobj.Rd sparklyr/man/ml_multilayer_perceptron.Rd sparklyr/man/sdf_sort.Rd sparklyr/man/livy_service.Rd sparklyr/man/spark_version_from_home.Rd sparklyr/man/ft_sql_transformer.Rd sparklyr/man/ml_kmeans.Rd sparklyr/man/spark_config.Rd sparklyr/man/invoke_method.Rd sparklyr/man/ft_elementwise_product.Rd sparklyr/man/sdf_sample.Rd sparklyr/man/compile_package_jars.Rd sparklyr/man/ft_tokenizer.Rd sparklyr/man/spark-connections.Rd sparklyr/man/invoke.Rd sparklyr/man/spark_log.Rd sparklyr/man/ml_prepare_inputs.Rd sparklyr/man/ft_quantile_discretizer.Rd sparklyr/man/sdf_schema.Rd sparklyr/man/spark-api.Rd sparklyr/man/ml_one_vs_rest.Rd sparklyr/man/spark_write_parquet.Rd sparklyr/man/ensure.Rd sparklyr/man/spark_install.Rd sparklyr/man/connection_config.Rd sparklyr/man/find_scalac.Rd sparklyr/man/ft_bucketizer.Rd sparklyr/man/print_jobj.Rd sparklyr/man/spark_dependency.Rd sparklyr/man/sdf_read_column.Rd sparklyr/man/ml_prepare_dataframe.Rd sparklyr/man/livy_install.Rd sparklyr/man/tbl_cache.Rd sparklyr/man/spark_web.Rd sparklyr/man/spark_dataframe.Rd sparklyr/man/ft_one_hot_encoder.Rd sparklyr/man/ft_regex_tokenizer.Rd sparklyr/man/spark_read_parquet.Rd sparklyr/man/sdf_partition.Rd sparklyr/man/sdf_predict.Rd sparklyr/man/reexports.Rd sparklyr/man/copy_to.spark_connection.Rd sparklyr/man/spark_load_table.Rd sparklyr/man/spark_write_json.Rd sparklyr/man/spark_home_dir.Rd sparklyr/man/ml_survival_regression.Rd sparklyr/man/sdf_with_unique_id.Rd sparklyr/man/spark_read_csv.Rd sparklyr/man/spark_connection.Rd sparklyr/man/ml_options.Rd sparklyr/man/spark_write_csv.Rd sparklyr/man/ft_string_indexer.Rd sparklyr/man/sdf_register.Rd sparklyr/man/ml_pca.Rd sparklyr/man/ml_naive_bayes.Rd sparklyr/man/ml_create_dummy_variables.Rd sparklyr/man/ml_logistic_regression.Rd sparklyr/man/ml_als_factorization.Rd sparklyr/man/ml_gradient_boosted_trees.Rd sparklyr/man/ft_discrete_cosine_transform.Rd sparklyr/man/na.replace.Rd sparklyr/man/ml_classification_eval.Rd sparklyr/man/spark_compile.Rd sparklyr/man/sdf_copy_to.Rd sparklyr/man/ft_index_to_string.Rd sparklyr/man/spark_save_table.Rd sparklyr/man/tbl_uncache.Rd sparklyr/man/sdf_mutate.Rd sparklyr/man/ml_tree_feature_importance.Rd sparklyr/man/ft_binarizer.Rd sparklyr/man/ml_saveload.Rd sparklyr/man/spark_compilation_spec.Rd sparklyr/man/sdf_persist.Rd
sparklyr/LICENSE

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.