sparklyr: R Interface to Apache Spark

Share:

Provision, connect and interface to Apache Spark from within R. This package supports connecting to local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end, and provides an interface to Spark's built-in machine learning algorithms.

Author
Javier Luraschi [aut, cre], Kevin Ushey [aut], JJ Allaire [aut], RStudio [cph], The Apache Software Foundation [aut, cph]
Date of publication
2016-09-24 22:40:18
Maintainer
Javier Luraschi <javier@rstudio.com>
License
Apache License 2.0 | file LICENSE
Version
0.4

View on CRAN

Man pages

compile_package_jars
Compile Scala sources into a Java Archive (jar)
connection_config
Read configuration values for a connection
connection_is_open
Check whether the connection is open
copy_to
Copy a local R data frame to Spark
DBISparkResult-class
DBI Spark Result.
ensure
Enforce Specific Structure for R Objects
find_scalac
Discover the Scala Compiler
ft_binarizer
Feature Transformation - Binarizer
ft_bucketizer
Feature Transformation - Bucketizer
ft_discrete_cosine_transform
Feature Transformation - Discrete Cosine Transform (DCT)
ft_elementwise_product
Feature Transformation - ElementwiseProduct
ft_index_to_string
Feature Transformation - IndexToString
ft_one_hot_encoder
Feature Transformation - OneHotEncoder
ft_quantile_discretizer
Feature Transformation - QuantileDiscretizer
ft_sql_transformer
Feature Transformation - SQLTransformer
ft_string_indexer
Feature Transformation - StringIndexer
ft_vector_assembler
Feature Transformation - VectorAssembler
hive_context
Get the HiveContext associated with a connection
invoke
Execute a method on a remote Java object
invoke_method
Generic call interface for spark shell
java_context
Get the JavaSparkContext associated with a connection
ml_als_factorization
Spark ML - Alternating Least Squares (ALS) matrix...
ml_binary_classification_eval
Spark ML - Binary Classification Evaluator
ml_classification_eval
Spark ML - Classification Evaluator
ml_create_dummy_variables
Create Dummy Variables
ml_decision_tree
Spark ML - Decision Trees
ml_generalized_linear_regression
Spark ML - Generalized Linear Regression
ml_gradient_boosted_trees
Spark ML - Gradient-Boosted Tree
ml_kmeans
Spark ML - K-Means Clustering
ml_lda
Spark ML - Latent Dirichlet Allocation
ml_linear_regression
Spark ML - Linear Regression
ml_logistic_regression
Spark ML - Logistic Regression
ml_model
Create an ML Model Object
ml_multilayer_perceptron
Spark ML - Multilayer Perceptron
ml_naive_bayes
Spark ML - Naive-Bayes
ml_one_vs_rest
Spark ML - One vs Rest
ml_options
Provide Options for Spark.ML Routines
ml_pca
Spark ML - Principal Components Analysis
ml_prepare_dataframe
Prepare a Spark DataFrame for Spark ML Routines
ml_prepare_inputs
Pre-process the Inputs to a Spark ML Routine
ml_random_forest
Spark ML - Random Forests
ml_saveload
Save / Load a Spark ML Model Fit
ml_survival_regression
Spark ML - Survival Regression
ml_tree_feature_importance
Spark ML - Feature Importance for Tree Models
na.replace
Replace Missing Values in Objects
pipe
Pipe operator
print_jobj
Generic method for print jobj for a connection type
register_extension
Register a package that implements a Spark extension
sdf_copy_to
Copy an Object into Spark
sdf_mutate
Mutate a Spark DataFrame
sdf_partition
Partition a Spark Dataframe
sdf_predict
Model Predictions with Spark DataFrames
sdf_read_column
Read a Column from a Spark DataFrame
sdf_register
Register a Spark DataFrame
sdf_sample
Randomly Sample Rows from a Spark DataFrame
sdf-saveload
Save / Load a Spark DataFrame
sdf_sort
Sort a Spark DataFrame
sdf_with_unique_id
Add a Unique ID Column to a Spark DataFrame
spark_compilation_spec
Define a Spark Compilation Specification
spark_compile
Compile Scala sources into a Java Archive (jar)
spark_config
Read Spark Configuration
spark_connect
Connect to Spark
spark_connection
Get the spark_connection associated with an object
spark_connection_is_open
Check if a Spark connection is open
spark_context
Get the SparkContext associated with a connection
spark_dataframe
Get the Spark DataFrame associated with an object
spark_default_compilation_spec
Default Compilation Specification for Spark Extensions
spark_dependency
Define a Spark dependency
spark_disconnect
Disconnect from Spark
spark_home_dir
Find the SPARK_HOME directory for a version of Spark
spark_install
Download and install various versions of Spark
spark_jobj
Get the spark_jobj associated with an object
spark_log
Retrieves entries from the Spark log
spark_read_csv
Read a CSV file into a Spark DataFrame
spark_read_json
Read a JSON file into a Spark DataFrame
spark_read_parquet
Read a Parquet file into a Spark DataFrame
spark_session
Get the Spark Session associated with a connection
spark_version
Version of Spark for a connection
spark_web
Open the Spark web interface
spark_write_csv
Write a Spark DataFrame to a CSV
spark_write_json
Write a Spark DataFrame to a JSON file
spark_write_parquet
Write a Spark DataFrame to a Parquet file
tbl_cache
Load a table into memory
tbl_uncache
Unload table from memory

Files in this package

sparklyr
sparklyr/inst
sparklyr/inst/staticdocs
sparklyr/inst/staticdocs/index.r
sparklyr/inst/conf
sparklyr/inst/conf/config-template.yml
sparklyr/inst/java
sparklyr/inst/java/sparklyr-2.0-2.11.jar
sparklyr/inst/java/sparklyr-1.6-2.10.jar
sparklyr/inst/extdata
sparklyr/inst/extdata/install_spark.csv
sparklyr/tests
sparklyr/tests/testthat
sparklyr/tests/testthat/test-install-spark.R
sparklyr/tests/testthat/test-ml-linear-regression.R
sparklyr/tests/testthat/test-ml-kmeans.R
sparklyr/tests/testthat/test-ml-generalized-linear-regression.R
sparklyr/tests/testthat/test-ml-saveload.R
sparklyr/tests/testthat/derby.log
sparklyr/tests/testthat/test-serialization.R
sparklyr/tests/testthat/helper-initialize.R
sparklyr/NAMESPACE
sparklyr/R
sparklyr/R/dplyr_spark_connection.R
sparklyr/R/sdf_saveload.R
sparklyr/R/ml_classification_evaluators.R
sparklyr/R/spark_serialize.R
sparklyr/R/data_csv.R
sparklyr/R/spark_globals.R
sparklyr/R/utils.R
sparklyr/R/connection_spark.R
sparklyr/R/ml_kmeans.R
sparklyr/R/connection_instances.R
sparklyr/R/install_spark.R
sparklyr/R/ml_interface.R
sparklyr/R/ml_backwards_compatibility.R
sparklyr/R/spark_version.R
sparklyr/R/config_spark.R
sparklyr/R/ml_logistic_regression.R
sparklyr/R/spark_dataframe.R
sparklyr/R/dplyr_spark_table.R
sparklyr/R/spark_shell.R
sparklyr/R/dplyr_spark.R
sparklyr/R/mutation.R
sparklyr/R/ml_feature_transformation.R
sparklyr/R/dplyr_sql.R
sparklyr/R/ml_lda.R
sparklyr/R/ml_survival_regression.R
sparklyr/R/data_interface.R
sparklyr/R/dbi_spark_transactions.R
sparklyr/R/ml_gradient_boosted_tree.R
sparklyr/R/ml_utils.R
sparklyr/R/spark_jobj.R
sparklyr/R/spark_compile.R
sparklyr/R/ml_alternating_least_squares.R
sparklyr/R/precondition.R
sparklyr/R/install_spark_versions.R
sparklyr/R/spark_hive.R
sparklyr/R/sdf_sql.R
sparklyr/R/spark_connection.R
sparklyr/R/data_copy.R
sparklyr/R/reexports.R
sparklyr/R/ml_decision_tree.R
sparklyr/R/ml_pca.R
sparklyr/R/sdf_wrapper.R
sparklyr/R/connection_viewer.R
sparklyr/R/ml_generalized_linear_regression.R
sparklyr/R/ml_saveload.R
sparklyr/R/ml_options.R
sparklyr/R/ml_random_forest.R
sparklyr/R/ml_multilayer_perceptron.R
sparklyr/R/ml_one_vs_rest.R
sparklyr/R/spark_invoke.R
sparklyr/R/dbi_spark_result.R
sparklyr/R/dbi_spark_table.R
sparklyr/R/spark_deserialize.R
sparklyr/R/sdf_interface.R
sparklyr/R/spark_magrittr.R
sparklyr/R/dbi_spark_query.R
sparklyr/R/dbi_spark_connection.R
sparklyr/R/dplyr_spark_data.R
sparklyr/R/ml_linear_regression.R
sparklyr/R/ml_model_print_methods.R
sparklyr/R/imports.R
sparklyr/R/formulas.R
sparklyr/R/ml_naive_bayes.R
sparklyr/R/connection_windows.R
sparklyr/R/spark_extensions.R
sparklyr/README.md
sparklyr/MD5
sparklyr/java
sparklyr/java/backend.scala
sparklyr/java/utils.scala
sparklyr/java/handler.scala
sparklyr/java/logging.scala
sparklyr/java/sqlutils.scala
sparklyr/java/serializer.scala
sparklyr/DESCRIPTION
sparklyr/man
sparklyr/man/spark_read_json.Rd
sparklyr/man/DBISparkResult-class.Rd
sparklyr/man/spark_version.Rd
sparklyr/man/sdf-saveload.Rd
sparklyr/man/ml_generalized_linear_regression.Rd
sparklyr/man/connection_is_open.Rd
sparklyr/man/ml_lda.Rd
sparklyr/man/ft_vector_assembler.Rd
sparklyr/man/ml_linear_regression.Rd
sparklyr/man/register_extension.Rd
sparklyr/man/ml_binary_classification_eval.Rd
sparklyr/man/pipe.Rd
sparklyr/man/ml_random_forest.Rd
sparklyr/man/spark_default_compilation_spec.Rd
sparklyr/man/ml_decision_tree.Rd
sparklyr/man/ml_model.Rd
sparklyr/man/spark_jobj.Rd
sparklyr/man/spark_context.Rd
sparklyr/man/ml_multilayer_perceptron.Rd
sparklyr/man/sdf_sort.Rd
sparklyr/man/ft_sql_transformer.Rd
sparklyr/man/ml_kmeans.Rd
sparklyr/man/spark_config.Rd
sparklyr/man/invoke_method.Rd
sparklyr/man/ft_elementwise_product.Rd
sparklyr/man/sdf_sample.Rd
sparklyr/man/compile_package_jars.Rd
sparklyr/man/invoke.Rd
sparklyr/man/spark_log.Rd
sparklyr/man/ml_prepare_inputs.Rd
sparklyr/man/ft_quantile_discretizer.Rd
sparklyr/man/ml_one_vs_rest.Rd
sparklyr/man/spark_write_parquet.Rd
sparklyr/man/ensure.Rd
sparklyr/man/spark_install.Rd
sparklyr/man/connection_config.Rd
sparklyr/man/find_scalac.Rd
sparklyr/man/ft_bucketizer.Rd
sparklyr/man/print_jobj.Rd
sparklyr/man/spark_dependency.Rd
sparklyr/man/sdf_read_column.Rd
sparklyr/man/ml_prepare_dataframe.Rd
sparklyr/man/tbl_cache.Rd
sparklyr/man/spark_web.Rd
sparklyr/man/spark_dataframe.Rd
sparklyr/man/spark_connect.Rd
sparklyr/man/copy_to.Rd
sparklyr/man/ft_one_hot_encoder.Rd
sparklyr/man/spark_read_parquet.Rd
sparklyr/man/sdf_partition.Rd
sparklyr/man/sdf_predict.Rd
sparklyr/man/hive_context.Rd
sparklyr/man/spark_write_json.Rd
sparklyr/man/spark_home_dir.Rd
sparklyr/man/spark_session.Rd
sparklyr/man/ml_survival_regression.Rd
sparklyr/man/sdf_with_unique_id.Rd
sparklyr/man/spark_read_csv.Rd
sparklyr/man/spark_connection.Rd
sparklyr/man/ml_options.Rd
sparklyr/man/spark_connection_is_open.Rd
sparklyr/man/spark_write_csv.Rd
sparklyr/man/ft_string_indexer.Rd
sparklyr/man/sdf_register.Rd
sparklyr/man/ml_pca.Rd
sparklyr/man/ml_naive_bayes.Rd
sparklyr/man/ml_create_dummy_variables.Rd
sparklyr/man/ml_logistic_regression.Rd
sparklyr/man/ml_als_factorization.Rd
sparklyr/man/ml_gradient_boosted_trees.Rd
sparklyr/man/ft_discrete_cosine_transform.Rd
sparklyr/man/na.replace.Rd
sparklyr/man/ml_classification_eval.Rd
sparklyr/man/spark_compile.Rd
sparklyr/man/sdf_copy_to.Rd
sparklyr/man/ft_index_to_string.Rd
sparklyr/man/java_context.Rd
sparklyr/man/spark_disconnect.Rd
sparklyr/man/tbl_uncache.Rd
sparklyr/man/sdf_mutate.Rd
sparklyr/man/ml_tree_feature_importance.Rd
sparklyr/man/ft_binarizer.Rd
sparklyr/man/ml_saveload.Rd
sparklyr/man/spark_compilation_spec.Rd
sparklyr/LICENSE