Description Usage Arguments Details Note Examples
Avro processing functions defined for Column
.
1 2 3 4 5 6 7 8 9 |
x |
Column to compute on. |
... |
additional argument(s) passed as parser options. |
jsonFormatSchema |
character Avro schema in JSON string format |
from_avro
Converts a binary column of Avro format into its corresponding catalyst value.
The specified schema must match the read data, otherwise the behavior is undefined:
it may fail or return arbitrary result.
To deserialize the data with a compatible and evolved schema, the expected Avro schema can be
set via the option avroSchema.
to_avro
Converts a column into binary of Avro format.
Avro is built-in but external data source module since Spark 2.4. Please deploy the application as per the deployment section of "Apache Avro Data Source Guide".
from_avro since 3.1.0
to_avro since 3.1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | ## Not run:
df <- createDataFrame(iris)
schema <- paste(
c(
'{"type": "record", "namespace": "example.avro", "name": "Iris", "fields": [',
'{"type": ["double", "null"], "name": "Sepal_Length"},',
'{"type": ["double", "null"], "name": "Sepal_Width"},',
'{"type": ["double", "null"], "name": "Petal_Length"},',
'{"type": ["double", "null"], "name": "Petal_Width"},',
'{"type": ["string", "null"], "name": "Species"}]}'
),
collapse="\\n"
)
df_serialized <- select(
df,
alias(to_avro(alias(struct(column("*")), "fields")), "payload")
)
df_deserialized <- select(
df_serialized,
from_avro(df_serialized$payload, schema)
)
head(df_deserialized)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.