| text_dataset_from_directory | R Documentation |
tf.data.Dataset from text files in a directory.If your directory structure is:
main_directory/ ...class_a/ ......a_text_1.txt ......a_text_2.txt ...class_b/ ......b_text_1.txt ......b_text_2.txt
Then calling text_dataset_from_directory(main_directory, labels='inferred') will return a tf.data.Dataset that yields batches of
texts from the subdirectories class_a and class_b, together with labels
0 and 1 (0 corresponding to class_a and 1 corresponding to class_b).
Only .txt files are supported at this time.
text_dataset_from_directory(
directory,
labels = "inferred",
label_mode = "int",
class_names = NULL,
batch_size = 32L,
max_length = NULL,
shuffle = TRUE,
seed = NULL,
validation_split = NULL,
subset = NULL,
follow_links = FALSE,
verbose = TRUE
)
directory |
Directory where the data is located.
If |
labels |
Either |
label_mode |
String describing the encoding of
|
class_names |
Only valid if |
batch_size |
Size of the batches of data.
If |
max_length |
Maximum size of a text string. Texts longer than this will
be truncated to |
shuffle |
Whether to shuffle the data.
If set to |
seed |
Optional random seed for shuffling and transformations. |
validation_split |
Optional float between 0 and 1, fraction of data to reserve for validation. |
subset |
Subset of the data to return.
One of |
follow_links |
Whether to visits subdirectories pointed to by symlinks.
Defaults to |
verbose |
Whether to display number information on classes and
number of files found. Defaults to |
A tf.data.Dataset object.
If label_mode is NULL, it yields string tensors of shape
(batch_size,), containing the contents of a batch of text files.
Otherwise, it yields a tuple (texts, labels), where texts
has shape (batch_size,) and labels follows the format described
below.
Rules regarding labels format:
if label_mode is int, the labels are an int32 tensor of shape
(batch_size,).
if label_mode is binary, the labels are a float32 tensor of
1s and 0s of shape (batch_size, 1).
if label_mode is categorical, the labels are a float32 tensor
of shape (batch_size, num_classes), representing a one-hot
encoding of the class index.
Other dataset utils:
audio_dataset_from_directory()
image_dataset_from_directory()
split_dataset()
timeseries_dataset_from_array()
Other utils:
audio_dataset_from_directory()
clear_session()
config_disable_interactive_logging()
config_disable_traceback_filtering()
config_enable_interactive_logging()
config_enable_traceback_filtering()
config_is_interactive_logging_enabled()
config_is_traceback_filtering_enabled()
get_file()
get_source_inputs()
image_array_save()
image_dataset_from_directory()
image_from_array()
image_load()
image_smart_resize()
image_to_array()
layer_feature_space()
normalize()
pad_sequences()
set_random_seed()
split_dataset()
timeseries_dataset_from_array()
to_categorical()
zip_lists()
Other preprocessing:
image_dataset_from_directory()
image_smart_resize()
timeseries_dataset_from_array()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.