EstimateExoLabel: Estimate ExoLabel Disk Consumption

View source: R/ExoLabel.R

EstimateExoLabelR Documentation

Estimate ExoLabel Disk Consumption

Description

Estimate the total disk consumption for ExoLabel.

Usage

EstimateExoLabel(num_v, avg_degree=1,
              num_edges=num_v*avg_degree,
              node_name_length=8L)

Arguments

num_v

Approximate number of total unique nodes in the network.

avg_degree

Average degree of each node in the network.

num_edges

Approximate total number of edges in the network.

node_name_length

Approximate average length of each node name, in characters.

Details

This function provides a rough estimate of the total disk space required to run ExoLabel for a given input network. avg_degree and num_edges need not both be specified. The function prints out the estimated size of the original edgelist files, the estimated disk space to be consumed by ExoLabel, and the approximate ratio of disk space relative to the original file.

node_name_length specifies the average length of the node names–since the names themselves must be stored on disk, this contributes to the overall size. For relatively short node names (1-16 characters) this has a negligible impact on overall disk consumption.

Value

Returns a vector of length three, showing the estimated total edgelist file size, estimated disk consumption, and ratio of the two. All sizes are shown in bytes.

Note

Estimating the average node label size is challenging, and unfortunately it does have a relatively large effect on the estimated edgelist file size. This function should be used for rough estimations of sizing, not absolute values. Errors in estimation of rough node name size will have a larger impact on edgelist file estimation than on the ExoLabel disk usage, so users can have higher confidence in estimated ExoLabel consumption.

Author(s)

Aidan Lakshman <AHL27@pitt.edu>

See Also

ExoLabel

Examples

# 100,000 nodes, average degree 2
EstimateExoLabel(num_v=100000, avg_degree=2)

# 10,000 nodes, 50,000 edges
EstimateExoLabel(num_v=10000, num_edges=50000)

npcooley/SynExtend documentation built on May 17, 2024, 1:50 p.m.