# nn_bce_loss: Binary cross entropy loss In torch: Tensors and Neural Networks with 'GPU' Acceleration

## Description

Creates a criterion that measures the Binary Cross Entropy between the target and the output:

## Usage

 1 nn_bce_loss(weight = NULL, reduction = "mean") 

## Arguments

 weight (Tensor, optional): a manual rescaling weight given to the loss of each batch element. If given, has to be a Tensor of size nbatch. reduction (string, optional): Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the sum of the output will be divided by the number of elements in the output, 'sum': the output will be summed. Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction. Default: 'mean'

## Details

The unreduced (i.e. with reduction set to 'none') loss can be described as:

\ell(x, y) = L = \{l_1,…,l_N\}^\top, \quad l_n = - w_n ≤ft[ y_n \cdot \log x_n + (1 - y_n) \cdot \log (1 - x_n) \right]

where N is the batch size. If reduction is not 'none' (default 'mean'), then

\ell(x, y) = ≤ft\{ \begin{array}{ll} \mbox{mean}(L), & \mbox{if reduction} = \mbox{'mean';}\\ \mbox{sum}(L), & \mbox{if reduction} = \mbox{'sum'.} \end{array} \right.

This is used for measuring the error of a reconstruction in for example an auto-encoder. Note that the targets y should be numbers between 0 and 1.

Notice that if x_n is either 0 or 1, one of the log terms would be mathematically undefined in the above loss equation. PyTorch chooses to set \log (0) = -∞, since \lim_{x\to 0} \log (x) = -∞.

However, an infinite term in the loss equation is not desirable for several reasons. For one, if either y_n = 0 or (1 - y_n) = 0, then we would be multiplying 0 with infinity. Secondly, if we have an infinite loss value, then we would also have an infinite term in our gradient, since \lim_{x\to 0} \frac{d}{dx} \log (x) = ∞.

This would make BCELoss's backward method nonlinear with respect to x_n, and using it for things like linear regression would not be straight-forward. Our solution is that BCELoss clamps its log function outputs to be greater than or equal to -100. This way, we can always have a finite loss value and a linear backward method.

## Shape

• Input: (N, *) where * means, any number of additional dimensions

• Target: (N, *), same shape as the input

• Output: scalar. If reduction is 'none', then (N, *), same shape as input.

## Examples

 1 2 3 4 5 6 7 8 9 if (torch_is_installed()) { m <- nn_sigmoid() loss <- nn_bce_loss() input <- torch_randn(3, requires_grad=TRUE) target <- torch_rand(3) output <- loss(m(input), target) output\$backward() } 

torch documentation built on Oct. 7, 2021, 9:22 a.m.