Keras(二)损失函数loss

Keras(二)损失函数loss

本章介绍Keras.loss,损失函数

从功能上分,可以分为以下三类:

Probabilistic losses,主要用于分类

Regression losses, 用于回归问题

Hinge losses, 又称"maximum-margin"分类,主要用作svm,最大化分割超平面的距离

损失函数的使用

用于模型的compile和fit

一般在模型compile的时候,将其作为loss的参数传进来。

有以下两种写法:

from tensorflow import keras

from tensorflow.keras import layers

model = keras.Sequential()

model.add(layers.Dense(64, kernel_initializer='uniform', input_shape=(10,)))

model.add(layers.Activation('softmax'))

loss_fn = keras.losses.SparseCategoricalCrossentropy()

model.compile(loss=loss_fn, optimizer='adam')

# pass optimizer by name: default parameters will be used

model.compile(loss='sparse_categorical_crossentropy', optimizer='adam')

单独使用

损失函数也可以单独使用,只需要传入两个tensor作为y_true和y_pred输入即可,还有一个sample_weights为可选参数

tf.keras.losses.mean_squared_error(tf.ones((2, 2,)), tf.zeros((2, 2)))

自定义loss

我们可以自定义一个函数用于loss的参数,定义也非常简单,只需要一个输入tensor和一个输出tensor即可

def my_loss_fn(y_true, y_pred):

squared_difference = tf.square(y_true - y_pred)

return tf.reduce_mean(squared_difference, axis=-1) # Note the `axis=-1`

model.compile(optimizer='adam', loss=my_loss_fn)

add_loss()

add_loss是tf.layers提供的函数,我们在创建自定义的layer时,可以在call中调用这个函数,可以将损失添加到训练的过程中。

from tensorflow.keras.layers import Layer

class MyActivityRegularizer(Layer):

"""Layer that creates an activity sparsity regularization loss."""

def __init__(self, rate=1e-2):

super(MyActivityRegularizer, self).__init__()

self.rate = rate

def call(self, inputs):

# We use `add_loss` to create a regularization loss

# that depends on the inputs.

self.add_loss(self.rate * tf.reduce_sum(tf.square(inputs)))

return inputs

class SparseMLP(Layer):

"""Stack of Linear layers with a sparsity regularization loss."""

def __init__(self, output_dim):

super(SparseMLP, self).__init__()

self.dense_1 = layers.Dense(32, activation=tf.nn.relu)

self.regularization = MyActivityRegularizer(1e-2)

self.dense_2 = layers.Dense(output_dim)

def call(self, inputs):

x = self.dense_1(inputs)

x = self.regularization(x)

return self.dense_2(x)

mlp = SparseMLP(1)

y = mlp(tf.ones((10, 10)))

print(mlp.losses) # List containing one float32 scalar

[]

损失函数类型

Probabilistic losses

对于分类概率问题常用交叉熵来作为损失函数。

BinaryCrossentropy(BCE)

BinaryCrossentropy用于0,1类型的交叉熵。计算公式:

交叉熵描述了两个概率分布之间的距离,当交叉熵越小说明二者之间越接近。

关于BinaryCrossentropy的原理可以参考:

https://towardsdatascience.com/understanding-binary-cross-entropy-log-loss-a-visual-explanation-a3ac6025181a

示例:

bce = tf.keras.losses.BinaryCrossentropy(from_logits=True)

y_pred = [0, 1, 0, 0]

y_val = [0.1, 1.29, -1, -6.2]

bce(y_pred, y_val).numpy()

0.32571107

其中,from_logits在使用样本取[-inf, inf],用正负代表接近0,1时取True,在样本取值在[0,1]时取False。

CategoricalCrossentropy(CCE)

CategoricalCrossentropy和SparseCategorialCrossentropy都用于多分类器,而CategoricalCrossentropy通常用于One-hot分类的问题中

cce = tf.keras.losses.CategoricalCrossentropy(from_logits=True)

y_true = [[0, 1, 0], [0, 0, 1]]

y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]

print(cce(y_true, y_pred).numpy())

y_pred = [[0.05, 0.95, 0], [0.1, 0.1, 0.8]]

print(cce(y_true, y_pred).numpy())

1.1769392

0.13721842

SparseCategorialCrossentropy(SCCE)

SparseCategorialCrossentropy用于数值标签的多分类器

y_true = [1, 2]

y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]

# Using 'auto'/'sum_over_batch_size' reduction type.

scce = tf.keras.losses.SparseCategoricalCrossentropy()

print(scce(y_true, y_pred).numpy())

y_pred = [[0.05, 0.95, 0], [0.1, 0.1, 0.8]]

print(scce(y_true, y_pred).numpy())

1.1769392

0.13721849

Poisson

泊松损失,值为loss = y_pred - y_true * log(y_pred)

真实标签服从泊松分布的负对数似然损失。

y_true = [[0., 1.], [0., 0.]]

y_pred = [[1., 1.], [0., 0.]]

# Using 'auto'/'sum_over_batch_size' reduction type.

p = tf.keras.losses.Poisson()

print(p(y_true, y_pred).numpy())

y_pred = [[0., 1.], [0., 0.]]

print(p(y_true, y_pred).numpy())

0.49999997

0.24999997

KLDivergence

Kullback-Leibler 散度损失,公式:loss = y_true * log(y_true / y_pred)

y_true = [[0., 1.], [0., 0.]]

y_pred = [[1., 1.], [0., 0.]]

# Using 'auto'/'sum_over_batch_size' reduction type.

p = tf.keras.losses.KLDivergence()

print(p(y_true, y_pred).numpy())

y_pred = [[0., 1.], [0., 0.]]

print(p(y_true, y_pred).numpy())

-8.059048e-07

0.0

Regression losses

下面介绍回归问题中常见的损失函数。

MeanSquaredError(MSE)

最常见的损失函数,均方差loss = square(y_true - y_pred)

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

mse = tf.keras.losses.MeanSquaredError()

print(mse(y_true, y_pred).numpy())

0.0075336643

MeanAbsoluteError(MAE)

loss = abs(y_true - y_pred)

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

mae = tf.keras.losses.MeanAbsoluteError()

print(mae(y_true, y_pred).numpy())

0.053666655

MeanAbsolutePercentageError(MAPE)

loss = 100 * abs((y_true - y_pred) / y_true)

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

mape = tf.keras.losses.MeanAbsolutePercentageError()

print(mape(y_true, y_pred).numpy())

17.416664

MeanSquaredLogarithmicError(MSLE)

loss = square(log(y_true + 1.) - log(y_pred + 1.))

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

msle = tf.keras.losses.MeanSquaredLogarithmicError()

print(msle(y_true, y_pred).numpy())

0.003985339

CosineSimilarity

loss = -sum(l2_norm(y_true) * l2_norm(y_pred))

注意,它是一个介于-1和1之间的数字。当它是一个介于-1和0之间的负数时,0表示正交,接近-1的值表示更大的相似性。越接近1的值表示差异越大。

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

cs = tf.keras.losses.CosineSimilarity()

print(cs(y_true, y_pred).numpy())

-0.9891268

Huber

Huber Loss 是一个用于回归问题的带参损失函数, 优点是能增强平方误差损失函数(MSE, mean square error)对离群点的鲁棒性。公式:

其中参数a即residual,y_pred - y_true, 在Keras中δ=1

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

huber = tf.keras.losses.Huber()

print(huber(y_true, y_pred).numpy())

0.0037668322

LogCosh

logcosh = log((exp(x) + exp(-x))/2), 其中 x = y_pred - y_true

y_true = [0.8, 0.1, 0.3]

y_pred = [0.81, 0.101, 0.45]

lc = tf.keras.losses.LogCosh()

print(lc(y_true, y_pred).numpy())

0.0037528474

HingeLoss

Hinge

HingeLoss的定义:

loss = maximum(1 - y_true * y_pred, 0)

y_true = [[0., 1.], [0., 0.]]

y_pred = [[0.6, 0.4], [0.4, 0.6]]

h = tf.keras.losses.Hinge()

print(h(y_true, y_pred).numpy())

1.3

SquaredHinge

Hinge的基础上平方

loss = square(maximum(1 - y_true * y_pred, 0))

y_true = [[0., 1.], [0., 0.]]

y_pred = [[0.6, 0.4], [0.4, 0.6]]

h = tf.keras.losses.SquaredHinge()

h(y_true, y_pred).numpy()

1.86

CategoricalHinge

neg=max((1−正确值)×预测值)

pos=∑(正确值 × 预测值)

loss=max(neg−pos+1,0)

y_true = np.random.randint(0, 3, size=(2,))

y_true = tf.keras.utils.to_categorical(y_true, num_classes=3)

y_pred = np.random.random(size=(2, 3))

loss = tf.keras.losses.categorical_hinge(y_true, y_pred)

print(loss.numpy())

assert loss.shape == (2,)

pos = np.sum(y_true * y_pred, axis=-1)

neg = np.amax((1. - y_true) * y_pred, axis=-1)

assert np.array_equal(loss.numpy(), np.maximum(0., neg - pos + 1.))

相关手记

08款朗逸评价
成年人必备的5个电影网站,免费看大片,请悄悄滴收藏|163
什么是脱离文档流?有什么办法可以让元素脱离标准的文档流?