解决Keras TensorFlow 混编中 trainable=False设置无效问题

更新时间：2020年6月29日 08:24 点击：1856

这是最近碰到一个问题，先描述下问题：

首先我有一个训练好的模型(例如vgg16)，我要对这个模型进行一些改变，例如添加一层全连接层，用于种种原因，我只能用TensorFlow来进行模型优化,tf的优化器，默认情况下对所有tf.trainable_variables()进行权值更新，问题就出在这，明明将vgg16的模型设置为trainable=False，但是tf的优化器仍然对vgg16做权值更新

以上就是问题描述，经过谷歌百度等等，终于找到了解决办法，下面我们一点一点的来复原整个问题。

trainable=False 无效

首先，我们导入训练好的模型vgg16，对其设置成trainable=False

from keras.applications import VGG16
import tensorflow as tf
from keras import layers

# 导入模型
base_mode = VGG16(include_top=False)
# 查看可训练的变量
tf.trainable_variables()

[<tf.Variable 'block1_conv1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block1_conv2/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv2/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block2_conv1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv1/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block2_conv2/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv2/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block3_conv1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv2/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv2/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv3/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv3/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block4_conv1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv2/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv3/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv2/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv3/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block1_conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv1_1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block1_conv2_1/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv2_1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block2_conv1_1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv1_1/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block2_conv2_1/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv2_1/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block3_conv1_1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv1_1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv2_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv2_1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv3_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv3_1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block4_conv1_1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv3_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv1_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref>]

# 设置 trainable=False
# base_mode.trainable = False似乎也是可以的
for layer in base_mode.layers:
  layer.trainable = False

设置好trainable=False后，再次查看可训练的变量，发现并没有变化，也就是说设置无效

# 再次查看可训练的变量
tf.trainable_variables()

[<tf.Variable 'block1_conv1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block1_conv2/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv2/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block2_conv1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv1/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block2_conv2/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv2/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block3_conv1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv2/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv2/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv3/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv3/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block4_conv1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv2/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv3/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv2/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv3/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block1_conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv1_1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block1_conv2_1/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref>,
 <tf.Variable 'block1_conv2_1/bias:0' shape=(64,) dtype=float32_ref>,
 <tf.Variable 'block2_conv1_1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv1_1/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block2_conv2_1/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref>,
 <tf.Variable 'block2_conv2_1/bias:0' shape=(128,) dtype=float32_ref>,
 <tf.Variable 'block3_conv1_1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv1_1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv2_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv2_1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block3_conv3_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref>,
 <tf.Variable 'block3_conv3_1/bias:0' shape=(256,) dtype=float32_ref>,
 <tf.Variable 'block4_conv1_1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block4_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block4_conv3_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv1_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv1_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv2_1/bias:0' shape=(512,) dtype=float32_ref>,
 <tf.Variable 'block5_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref>,
 <tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref>]

解决的办法

解决的办法就是在导入模型的时候建立一个variable_scope，将需要训练的变量放在另一个variable_scope,然后通过tf.get_collection获取需要训练的变量，最后通过tf的优化器中var_list指定需要训练的变量

from keras import models
with tf.variable_scope('base_model'):
  base_model = VGG16(include_top=False, input_shape=(224,224,3))
with tf.variable_scope('xxx'):
  model = models.Sequential()
  model.add(base_model)
  model.add(layers.Flatten())
  model.add(layers.Dense(10))

# 获取需要训练的变量
trainable_var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'xxx')
trainable_var

[<tf.Variable 'xxx_2/dense_1/kernel:0' shape=(25088, 10) dtype=float32_ref>,
<tf.Variable 'xxx_2/dense_1/bias:0' shape=(10,) dtype=float32_ref>]

# 定义tf优化器进行训练，这里假设有一个loss
loss = model.output / 2; # 随便定义的，方便演示
train_step = tf.train.AdamOptimizer().minimize(loss, var_list=trainable_var)

总结

在keras与TensorFlow混编中，keras中设置trainable=False对于TensorFlow而言并不起作用

解决的办法就是通过variable_scope对变量进行区分，在通过tf.get_collection来获取需要训练的变量，最后通过tf优化器中var_list指定训练

以上这篇解决Keras TensorFlow 混编中 trainable=False设置无效问题就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持猪先飞。

[!--infotagslink--]

上一篇: Python如何对XML 解析

下一篇: python批量处理多DNS多域名的nslookup解析实现

在Keras中利用np.random.shuffle()打乱数据集实例
这篇文章主要介绍了在Keras中利用np.random.shuffle()打乱数据集实例，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-06-16
SpringBoot2.x中management.security.enabled=false无效的解决
这篇文章主要介绍了SpringBoot2.x中management.security.enabled=false无效的解决方案，具有很好的参考价值，希望对大家有所帮助。如有错误或未考虑完全的地方，望不吝赐教...2021-07-23
解决tensorflow训练时内存持续增加并占满的问题
今天小编就为大家分享一篇解决tensorflow训练时内存持续增加并占满的问题，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-04-22
解决在keras中使用model.save()函数保存模型失败的问题
这篇文章主要介绍了解决在keras中使用model.save()函数保存模型失败的问题，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-05-21
win10安装tensorflow-gpu1.8.0详细完整步骤
这篇文章主要介绍了win10安装tensorflow-gpu1.8.0详细完整步骤，本文给大家介绍的非常详细，具有一定的参考借鉴价值,需要的朋友可以参考下...2020-04-22
macOS M1(AppleSilicon) 安装TensorFlow环境
苹果为M1芯片的Mac提供了TensorFlow的支持，本文主要介绍了如何给使用M1芯片的macOS安装TensorFlow的环境，感兴趣的可以了解一下...2021-08-13
windows系统Tensorflow2.x简单安装记录(图文)
这篇文章主要介绍了windows系统Tensorflow2.x简单安装记录(图文)，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧...2021-01-18
解决Keras 中加入lambda层无法正常载入模型问题
这篇文章主要介绍了解决Keras 中加入lambda层无法正常载入模型问题，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-06-17
keras.layer.input()用法说明
这篇文章主要介绍了keras.layer.input()用法说明，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-06-17
keras的三种模型实现与区别说明
这篇文章主要介绍了keras的三种模型实现与区别说明，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-07-04
完美解决TensorFlow和Keras大数据量内存溢出的问题
这篇文章主要介绍了完美解决TensorFlow和Keras大数据量内存溢出的问题，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-07-04
Tensorflow读取并输出已保存模型的权重数值方式
今天小编就为大家分享一篇Tensorflow读取并输出已保存模型的权重数值方式，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看不看...2020-04-30
利用keras使用神经网络预测销量操作
这篇文章主要介绍了利用keras使用神经网络预测销量操作，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-07-08
在tensorflow下利用plt画论文中loss,acc等曲线图实例
这篇文章主要介绍了在tensorflow下利用plt画论文中loss,acc等曲线图实例，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-06-16
Python通过TensorFLow进行线性模型训练原理与实现方法详解
这篇文章主要介绍了Python通过TensorFLow进行线性模型训练原理与实现方法,结合实例形式详细分析了Python通过TensorFLow进行线性模型训练相关概念、算法设计与训练操作技巧,需要的朋友可以参考下...2020-04-27
使用keras实现孪生网络中的权值共享教程
这篇文章主要介绍了使用keras实现孪生网络中的权值共享教程，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-06-11
详解tf.device()指定tensorflow运行的GPU或CPU设备实现
这篇文章主要介绍了详解tf.device()指定tensorflow运行的GPU或CPU设备实现，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧...2021-02-20
tensorflow实现对张量数据的切片操作方式
今天小编就为大家分享一篇tensorflow实现对张量数据的切片操作方式，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-04-22
keras:model.compile损失函数的用法
这篇文章主要介绍了keras:model.compile损失函数的用法，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-07-02
keras输出预测值和真实值方式
这篇文章主要介绍了keras输出预测值和真实值方式，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧...2020-06-28

解决Keras TensorFlow 混编中 trainable=False设置无效问题

相关文章

阁下可能感兴趣的内容

推荐阅读