安装TensorFlow

首先TensorFlow官网有详细的说明

linux环境比较简单，以下记录是关于Windows的

虽然官网有详细的说明，但实际还是会遇到问题的，如果想做一些改进问题就会更多了，以下是我的安装记录

安装Python

注意一定要安装64位版本，比如我一不小心就装了个32位版本，只好卸载重新安装

python -v

….

Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 23 2018, 23:31:17) [MSC v.1916 32 bit (Intel)] on win32

点击这里下载Python3.6.8 64位版本
安装Visual Studio Code

强烈推荐安装，可以在IDE里编写、调试代码

该IDE里提供的Terminal也比Windows自带的cmd好用，后续地命令可以在这个工具里执行
安装虚拟环境（官方也说了推荐这种方式）
1. 首先进入想要创建虚拟环境的目录
2. 执行以下命令，在该目录下创建名为venv的虚拟环境，后续所有包都会安装在这个目录下
1
2
pip install virtualenv
virtualenv --system-site-packages -p python ./venv
官方提供的命令会报错，命令中不能使用python3

连续执行以下几个命令，具体可参考官方说明

.\venv\Scripts\activate #激活虚拟环境
pip install --upgrade pip #更新&安装pip
pip install --upgrade tensorflow #更新&安装tensorflow
#测试会用到的一些包
pip install matplotlib
pip install opencv-python
pip install tqdm

如果import cv2报错，则可以换成其它版本

1 2	pip uninstall opencv-python pip install opencv-python==3.4.5.20

设置VScode使用python虚拟环境
1. 在VScode的扩展面板里搜索并安装”Python for VScode“
2. 打开设置（在界面左下角），搜索python.pythonPath
3. 修改Workspace Settings，指定刚才的虚拟环境python的路径，比如
  
  E:\DL\venv\Scripts\python.exe
  
  修改完成后，需要重启VScode

基础知识

shape理解

shape表示张量各维度的数据长度，比如shape=(1,2,3)，3个数字表示3维

第1个维度数字是1，表示只第1个维度的数据长度是1

第2个维度数字是2，表示只第2个维度的数据长度是2

第3个维度数字是3，表示只第3个维度的数据长度是3

1 2	[x, y] #只有一个维度，长度是2，所以这个张量的shape=(2,), 或者(2,0) [[[x,y,z], [x,y,z]]] #这个张量的shape=(1,2,3)

CNN原理

https://www.jianshu.com/p/fe428f0b32c1

注：多通道卷积：将每个通道的卷积结果相加，得到1个feature map，而不是多个feature map

如像素大小12x12, 3通道（rgb）对应的shape为12x12x3，与8个3x3的卷积核卷积后得到的shape为：10x10x8

tf函数

定义变量相关的函数

tf.placeholder

定义占位符

1 2	x = tf.placeholder("float",[None,784]) y_ = tf.placeholder("float", [None,10])

定义一个输入x，有784个维度，但具体的数据暂未给出

tf.Variable

1	W = tf.Variable(<initial-value>, name=<optional-name>)

用于生成一个初始值为initial-value的变量。必须指定初始化值

举例：

1 2	W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10]))

定义2个矩阵变量，分别是权重和偏置，训练过程就是不段变换这2个矩阵\

tf.get_variable

1 2	W = tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None)

获取已存在的变量（要求不仅名字，而且初始化方法等各个参数都一样），如果不存在，就新建一个。
可以用各种初始化方法，不用明确指定值。

tf.variable_scope

设置变量名

1 2	with tf.variable_scope('mynet'): b = tf.Variable(tf.zeros([10]), name='b')

定义的变量名是：mynet/b

tf.zeros_like

1	tf.zeros_like(tensor, dtype=None, name=None, optimize=True)

创建一个所有元素都设置为零的张量，张量的类型与tensor相同

tf.truncated_normal

产生随机正太分布

变量操作相关的函数

tf.matmu

矩阵乘法

softmax：回归函数，将给定的输入x，权重W和偏置b计算所得值回归到[0-1]的范围

1	y = tf.nn.softmax(tf.matmul(x,W) + b)

tf.reduce_sum

累加和

以下代码为求交叉熵

1	cross_entropy = -tf.reduce_sum(y_*tf.log(y))

tf.cast

1	cast(x, dtype, name=None)

将x的数据格式转化成dtype.例如，原来x的数据格式是bool，
那么将其转化成float以后，就能够将其转化成0和1的序列。反之也可以

tf.reduce_mean

求平均值

tf.squeeze

1	squeeze(input,axis=None,name=None,squeeze_dims=None)

该函数返回一个张量，这个张量是将原始input中所有维度为1的那些维都删掉的结果
axis可以用来指定要删掉的为1的维度，此处要注意指定的维度必须确保其是1，否则会报错

例子：

#  't' 是一个维度是[1, 2, 1, 3, 1, 1]的张量
tf.shape(tf.squeeze(t))   # [2, 3]， 默认删除所有为1的维度

# 't' 是一个维度[1, 2, 1, 3, 1, 1]的张量
tf.shape(tf.squeeze(t, [2, 4]))  # [1, 2, 3, 1]，标号从零开始，只删掉了2和4维的1

tf.slice

这个函数的作用是从输入数据input中提取出一块切片

切片的尺寸是size，切片的开始位置是begin。
切片的尺寸size表示输出tensor的数据维度，其中size[i]表示在第i维度上面的元素个数。

参考https://www.jianshu.com/p/71e6ef6c121b

tf.concat

1	tf.concat(values, axis, name='concat')

其中：
values应该是一个tensor的list或者tuple。
axis则是我们想要连接的维度。
tf.concat返回的是连接后的tensor。
比如，如果list中的tensor的shape都是（2，2，2），如果此时的axis为2，即连接第三个维度，那么连接后的shape是（2，2，4），具体表现为对应维度的堆砌。例子如下：

1
2
3

t1 = [[[1, 2], [2, 3]], [[4, 4], [5, 3]]]
t2 = [[[7, 4], [8, 4]], [[2, 10], [15, 11]]]
tf.concat([t1, t2], axis=-1)

输出结果为

1	<tf.Tensor 'concat_2:0' shape=(2, 2, 4) dtype=int32>

再sess.run（）一下拿出具体tensor为：

[[[ 1,  2,  7,  4],
  [ 2,  3,  8,  4]],

 [[ 4,  4,  2, 10],
  [ 5,  3, 15, 11]]]

可见符合（2，2，4）的shape。

tf.train.string_input_producer

这个函数需要传入一个文件名list，系统会自动将它转为一个文件名队列。

此外tf.train.string_input_producer还有两个重要的参数，一个是num_epochs，它就是我们上文中提到的epoch数。另外一个就是shuffle，shuffle是指在一个epoch内文件的顺序是否被打乱。

summary

tensorboard 作为一款可视化神器，可以说是学习tensorflow时模型训练以及参数可视化的法宝。

而在训练过程中，主要用到了tf.summary()的各类方法，能够保存训练过程以及参数分布图并在tensorboard显示。

参考TensorFlow框架(2)之TensorBoard详解

TFRecord

Tfrecord是tensorflow官方推荐的训练数据存储格式，它更容易与网络应用架构相匹配。

Tfrecord本质上是二进制的Protobuf数据，因而其读取、传输的速度更快。Tfrecord文件的每一条记录都是一个tf.train.Example的实例。

使用tfrecord文件格式的另一个好处是数据结构统一，屏蔽了底层的数据结构。在类似于图像分类的任务中，原始数据是各个图片以单独的小文件的形式存在，label又以文件夹的形式存在，处理这样的数据比较麻烦，比如随机打乱，分batch等操作；而所有原始数据转换为一个或几个单独的tfrecord文件后处理起来就会比较方便。

生成tfrecord文件

何把原始数据转换为tfrecord文件格式，请参考下面的代码片段：

def _bytes_feature(value):
    if not isinstance(value, list):
        value = [value]
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))
def _int64_feature(value):
    if not isinstance(value, list):
        value = [value]
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))

# 建立tfrecorder writer
writer = tf.python_io.TFRecordWriter('csv_train.tfrecords')

for i in xrange(train_values.shape[0]):
    image_raw = train_values[i].tostring()
	features=tf.train.Features(feature={
        'image_raw':  _bytes_feature([image_raw]),
        'label': _int64_feature([train_labels[i]])
    })
    # build example protobuf
    example = tf.train.Example(features=features)
    writer.write(example.SerializeToString())

writer.close()

使用tfrecord文件

定义与保存文件时对应的解析文件方法

def parse_exmp(serial_exmp):
    features={'image_raw': tf.FixedLenFeature([3], tf.int64),
              'label': tf.FixedLenFeature([3],tf.float32)
    }
    feats = tf.parse_single_example(serial_exmp, features=features)
    image = tf.decode_raw(features['image_raw'], tf.uint8)
    #根据实际情况对image格式做转换
    #....
    label = tf.cast(features['label'], tf.int32)
    return image, label

使用TFRecordDataset读取tfrecord文件

def get_tf_data():
    '''读取tfrecord数据'''
    dataset = tf.data.TFRecordDataset([tf_filename1, tf_filename2])
    dataset = dataset.map(parse_exmp)
    dataset = dataset.shuffle(1000)
    dataset = dataset.repeat(1).batch(batch_size)

    iterator = dataset.make_one_shot_iterator()
    one_element = iterator.get_next()
	return one_element

使用session获取实际的数据

sess = tf.Session()
while(True):
    try:
        result = sess.run(one_element)
        print(result[0], result[1])
    except tf.errors.OutOfRangeError:
        print("end!")
        break

tf.data的使用

参考:https://zhuanlan.zhihu.com/p/38421397

模型的保存与恢复(Saver)

将训练好的模型参数保存起来，以便以后进行验证或测试，这是我们经常要做的事情。tf里面提供模型保存的是tf.train.Saver()模块。

模型保存，先要创建一个Saver对象：如

1	saver=tf.train.Saver()

在创建这个Saver对象的时候，有一个参数我们经常会用到，就是 max_to_keep 参数，这个是用来设置保存模型的个数，默认为5，即 max_to_keep=5，保存最近的5个模型。如果你想每训练一代（epoch)就想保存一次模型，则可以将 max_to_keep设置为None或者0，如：

1	saver=tf.train.Saver(max_to_keep=0)

但是这样做除了多占用硬盘，并没有实际多大的用处，因此不推荐。

当然，如果你只想保存最后一代的模型，则只需要将max_to_keep设置为1即可，即

1	saver=tf.train.Saver(max_to_keep=1)

创建完saver对象后，就可以保存训练好的模型了，如：

1	saver.save(sess,'model/mnist.ckpt',global_step=step)

生成的文件在model目录下，文件名的前缀是


第一个参数sess,这个就不用说了。第二个参数设定保存的路径和名字，第三个参数将训练的次数作为后缀加入到模型名字中。

> saver.save(sess, 'my-model', global_step=0) ==>      filename: 'my-model-0'
> ...
> saver.save(sess, 'my-model', global_step=1000) ==> filename: 'my-model-1000'

模型的恢复用的是restore()函数，它需要两个参数restore(sess, save_path)，save_path指的是保存的模型路径。我们可以使用tf.train.latest_checkpoint（）来自动获取最后一次保存的模型。如：

```python
model_file=tf.train.latest_checkpoint('model/')
saver.restore(sess,model_file)

PB文件

生成pb文件

tf.graph_util.convert_variables_to_constants函数，会将计算图中的变量取值以常量的形式保存。

import tensorflow as tf
from tensorflow.python.platform import gfile
 
if __name__ == "__main__":
    a = tf.Variable(tf.constant(5.,shape=[1]),name="a")
    b = tf.Variable(tf.constant(6.,shape=[1]),name="b")
    c = a + b
    init = tf.global_variables_initializer()
    sess = tf.Session()
    sess.run(init)
    #导出当前计算图的GraphDef部分
    graph_def = tf.get_default_graph().as_graph_def()
    #保存指定的节点，并将节点值保存为常数
    output_graph_def = tf.graph_util.convert_variables_to_constants(sess,graph_def,['add'])
    #将计算图写入到模型文件中
    model_f = tf.gfile.GFile("model.pb","wb")
    model_f.write(output_graph_def.SerializeToString())

分析pb文件

使用Tensorboard分析pb文件，有两种方法

方法一：

利用pb文件恢复计算图
利用Tensorboard查看计算图的结构

方法二

利用tensorflow提供的tools里的import_pb_to_tensorboard.py这个工具，但是这个工具linux版本的tensorflow没有安装（Win下默认安装），需要的可以去下载[https://github.com/tensorflow/tensorflow/tree/master/tensorflow/python/tools]

方法一

从pb文件中恢复计算图

import tensorflow as tf

model = 'model.pb' #请将这里的pb文件路径改为自己的
graph = tf.get_default_graph()
graph_def = graph.as_graph_def()
graph_def.ParseFromString(tf.gfile.GFile(model, 'rb').read())
tf.import_graph_def(graph_def, name='graph')
summaryWriter = tf.summary.FileWriter('log/', graph)

利用Tensorboard查看计算图

在命令行运行以下命令，启动Tensorboard

1 2	#命令行运行里执行 tensorboard --logdir log/ #这里的路径就是1中最后一行图保存的路径，请根据自己的需要更改

方法二

利用tools里面的import_pb_to_tensorboard.py工具

1
2
3

#命令行
python -m tensorflow.python.tools.import_pb_to_tensorboard --model_dir="your_path/model.pb" --log_dir="your_log_path"  
tensorboard --logdir="your_log_path" #启动tensorboard

或者

import sys
import os
from tensorflow.python.tools.import_pb_to_tensorboard import import_to_tensorboard

if __name__ == "__main__":
    model = sys.argv[1]
    log_dir = sys.argv[2]
    import_to_tensorboard(model, log_dir) 
    #调用命令
    os.system('tensorboard --logdir='+log_dir) #启动tensorboard

经过查看源码，第二种方法其实是对第一种方法的包装。
两种方法是一致的，只不过第二种方法更加便捷。

示例代码

#coding=UTF-8
import sys
import tensorflow as tf
import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

#占位符，x:输入，y_:实际结果
x = tf.placeholder("float",[None,784])
y_ = tf.placeholder("float", [None,10])

def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

#用于调试
#batch = mnist.train.next_batch(50)
#x_image = tf.reshape(batch[0], [-1,28,28,1])
#y_ = batch[1]


#第一层卷积的权重和偏置，卷积核大小5x5
#原始数据只有1个通道
#卷积数量32个，从32个不同的维度来提取特征，将产生32个输出通道
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
#将输入（原始数据）转化成4维向量
x_image = tf.reshape(x, [-1,28,28,1])
#第一层卷积 -> relu -> 池化
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)


#第二层卷积的权重和偏置，卷积核大小5x5
#上一层有32个输出通道，因此这一层有32个输入通道
#卷积数量64个，从64个不同的维度来提取特征，将产生64个输出通道
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
#第二层卷积 -> relu -> 池化
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#全连接层的权重和偏置
#上一层有64个输出通道，输出的尺寸是7x7
#设置全连接数量为1024，从1024个维度提取特征，将产生1024个1维度的输出通道
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])
#将输入（第二层输出）转化成2维向量
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
#计算连接层的结果
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

#Dropout层，防止或减轻过拟合，一般用在全连接层。
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#输出层的权重和偏置
#上一层（全连接层）有1024个输出通道
#设置10个输出，分别代码数字0-9的概率
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
#计算输出层的结果，矩阵乘+归1化处理
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

#成本（交叉熵）随机梯度递减
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())

#开始训练
for i in range(20000):
  batch = mnist.train.next_batch(50)

  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={
        x:batch[0], y_: batch[1], keep_prob: 1.0})
    print "step %d, training accuracy %g"%(i, train_accuracy)
  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

#测试结果
print "test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})

参考链接

学习过程中参考了以下文章

TensorFlow框架(1)之Computational Graph详解
 TensorFlow框架(2)之TensorBoard详解
 TensorFlow框架(3)之MNIST机器学习入门
 TensorFlow框架(4)之CNN卷积神经网络详解

TensorFlow笔记