TensorFlow 速查表
1. 简介
TensorFlow 是 Google 开源的端到端机器学习平台。Keras 作为其高级 API,提供简洁的模型构建、训练与部署接口。TensorFlow 2.x 默认启用 Eager Execution(即时执行),像 NumPy 一样直观操作张量。
import tensorflow as tf
print(tf.__version__) # e.g. 2.16.1
print(tf.executing_eagerly()) # True
2. 安装
# CPU only
pip install tensorflow
# GPU support (CUDA required)
pip install tensorflow[and-cuda]
# Verify GPU
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
GPU 版本需要 NVIDIA CUDA Toolkit 和 cuDNN。推荐使用 conda 或 Docker 管理环境。
3. 张量操作
创建张量
# Constants
a = tf.constant([[1, 2], [3, 4]]) # shape (2,2), dtype int32
b = tf.constant([1.0, 2.0], dtype=tf.float32)
# Zeros / Ones
z = tf.zeros((3, 4)) # 3x4 of 0.0
o = tf.ones((2, 3)) # 2x3 of 1.0
# Random
r = tf.random.normal((3, 3), mean=0, stddev=1)
u = tf.random.uniform((2, 2), minval=0, maxval=10)
# Range
seq = tf.range(0, 10, delta=2) # [0, 2, 4, 6, 8]
# From NumPy
import numpy as np
t = tf.constant(np.array([1, 2, 3])) # NumPy -> Tensor
n = t.numpy() # Tensor -> NumPy
张量是不可变的多维数组。使用 tf.Variable 创建可变张量(用于模型参数)。
形状操作
x = tf.constant([[1,2,3],[4,5,6]]) # shape (2,3)
tf.reshape(x, (3, 2)) # reshape to (3,2)
tf.reshape(x, (-1,)) # flatten to (6,)
tf.expand_dims(x, axis=0) # (1,2,3)
tf.squeeze(tf.zeros((1,3,1))) # (3,) remove size-1 dims
# Concat & Stack
a = tf.constant([[1,2]])
b = tf.constant([[3,4]])
tf.concat([a, b], axis=0) # (2,2) along rows
tf.stack([a, b], axis=0) # (2,1,2) new dim
数学运算
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[5.0, 6.0], [7.0, 8.0]])
tf.matmul(a, b) # matrix multiply
tf.reduce_sum(a) # 10.0 (sum all)
tf.reduce_mean(a, axis=1) # [1.5, 3.5] per row
tf.reduce_max(a, axis=0) # [3.0, 4.0] per col
tf.math.log(a) # element-wise log
tf.nn.softmax(a, axis=1) # softmax per row
GPU 设备
# List available devices
gpus = tf.config.list_physical_devices('GPU')
print(f"GPUs available: {len(gpus)}")
# Place ops on a specific device
with tf.device('/GPU:0'):
x = tf.random.normal((1000, 1000))
y = tf.matmul(x, x)
# Memory growth (prevent OOM)
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
TensorFlow 默认自动使用可用 GPU。用 tf.device 显式指定设备。
GradientTape
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
y = x ** 2 + 2 * x + 1 # y = x^2 + 2x + 1
dy_dx = tape.gradient(y, x) # dy/dx = 2x + 2 = 8.0
print(dy_dx) # tf.Tensor(8.0, ...)
GradientTape 记录前向计算,自动求导。它是自定义训练循环的核心。
4. Keras 模型构建
Sequential API
model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)),
tf.keras.layers.Dropout(0.3),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.summary()
Sequential 适合线性堆叠的简单模型。层按顺序添加,前一层输出自动作为下一层输入。
Functional API
# Multi-input model
input_text = tf.keras.Input(shape=(100,), name='text_input')
input_meta = tf.keras.Input(shape=(5,), name='meta_input')
x = tf.keras.layers.Dense(64, activation='relu')(input_text)
x = tf.keras.layers.Dropout(0.3)(x)
y = tf.keras.layers.Dense(16, activation='relu')(input_meta)
combined = tf.keras.layers.Concatenate()([x, y])
output = tf.keras.layers.Dense(1, activation='sigmoid')(combined)
model = tf.keras.Model(inputs=[input_text, input_meta], outputs=output)
Functional API 支持多输入/多输出、共享层和非线性拓扑(如残差连接)。
子类化
class MyModel(tf.keras.Model):
def __init__(self):
super().__init__()
self.dense1 = tf.keras.layers.Dense(128, activation='relu')
self.dropout = tf.keras.layers.Dropout(0.3)
self.dense2 = tf.keras.layers.Dense(10)
def call(self, inputs, training=False):
x = self.dense1(inputs)
x = self.dropout(x, training=training)
return self.dense2(x)
model = MyModel()
子类化提供最大灵活性,适合研究和自定义前向逻辑。注意 training 参数控制 Dropout/BN 行为。
常用层
| 层 | 代码 | 用途 |
|---|---|---|
| Dense | Dense(64, activation='relu') | 全连接层 |
| Conv2D | Conv2D(32, (3,3), activation='relu') | 图像卷积 |
| MaxPooling2D | MaxPooling2D(pool_size=(2,2)) | 下采样 |
| LSTM | LSTM(64, return_sequences=True) | 序列建模 |
| Embedding | Embedding(vocab_size, 128) | 词嵌入 |
| Dropout | Dropout(0.3) | 正则化 |
| BatchNormalization | BatchNormalization() | 归一化 |
| Flatten | Flatten() | 展平多维张量 |
| GlobalAveragePooling2D | GlobalAveragePooling2D() | 全局平均池化 |
5. 训练
model.compile()
model.compile(
optimizer='adam', # or tf.keras.optimizers.Adam(1e-3)
loss='sparse_categorical_crossentropy', # for integer labels
metrics=['accuracy']
)
model.fit()
history = model.fit(
x_train, y_train,
epochs=20,
batch_size=32,
validation_split=0.2,
callbacks=[
tf.keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True),
tf.keras.callbacks.ModelCheckpoint('best_model.keras', save_best_only=True),
]
)
# Evaluate
loss, acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {acc:.4f}")
# Predict
predictions = model.predict(x_new) # returns NumPy array
自定义训练循环
optimizer = tf.keras.optimizers.Adam(1e-3)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
@tf.function # compile to graph for speed
def train_step(x, y):
with tf.GradientTape() as tape:
logits = model(x, training=True)
loss = loss_fn(y, logits)
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
return loss
for epoch in range(10):
for x_batch, y_batch in train_dataset:
loss = train_step(x_batch, y_batch)
print(f"Epoch {epoch+1}, Loss: {loss:.4f}")
使用 @tf.function 装饰器将 eager 代码编译为计算图,大幅提升性能。
Callbacks
| Callback | 代码 | 说明 |
|---|---|---|
| EarlyStopping | EarlyStopping(patience=5, monitor='val_loss') | 验证损失停止下降时停止训练 |
| ModelCheckpoint | ModelCheckpoint('best.keras', save_best_only=True) | 保存最优模型权重 |
| TensorBoard | TensorBoard(log_dir='./logs') | 可视化训练过程 |
| ReduceLROnPlateau | ReduceLROnPlateau(factor=0.5, patience=3) | 学习率自动衰减 |
| LearningRateScheduler | LearningRateScheduler(lambda e: 1e-3 * 0.9**e) | 自定义学习率调度 |
6. 常见模式
图像分类 (CNN)
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(224,224,3)),
tf.keras.layers.MaxPooling2D((2,2)),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D((2,2)),
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(10, activation='softmax')
])
# Data augmentation
data_augmentation = tf.keras.Sequential([
tf.keras.layers.RandomFlip('horizontal'),
tf.keras.layers.RandomRotation(0.1),
tf.keras.layers.RandomZoom(0.1),
])
文本分类 (Embedding + LSTM)
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, 128, input_length=max_len),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
迁移学习
# Use pre-trained MobileNetV2
base_model = tf.keras.applications.MobileNetV2(
input_shape=(224, 224, 3),
include_top=False, # remove classification head
weights='imagenet'
)
base_model.trainable = False # freeze base
model = tf.keras.Sequential([
base_model,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(num_classes, activation='softmax')
])
# Fine-tuning: unfreeze last N layers
base_model.trainable = True
for layer in base_model.layers[:-20]:
layer.trainable = False
可用模型:ResNet50, VGG16, EfficientNet, InceptionV3 等,均在 tf.keras.applications 中。
tf.data.Dataset
# From tensors
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.shuffle(1000).batch(32).prefetch(tf.data.AUTOTUNE)
# From directory (images)
dataset = tf.keras.utils.image_dataset_from_directory(
'data/train/',
image_size=(224, 224),
batch_size=32,
label_mode='categorical'
)
# From CSV
dataset = tf.data.experimental.make_csv_dataset(
'data.csv', batch_size=32, label_name='target'
)
tf.data 提供高效的数据管道。prefetch(AUTOTUNE) 自动优化 CPU/GPU 并行。
模型保存与加载
# SavedModel format (recommended)
model.save('my_model') # directory
loaded = tf.keras.models.load_model('my_model')
# Keras format (.keras)
model.save('model.keras')
loaded = tf.keras.models.load_model('model.keras')
# Weights only
model.save_weights('weights.h5')
model.load_weights('weights.h5')
# Export for TF Serving
tf.saved_model.save(model, 'export/1/')
7. 损失函数
| 任务 | 损失函数 | 代码 |
|---|---|---|
| 二分类 | Binary Crossentropy | loss='binary_crossentropy' |
| 多分类 (整数标签) | Sparse Categorical CE | loss='sparse_categorical_crossentropy' |
| 多分类 (one-hot) | Categorical CE | loss='categorical_crossentropy' |
| 回归 | MSE | loss='mse' |
| 回归 (鲁棒) | Huber | loss=tf.keras.losses.Huber(delta=1.0) |
| 多标签 | Binary CE (sigmoid) | loss='binary_crossentropy' + sigmoid |
| 对比学习 | Cosine Similarity | loss=tf.keras.losses.CosineSimilarity() |
8. 优化器
| 优化器 | 代码 | 推荐场景 |
|---|---|---|
| Adam | tf.keras.optimizers.Adam(learning_rate=1e-3) | 通用默认选择 |
| SGD | tf.keras.optimizers.SGD(lr=0.01, momentum=0.9) | 计算机视觉、精细调参 |
| RMSprop | tf.keras.optimizers.RMSprop(lr=1e-3) | RNN / 序列模型 |
| AdamW | tf.keras.optimizers.AdamW(lr=1e-3, weight_decay=0.01) | Transformer / 大模型 |
| Adagrad | tf.keras.optimizers.Adagrad(lr=0.01) | 稀疏特征 / NLP |
9. TensorFlow vs PyTorch
| 特性 | TensorFlow / Keras | PyTorch |
|---|---|---|
| 执行模式 | Eager + @tf.function 图编译 | Eager + torch.compile |
| 高级 API | tf.keras (built-in) | 需配合 Lightning / HuggingFace |
| 部署 | TF Serving, TF Lite, TF.js | TorchServe, ONNX, ExecuTorch |
| 移动端 | TF Lite (mature) | ExecuTorch (newer) |
| 浏览器 | TensorFlow.js | ONNX.js / Transformers.js |
| 社区趋势 | 生产部署首选 | 研究 & 学术首选 |
| 调试 | Eager 模式 + TF Debugger | 原生 Python 调试 |
| 分布式训练 | tf.distribute.Strategy | torch.distributed |
两个框架都很成熟。TensorFlow 在工业部署方面有更完整的生态,PyTorch 在研究领域更受欢迎。
10. 部署
TF Serving
# Save model in SavedModel format
tf.saved_model.save(model, 'export/model/1/')
# Docker: run TF Serving
# docker pull tensorflow/serving
# docker run -p 8501:8501 \
# --mount type=bind,source=$(pwd)/export/model,target=/models/my_model \
# -e MODEL_NAME=my_model tensorflow/serving
# REST API prediction
import requests, json
data = json.dumps({"instances": x_test[:3].tolist()})
resp = requests.post('http://localhost:8501/v1/models/my_model:predict',
data=data, headers={"Content-Type": "application/json"})
print(resp.json()['predictions'])
TF Lite (移动端)
# Convert to TFLite
converter = tf.lite.TFLiteConverter.from_saved_model('export/model/1/')
converter.optimizations = [tf.lite.Optimize.DEFAULT] # quantize
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
# Inference with TFLite
interpreter = tf.lite.Interpreter(model_path='model.tflite')
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
output = interpreter.get_tensor(interpreter.get_output_details()[0]['index'])
TensorFlow.js (浏览器)
# Convert to TF.js format
# pip install tensorflowjs
# tensorflowjs_converter --input_format=tf_saved_model \
# export/model/1/ tfjs_model/
# In browser JavaScript:
# const model = await tf.loadGraphModel('tfjs_model/model.json');
# const input = tf.tensor2d([[...features...]]);
# const prediction = model.predict(input);
# prediction.print();
11. 相关工具
12. 常见问题
TensorFlow 2.x 和 1.x 的主要区别是什么?
TF 2.x 默认使用 Eager Execution(即时执行),移除了 tf.Session,将 Keras 作为官方高级 API,并通过 @tf.function 提供图编译能力。API 更加简洁和 Pythonic。
什么时候用 Sequential vs Functional vs Subclassing?
Sequential 适合简单线性模型;Functional API 适合多输入/输出、共享层的复杂模型;Subclassing 适合需要自定义前向逻辑的研究场景(如 GAN、自定义 attention)。
如何解决 GPU 内存不足 (OOM) 问题?
1) 启用内存增长:tf.config.experimental.set_memory_growth(gpu, True);2) 减小 batch_size;3) 使用混合精度训练 tf.keras.mixed_precision;4) 使用梯度累积。
TensorFlow 适合生产部署吗?
是的,TensorFlow 拥有最完善的部署生态:TF Serving(服务端)、TF Lite(移动端/嵌入式)、TF.js(浏览器)。Google 内部大规模使用 TensorFlow 服务数十亿用户请求。
如何选择 TensorFlow 还是 PyTorch?
如果重视部署便捷性、移动端支持或浏览器运行,选 TensorFlow。如果侧重学术研究、快速原型开发或需要更灵活的调试体验,选 PyTorch。两者都能胜任大多数深度学习任务。