TensorFlow 速查表

1. 简介

TensorFlow 是 Google 开源的端到端机器学习平台。Keras 作为其高级 API,提供简洁的模型构建、训练与部署接口。TensorFlow 2.x 默认启用 Eager Execution(即时执行),像 NumPy 一样直观操作张量。

import tensorflow as tf print(tf.__version__) # e.g. 2.16.1 print(tf.executing_eagerly()) # True

2. 安装

# CPU only pip install tensorflow # GPU support (CUDA required) pip install tensorflow[and-cuda] # Verify GPU python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

GPU 版本需要 NVIDIA CUDA Toolkit 和 cuDNN。推荐使用 conda 或 Docker 管理环境。

3. 张量操作

创建张量

# Constants a = tf.constant([[1, 2], [3, 4]]) # shape (2,2), dtype int32 b = tf.constant([1.0, 2.0], dtype=tf.float32) # Zeros / Ones z = tf.zeros((3, 4)) # 3x4 of 0.0 o = tf.ones((2, 3)) # 2x3 of 1.0 # Random r = tf.random.normal((3, 3), mean=0, stddev=1) u = tf.random.uniform((2, 2), minval=0, maxval=10) # Range seq = tf.range(0, 10, delta=2) # [0, 2, 4, 6, 8] # From NumPy import numpy as np t = tf.constant(np.array([1, 2, 3])) # NumPy -> Tensor n = t.numpy() # Tensor -> NumPy

张量是不可变的多维数组。使用 tf.Variable 创建可变张量(用于模型参数)。

形状操作

x = tf.constant([[1,2,3],[4,5,6]]) # shape (2,3) tf.reshape(x, (3, 2)) # reshape to (3,2) tf.reshape(x, (-1,)) # flatten to (6,) tf.expand_dims(x, axis=0) # (1,2,3) tf.squeeze(tf.zeros((1,3,1))) # (3,) remove size-1 dims # Concat & Stack a = tf.constant([[1,2]]) b = tf.constant([[3,4]]) tf.concat([a, b], axis=0) # (2,2) along rows tf.stack([a, b], axis=0) # (2,1,2) new dim

数学运算

a = tf.constant([[1.0, 2.0], [3.0, 4.0]]) b = tf.constant([[5.0, 6.0], [7.0, 8.0]]) tf.matmul(a, b) # matrix multiply tf.reduce_sum(a) # 10.0 (sum all) tf.reduce_mean(a, axis=1) # [1.5, 3.5] per row tf.reduce_max(a, axis=0) # [3.0, 4.0] per col tf.math.log(a) # element-wise log tf.nn.softmax(a, axis=1) # softmax per row

GPU 设备

# List available devices gpus = tf.config.list_physical_devices('GPU') print(f"GPUs available: {len(gpus)}") # Place ops on a specific device with tf.device('/GPU:0'): x = tf.random.normal((1000, 1000)) y = tf.matmul(x, x) # Memory growth (prevent OOM) for gpu in gpus: tf.config.experimental.set_memory_growth(gpu, True)

TensorFlow 默认自动使用可用 GPU。用 tf.device 显式指定设备。

GradientTape

x = tf.Variable(3.0) with tf.GradientTape() as tape: y = x ** 2 + 2 * x + 1 # y = x^2 + 2x + 1 dy_dx = tape.gradient(y, x) # dy/dx = 2x + 2 = 8.0 print(dy_dx) # tf.Tensor(8.0, ...)

GradientTape 记录前向计算,自动求导。它是自定义训练循环的核心。

4. Keras 模型构建

Sequential API

model = tf.keras.Sequential([ tf.keras.layers.Dense(128, activation='relu', input_shape=(784,)), tf.keras.layers.Dropout(0.3), tf.keras.layers.Dense(64, activation='relu'), tf.keras.layers.Dense(10, activation='softmax') ]) model.summary()

Sequential 适合线性堆叠的简单模型。层按顺序添加,前一层输出自动作为下一层输入。

Functional API

# Multi-input model input_text = tf.keras.Input(shape=(100,), name='text_input') input_meta = tf.keras.Input(shape=(5,), name='meta_input') x = tf.keras.layers.Dense(64, activation='relu')(input_text) x = tf.keras.layers.Dropout(0.3)(x) y = tf.keras.layers.Dense(16, activation='relu')(input_meta) combined = tf.keras.layers.Concatenate()([x, y]) output = tf.keras.layers.Dense(1, activation='sigmoid')(combined) model = tf.keras.Model(inputs=[input_text, input_meta], outputs=output)

Functional API 支持多输入/多输出、共享层和非线性拓扑(如残差连接)。

子类化

class MyModel(tf.keras.Model): def __init__(self): super().__init__() self.dense1 = tf.keras.layers.Dense(128, activation='relu') self.dropout = tf.keras.layers.Dropout(0.3) self.dense2 = tf.keras.layers.Dense(10) def call(self, inputs, training=False): x = self.dense1(inputs) x = self.dropout(x, training=training) return self.dense2(x) model = MyModel()

子类化提供最大灵活性,适合研究和自定义前向逻辑。注意 training 参数控制 Dropout/BN 行为。

常用层

代码 用途
DenseDense(64, activation='relu')全连接层
Conv2DConv2D(32, (3,3), activation='relu')图像卷积
MaxPooling2DMaxPooling2D(pool_size=(2,2))下采样
LSTMLSTM(64, return_sequences=True)序列建模
EmbeddingEmbedding(vocab_size, 128)词嵌入
DropoutDropout(0.3)正则化
BatchNormalizationBatchNormalization()归一化
FlattenFlatten()展平多维张量
GlobalAveragePooling2DGlobalAveragePooling2D()全局平均池化

5. 训练

model.compile()

model.compile( optimizer='adam', # or tf.keras.optimizers.Adam(1e-3) loss='sparse_categorical_crossentropy', # for integer labels metrics=['accuracy'] )

model.fit()

history = model.fit( x_train, y_train, epochs=20, batch_size=32, validation_split=0.2, callbacks=[ tf.keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True), tf.keras.callbacks.ModelCheckpoint('best_model.keras', save_best_only=True), ] ) # Evaluate loss, acc = model.evaluate(x_test, y_test) print(f"Test accuracy: {acc:.4f}") # Predict predictions = model.predict(x_new) # returns NumPy array

自定义训练循环

optimizer = tf.keras.optimizers.Adam(1e-3) loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) @tf.function # compile to graph for speed def train_step(x, y): with tf.GradientTape() as tape: logits = model(x, training=True) loss = loss_fn(y, logits) grads = tape.gradient(loss, model.trainable_variables) optimizer.apply_gradients(zip(grads, model.trainable_variables)) return loss for epoch in range(10): for x_batch, y_batch in train_dataset: loss = train_step(x_batch, y_batch) print(f"Epoch {epoch+1}, Loss: {loss:.4f}")

使用 @tf.function 装饰器将 eager 代码编译为计算图,大幅提升性能。

Callbacks

Callback 代码 说明
EarlyStoppingEarlyStopping(patience=5, monitor='val_loss')验证损失停止下降时停止训练
ModelCheckpointModelCheckpoint('best.keras', save_best_only=True)保存最优模型权重
TensorBoardTensorBoard(log_dir='./logs')可视化训练过程
ReduceLROnPlateauReduceLROnPlateau(factor=0.5, patience=3)学习率自动衰减
LearningRateSchedulerLearningRateScheduler(lambda e: 1e-3 * 0.9**e)自定义学习率调度

6. 常见模式

图像分类 (CNN)

model = tf.keras.Sequential([ tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(224,224,3)), tf.keras.layers.MaxPooling2D((2,2)), tf.keras.layers.Conv2D(64, (3,3), activation='relu'), tf.keras.layers.MaxPooling2D((2,2)), tf.keras.layers.Conv2D(128, (3,3), activation='relu'), tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.5), tf.keras.layers.Dense(10, activation='softmax') ]) # Data augmentation data_augmentation = tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.1), tf.keras.layers.RandomZoom(0.1), ])

文本分类 (Embedding + LSTM)

model = tf.keras.Sequential([ tf.keras.layers.Embedding(vocab_size, 128, input_length=max_len), tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)), tf.keras.layers.Dense(64, activation='relu'), tf.keras.layers.Dropout(0.5), tf.keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

迁移学习

# Use pre-trained MobileNetV2 base_model = tf.keras.applications.MobileNetV2( input_shape=(224, 224, 3), include_top=False, # remove classification head weights='imagenet' ) base_model.trainable = False # freeze base model = tf.keras.Sequential([ base_model, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dense(num_classes, activation='softmax') ]) # Fine-tuning: unfreeze last N layers base_model.trainable = True for layer in base_model.layers[:-20]: layer.trainable = False

可用模型:ResNet50, VGG16, EfficientNet, InceptionV3 等,均在 tf.keras.applications 中。

tf.data.Dataset

# From tensors dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)) dataset = dataset.shuffle(1000).batch(32).prefetch(tf.data.AUTOTUNE) # From directory (images) dataset = tf.keras.utils.image_dataset_from_directory( 'data/train/', image_size=(224, 224), batch_size=32, label_mode='categorical' ) # From CSV dataset = tf.data.experimental.make_csv_dataset( 'data.csv', batch_size=32, label_name='target' )

tf.data 提供高效的数据管道。prefetch(AUTOTUNE) 自动优化 CPU/GPU 并行。

模型保存与加载

# SavedModel format (recommended) model.save('my_model') # directory loaded = tf.keras.models.load_model('my_model') # Keras format (.keras) model.save('model.keras') loaded = tf.keras.models.load_model('model.keras') # Weights only model.save_weights('weights.h5') model.load_weights('weights.h5') # Export for TF Serving tf.saved_model.save(model, 'export/1/')

7. 损失函数

任务 损失函数 代码
二分类Binary Crossentropyloss='binary_crossentropy'
多分类 (整数标签)Sparse Categorical CEloss='sparse_categorical_crossentropy'
多分类 (one-hot)Categorical CEloss='categorical_crossentropy'
回归MSEloss='mse'
回归 (鲁棒)Huberloss=tf.keras.losses.Huber(delta=1.0)
多标签Binary CE (sigmoid)loss='binary_crossentropy' + sigmoid
对比学习Cosine Similarityloss=tf.keras.losses.CosineSimilarity()

8. 优化器

优化器 代码 推荐场景
Adamtf.keras.optimizers.Adam(learning_rate=1e-3)通用默认选择
SGDtf.keras.optimizers.SGD(lr=0.01, momentum=0.9)计算机视觉、精细调参
RMSproptf.keras.optimizers.RMSprop(lr=1e-3)RNN / 序列模型
AdamWtf.keras.optimizers.AdamW(lr=1e-3, weight_decay=0.01)Transformer / 大模型
Adagradtf.keras.optimizers.Adagrad(lr=0.01)稀疏特征 / NLP

9. TensorFlow vs PyTorch

特性 TensorFlow / Keras PyTorch
执行模式Eager + @tf.function 图编译Eager + torch.compile
高级 APItf.keras (built-in)需配合 Lightning / HuggingFace
部署TF Serving, TF Lite, TF.jsTorchServe, ONNX, ExecuTorch
移动端TF Lite (mature)ExecuTorch (newer)
浏览器TensorFlow.jsONNX.js / Transformers.js
社区趋势生产部署首选研究 & 学术首选
调试Eager 模式 + TF Debugger原生 Python 调试
分布式训练tf.distribute.Strategytorch.distributed

两个框架都很成熟。TensorFlow 在工业部署方面有更完整的生态,PyTorch 在研究领域更受欢迎。

10. 部署

TF Serving

# Save model in SavedModel format tf.saved_model.save(model, 'export/model/1/') # Docker: run TF Serving # docker pull tensorflow/serving # docker run -p 8501:8501 \ # --mount type=bind,source=$(pwd)/export/model,target=/models/my_model \ # -e MODEL_NAME=my_model tensorflow/serving # REST API prediction import requests, json data = json.dumps({"instances": x_test[:3].tolist()}) resp = requests.post('http://localhost:8501/v1/models/my_model:predict', data=data, headers={"Content-Type": "application/json"}) print(resp.json()['predictions'])

TF Lite (移动端)

# Convert to TFLite converter = tf.lite.TFLiteConverter.from_saved_model('export/model/1/') converter.optimizations = [tf.lite.Optimize.DEFAULT] # quantize tflite_model = converter.convert() with open('model.tflite', 'wb') as f: f.write(tflite_model) # Inference with TFLite interpreter = tf.lite.Interpreter(model_path='model.tflite') interpreter.allocate_tensors() input_details = interpreter.get_input_details() interpreter.set_tensor(input_details[0]['index'], input_data) interpreter.invoke() output = interpreter.get_tensor(interpreter.get_output_details()[0]['index'])

TensorFlow.js (浏览器)

# Convert to TF.js format # pip install tensorflowjs # tensorflowjs_converter --input_format=tf_saved_model \ # export/model/1/ tfjs_model/ # In browser JavaScript: # const model = await tf.loadGraphModel('tfjs_model/model.json'); # const input = tf.tensor2d([[...features...]]); # const prediction = model.predict(input); # prediction.print();

12. 常见问题

TensorFlow 2.x 和 1.x 的主要区别是什么?

TF 2.x 默认使用 Eager Execution(即时执行),移除了 tf.Session,将 Keras 作为官方高级 API,并通过 @tf.function 提供图编译能力。API 更加简洁和 Pythonic。

什么时候用 Sequential vs Functional vs Subclassing?

Sequential 适合简单线性模型;Functional API 适合多输入/输出、共享层的复杂模型;Subclassing 适合需要自定义前向逻辑的研究场景(如 GAN、自定义 attention)。

如何解决 GPU 内存不足 (OOM) 问题?

1) 启用内存增长:tf.config.experimental.set_memory_growth(gpu, True);2) 减小 batch_size;3) 使用混合精度训练 tf.keras.mixed_precision;4) 使用梯度累积。

TensorFlow 适合生产部署吗?

是的,TensorFlow 拥有最完善的部署生态:TF Serving(服务端)、TF Lite(移动端/嵌入式)、TF.js(浏览器)。Google 内部大规模使用 TensorFlow 服务数十亿用户请求。

如何选择 TensorFlow 还是 PyTorch?

如果重视部署便捷性、移动端支持或浏览器运行,选 TensorFlow。如果侧重学术研究、快速原型开发或需要更灵活的调试体验,选 PyTorch。两者都能胜任大多数深度学习任务。