# Global Tensor 和实习总结｜OneFlow 学习笔记

1

2

## Global Tensor

### 2.1 OneFlow 分布式全局视角的基础保证

``````>>> placement1 = flow.placement("cuda", ranks=[0, 1, 2, 3]) # 1D SBP 配置集群
>>> placement2 = flow.placement("cuda", ranks=[[0, 1], [2, 3]]) # 2D SBP 配置集群``````

``````>>> sbp = (flow.sbp.broadcast, flow.sbp.split(0))
>>> tensor_to_global = tensor.to_global(placement=placement, sbp=sbp)``````

### 2.2 SBP 自动转换

Boxing 机制有不用的方法，如 all2all、broadcast、reduce-scatter、all-reduce 和 all-gather 等，每种操作会产生不同的通信成本。上面的 `split(0)` 转换为 broadcast，相当于做了一次 all-gather 操作。其实很好理解，这部分内容更详细的解释对应 OneFlow 论文https://arxiv.org/pdf/2110.15...的 3.2 节，比如每种操作的通信成本大小计算。

### 2.3 to_global 方法

``````import oneflow as flow

P0 = flow.placement("cuda", ranks=[0, 1])
P1 = flow.placement("cuda", ranks=[2, 3])
a0_sbp = flow.sbp.split(0)
b1_sbp = flow.sbp.split(1)

A0 = flow.randn(4, 5, placement=P0, sbp=a0_sbp)
B0 = flow.randn(5, 8, placement=P0, sbp=b0_sbp)
Y0 = flow.matmul(A0, B0)
Y0 = Y0.to_global(placement=P1, sbp=y0_sbp)B1 = flow.randn(8, 6, placement=P1, sbp=b1_sbp)
Y2 = flow.matmul(Y0, B1)``````

### 2.4 GlobalTensor 类代码跟踪

OneFlow 的 Tensor 设计更像桥接模式，把 Tensor 基类作为抽象化角色，TensorIf 作为 Tensor 的子类充当实现化角色接口，GlobalTensor 和 MirroredTensor 都给出了实现化角色接口的具体实现，如下图。

### 2.5 如何做 Global Ops 的执行测试

``````@autotest(n=1, check_graph=False)
def _test_matmul(test_case, placement, x_sbp, y_sbp):
x = random_tensor(ndim=2, dim0=8, dim1=16).to_global(placement=placement, sbp=x_sbp)
y = random_tensor(ndim=2, dim0=16, dim1=8).to_global(placement=placement, sbp=y_sbp)

class TestMatMulModule(flow.unittest.TestCase):
@globaltest
def test_matmul(test_case):
for placement in all_placement():
for x_sbp in all_sbp(placement, max_dim=2):
for y_sbp in all_sbp(placement, max_dim=2):
_test_matmul(test_case, placement, x_sbp, y_sbp)

if __name__ == "__main__":
unittest.main()
``````

3

## 总结

https://github.com/Oneflow-In...