元编程

Taichi为元编程提供了基础架构。元编程可以

  • 统一对维度依赖的代码开发,例如2维/3维(2D/3D)物理仿真
  • 通过将运行时开销转移到编译时来提高运行时的性能
  • 简化Taichi标准库的开发

Taichi内核是 惰性实例化 的,并且很多有计算可以发生在 编译时。即使没有模板参数,Taichi中的每一个内核也都是模板内核。

Template metaprogramming

You may use ti.template() as a type hint to pass a tensor as an argument. For example:

@ti.kernel
def copy(x: ti.template(), y: ti.template()):
    for i in x:
        y[i] = x[i]

a = ti.var(ti.f32, 4)
b = ti.var(ti.f32, 4)
c = ti.var(ti.f32, 12)
d = ti.var(ti.f32, 12)
copy(a, b)
copy(c, d)

As shown in the example above, template programming may enable us to reuse our code and provide more flexibility.

使用组合索引(grouped indices)的对维度不依赖的编程

However, the copy template shown above is not perfect. For example, it can only be used to copy 1D tensors. What if we want to copy 2D tensors? Do we have to write another kernel?

@ti.kernel
def copy2d(x: ti.template(), y: ti.template()):
    for i, j in x:
        y[i, j] = x[i, j]

Not necessary! Taichi provides ti.grouped syntax which enables you to pack loop indices into a grouped vector to unify kernels of different dimensionalities. For example:

@ti.kernel
def copy(x: ti.template(), y: ti.template()):
    for I in ti.grouped(y):
        # I is a vector with same dimensionality with x and data type i32
        # If y is 0D, then I = ti.Vector([]), which is equivalent to `None` when used in x[I]
        # If y is 1D, then I = ti.Vector([i])
        # If y is 2D, then I = ti.Vector([i, j])
        # If y is 3D, then I = ti.Vector([i, j, k])
        # ...
        x[I] = y[I]

@ti.kernel
def array_op(x: ti.template(), y: ti.template()):
    # if tensor x is 2D:
    for I in ti.grouped(x): # I is simply a 2D vector with data type i32
        y[I + ti.Vector([0, 1])] = I[0] + I[1]

    # then it is equivalent to:
    for i, j in x:
        y[i, j + 1] = i + j

Tensor metadata

Sometimes it is useful to get the data type (tensor.dtype) and shape (tensor.shape) of tensors. These attributes can be accessed in both Taichi- and Python-scopes.

@ti.func
def print_tensor_info(x: ti.template()):
  print('Tensor dimensionality is', len(x.shape))
  for i in ti.static(range(len(x.shape))):
    print('Size alone dimension', i, 'is', x.shape[i])
  ti.static_print('Tensor data type is', x.dtype)

See Tensors of scalars for more details.

注解

对稀疏张量而言,此处会返回其完整域的形状(full domain shape)。

Matrix & vector metadata

Getting the number of matrix columns and rows will allow you to write dimensionality-independent code. For example, this can be used to unify 2D and 3D physical simulators.

matrix.m equals to the number of columns of a matrix, while matrix.n equals to the number of rows of a matrix. Since vectors are considered as matrices with one column, vector.n is simply the dimensionality of the vector.

@ti.kernel
def foo():
  matrix = ti.Matrix([[1, 2], [3, 4], [5, 6]])
  print(matrix.n)  # 2
  print(matrix.m)  # 3
  vector = ti.Vector([7, 8, 9])
  print(vector.n)  # 3
  print(vector.m)  # 1

编译时求值(Compile-time evaluations)

Using compile-time evaluation will allow certain computations to happen when kernels are being instantiated. This saves the overhead of those computations at runtime.

  • Use ti.static for compile-time branching (for those who come from C++17, this is if constexpr.):
enable_projection = True

@ti.kernel
def static():
  if ti.static(enable_projection): # 没有运行时开销
    x[0] = 1
  • Use ti.static for forced loop unrolling:
@ti.kernel
def func():
  for i in ti.static(range(4)):
      print(i)

  # is equivalent to:
  print(0)
  print(1)
  print(2)
  print(3)

何时使用 ti.static 来进行for循环

下面有一些为何应该在for循环的同时使用 ti.static 的原因。

  • 循环展开以提高性能。
  • 对向量/矩阵的元素进行循环。矩阵的索引必须为编译时常量。张量的索引可以为运行时变量。例如,如果 x 是由3维向量组成的1维张量,并可以 x[tensor_index][matrix_index] 的形式访问。第一个索引(tensor_index)可以是变量,但是第二个索引(matrix_index)必须是一个常量。

例如,向量张量(tensor of vectors)的重置代码应该为

@ti.kernel
def reset():
  for i in x:
    for j in ti.static(range(x.n)):
      # The inner loop must be unrolled since j is a vector index instead
      # of a global tensor index.
      x[i][j] = 0