Cuda Release News <360p>

Instead of writing raw PTX or using third-party wrappers, you can now write:

import cuda @cuda.kernel def vec_add(a, b, c): idx = cuda.thread_idx.x + cuda.block_idx.x * cuda.block_dim.x if idx < a.size: c[idx] = a[idx] + b[idx] vec_add[blocks, threads](a, b, c) cuda release news

find_package(CUDA REQUIRED) cuda_add_executable(myapp main.cu) New way (CUDA 13+): Instead of writing raw PTX or using third-party