Compute shaders shows creating buffer filling it with data and then getting the data back from the buffer to check CPU to GPU then from GPU to CPU if navigator gpu console log WebGPU is not supported or is it disabled flags settings return const adapter await navigator gpu requestAdapter const device await adapter requestDevice Get a GPU buffer in a mapped state and an arrayBuffer for writing const gpuWriteBuffer device createBuffer mappedAtCreation true size 4 usage GPUBufferUsage MAP_WRITE GPUBufferUsage COPY_SRC const arrayBuffer gpuWriteBuffer getMappedRange Write bytes to buffer new Uint8Array arrayBuffer set 0 1 2 3 Unmap buffer so that it can be used later for copy gpuWriteBuffer unmap Get a GPU buffer for reading in an unmapped state const gpuReadBuffer device createBuffer mappedAtCreation false size 4 usage GPUBufferUsage COPY_DST GPUBufferUsage MAP_READ Encode commands for copying buffer to buffer const copyEncoder device createCommandEncoder copyEncoder copyBufferToBuffer gpuWriteBuffer source buffer 0 source offset gpuReadBuffer destination buffer 0 destination offset 4 size Submit copy commands const copyCommands copyEncoder finish device queue submit copyCommands Read buffer await gpuReadBuffer mapAsync GPUMapMode READ const copyArrayBuffer gpuReadBuffer getMappedRange console log new Uint8Array copyArrayBuffer Details on the WGSL shader compute You ll call passEncoder dispatchWorkgroups x y z which are mapped to a range x 0 1 2 3 max what you passed in to the dispatch call y 0 1 2 3 z 0 1 2 This is mapped to the shader using global_id x global_id y and global_id z You don t want a single compute shader cell doing all the work instead you use lookup so that each part of the calculation can be performed by different processor cells sharing the workload across the GPU As each cell has access to the inputs uniform matrices know the inputs and the outputs you just have to keep track of which shader cells are performing which operations Create a compute pipeline which will do all the work Simple matrix multiplication using WebGPU if navigator gpu console log WebGPU is not supported or is it disabled flags settings return const adapter await navigator gpu requestAdapter const device await adapter requestDevice First Matrix const firstMatrix new Float32Array 2 rows 4 columns 1 2 3 4 5 6 7 8 console log matrix a firstMatrix slice 2 const bufferFirstMatrix device createBuffer mappedAtCreation true size firstMatrix byteLength usage GPUBufferUsage STORAGE const arrayBufferFirstMatrix bufferFirstMatrix getMappedRange new Float32Array arrayBufferFirstMatrix set firstMatrix bufferFirstMatrix unmap Second Matrix const secondMatrix new Float32Array 4 rows 2 columns 1 2 3 4 5 6 7 8 console log matrix b secondMatrix slice 2 const bufferSecondMatrix device createBuffer mappedAtCreation true size secondMatrix byteLength usage GPUBufferUsage STORAGE const arrayBufferSecondMatrix bufferSecondMatrix getMappedRange new Float32Array arrayBufferSecondMatrix set secondMatrix bufferSecondMatrix unmap Result Matrix const resultMatrixBufferSize Float32Array BYTES_PER_ELEMENT 2 firstMatrix 0 secondMatrix 1 const resultMatrixBuffer device createBuffer size resultMatrixBufferSize usage GPUBufferUsage STORAGE GPUBufferUsage COPY_SRC Bind group layout and bind group const bindGroupLayout device createBindGroupLayout entries binding 0 visibility GPUShaderStage COMPUTE buffer type read only storage binding 1 visibility GPUShaderStage COMPUTE buffer type read only storage binding 2 visibility GPUShaderStage COMPUTE buffer type storage const bindGroup device createBindGroup layout bindGroupLayout entries binding 0 resource buffer bufferFirstMatrix binding 1 resource buffer bufferSecondMatrix binding 2 resource buffer resultMatrixBuffer Compute shader code const computeShader struct Matrix size vec2 f32 numbers array f32 group 0 binding 0 var storage read firstMatrix Matrix group 0 binding 1 var storage read secondMatrix Matrix group 0 binding 2 var storage read_write resultMatrix Matrix compute workgroup_size 8 8 fn main builtin global_invocation_id global_id vec3 u32 Guard against out of bounds work group sizes if global_id x u32 firstMatrix size x global_id y u32 secondMatrix size y return resultMatrix size vec2 f32 firstMatrix size x secondMatrix size y let resultCell vec2 u32 global_id x global_id y var result 0 0 for var i 0u i u32 firstMatrix size y i i 1u let a i resultCell x u32 firstMatrix size y let b resultCell y i u32 secondMatrix size y result result firstMatrix numbers a secondMatrix numbers b let index resultCell y resultCell x u32 secondMatrix size y resultMatrix numbers index result Pipeline setup const computePipeline device createComputePipeline layout device createPipelineLayout bindGroupLayouts bindGroupLayout compute module device createShaderModule code computeShader entryPoint main const bindGroup device createBindGroup layout bindGroupLayout layout computePipeline getBindGroupLayout 0 entries binding 0 resource buffer bufferFirstMatrix binding 1 resource buffer bufferSecondMatrix binding 2 resource buffer resultMatrixBuffer Commands submission const commandEncoder device createCommandEncoder const passEncoder commandEncoder beginComputePass passEncoder setPipeline computePipeline passEncoder setBindGroup 0 bindGroup const x Math ceil firstMatrix 0 8 X dimension of the grid of workgroups to dispatch const y Math ceil secondMatrix 1 8 Y dimension of the grid of workgroups to dispatch passEncoder dispatchWorkgroups x y passEncoder end Get a GPU buffer for reading in an unmapped state const readBuffer device createBuffer size resultMatrixBufferSize usage GPUBufferUsage COPY_DST GPUBufferUsage MAP_READ Encode commands for copying buffer to buffer commandEncoder copyBufferToBuffer resultMatrixBuffer source buffer 0 source offset readBuffer destination buffer 0 destination offset resultMatrixBufferSize size Submit GPU commands const gpuCommands commandEncoder finish device queue submit gpuCommands Read buffer await readBuffer mapAsync GPUMapMode READ const arrayBuffer readBuffer getMappedRange console log result a x b new Float32Array arrayBuffer
Bind group layout and bind group const bindGroupLayout device createBindGroupLayout entries binding 0 visibility GPUShaderStage COMPUTE buffer type read only storage binding 1 visibility GPUShaderStage COMPUTE buffer type read only storage binding 2 visibility GPUShaderStage COMPUTE buffer type storage const bindGroup device createBindGroup layout bindGroupLayout entries binding 0 resource buffer bufferFirstMatrix binding 1 resource buffer bufferSecondMatrix binding 2 resource buffer resultMatrixBuffer Compute shader code const computeShader struct Matrix size vec2 f32 numbers array f32 group 0 binding 0 var storage read firstMatrix Matrix group 0 binding 1 var storage read secondMatrix Matrix group 0 binding 2 var storage read_write resultMatrix Matrix compute workgroup_size 8 8 fn main builtin global_invocation_id global_id vec3 u32 Guard against out of bounds work group sizes if global_id x u32 firstMatrix size x global_id y u32 secondMatrix size y return resultMatrix size vec2 f32 firstMatrix size x secondMatrix size y let resultCell vec2 u32 global_id x global_id y var result 0 0 for var i 0u i u32 firstMatrix size y i i 1u let a i resultCell x u32 firstMatrix size y let b resultCell y i u32 secondMatrix size y result result firstMatrix numbers a secondMatrix numbers b let index resultCell y resultCell x u32 secondMatrix size y resultMatrix numbers index result Pipeline setup const computePipeline device createComputePipeline layout device createPipelineLayout bindGroupLayouts bindGroupLayout compute module device createShaderModule code computeShader entryPoint main const bindGroup device createBindGroup layout bindGroupLayout layout computePipeline getBindGroupLayout 0 entries binding 0 resource buffer bufferFirstMatrix binding 1 resource buffer bufferSecondMatrix binding 2 resource buffer resultMatrixBuffer Commands submission const commandEncoder device createCommandEncoder const passEncoder commandEncoder beginComputePass passEncoder setPipeline computePipeline passEncoder setBindGroup 0 bindGroup const x Math ceil firstMatrix 0 8 X dimension of the grid of workgroups to dispatch const y Math ceil secondMatrix 1 8 Y dimension of the grid of workgroups to dispatch passEncoder dispatchWorkgroups x y passEncoder end Get a GPU buffer for reading in an unmapped state const readBuffer device createBuffer size resultMatrixBufferSize usage GPUBufferUsage COPY_DST GPUBufferUsage MAP_READ Encode commands for copying buffer to buffer commandEncoder copyBufferToBuffer resultMatrixBuffer source buffer 0 source offset readBuffer destination buffer 0 destination offset resultMatrixBufferSize size Submit GPU commands const gpuCommands commandEncoder finish device queue submit gpuCommands Read buffer await readBuffer mapAsync GPUMapMode READ const arrayBuffer readBuffer getMappedRange console log result a x b new Float32Array arrayBuffer