Some more work on upcycling second hand PS2’s into cheap fluxus machines. The next step is to embrace the vector units – strange little processors for doing things with points lines and colours extremely fast.
This is quite a daunting task for various reasons, not only can you run floating point and integer calculations in parallel, the VU’s also have instruction pipelining where a calculation can take a number of instructions to finish. The advantage of this is that you can interleave work that you are doing while you wait for things to complete, and make extremely fast programs – but it takes some time to get your head around.
Luckily it’s made a whole lot easier by a free software tool called OpenVCL. This lets you write fairly straight forward assembler and it at least makes sure it should run correctly by spacing it out for you – future versions may also start optimising the code automatically.
This is my first very basic renderer, which simply applies the world->screen transform to each vertex in an object. It’s job is to load data put in it’s memory by the CPU, process and send it to the “graphics synthesizer” to draw the gouraud shaded triangles. It’s not only much faster than doing the same job on the CPU, but it leaves it free for other processing (such as running a Scheme interpreter).
.syntax new .name vu1_unlit .vu .init_vf_all .init_vi_all --enter --endenter ; load the matrix row by row lq world_screen_row0, 0(vi00) lq world_screen_row1, 1(vi00) lq world_screen_row2, 2(vi00) lq world_screen_row3, 3(vi00) ; load the params and set the addresses for ; the giftag and vertex data lq params, 4(vi00) iaddiu giftag_addr, vi00, 5 iaddiu vertex_data, vi00, 6 ; move the vertex count to an integer ; register so we can loop over it mtir vertex_index, params[x] vertex_loop: ; load the colour (just increments vertex_data) lqi colour, (vertex_data++) ; load vertex position lq vertex, 0(vertex_data) ; apply the transformation mul acc, world_screen_row0, vertex[x] madd acc, world_screen_row1, vertex[y] madd acc, world_screen_row2, vertex[z] madd vertex, world_screen_row3, vertex[w] div q, vf00[w], vertex[w] mul.xyz vertex, vertex, q ; convert to fixed point ftoi4 vertex, vertex ; overwrite the old vertex with the transformed one sqi vertex, (vertex_data++) ; decrement and loop iaddi vertex_index, vertex_index, -1 ibne vertex_index, vi00, vertex_loop ; send to gs xgkick giftag_addr --exit --endexit