SPO600 - Project Stage 1

This blog post is for my SPO600 class I am participating at Seneca College, this is related to stage 1 of our project detailed here.

To briefly explain the objective, we are seeking out an open-source software package that leverages "Single Instruction, Multiple Data" (SIMD) processing, but not sve/sve2 optimizations. Once we select our desired package, the next stages include successfully implementing sve2 optimizations where applicable, and working towards getting our optimizations added to the package officially.

My Selection

I have selected Realtime Math as my subject for these optimizations. This package initially jumped out at me due the "game-development" tag as gaming is a great interest of mine. After investigating the codebase I believe this is a great candidate for these optimizations. The package performs SIMD operations, and has support for ARM-v7a and ARM64 architectures already. Working towards adding SVE2 support to prepare for ARM-v9 seems like a natural step for this package.

The Strategy

We have three options provided to us for implementing SVE2 optimizations, autovectorization, adding inline assembler, or using SVE2 intrinsics. After searching the codebase for a good jumping-off point, my strategy is clear. Within the package where SIMD operations are implemented, ARM NEON and ARM64 NEON intrinsics are already being used. This makes my decision very simple, I will continue with this example and leverage the SVE2 intrinsics specifically and see how that goes!

This post is merely an introduction to my subject for these optimizations and a general idea of my plan of action. In my next post on the topic I will go into more detail on where specifically these changes can be made, and what extra work will go into integrating these optimizations smoothly and safely.

Comments