There are a few ways to do this on a Teensy.
The first, and simplest is to write a whole port:
https://forum.pjrc.com/threads/17532-Tu ... #post21228
This way you can set 8 pins at the same time. The command itself executes extremely fast (within a microsecond) so calling it quickly for 2 different ports should not be an issue. Just try to have the clock always be the last thing to call, so all the data will be processed by the cartridge.
Another, better but way more difficult way is with DMA, which is not sensitive to interrupts and other external problems. It still only calls one port at a time, but the delays between the 2 are even smaller (<500ns). I do have a program that uses DMA on a Teensy I can share, but it will not be that simple to use, so if the above is adequate, I would try that first.