aaudio: We PlayDevice first and WaitDevice after; reduce semaphore count by 1.
Previously, we would WaitDevice first, but that would feed a silent buffer
to AAudio upfront, introducing latency. When this change was made, the
semaphore count should have been adjusted, since we're waiting on one less
buffer.
Fixes #12882.