That would be an internal limitation of the kernel32.dll system beep process. I don't think it allows you to make more than one single call to the function that way. I had thought of doing this myself though in the beginning though to for curiosity, so in the beginning I was trying to play chords, and I had several textboxes playing at a single time, on multiple threads, but it seemed to be playing them in sequential order anyways. Perhaps the next thing to try would be parallel processing with CPU cores? Fooling around with affinity. I'd developed a previous application to read the bits to determine which CPU cores to use and how many are available as well, it's basically like a bunch of switches. You may have them turned on in a way something like this 0001, or 1111, or 0101, 1100, etc... This all tells you which cores are being used and which aren't, I would arrange it in a way to be using 0001, 0010, 0100, 1000 for each beep being called from kernel32.dll, and may have better luck!
As threading does have it's limitations as well, I think this would be the best bet, but it means it's a limitation of how many cores are available for your computer to use as well, whether that's 2, 4, 6, or even 8 for example. I have 4 to test with currently.
I had noticed while playing beeps FAST enough, that even at a current speed it would sometimes lag finishing and disposing of the last call to create a beep with a certain frequency, so there must be something in the kernel32.dll which queue's the calls, meaning we may or may not have better luck calling it from different CPU's cores.