Network bandwidth: voice

Bandwidth is needed for more things than just the data you send yourself. The XNA Framework handles voice automatically, but every time you speak into the headset, we have to send that data out over the wire.

The voice stream is heavily compressed, using ~500 bytes per second, and only when you are actually talking.

By default, all players can talk to all others. Consider a 16 player game, where one player is talking to the other 15:

  • 500 * 15 = 7.3 kilobytes per second

Yikes! Remember we only have 8k in total. We’ve nearly used the whole thing up, even before sending any actual game data.

How can you survive this deadly attack of the killer voice bandwidth gremlins?

  • Limit your game to a smaller number of players.
  • Or use LocalNetworkGamer.EnableSendVoice to limit who can talk to who:
    • Only talk to players on your team.
    • Only talk to people who are near you in the world. But avoid changing this too often! EnableSendVoice must itself send network data to coordinate the new settings. If you change it often, this could end up costing more than you saved.
    • In MotoGP, we let each of our 16 players talk to all the other 15 when in the lobby (we didn’t have much other data to send then), but while racing they could only talk to the 3 closest players.

Comments (8)

  1. radioact1ve says:

    I never knew bandwidth was so troublesome (hence the lack of experience).

    I’m guessing this is not just an XNA thing. How is this handled in say Big Team battle via Halo.

  2. CGomez says:

    My guess is the way Halo handles it in Big Team Battle is by using the Push to Talk option.

    1) Once the game starts you can only talk to the other 7 players on your team.  As Shawn has pointed out, this is still potentially a lot of bandwidth, and with what I’ve seen Halo 3 do, it needs that bandwidth.

    2) You have to Push To Talk just to transmit.  Once you click up on your dpad, they effectively use LocalNetworkGamer.EnableSendVoice to let you talk and shut it off when you are done.  Since people aren’t constantly hitting their up directional button, bandwidth is saved.

  3. radioact1ve says:

    Sounds good. In fact, sounds a little obvious. 8)


  4. Caleb Jares says:

    So if you have 4 gamers on one xbox, can you combine the voice? Because I assume they have to share 8kbs of bandwidth, and sending a possible 2000 bytes to 12 other players surpasses their bandwidth!

  5. Hi Shawn

    I encounter big problems by implementing voice to my game. The used bandwidth is lots higher than 500 bytes/sec. Even if there are only two player talking to each other, NetworkSession.BytesPerSecondSent amounts an average of 2500 bytes/sec. Thus I've written and uploaded a simple chat application to my website which connects players and measures the used bandwidth:

    I am really desperate since I optimized my game for 8 players not using more than 5 kB/s in average and now voice-chat adds up to 7 kB/s for only 4 players in a team.

    Thus it would be awesome if you could check out my demo-app to illustrate that voice consumes alot more than 500 B/s. I hope that I am wrong, because I can't disclaim on voice in my game!

    Please help!

  6. ShawnHargreaves says:

    The voice data itself is 500 bytes per second, but if you are not sending any other data at the same time, there will also be a packet header cost to sending this voice info.

    Voice can be merged into existing game packets, so you would not usually pay a header cost for adding voice if your game is already sending a packet stream between these machines.

  7. Hi Shawn, thanks for your reply. It's strange that I observe the exact same bandwidth increase in my game where I send data each 4th frame. The voice data is sent within the XNA header which adds these 51 bytes per packet? Does it play a role whether I send the data reliable or as pure UDP (SendDataOptions.None) ? Moreover I observed different bandwidth increases using different machines. For example when using two Laptops with my example app there is only 1.06 kB/s sent and when I used my Desktop PC with headset it's about 2.5 kB/s sent. The very strange thing here is that the connections that consume these 2.5 kB/s I do not even hear the voice on the other PC even though HasVoice is true, IsMutedByLocalUser is false and the received bandwidth shows also these 2.5 kB/s sent by the other node. Might it be that the xna voice system is not technically fully matured?

    Best regards,


  8. Sorry it's me once more 😉 I enhanced my small test app by a constant stream that sends 10 bytes (10 bools) each frame at 60fps. That results in roughly 3.69kB/s = (10[byte]+51[header-byte])*60[frames]. Now additional voice adds still 2.5 kB/s to the base of 3.69 kB/s. How is this explainable? Thanks for your efforts.