Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for TURN server and allow configurable STUN server #149

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

sjoerd108
Copy link
Contributor

What is this?

CrewLink uses peer to peer connections for most of its traffic between clients. This is great because it keeps the voice server's load to a minimum. That said, creating peer to peer connections isn't that easy on the modern internet due to NAT, firewalls and strict network admins. Unless you are on a local LAN, chances are you need a way to figure out your public IP address and port to establish a peer to peer connection. That's where STUN comes in and currently CrewLink is hard-coded to use one of Google's STUN servers. STUN works in most cases, but some people find themselves behind a strict NAT/firewall that doesn't allow peer to peer connections some way or another. This is where TURN comes in. TURN acts as a relay between peer to peer clients, bypassing the NAT entirely.

CrewLink currently does not support configuring a TURN or custom STUN server. This pull request aims to address that. The existing libraries CrewLink uses already support using these servers, so all it introduces is the required UI to make it user-configurable. It does this by adding an "Advanced Settings" tab near the bottom of the settings menu that is collapsed by default. People hosting their own servers opting to add a TURN/STUN server can then instruct users to configure these settings.

Why?

I'm a big fan of CrewLink and enjoy playing Among Us with it with many people. Some people are, sadly, unable to use it due to their ISP/network admins making it hard to establish peer to peer connections. Testing on a custom build that supports a TURN server has dropped the amount of people not being able establish a connection with others to 0 in the places I've tested. I believe most people that can set up a crewlink server are also able to set something up like Coturn greatly increasing the people they can play with.

@ottomated
Copy link
Owner

As-is, this adds too much stress to the user. Please reconfigure this so the voice server can optionally send a TURN server over websocket, to make the entire process automatic.

@sjoerd108
Copy link
Contributor Author

I see. I had some small concerns that it may be too easy to gain access to the TURN server if it was implemented like that, but I suppose one can rotate the TURN keys fairly regularly to solve that.

I'll update this PR and get another PR ready for CrewLink-server.

@sjoerd108 sjoerd108 marked this pull request as draft December 6, 2020 00:14
@sjoerd108
Copy link
Contributor Author

I've implemented the requested changes and opened a PR for CrewLink-server as well. Both are ready for review.

src/renderer/Voice.tsx Outdated Show resolved Hide resolved
@MattJustMatt
Copy link

LGTM

@marss5
Copy link

marss5 commented Dec 7, 2020

hey, is this problem fixed now.. or?

@ChaoticWeevil
Copy link

I think I am having an issue because of this

@sjoerd108
Copy link
Contributor Author

I think I am having an issue because of this

If you are testing this as a dev build, could you please include some logs and specify how you've configured the peer-config on the server? This isn't in the official release yet.

@mja00
Copy link

mja00 commented Dec 8, 2020

I think I am having an issue because of this

If you are testing this as a dev build, could you please include some logs and specify how you've configured the peer-config on the server? This isn't in the official release yet.

I know I had some issues. The first game in a lobby ran great, with no issues whatsoever. Once the game ended and you were on the victory/defeat screen issues started popping up. About 50% of the time at least 1 person wasn't able to hear someone else, and the person they couldn't hear wasn't able to hear them back. This looked to be from the developer console to be just a connection refused from RTCPeerConnection. Seems there is also DOMExecution, and signaling errors after a game ends. About 20% of the time about half of the lobby would lose the ability to hear or speak. I grabbed the logs from one of the test sessions and it can be found here.

The peer-config is set to force the relay, use Google as the STUN server, and then a custom TURN server setup using coturn on a DigitalOcean droplet. I can confirm the TURN server works with both Trickle ICE and through the game since the first lobby works and my droplet's bandwidth spikes. If you'd like the info for the TURN server I can also provide that.

@sjoerd108
Copy link
Contributor Author

@mja00 Thanks for this, I'll give those logs a good look. I haven't had those issues during testing of my own, but I'll see what I can do to reproduce them. Without giving sensitive info about your TURN server away, could you specify what, if anything, you changed about Coturn's config (aside from credentials), I'd like to replicate the setup if possible.

@mja00
Copy link

mja00 commented Dec 8, 2020

@sjoerd108 Sure, my Coturn config is as follows

server-name=turn.mja00.com
realm=turn.mja00.com
cert=/etc/letsencrypt/live/turn.mja00.com/cert.pem
pkey=/etc/letsencrypt/live/turn.mja00.com/privkey.pem
fingerprint
listening-ip=0.0.0.0
listening-port=3478
min-port=10000
max-port=20000
log-file=/var/log/turnserver.log
verbose
user=fakeuser:fakecreds
lt-cred-mech

With turn:turn.mja00.com:3478 set as the turn server in peer-config. Looks like turn:turn.mja00.com:5349 also works as an IP.

@sjoerd108
Copy link
Contributor Author

@mja00 Thanks. I'll replicate those settings and try to do another test run with a full/almost full lobby. I wasn't running it with TLS as WebRTC already encrypts the traffic, but I'll try adding some certs though I doubt it'll make the difference.

@mja00
Copy link

mja00 commented Dec 8, 2020

@sjoerd108 No problem, TLS probably doesn't make any difference and it honestly might be weirdness with a basic DO droplet. I have some different hardware I can spin a TURN server up on if needed.

@MattJustMatt
Copy link

MattJustMatt commented Dec 8, 2020

I'd just like to add, this issue about connections degrading after the first game I believe isn't unique to this specific implementation (I think it has a deeper cause). I also have a separate branch with relay-only for my private games and was also running into this. I think error handling (for error events from simple-peer) and some improvements to connection flow are the solution.

@sjoerd108
Copy link
Contributor Author

@mja00 I have to side with @MattJustMatt here. We did two test runs, one with this version, replicating your TURN server config and the latest official version (1.1.5). When running the TURN version we saw several instances of ERR_DATA_CHANNEL being thrown, the connection did not die though. With the latest official build we not only saw ERR_DATA_CHANNEL but also "Cannot signal after peer is destroyed". After the latter error happened one participant indeed stopped being heard. I'm fairly certain if either version was being used long enough both of these errors could occur.

I believe it's safe to say the errors you are seeing are unrelated to this feature.

@mja00
Copy link

mja00 commented Dec 8, 2020

@sjoerd108 Glad to hear it wasn't the feature then, looks like I'm gonna have to understand the codebase and delve into some NodeJS to fix some of those errors. 😛 I appreciate checking out the errors and confirming it wasn't the feature.

@NewYearNewPhil
Copy link

NewYearNewPhil commented Dec 12, 2020

I see. I had some small concerns that it may be too easy to gain access to the TURN server if it was implemented like that, but I suppose one can rotate the TURN keys fairly regularly to solve that.

I'll update this PR and get another PR ready for CrewLink-server.

This could be solved using the Short-Term Credential Mechanism if the STUN server was included as part of CrewLink-server (e.g. via node-turn). The CrewLink-server could add users at runtime using their id & a generated password to allow them to authenticate against the STUN server, removing the user as they disconnect.
Would also save hosts the hassle of having a seperate process running to provide proper sound quality to players behind strict NAT/firewall.

@sjoerd108
Copy link
Contributor Author

That's a good point, it'd solve the problem of secrecy surrounding TURN credentials. I did have a look at using Node TURN my only concern is that it wouldn't perform that well compared to a native implementation. Though for small servers that shouldn't be a problem.

@tystuyfzand
Copy link

tystuyfzand commented Dec 13, 2020

This would be a great feature, and it's pretty simple to do a TURN server with authentication backed by the socket.io ids, I actually did one in Go. It's actually pretty vital to ensure everyone is able to use CrewLink without issues.

That's a good point, it'd solve the problem of secrecy surrounding TURN credentials. I did have a look at using Node TURN my only concern is that it wouldn't perform that well compared to a native implementation. Though for small servers that shouldn't be a problem.

It seems like this would be a community-only feature, so a built in TURN server in Node would be plenty for the 10-20 people a regular server would have on it, though the bigger servers would probably need something better. I think coturn has a REST API to create credentials, so maybe a docker stack/docker compose file for deploying a TURN server as well as a simple integration into the node version? A pluggable backend/setting to either run a built-in TURN server or use the API would be cool.

@sjoerd108
Copy link
Contributor Author

@tystuyfzand Docker would solve it quite nicely, yeah. I'm going to look into at least doing the built-in node-turn thing as it would indeed greatly increase the amount of people being able to play. This''ll be mostly server-side but I'll make both PRs drafts for now.

@sjoerd108 sjoerd108 marked this pull request as draft December 13, 2020 23:55
@tystuyfzand
Copy link

@sjoerd108 I'd say this side of the PR is complete, as it adds a very simple way to modify the ICE configuration. I think this should be ready to pull, minus any review on the styling/whatever - it'd let the server/community side evolve and make the app even better.

@sjoerd108
Copy link
Contributor Author

@tystuyfzand I think I may introduce some changes to the peer-config from the server, which may affect validation on this side.

@Flyy-y
Copy link

Flyy-y commented Dec 15, 2020

Hey, any news on the PR ? This features looks absolutely awesome !

@sjoerd108
Copy link
Contributor Author

@Flyy-y It's coming along quite nicely. Should have an update ready for review not too long from now.

Copy link

@davidenwang davidenwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - this fixed all our among us problems

IVLIVS-III added a commit to IVLIVS-III/CrewLink that referenced this pull request Dec 30, 2020
@sjoerd108 sjoerd108 marked this pull request as ready for review January 1, 2021 13:51
@sjoerd108
Copy link
Contributor Author

@pengi
Copy link

pengi commented Jan 2, 2021

Earlier we had a lot of peer-to-peer connection issues when playing, so I set up a coturn server, and been using a custom build of this branch for a couple of rounds, and this TURN patch seems to solve all of the issues we had related to connection.

Would love to see this patch end up in the official release

@mudassirzulfiqar
Copy link

Please Create new Release after merging this

@billyjbryant
Copy link

So I've been reading through this PR and I love this enhancement to CrewLink. The one thing I notice, is that there is no retry if the Peer to Peer connection fails. It would be useful for the server to provide a list of STUN/TURN servers and the client go through them until it finds one it can connect to with no issues.

Ideally, it would be preferred to use an open STUN server like the current Google provided STUN server that is set as the default; followed by a list of one or more additional STUN or TURN servers the client can attempt to connect to.

Having said that, I haven't specifically tested this PR yet, so if this is functionally how it works today, hooray and thanks for all of your hard work.
If it is not, please consider adding a retry with exponential backoff and timeout so that you might be able to cycle through a list of servers instead of using just one.

@sjoerd108
Copy link
Contributor Author

@billyjbryant This is pretty much already here. If the server does not send a client peer config then the client will opt the use one of Google's stun servers by default. The server, too, will send Google's STUN server by default if no peer behaviour is specified. As for retry logic, that is something that has been implemented and merged in PR #382.

@amgno
Copy link

amgno commented Jan 12, 2021

I would like to ask, if it's not a problem, if there is a way to use Crewlink without Peer to Peer. Because, for all of my friends, here in italy we have a Strict NAT type.
So It would be really helpfull to have a "main" server that handle all the connections without using Peer to peer.

If there is a way to use it then let me know, there's no problem for me to setup a server and such.

@pengi
Copy link

pengi commented Jan 12, 2021

I would like to ask, if it's not a problem, if there is a way to use Crewlink without Peer to Peer. Because, for all of my friends, here in italy we have a Strict NAT type. [...]

That's the problem we have here too. Some NAT gateways can track connections and some don't. And since the initiator of the connection is the last person to reload (or if it was the other way around? At least it depends on loading order), the connection problems is really erratic. So having the server not behind NAT (I'm using a smaller paid VPS from a hosting service) really helps.

So this patch will, given a crewlink server with matching patch and proper opt-in configuration, make it possible to communicate with just one server, called TURN server. However, it will still be one audio stream between each player, just transmitted though the server. So I was looking at some traffic graphs for the TURN setup I'm using, and for a game of 8 people (I think) was about 3.2Mbps traffic in each direction on the server.

I overheard a short discussion on ottomateds twitch stream when he was discussing this issue, and yes, it will be require a significant increase in server requirements to use a TURN server, and would therefore not be suitable for publicly hosted servers. But it is possible to use a private server where a group of players can get together and share their cost if a TURN server is requiered.

@psukys
Copy link

psukys commented Jan 14, 2021

can use of TURN server be toggleable, so that public server remains TURNless, and privates can enable it?

@nounours12
Copy link

Please add this

@pengi
Copy link

pengi commented Feb 12, 2021

A few days ago, I group I usually played with had the official 2.0.1 with lot of connection and sound issues, probably related to UDP packet loss.

Today, just a few hours ago, part of the group was playing again, using 2.0.1 with this patch, and I would say it worked really well, and the only issues we had was some minor issues not related to connections.

So once more, I've seen that it works and the need for it to be usable.

Copy link

@KillaBoi KillaBoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks clean and program works as intended and actually improves the overall stability and connectivity of people behind strict NAT type and carrier grade networks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.