If you're on linux, you most likely need the H.264 ffmpeg codec package and things like the Cisco OpenH264 plugin for Firefox. Also, try unblocking WebRTC if you block it
Hmm, what I was thinking of was that everyone seemed to be competing with each other to click random UI elements on the virtual browser as rapidly as possible, I assume because there wasn't a way to speak with one another / coordinate. Or maybe I just missed the chat area?
Yeah we kept the MVP lean and didn't add chat or audio/video calling. Our goal (which we're not meeting) is to keep the Three.js example line count below 200.
The problem with VP8 is it's almost never hardware accelerated, neither on the encoding nor on the decoding side, so it's undesirable. Also, the software encoder x264 is quite far ahead of libvpx vp8 in terms of the encoding cost vs bitrate vs quality tradeoffs in our testing. We really hope AV1 solves all these issues once and for all in the years to come.
I guess you will need to make some compromises to make this work reliably.
My personal experience with WebRTC was: "One can be glad if one gets anything working at all across different environments". 0 stars, would not touch again (or only with a very very long pole; everybody has his price… ;-)).
The other thing that came to my mind is: Why do all that on the server? Seems costly.
One could build a "browser in browser", and share that (partly) P2P through the WebRTC screen-sharing feature. The "browser in browser" would be needed to be able to make that thing interactive as the screen-sharing feature transmits only a video. You would need to capture mouse and keyboard on the webpage within the "virtual browser" (and transmit it though an additional WebRTC stream). Capturing the inputs outside of the browser is not possible afik with WebRTC.
Thanks! Actually it's neither: we run full Chromium rather than CEF and Mediasoup for webrtc. We have looked into webrtcbin as well, but it wasn't the best fit for us.
Interesting! I've been using webrtcbin for some production work but it's been broadcasting of low latent internally used video with no user input back. I've seen demos of user input with webrtcbin, but have never done it in practice. If you can, do you mind sharing why webrtcbin wasn't a good fit for you (or why MediaSoup was better)?
webrtcbin is nice for one-to-one, but we were looking for something that's a better fit for one-to-many, which is where mediasoup came in. That being said, mediasoup is quite complex from a maintainability perspective since it's designed to handle several use cases, so we plan on making our own webrtc streaming solution in the future that's designed with only one-to-many in mind