Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Talk – A free group video call app with screen sharing (github.com/vasanthv)
310 points by vasanthv on Sept 5, 2020 | hide | past | favorite | 99 comments


Is this different from or better than Jitsi Meet? Not intending to criticize the effort, just curious if I should switch my goto choice of video conferencing software.

https://meet.jit.si/


Jitsi is built to run with a server managing the call. This is peer to peer, it appears.


It still needs a server to relay the webrtc peer discovery stuff, at least by looking at server.js ?


Setting up a Jitsi server is non-trivial; this is simple (git clone / cd talk / npm install / npm start). And, if you don't want to trust the programme, you can do this from a dedicated unpriviledged user.


I thought so as well. Ended up rolling my own and encountering too many sync and error-handling issues.

Ended up deciding to spend the week configuring jitsi, but all it took was 1 hour to install and configure the server and I was done


Was this using Ubuntu or whatever it is they recommend or ``manually''?


Ubuntu, just apt install or whatever it was

The development of the frontend is a bit messy, but you'll manage


Same here. We use Jitsi inside our app for online learning. It works good.


Getting Jitsi running is actually trivial (depending on your definition of trivial). I went from zero to fully running in about 30 minutes on a fresh Ubuntu server when I wanted to check something one time. It might even be dockerized, but my memory isn't 100% on that.


Maybe on Ubuntu. I'm talking about the ``manual'' installation to which everyone not using Ubuntu must resort.


If you don't trust the app, don't run it. You'll be giving it access to video, voice and internet at minimum.


It’s always “just npm install.” IMO that’s no different than curl | sh; it’s easy to copy/paste but not simple.


You can't avoid that if you want to punch through NAT like all of us have to do.

Or can you?


It's not even that. Well, it is, but it's two separate things: you need a publicly-addressable server to host this code, for all the peers to connect to in order to discover each other and exchange session information. You could also host this locally and use something like ngrok to provide the publicly-addressable endpoint.

The NAT hole punching is done by the STUN servers listed in script.js. They appear to be public third-party STUN servers, so that could be a vector for a malicious actor. There are also third-party TURN servers listed, which will relay media in the case that NAT traversal fails. That should be ok too, but could also be another attack vector.


And it seems to contain hardcoded usernames and password for those STUN/TURN servers too...


huh, interesting. I’m sure there’s a use to this extra layer of “obscurity” after the sibling comment.


Don't need NAT with IPv6


I have 2 routers and one by default blocks incoming connections on ipv6 and the other one give you the choice between drop and reject for incoming connections.


More than half of the world still can't use IPv6 (including my whole country) due to ISPs not implementing it, so relying on that is not a good choice for now.


But it's still used nonetheless (even if less than with IPv4).


Jitsi not being p2p is advantage in my book. I'd imagine p2p group video calling is subpar.


IIRC Jitsi is P2P if you have 2 people.


And configurable to allow more


Looks like you can use it with mobile without downloading an app, so there's that


You can do that with jitsi as well. I've done it.


Ah, so you can! I feel like that wasn't always there before. Very cool.


Granted, you do have to find a tiny link hidden below a huge "Download the app" button. Kind of an antipattern if you ask me.


Another fully featured group video chat: https://talky.io

It's based on this open source app https://github.com/simplewebrtc/simplewebrtc-talky-sample-ap...

They also run https://www.simplewebrtc.com which is an SDK for building custom WebRTC apps. They also provide TURN and SFU servers.


This appear to be client side only (it can even be statically deployed) - the project in the submission is a client + server that handles signalling.

And,i dont quite understand this bit:

> To get started, you will first need to edit public/index.html to set your API key.

So, for any service using talky, I can just steal the Api key rather than subscribing? I mean, it's in the index file, not even protected by login, but sent to all clients (and bots..)?


Another open source, WebRTC-based chat - https://brie.fi/ng

Also open source. Discussion: https://news.ycombinator.com/item?id=23523830


Is this hosted anywhere?

I created https://omen.tv/ just last weekend. Similar in that it's powered by WebRTC, but it's designed for casting your screen (e.g. Jackbox Games) to other people's TVs.

My greatest annoyances with WebRTC were:

1. WebRTC requires a STUN server, and despite the spec initially supporting default ICE servers, it has since been pulled out into an extension because browser vendors don't want to provide servers[1]. There are free STUN servers (Google etc.) but...

2. Double NAT clients. STUN is inevitably going to discover double NAT clients. The only way to connect these clients is through a proxy. Specifically a TURN server. Unlike STUN servers, these are not light-weight, and I can totally understand why browser vendors don't offer them by default. So inevitably any WebRTC use still requires a self-hosted TURN server.

3. Inconsistent access to streams across browsers. In particular the getDisplayMedia[2] API provides poor availability of the audio stream. Courtesy of Apple doing Apple things, I don't believe it's even possible to implement this API on macOS as there's no audio loopback device. I worked around this for my use-case by installing BlackHole[3] (which is great software!) However, the loopback device appears as an input e.g. like a microphone, so isn't technically the display audio.

For this all to be smooth I think all these pain points need to be addressed.

Issue 1 and 2 require NAT to be kicked to the curb, come on IPv6! Issue 3, requires Apple to expose a loopback device and browsers to implement support. These aren't insurmountable issues, but they're unfortunately out of our hands as day-to-day devs.

[1] https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConn...

[2] https://developer.mozilla.org/en-US/docs/Web/API/MediaDevice...

[3] https://github.com/ExistentialAudio/BlackHole


The client-side script.js [1] appears to have a hard coded list of well known STUN/TURN endpoints, some with credentials.

The identity and access management is non-existent at this point of the project’s roadmap. A client needs the hostname of the Talk Node.js server along with a unique “room” name to connect, as far as I can tell after a quick glance of server.js [2].

[1] https://github.com/vasanthv/talk/blob/master/www/script.js

[2] https://github.com/vasanthv/talk/blob/master/server.js


I was playing around to create something similar and these problems were EXACTLY what I bumped into. STUN is a must for discovery, TURN is a must if both people are behind NATs, Apple support is meh...

So, to sum up, for a webRTC p2p app you need to:

1-3) The things you mentioned.

4) Code your own signaling server to transmit events like "call", "hung up", "room" tracking, etc.


> TURN is a must if both people are behind NATs

This is not what double NAT means. If both people are behind single NAT, STUN is enough.

I think it is also enough if one person (or both) are behind more than one regular NAT, which is what double-nat usually means.

Where STUN doesn't work is symmetric NAT [1], where different destination servers will receive a packet with different source ports, even if the original source port was the same.

It also doesn't work in a few, lesser-used NAT types [2].

[1] https://en.wikipedia.org/wiki/Network_address_translation#Sy...

[2] https://en.wikipedia.org/wiki/STUN#Limitations


Ad 2) this is false, double Nat doesn't automatically mean you need a proxy. It still depends on Nat type, it just increases the chances that one of them is a "hard one", to speak in terms of recent post about this problem:

https://news.ycombinator.com/item?id=24241105


I'm unaware of any NAT setup that allows double NAT'd clients to directly communicate with each other. Quite happy to learn different if you'll point me toward a specific NAT setup that'd allow this.

Just off the top of my head, the only way I could see this working is if the NATs themselves aren't transparent, and offer an API to manually map address/port pairs.


Friendly NATs map, for UDP, e.g. (source address, source port) to (one of our outside addresses, port). Anyone who sends packets back to that outside address gets packets to you.

So, you bind 192.168.100.55:5555 and send a packet to a STUN server 1.2.3.4:3478. This creates a mapping between, 192.168.100.55:5555 and an outside port, we'll call it 81.82.83.84:12345. And the STUN server tells you back, in a reply, "you're 81.82.83.84:12345".

Then your friend sends a packet from his private address to 81.82.83.84:12345. It reaches your NAT, and is translated to 192.168.100.55:5555 and you receive it. Replying to this packet lets you send packets to him.


Yeah, my apologies. You're indeed correct.

I was incorrectly assuming that all NATs dynamically assign a new external port per destination IP-port tuple. As this is what I've observed in practice.

However, it looks as though many believe this only occurs for Symmetric NATs, and never Cone NATs. It seems the common terminology just isn't nuanced enough to explain what's really going on. Wikipedia[1] actually has an interesting paragraph regarding terminology:

> Many NAT implementations combine these types, and it is, therefore, better to refer to specific individual NAT behaviors instead of using the Cone/Symmetric terminology. RFC 4787 attempts to alleviate confusion by introducing standardized terminology for observed behaviors. [...] Specifically, most NATs combine symmetric NAT for outgoing connections with static port mapping, where incoming packets addressed to the external address and port are redirected to a specific internal address and port.

The CGNATs I've encountered, which perhaps aren't representative, are mobile network CGNATs. By observation they were behaving like Restricted (either Address or Port) Cone NATs for inbound packets, but like Symmetric NATs for outbound packets. Hence, the source of my confusion in believing all Cone NATs exhibited this behaviour.

EDIT: I'm admittedly poor with the terminology as it's been years since I read the RFCs. I was only now able to express the above after doing a lot of refresher reading. I was coming at this based on experience implementing custom UDP P2P protocols, where I mostly just care about the worst case scenario; and had evidently assumed it more common than it is.

[1] https://en.wikipedia.org/wiki/Network_address_translation#Me...


I don't understand how double NAT would cause a problem: you send a UDP packet out to someone on the public Internet (maybe STUN) and it will punch two holes; assuming you don't have low quality symmetric NAT, packets sent to the outer NAT on the egress port will be translated to the egress port on the inner NAT which will translate back to the original port on the client. Why do you think this would fail?


Why would it not allow this? Read the article I linked, it explains why double Nat is not really different.

A specific Nat setup would be my home connection, or pretty much anyone using the same ISP. It's CGN and then another Nat at home. Stun works fine with that.


can some one tell me why I want NAT kicked to the curb? I like the idea that you get slightly less info from me behind nat. I have 14-15 devices but if I understand, from the outside my NAT they look like one device. I like that honestly though maybe I shouldn't. That it's harder to do peer to peer is partly the point.


One and two read as if STUN and TURN are specific technologies build for WebRTC, but as far as I know, those are just the standard technologies for handling P2P-scenarios with consumer level usability in mind.

I think the biggest issue is that often you have to set up a STUN/TURN server in an extra environment. In the XMPP world, ejabberd has come to the point, where it brings most things you need today in a single package (XMPP, HTTPS, Lets-enrypt, STUN/TURN, ...). That way you add about 5 lines to a config file and get STUN and TURN with little effort. I like that simplicity and if you don't, you are still free to use an external STUN/TURN server.


> One and two read as if STUN and TURN are specific technologies build for WebRTC, but as far as I know, those are just the standard technologies for handling P2P-scenarios with consumer level usability in mind.

My apologies if it reads that way. They are indeed standardised web technologies used in a whole host of environments. For example, Valve run STUN servers for their (now deprecated, but very much still in use) Steam P2P APIs; which are backed by a custom UDP protocol.


For your points 1 and 2... you need those servers to be hosted away from your end points. ICE will not work otherwise and they are ultimately legit servers that someone has to work to keep running or pay for.

The STUN one is easy really but the TURN servers need to be able to proxy traffic. Twillio provides STUN/TURN servers and they are fairly cheap. I am sure there are others.


Since most IPv6 routers I have seen have default firewall rules similar to NAT (allow outgoing, and incoming only if part of a valid session), STUN servers would probably still be required, wouldn't they?


Not to my knowledge. With IPv6 you know each other's public address, and can thus "hole punch" directly. The problem with NAT is you don't even know the IP address you're attempting to establish a connection to.


You are right, I forgot about this. Though, as always, you need to obtain your peer's IP trough a side channel (also the case without firewall), if your firewall isn't too picky about answering to blocked requests, and doesn't meddle with source ports, punching a hole should be quite straightforward (one side just has to co-ordinate with the other to pretend the firewall didn't block the first incoming packet, right?).


Yeah; and this is why IPv6 is a bit evil and CGNAT+PMP (not that any clients are smart enough to do PMP correctly, which is stupid; and sadly not enough Internet is behind true CGNAT) is epic: you lose nothing (due to the explicit port mapping control) and get what amounts to free anonymity (unlike IPv6, which tattoos an identifier on you as if that's a feature).


Well, what you say is true to some extent, but there are some privacy extensions... And I don't think you should rely on having an obfuscated/shared IP for anonymity. There are better tools for this, like Tor, I2P, Freenet, etc.

IPv6 was designed back in the 90s, where we had much less concerns about tracking, privacy and anonymity. But IPv4 is even more ancient, and, make no mistake, CGNAT is not designed to make you more anonymous, so you could be fooled by a false sense of security.

Maybe we need to sell IPv6 as a way to track users to boost its adoption? And develop those privacy-conscious networks on top of it? Regardless, IPv4 and NAT are not consumer-friendly when it comes to self-hosting and p2p, so as long as these are the norm, software silos will be at an advantage.


I've been trying to find something like omen tv forEVER !!!. Webrtc existed for a while now but no one bother making something like this


I enjoyed sharing my entire full screen with omen.tv and watching infinity -- just like pointing a video camera at a mirror.


it is hosted here:

http://talk.vasanthv.com


STUN servers? I thought ipv4 was dead by now...


I thought low hanging fruits have all been collected by now.


The JS code is under 500 lines so this is at least simple and auditable.

I didn't see anything about encryption based on a cursory read. Does WebRTC have some built in or is this unencrypted?


WebRTC requires DTLS (the UDP version of TLS) for the media streams; it doesn't even allow unencrypted.

The signaling for call setup / maintenance / teardown can be whatever, though. It looks like this uses socket.io with HTTPS long-polling. So you need a publicly-addressable IP (or a proxy service like ngrok) in order to host this yourself.

Assuming you host it yourself, all clients should have encrypted signaling with your server, and encrypted voice/video packets between each other. If someone else hosts it for you, they can see your signaling traffic, but -- unless they change the code to forward the media stream elsewhere in order to MitM it -- your media should still be encrypted.

However, this does make use of public STUN servers to help with NAT/firewall hole punching, which will leak your IP addresses and possibly make it so an adversary can figure out the IPs of the people who were talking to each other, based on the timing of connections. There are also public TURN servers listed which will act as media relays if the NAT traversal fails. Usually the TURN servers are dumb packet forwarders and won't terminate the DTLS sessions, so that should be fine, but I don't remember how the protocol works well enough (it's been a good 7 years since I was knee-deep in this stuff) to say that for certain.


DTLS is only the first part of the encryption. The actual media is transferred of SRTP which is encrypted RTP. DTLS provides the keying material to derive keys for SRTP.

DTLS is however fully used for SCTP, which the protocol for data channels alongside the media streams.


Yes, I'm aware (I've implemented a WebRTC stack before); I didn't think diving into the finer details of SRTP would be necessary at this point.


Amazing! I think this is the better open-source alternative to Google Meet and Zoom. Of course, this may not be filled with all the features offered by them, but this is more than enough for meeting 6-8 people at a time.


Is there a video call app with screen sharing that doesn't use WebRTC?

I often host online lectures and with screen sharing, it always uses up 100% of my CPU resources.


>>> quality of the call is inversely proportional to the number of people on the cal

Interesting experiment would be to run locally on high speed LAN with LOTS of participants. Whats the limit to what the browser can handle?

Thanks for building, vasanthv ;)


Is this pretty similar to zipcall.io? I've used it a few times and was surprised with how well it worked for video calls and screen sharing.

Only problem I had with it was one client that couldn't get his mic working. I'm assuming it was a browser permissions issue but I called him on Signal instead rather than trying to track it down remotely.


I am surprised to see a video call and screensharing app build with less than 500 lines of JS code.

Did not know that.


WebRTC is amazing, until you start getting users on iPads. The complaints of the Firefox users are easy to ignore (there are so few of them, after all), but the iPad users are too numerous (especially within my company).


here's a similar webrtc app, but free of node.js dependency https://ba.net/screen-share-party


I don't know anything about STUN or TURN servers. I saw some credentials in the script.js file. Is it dummy or is it okay to make this public?


The original ICE credential APIs make approximately zero sense. Basically the clients need the password, so there's nothing stopping them redistributing it, making the whole thing rather pointless.

There's now support for third-party auth (i.e. oauth). It's not really fool-proof on its own, however you can at least then disable access to those who abuse the system. However, for this to work you need to have an oauth provider i.e. sign-in, which may be non-desirable.


> credential: 'd0ntuseme'

Might just be an example.


Couldn't the double NAT STUN/TURN thing be eliminated if the solution was modified to utilize Wireguard on all the endpoints?


I thought WebRTC was the coolest thing since sliced bread until I ran into the whole TURN and STUN thing. I feel catfished.


Those are not specific to WebRTC though. They are needed protocols if you want a chance to traverse NATs. They're used by WebRTC and other VOIP stacks too.


Yep, same here. Especially when one takes a step back, loads a h323 softphone that, unlike browsers on linux, uses hw encode/decode, and suddenly my laptop doesn't want to fry itself.


What are TURN and STUN?


Ways for punching through NAT. STUN lets you discover NAT addresses and the like to try and establish a connection with a port on the WAN IP, or some other connection mechanism. TURN is effectively a proxy that can be used by both sides to establish a connection through NAT.



To share just one screen (mostly for my mom and me) I created https://screenshare.43z.one/ in just a few lines of client side js.


Nice! The WebRTC samples tend to be a bit convoluted. This sample project is a great demonstration of simplicity. Going to use this for some data channel work.


It would be great with a similar solution that also packaged you own STUN/TURN setup. Maybe there is someone who did that for Jitsi meet?


OpenVidu.io packs everything needed: an all-in-one Docker image that you can deploy and use to build any videoconference project upon it (or use the ready made Call app that is provided as a real-world project example)

Disclaimer: I am coworker with the people that write OpenVidu. Check it out!


Is there anything which doesn't use webrtc?


How does the identity management work here? How do you know that you are talking to who you think you are talking to?


Can't we just manage this on layer 8?


There are lots of free options. We use zoom because it has the best and most consistent quality.


From a tech-org perspective I've had the opposite experience. Zoom crashes, behaves weird, has issues on some peoples machines but not others, does weird javascripty things to hide the "join by web" link which we get complaints about, etc.

From a non-technical persons perspective, my piano teacher was using Zoom but ended up having to switch to Jitsi Meet because (anecdotally) the audio quality was better and more consistent.


It’s not anecdotal, Zoom applies significant audio filtering and “enhancement” as well as selectively attenuating “non-primary” speakers.


My casual observations of it were anecdotal, that is. Glad to see someone else has noticed similar behaviour.


Is there a way for multiple people to simultaneously share their screen with each other


Could you add some screenshots on the README?


Up to how many participants P2P WebRTC video calls work well in browsers?


Right from the first paragraph of the README:

> The sweet number is somewhere around 6 to 8 people in an average high-speed connection.


Doesn’t look too secure.


Quite the contrary; if you host this yourself, remove the TURN server list, and host your own STUN server, it's much more secure than pretty much any of the third-party-hosted services out there. Assuming you trust your browser, that is.

Obviously I haven't audited the code, but it's pretty small, and wouldn't be hard to thoroughly audit.

It's surprisingly light on dependencies, just pulling in express as the webapp server, and socket.io for the call signaling. Personally I'd probably roll my own signaling over websocket to avoid that dependency (socket.io is an awful protocol, though for generally understandable reasons), but that would likely more than double the amount of code the author would've had to write.


Why is socket.io awful?


I was curious too, since I'm generally fond of socket.io. I've written my own small library around bare WebSocket API, and there are many aspects to consider, like fallbacks and reconnection. In my opinion, socket.io provides a simple and smallish interface that takes care of most needs.

Here's an article I found, with opinions on the good and the bad.

https://dzone.com/articles/socketio-the-good-the-bad-and-the... (2018)


The interface is nice (from Javascript; it's impossible to write something type-safe in any language where you like to care about type safety); the protocol and implementation itself is horrifying. Some of the protocol's handshaking gymnastics was necessary 15 years ago when you couldn't rely on there being a standards-conforming Websocket implementation, but those days are long gone.

Documentation is sparse, and when you find some docs, half the time it isn't clear which (incompatible) version of the protocol they're talking about.

Source: I was working on writing an interoperable server implementation of socket.io a couple years ago, and it was way more work than it should have been.


Could you elaborate?


Great. Can I has hw encoding/decoding on linux, please? I don't need another software, I need my laptop not to boil.


That's not the developer of this app's fault.


Stop giving away great software for free. Sell it! :)

(P.S:: Didn't check out the source code. It's just a general sentiment of mine lately.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: