The problem(-s) described in the blog post are really acute for IoT in general, especially if you want your device to run on batteries or you have a limited data budget.
> Therefore, when you try to continue talking to the server over a previously established session, it will not recognize you. This means you’ll have to re-establish the session, which typically involves expensive cryptographic operations and sending a handful of messages back and forth before actually delivering the data you were interested in sending originally.
The blog post mentions Session IDs as a solution, but these require servers to be stateful, which can be challenging in most deployments. An alternative is Session Tickets (https://datatracker.ietf.org/doc/html/rfc5077), but these may cause issues when offloading networking to another device, such as a cellular modem, as their implementation may be non-standard or faulty.
incoming rant
These issues could be mitigated—or even solved—by using mature software platforms like Zephyr RTOS and their native networking stacks, which receive more frequent updates than traditional “all-in-one-package” SoCs. However, many corporations choose to save a few dollars on hardware at the expense of incurring thousands in software engineering costs and bug-hunting. It is often seen as more cost-effective to purchase a cellular modem with an internal MCU rather than a separate cellular modem and a host MCU to run the networking stack. It is one of the many reasons why many IoT devices are utter garbage.
Author here -- thanks for engaging in the discussion! You won't find any pushback from us on using Zephyr -- we are contributors, the firmware example in the post is using it (or Nordic's NCS distribution of it), and we offer free Zephyr training [0] every month :)
> It is often seen as more cost-effective to purchase a cellular modem with an internal MCU rather than a separate cellular modem and a host MCU to run the networking stack.
This one isn't just cost--the compliance restrictions that the cellular carriers place on you are idiotic.
The big one we bumped into is "must allow allow carrier initiated firmware updates with no restrictions on scheduling" which translates to "the carrier will eat your battery often and without warning".
In addition, many IoT devices may not call home more than once every couple of months. And the carrier will happily roll out tower firmware that will kill those being able to call home.
If I use a module with my own firmware, the modem folks will simply point fingers at me. If I use a module with integrated SoC and firmware and it gets updeathed, I get the "joy" of yelling at the cellular module manufacturer.
(I had the wonderful experience of watching a cellular IoT project go gradually dead over 3 days as the carrier rolled out an "upgrade" across its system. We got a front seat as the module manufacturer was screaming bloody murder at the carrier who simply did "We Don't Care. We Don't Have To. We're the Phone Company.")
Your experience is bizzare. Was it Verizon (or AT&T/T-Mobile) and did you use a Cat-1bis/Cat-M/NB-IoT or higher class modem?
Normally US carriers require Firmware-Over-The-Air (FOTA) capability for the modem firmware. This is not the case for deployments outside of the United States, to my knowledge.
It would be interesting to hear more about your story!
All the lawyers in the world won't protect you from paying the penalties in the contract you signed when you overtly and deliberately failed to deliver.
> What makes a separate cellular modem better than an internal cellular modem?
The US 3G shutdown required some rather expensive and unexpected upgrades. Vendors signed long-term contracts with 3G providers, and then "someone" was on the hook, to replace something, when the 3G vendors terminated their contracts prematurely.
The deeper the modem was integrated into a product, the harder it was to change. The shallower the modem was integrated, the easier it was to change.
For example:
One of my cars just lost its internet connectivity, and the automaker never offered any way to fix it. (I didn't care, I only used Android Auto in that car.)
My employer (IOT) sent out free chips to our customers. They had to arrange someone to go do a site visit and swap a chip while on a phone call with us. We're small and early enough that it wasn't a big deal.
My solar panel vendor wanted me to $pend big buck$ on a new smart meter and refused to honor their warranty. I told them to run a cable to the ethernet port in my meter.
Car manufacturers should abandon developing their own entertainment systems and instead collaborate with Apple (CarPlay) and Google (Android Auto) to improve integration and support a wider range of use cases. Unfortunately, they seem to revive these efforts every 4-6 years (e.g. Mercedes Benz)
The only feature I need to control remotely in my car is preheating during winter—I wonder how they could achieve that without using cellular connectivity as paying a subscription for such a service would make it less attractive to me.
There are issues with infotainement systems in older vechicles supporting Android Auto and Apple Car Play. This is primarily because car manufacturers (or subcontractors) aren't good at integrating these systems. Often they are tight on budget too!
Personally, I’m not a fan of Tesla’s car computer. My main annoyance is the lack of physical buttons. All I want from my car is reliable navigation, seamless audio playback, and easy mobile call handling.
> What makes a separate cellular modem better than an internal cellular modem?
When using a separate cellular modem, you can connect it to your MCU via either a USB or UART interface. In IoT applications, UART is the more common choice. Then you can multiplex the interface with CMUX, allowing simultaneous use of AT commands and PPP.
With frameworks like lwIP or Zephyr supporting PPP, you can get your network running really quickly and have full control over the networking and crypto stacks. Using Zephyr you get POSIX-compliant sockets which allows you to leverage existing protocol implementations.
In contrast, using a SoC's often requires reintegrating the entire network stack, as they typically do not support POSIX sockets. I've worked on SoC's that only support TLS 1.1 and the vendor refused to upgrade it, as it would require them to re-certify their modem. Switching to a different SoC can mean repeating this process from scratch as different vendors implement their own solutions (sometimes even the same vendor will have different implementation(-s) for different modems).
> I am evaluating some Nordic semiconductor parts for a project. They seem to have an internal modem but Nordic uses zephyr. Any thoughts?
It runs on Zephyr RTOS and can be built as a standalone modem (https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/appli...). You can either use Zephyr's native networking stack or offload it to the modem while retaining the same API. This means you get access to all of Zephyr’s protocol implementations without additional effort. The design makes it feel as though you have a completely independent MCU connected to an external modem.
That said, it does have quirks and bugs, particularly when offloading networking. It also has relatively limited resources, with only 256 kB of RAM and 1 MB of flash storage.
Overall, it is the best SoC I’ve worked with, but it is still an SoC. Whether it suits your project depends on your specific use case. If you plan to integrate additional radios in the future (e.g., UWB, BLE, Wi-Fi), I’d recommend using a separate MCU if your budget allows. This will provide significantly more flexibility. Otherwise it is definitely one of the better SoC's currently in the market, to my knowledge.
PS. It only supports Cat-M (& NB-IoT but I'm going to skip over it intentionally!) which is not globally supported, so you should make sure the region you want to deploy in supports that technology.
On one hand, licensing requirements and regulation often mean that modems are locked down in terms of firmware updates, reference documentation, source, and capabilities. This often translates into a larger "black box" area, and one embedded inside your SoC instead of physically separate and connected over a serial bus.
On the other, on-chip modems often (not sure about those Nordics) have DMA.
cellular modems go nonfunctional/obsolete much faster than other systems. 3g is almost entirely gone worldwide. 4g is still around, but providers are already reducing how much their towers dedicate to it. The standards body is working on 6g, who knows when that will come and push out older stuff.
If the case of my car I don't care - I have never found a use for the cellular connectivity it has (if any). However there are lots of other devices where the cellular connectivity is important and users will want to upgrade the modem to keep it working. If cellular connectivity is just a marketing bullet point nobody cares about then integrated is good enough, but if your device really isn't useful without the modem make that modem replaceable for somewhat cheap.
4g should survive better than 2g and 3g, because the 5g standard allows for mixed mode deployments where the coordination channel runs as 4g, and the slots can be 4g or 5g dependening on what the client device is capable of. Running the coordination channel with 5g encoding could be a little more efficient, but it's not a big loss compared to running a minimum size 2g/3g allocation.
Generally the modem would support EDGE/E-GPRS (2G) and 3G simultaniously. In Europe, EDGE/E-GPRS is still quite popular and unlikely to be sunsetted in the next 5 years. It is a pity that many manufacturers tightly integrated the modem into their PCB, instead of creating a separate "communication box" - especially as some use-cases have the budget for it.
Nevertheless, with modern cloud providers moving the state into the OS/Networking layer is still easier to scale. You don't need to write your own services to handle it.
You don’t want to redo the handshake (which, as far as I know, is identical to TLS) every time you send a packet using DTLS. Therefore, you still need to retain state information. In my opinion, using DTLS (UDP) is considerably more nuanced than using TLS (TCP) in an embedded IoT context.
Zephyr RTOS[1] is an RTOS supported by the Linux foundation. It has similar structures to Linux for device configuration like a device tree and tries to emulate POSIXish APIs. I think some embedded people are put off by this configuration structure.
Notably, Zephyr RTOS is the basis of Nordic Semiconductor's SDK[2]. Nordic is a major manufacturer of cellular and wireless MCUs.
The mere existence of Tailscale should give a hint that NAT is only a speedbump and not any protection whatsoever. It protects you against nothing. Every method that Tailscale uses to traverse NAT can be in isolation used by any other piece of software. For more info about that you can read the following article.
What people really want is a firewall, and since NAT acts as a firewall, they confuse it with that.
My university has a public IP for every computer, but you could still only connect to the servers, not random computers, from the outside. Because they had a firewall.
What ordinary people (as opposed to IT departments) really want is firewall that can't be accidentally disabled by pushing an overly permissive firewall rule.
NAT/port forwarding, for all their faults make it rather difficult to write rules allowing traffic to a machine you didn't intend to expose to the world.
Yeah but the average person wouldn't know to set up a firewall (and can't count on their ISP to have their best interests at heart.) Therefore the general public benefits from the degree of protection that NAT provides.
Then just enable the firewall by default, or don't even provide a way to disable it unless the user enters "developer/advanced/Pro (tm)" mode. None of these are valid excuses for NAT.
"not any protection whatsoever" is way too strong a statement. NAT does raise the bar to exploiting a random smart lightbulb in your house significantly higher.
The big distinction is that for Tailscale both endpoints know they want to talk to each other, and that both have Internet access. That's not the usual case firewalls are designed for.
Tailscale doesn't strictly need NAT traversal. They can run only their DERP servers and still continue to work. If your firewall tries to block two devices from communicating and yet allows both devices internet access, you have already lost.
Sounds like you like the idea of a stateful firewall, and good news: There are stateful firewalls for IPv6!
They have all the upsides of NATs (i.e. an option to block inbound connections by default), with none of the downsides (they preserve port numbers, can be implemented statelessly, they greatly simplify cooperative firewall traversal, you can allow inbound connections for some hosts).
I found it weird that IPv6 folks are so against NAT as a cultural thing when it works perfectly well on IPv6. They're not fundamentally opposed.
I could have all of my servers in public subnets and give them all public IP addresses, but I still prefer to put everything I can in private. Not only does the firewall not allow traffic in, but you can't even route to them. It now becomes really hard to accidentally grant more access than you intended.
I would hazard that most devices on there internet are in the boat of want to talk to the internet but not be reachable on it.
Yea IPv6 folks are indeed against NAT philosophically because it's considered one of the big mistakes of IPv4.
There is a distinction between being publicly addressable and publicly routable. You can have the former without having the latter.
If you want more private addresses, IPv6 has a solution too: use ULAs and not GUAs. Design your internal network so it has mostly ULAs for application servers, database servers and the like, except for the reverse proxy having both publicly accessible GUAs as well as ULAs for talking to the rest of the network.
I personally use ULAs and GUAs concurrently on my network, because I have a residential ISP where my GUA prefix is not fixed.
I'm not opposed to anyone voluntarily using a NAT at all. I just hate it when somebody makes that decision for me, and that unfortunately still happens all the time.
If it's a well-reasoned decision, sure, but I do suspect that more often than not it's a lack of knowledge about alternatives that makes people still opt for NATs, and that just makes me sad on top of being annoyed with the inconvenience of having to tunnel when a direct connection seems so close at hand.
> I would hazard that most devices on there internet are in the boat of want to talk to the internet but not be reachable on it.
I highly doubt that. One big example is VoIP: Incredibly common these days, yet so much of it is going through centralized relays, and often for absolutely no technical reason.
For a personal network where you decide to use NAT on ipv6? Sure, go ahead.
Being forced into a CGNAT on ipv6 is just a dick move though. And I believe that's the kinda NAT that has coloured the opinions of most NAT for ipv6 detractors.
If you don't want that, then complain about a lacking a configuration as such and configure your firewall so that that they can't. But don't cheer on something that's breaking functionality that others might want (especially if it doesn't actually achieve your own goals reliably).
But to your point about not cheering on NAT, well I will because I see NAT as useful tool.
It is not an opinion well aligned with the preferences of the IETF. But the purist model of transparent end-to-end networking has never sat well with me. It’s just not a thing we want.
So because you don't want it means that nobody can get it?
Because that's what happens if you advocate for NAT by default. Conversely, with "IPv6 + inbound-blocking-firewall on CPEs by default", everybody gets the same behavior by default, and people that want something else can get that instead.
A telematics tracker in a vehicle that logistic companies use (e.g. Amazon, FedEx) is also considered as an IoT device. I don't believe that the author is talking about Smart Home appliances exclusively.
Same, but ISTR we had some Cisco gear where NAT (or PAT as they insisted; yes, I know difference; no, no one cared) had a different license or hardware requirement or something from firewall rules.
I agree until I discover I'm doing something where I want to access/change that device. It is really nice when I'm returning home early that I can change my thermostat out of vacation mode. I've often wished I had a way to tell if I left a door unlocked.
Security and privacy is of course critical to all this, but the concept of internet itself is not wrong.
That's what a VPN is for. Every router I've had in the past decade has had support for running a VPN server so you can have one running 24/7 without any additional hardware. Even my retired elderly parents run a VPN server on their home router.
I hope you're not implying that allowing IoT devices access to the internet is not a massive security vulnerability. IoT devices are notoriously insecure and poorly maintained. I'd much rather have LAN-only IoT devices and an internet-accessible VPN server than letting IoT devices access the internet.
But theirs uses certificates (the router UI generates the openvpn client config files with the certificate embedded inside it) and no, they do not understand the security implications between those two choices.
Mine is a wireguard VPN with both the pub/priv keypair and PSK.
I'm implying that a VPN is not a "silver bullet." In particular since they don't understand the model, are stuck with a vendor implementation, and probably never update their router firmware.
There's a reason IoT vendors try to do this all "in device."
What I want from platforms, and I fought for at one time in Google with no success: a platform API that provides applications a way to schedule packets when the radio turns on.
The mobile platforms in particular continue to assume that you can live with their high level HTTP stuff, and it's just not good enough. The non-mobile platforms largely don't even approach the problem at all.
Indeed, that sounds like an obvious feature. Hard to believe it hasn't been implemented! I'd love to have that feature on Linux desktop/laptops. I think you could make lots of applications behave a whole lot better.
the arguments from folks on the "architecture review boards" was that multiple connections are always bad and that developers can't be trusted. i'm willing to accept that they did get beat up quite a bit over power at various points when at times applications were a big part of the problem. That said this is also a gross misunderstanding of the problem and overall solution space, as well as very much gatekeeping.
Dump the entire queue to a google server (confirm receipt and shut down the connection) and then have the google server forward all the data to its destinations?
Seems to me that would be the lowest power, lowest developer trust, lowest number of connections, maximum gatekeeping method.
This article should clarify at the start whether TCP or UDP is under consideration. NAT idle timeouts for both are typically very different. RFC 5382 [0] specifies no less than 2 hours and 4 minutes for TCP. RFC 4787 [1] specifies no less than 2 minutes for UDP. Towards the end of the article it becomes clear that it's UDP.
The example diagrams also incorrectly show port numbers exceeding 65535. The port fields in TCP and UDP headers are 16 bits [2].
I have worked on a device with this exact same "send a tiny sensor reading every 30 minutes" use case, and this has not been my experience at all. We can run an STM32 and a few sensors at single digit microamps, add an LCD display and a few other niceties and it's one or two dozen. Simply turning on a modem takes hundreds of microamps, if not milliamps. In my experience it's always been better for power consumption to completely shut down the modem and start from scratch each time [1] - which means you're paying to start a new session every time anyway. Now I'll agree it's still inefficient to start up a full TLS session, a protocol like in the post will have it's uses, but I wouldn't blame it on NAT.
[1] Doing this of course kills any chance at server-to-device comms, you can only ever apply changes when the device next checks in. This does cause us complaints from time to time, especially for those with longer intervals.
Power Saving Mode (PSM), a power-saving mechanism in LTE, was specifically designed to address such issues. It allows the device to inform the eNB (base station) that it will be offline for a certain period while ensuring it periodically wakes up to perform a Tracking Area Update (TAU), preventing the loss of registration. This concept is similar to Session Tickets or Session IDs in (D)TLS—or at least, that’s how I like to think about it. However, there are no guarantees that the operator will support this feature or that they will support the report-in period that you want!
Maintaining an active session for communication between the endpoint and the edge device is highly power-intensive. Even with (e)DRX, the average power consumption remains significantly higher than in sleep mode. Moreover, the vast majority of devices do not need to frequently ping a management server, as configuration and firmware updates are typically rare in most IoT deployments.
Great pointer! My sibling post in this thread references a few other blog entries where we have detailed using eDRX and similar low power modes alongside Connection IDs. I agree that many devices don't need to be immediately responsive to cloud to device communication, and checking in for firmware updates on the order of days is acceptable in many cases.
One way to get around this in cases where devices need to be fairly responsive to cloud to device communication (on the order of minutes) but in practice infrequently receive updates is using something like eDRX with long sleep periods alongside SMS. The cloud service will not be able to talk to the device directly after the NAT entry is evicted (typically a few minutes), but it can use SMS to notify the device that the server has new information for it. On the next eDRX check in, the SMS message will be present, then the device can ping the server, and if using Connection IDs, can pull down the new data without having to establish a new session.
Is "Non-IP Data Delivery" (basically SMS but for raw data packets, bound to a pre-defined application server) already a thing in practice?
In theory, you get all the power saving that the cellular network stack has to offer without having to maintain a connection. While on protocol layer NIDD is almost handled like an SMS (paging, connectionless), it is not routed through a telephony core (and hence sloooow). The base station / core will directly forward it to your predefined application server.
It has been heavily advertised, but its support is inconsistent. If you are deploying devices across multiple regions, you likely want them to function the same way everywhere.
802.11 supports the same thing. A STA (client) can tell an AP that it'll be going away for some time, and the AP will queue all traffic for the STA until it actively reports back. Broadcast traffic can also be synchronized to particular intervals (but low power devices are usually not interested in that anyway for efficiency reasons).
I have very little experience with Wi-Fi, as the industry I worked in relied almost exclusively on cellular networks. However, I wonder how many Wi-Fi routers actually support this functionality in practice - as queing traffic means you need to cache it somewhere.
Author of this post here -- thanks for sharing your experience! One thing I'll agree with immediately is that if you can afford to power down hardware that is almost always going to be your best option (see a previous post on this topic [0]). I believe the NAT post also calls this out, though I believe I could have gone further to disambiguate "sleeping" and "turning off":
> This doesn’t solve the issue of cloud to device traffic being dropped after NAT timeout (check back for another post on that topic), but for many low power use cases, being able to sleep for an extended period of time is more important than being able to immediately push data to devices.
(edit: there was originally an unfortunate typo here where the paragraph read "less important" rather than "more important")
Depending on the device and the server, powering down the modem does not necessarily mean that a session has to be started from scratch when it is powered on again. In fact, this is one of the benefits of the DTLS Connection ID strategy. A cellular device, for example, could wake up the next time in a completely different location, connect to a new base station, be assigned a fresh IP address, and continue communication with the server without having to perform a full handshake.
In reality, there is a spectrum of low power options with modems. We have written about many of them, including a post [1] that followed this one and describes using extended discontinuous reception (eDRX) [2] with DTLS Connection IDs and analyzing power consumption.
An IPv6 router with a stateful firewall blocking incoming connections could have just the same issues with timeouts, I'd imagine. Switching to IPv6 doesn't just mean that anyone can make a P2P connection to anyone else (even STUN needs a third-party server to coordinate the two peers).
(D)TLS session resumption (I'm not sure if their "Connection IDs" are that or something similar) seems like the most foolproof solution to this scenario, assuming that the remote host can support it.
Not if the end user isn't in control of the firewall. (And if they were, then they could just forward dedicated ports for the devices they need.) It might not be as bad as the CGNAT situation, but there are plenty of big WANs that can't be reconfigured at will.
I have firewalls in v4 and v6 networks which don’t do any natting (well other than some 6-4 between them). They track sessions for security purposes, and they time them out for both security and memory reasons.
>An IPv6 router with a stateful firewall blocking incoming connections could have just the same issues with timeouts, I'd imagine.
You'd be surprised... PCP (Port Control Protocol) implemented by large vendors such as Cisco and Apple are able to punch through a firewall for up to 24 hours in a single session.
and also so many other things. ARP goes away, dhcp goes away -- yet people reinvented dhcp anyway and did it wrong (IHMO).
I'm of the opinion that IPV6 changed some small things just enough to get people to have to learn new stuff -- and also forgot that NAT is not a firewall, somewhere along the way.
Great question that I don't think the other commentors really answered. Sadly I don't really know "why" either. Second system effect?
Instead of making IPv6 just "IPv4 with 128bit instead of 32bit address space" the designers changed a bunch of things. NDP is new and different than ARP. ICMPv6 is extended and different. MLD instead of IGMP for multicast group membership. DHCP is extended and has a bunch of other alternatives, like SLAAC. IPsec is mandatory now.
All the new IPv6 protocols seem more secure, more powerful, and more efficient. So I guess that is "why". But I bet in an alternative history where backwards compatibility and the upgrade process were prioritized IPv6 would be a lot more widespread.
In general, the IPv6 stack is not supported on many smaller mcu. It is not that options like FreeRTOS can't support the stack, but rather the resource constraints pose a challenge.
That being said, most modern SoCs are competitively priced... and will boot Linux just fine for under $5/part. =3
Some of the new IoT protocols build on ipv6 natively.
The resource overhead is minimal for modern mcus. Dropping dhcp and arp can save a lot of resources too. Also I have mcus with more ram than my first pc.
Disabling IPv6 in ESP-IDF can save about 40KiB of flash and 2KiB of RAM. Not enough that I'd do it by default, but I've hit the limit of some of my hobbyist ESP32s to the point where I've disabled modules to cram my code in there.
Disabling IPv4 saves 25KiB of flash and less than a KiB of RAM. If you're down to the last kilobytes, disabling IPv6 makes more sense than disabling IPv4. Both options are choices of last resort, though.
Eliminating NAT makes virtual networking much hairier. E.g., my desktop is currently connected to a big enterprise network that keeps track of all devices and will only allocate one IP address per MAC address. That leaves no IPs for any VMs on my desktop to use, so they must go through NAT if they want to communicate with the outside world.
Based on your description, you're using NAT as a means to bypass network restrictions. It's a great hack that will do the trick wonderfully, but if that's the officially proposed solution then the network is not set up for what it's actually used for.
Contrary to popular belief, IPv6 does have NAT, it's just stupid and unnecessary in most use cases. There are even different kinds of NAT; there's the "swap the public prefix with a private one" NAT (useful for shit ISPs that rotate your IPv6 prefix to make you pay extra for a static one, so your addresses don't keep changing), or the IPv4-like "masquerade all traffic when forwarding" one that'll work in your weird enterprise network.
With the latter option, you can set up an IPv6 with DHCPv6 on the upstream and whatever combination of SLAAC or DHCP you want on the VM side, and set up NAT using guides like this (possibly outdated iptables-based) one: https://forums.raspberrypi.com/viewtopic.php?t=298878
> Based on your description, you're using NAT as a means to bypass network restrictions.
The threat model on this big enterprise WAN isn't "an authorized user creates a VM on an existing machine", but rather "some rando plugs something in with an Ethernet cable and goes to town on the intranet services". It has a whole web portal where you can log in and register your devices. So it's not like anything's really being bypassed here.
Anyway, I still prefer NAT for VMs over most passthrough schemes, which I've found unreliable even on IPv4 networks. What the outside network doesn't know about can't hurt it.
> The threat model on this big enterprise WAN isn't "an authorized user creates a VM on an existing machine", but rather "some rando plugs something in with an Ethernet cable and goes to town on the intranet services". It has a whole web portal where you can log in and register your devices. So it's not like anything's really being bypassed here.
Isn't that what the MAC filtering is for? I don't see the point of stopping a single registered MAC on a single port from having multiple IPs.
I might be misremembering some of the details (in fact, given that the web portal works, I think it might allow new devices to have local IPs, but bounce all outbound packets until they're registered), but however it works exactly, it does not like passthrough networking on VMs in practice.
(If I had to guess, just letting any machine have a million new IPs for the firewall to track has its own issues, so you'd end up with policies upon policies.)
In any case, sometimes middleboxes just don't behave precisely how we want them to, and that's why I'm skeptical of the typical IPv6 position of "a flat /64 network (or something emulating one) is all you'll ever need".
"Doctor it hurts when I". Keep track of all devices, and alloc a static /64 per (host) MAC.
And even in the face of a supposed malicious network admin like yours, you can still adopt IPv6, and use the IPv6 private network ranges for the VMs, and then do NAT like it's 1983, same as you would have otherwise done with IPv4, except that all other uses cases get easier, while this one use case remains as hard as it was before.
If they're not routed directly through the network card but use a virtual switch instead, the network upstream only sees the server MAC. If they're directly hooked into the network card they can have their own MACs, but that approach can make it hard to communicate between host and VM.
Why would it not? There's even a third MAC address field in the 802.11 headers for just that scenario, as far as I remember (i.e. to distinguish between "802.11 source/destination" and "Ethernet-level actual source/destination").
You actually need to use special 4addr/WDS frames with 4 addresses[1]. However no one accepts these by default so it's unusable for a Laptop that you want to connect to random wlan networks.
Hm, maybe I'm mixing up something here, but all I can say is that it works like this on my machine: I have a VM running in bridged mode, I'm connected to my home router using Wi-Fi, and I can see the virtual MAC address of the VM on the router management interface.
Possibly this is just at the DHCP level, though, and what the router sees just comes from my host MAC address? A quick packet trace suggests that.
Few cellular modems commonly used in IoT support IPv6, and not all mobile network operators provide an IPv6 address. Since cellular connectivity plays a major role in the industry, IPv6 cannot be used as a blanket solution to this problem.
Either it would or it wouldn’t help in most cases, but the absence of consideration of it at all weakens the article’s arguments from a carrier perspective. IPv6 adoption was at 90% by US mobile carriers a couple years ago, and the US is not known for its telco infrastructure investment; so, while using IPv6 may not be a uniform cure for their issues, the article’s total focus on legacy IPv4 NAT issues is in stark contrast to its availability carrier-side in one of the weakest examples available. China regulates that IPv6 be supported and enabled by default on all hardware sold for use in-country since last year, and telcos have six months left until a first-stage IPv4 new-hardware prohibition goes into effect later this year, so the assumption that most cellular modems don’t support IPv6 seems unlikely as well given their regulatory climate. This deserves more research or at least an explanation of why such was not done for the initial release of the paper.
There are still countries with zero IPv6 on mobile, e.g. New Zealand. All three mobile carriers refuse to deploy IPv6 on mobile despite two of them doing so on their fixed line networks. Do agree with your point tho IPv6 should be considered a possible solution.
On modern firewalls/routers, NAT is only one cooks in the network kitchen raining on this author's parade. Stateful Packet Inspection has timeouts too!
It seems brave to let an IoT device talk to someone over an unencrypted wan though. They're often of pretty varying software quality and rarely updated.
If you really want IoT wifi devices, put them on a separate wifi, and only let them talk to a local device that you can keep up to date. Assume they're vulnerable to local attacks over wifi and act accordingly, e.g. don't give the IoT wifi access to your other devices beyond to that controller, and definitely not to the wider internet.
If they're closed source, assume they're already compromised from the factory
The issue in IoT is that most people expect to be able to control their IoT (e.g. Smart Home) devices from outside of their network. This requires you to have a central server these devices communicate with or a similar deployment on-premises with a public IP address.
I've always wondered how it is economically feasible to run these central services without a monthly subscription. If you stop selling devices you'll go under fairly quickly.
I have all my "Tuya" IoT devices on a separate network, isolated from my personal intranet, and the cloud servers these devices talk to are in China. I pay nothing to "Tuya" for their cloud service, never have. If the "Tuya" service ever goes down, my home automation is also down, and that sucks. I'm trying to replace it all with "Tasmota" devices now, but it's not quite as easy or as cheap to do.
and somehow they are still in business, and popular.
If you do care about security, keeping your home-automation within your own control is probably the only sane path. Homeassistant and similar open source things like openhab are pretty good if a bit fiddly, and a wireguard vpn like tailscale a fairly practical way to access it when away from home.
NAT breaks the central idea of IP (i.e. packet switching with stateless intermediary nodes). All of its problems are essentially downstream from there.
What you can do is port forwarding. You have a bunch of devices behind a 1:N NAT, so they share one IP address. For specific services on those devices, you can pair dedicated ports with this IP address, binding them to internal IP:port pairs.
It's not a perfect solution for every scenario, and requires configuration, but there it is.
This is how people on residential lines run web servers, mail servers, ... they map TCP ports like 443, 80, 22 and 25 on their router to go to specific internal hosts.
With CG-NAT this doesn't work. Multiple customers are sharing the same IP address, all of which are sitting behind a NAT. Further the internet gateway is a NAT sitting behind the CG-NAT. And if you prefer to use a nice Mesh WiFi router, well that's a third NAT layer.
Common suggestions I've heard:
"Use a VPN"
I tried to buy a computer from Apple directly. They detected the VPN and wouldn't let me purchase it. I turned off the VPN and the purchase worked. I then got a call from Apple asking me if I really did intend to buy the computer.
"Use Tailscale"
On my setup, tailscale can't really navigate the NAT setup correctly. I have a 200Mb downlink and I will often only get a 10th of that through tailscale. Also the routing table no longer works as routing is handled by a myriad of netfilter rules -- which may or may not conflict with docker on the same box.
"Use Ubiquiti (or other networking gear) to get rid of the last NAT"
Gah. I didn't need to perform a huge cash outlay 10 years ago and rewiring the house to do the same thing I'm trying to do today.
I don't have my own external IP address (even if dynamic) + I want my devices to be spontaneously contactable from the network without polling anything from the inside = does not compute
An ad for... the IETF? All of the firmware discussed in this post is open source, and we even contributed DTLS Connection ID server side support to a popular open source library [0] so other folks can stand up secure cloud services for low power devices. Sure, we sell a product, but our broader mission is making the increasing number of internet connected devices more secure and reliable. When sharing knowledge and technology is in service of that mission, we do not hesitate to do so.
Also considering the state of iot security its probably not a great idea to have everything accessible anyway. But that's a slightly different problem to solve.
It’s not going to solve a stateful firewall timing out.
You try to continue a tcp session that’s timed out on my firewall and the packets will be dropped.
This applies a fair amount t to me when I suspend my laptop, my ssh session will drop as both the server and the firewalls drop the session while it sits there peacefully. When it comes back the tcp packets get sent into the void.
Meanwhile my WireGuard connection which runs through two separate ipv4 nats works just fine, as it doesn’t rely on sessions or a server timing out a socket.
Why not let through all ipv6 traffic? It seems improbable that your device is going to get caught up in a ipv6 scan; the address space is too vast - I'm assuming here.
Just have relatively secure devices using ipv6 - not shipping with standard default admin passwords would nearly be enough if there aren't obvious vulnerabilities.
Isn't this the whole point of Thread? You have a low power mesh network for inside a deployment (house, office, factory, whatever) and it's the border router which does the "communicate with the internet/lan/wider network" piece. These are typically plugged in (so no worries about having to be low power) have plenty of memory (so no worries about having to drop established routes like a router).
> This doesn’t solve the issue of cloud to device traffic being dropped after NAT timeout (check back for another post on that topic), but for many low power use cases, being able to sleep for an extended period of time is less important than being able to immediately push data to devices.
NAT was introduced by private company called Network Translation Inc. and successfully broke efforts to migrate off IPv4 (which was supposed to be EOLd by 1990) and permanently broke the "network of hosts" into asymmetric one of servers and clients.
Note that we had a solution for address exhaustion by 1991, but it was just "good" and not "perfect" and worst of all it used the hated OSI protocol stack (TUBA - TCP & UDP on top of OSI CLNS - also known as IPv9). It even had at least two usable implementations at the time it was proposed (for SunOS and Cisco IOS)
> permanently broke the "network of hosts" into asymmetric one of servers and clients.
That was inevitable and can't reasonably be blamed on NAT. As but a few examples. ISPs arbitrarily break things unless you pay them extra. Stateful firewalls are a good thing. Even on my local LAN I can't reliably SSH into my laptop because it's on WiFi and does funky power saving stuff with the chipset I guess. My phone is far worse than my laptop in that regard.
Setting up a reliable and widely reachable server requires deliberate effort regardless of the existence of NAT.
There's some work predating it, but it was very much undeployed before PIX let out the genie out, because even the early NAT work notes it might be problematic even in short term.
It's partly technical (BSD Sockets being bad API that hard does low level proto ok details in applications) and partially business - vendors didn't want to do the work to upgrade software and hardware - especially with advent of CEF and similar hardware routing options. And by 1990s the government-led standardisation efforts that gave us widespread ethernet and IPv4 got axed, and efforts to make vendors update if only for federal contracts died in waiver hell.
The others kinds of problems are from there over time.
The thing I like about NAT is that it is essentially an ISP side stateful firewall. I would migrate to the ipv6 globally addressed mode immediately if my ISP had a checkbox "disallow all incoming connections".
ISP side? Hopefully not, or rather, only if you're behind one of those awful CG-NATs (and I'm not aware of any that let you actually configure port forwarding, although my knowledge here might be outdated; fortunately I haven't had to deal one in a long time). Otherwise, it's usually your CPE doing the NATting.
It sounds like you want an ISP-provided stateful firewall though, upstream of your (metered, slow) connection, which I'd agree would be a great feature to have!
You're right, I was thinking of CGNAT. My ISP definitely does it (they have far more users than IP allocations) and my router does NAT as well, so I guess I have a double NAT.
I don't know networking all that well. In my mind, I have 50 devices connected to my router behind NAT. My Mac, My Apple TV, my iPhone, My PC, My Linux Box, My partner's versions of all of those. My video games. Etc
From outside there's 1 IP address. With IPv6, every device would get it's own address outside. Why do I want that? That sounds less private to me. Am I mis-understanding something? Lots of traffic on one IP address sounds more obfuscated than all separate.
With IPv6, every device has multiple IP addresses. One or more addresses that are rotated* to prevent you from being tracked easily, and one that's derived from your device's MAC address so you can make your devices easily accessible from WAN by opening ports in your firewall if you want to.
You could disable the rotating addresses, or disable MAC-based ones by using DHCP, but there's usually no point.
As for why you would want something like that: a whole bunch of software and hardware breaks because of NAT. Consumer NAT has some monkey patching inside of it rewriting some protocols to make them work again (which also allowed random websites to open arbitrary ports to arbitrary addresses in some Linux routers a while back, because NAT overrules firewall settings to work) but there are still limitations.
For instance, if you're having issues with your Nintendo Switch, Nintendo will tell you to forward every single port to your Switch (https://en-americas-support.nintendo.com/app/answers/detail/..., hope that IP address doesn't get reassigned to an unpatched device later). Multiple Xbox consoles behind the same NAT requires tricking them into super-restricted-NAT mode to work, or enabling UPnP which allows devices to open ports in your firewall without any authentication.
NAT just kind of sucks. IPv6 wasn't ready for deployment when NAT gained popularity, but all of the reasonable problems have been solved over a decade ago.
*=default rotation happens daily, but your OS may allow you to pick a shorter duration. I've found out the hard way that setting this to five minutes will fill up Linux' route table real fast after a few days.
No, it doesn't. At least the last time I checked unless you go out of your way to implement a non-standard configuration IPv6 is a disaster for personal privacy for the typical multi-user household.
Then again, the "typical" multi-user household is likely logged in to most things via SSO with Google or Facebook and probably has approximately zero fingerprinting mitigations in use so perhaps it isn't worth worrying about?
If you aren't the typical household then given 2^64 addresses and a Linux box serving as a router you've got quite a few options available. Including various creative reinventions of NAT that don't break basic functionality.
> IPv6 is a disaster for personal privacy for the typical multi-user household
Why? With privacy extensions (which are normally enabled for user devices), then all someone can do is look at the prefix. This is identical to looking at the IPv4 address in a NAT setup, and it hasn't been that much of a privacy disaster.
> This is identical to looking at the IPv4 address in a NAT setup
It is not identical unless the OS uses a new IP for every new outbound connection. I believe that would qualify as a (very) nonstandard configuration.
> it hasn't been that much of a privacy disaster.
Indeed, it was tongue in cheek which is why I went on to point out SSO. The reality is most people aren't willing to sacrifice convenience to retain even a shred of privacy.
If you are one of the few who care then you can implement one of the many possible non-standard solutions.
Even disregarding fingerprinting, a single household doesn't have enough traffic from separate devices/users to the same servers to really matter from a privacy standpoint.
If my PC uses the same IP as my partner's to talk to Google, it hardly matters for our privacy if they mix up the attribution of traffic between the two of us.
Speak for yourself. I also don't want it to be readily apparent how many different devices I have, or when I'm using which one, or how many people are in the household, or when who is home.
Granted any service that I consistently interact with is likely to be able to figure out at least some of that information if they put in some effort. But I don't want to be freely providing a complete picture for zero effort.
Creepy data aggregator stories pop up on the HN front page regularly so hopefully I don't need to explain why I feel this way.
Yeah, I mean, I share those concerns in general, but my efforts are mostly centered around aggressive ad/tracker-blocking (moderate DNS-level blocking at the network level, more aggressive at the device level + browser-level blocking) and the avoidance of non-privacy-focused services, e.g. avoiding the popular social networks entirely, and using privacy-supporting pay-for services.
Using the same IP for all of my devices, for me, generally falls into the same bucket of anti-fingerprinting techniques that are used by the Tor Browser like letterboxed resolution that I don't find practical for general use. If I want to actually prevent fingerprinting by IP, resolution, etc. then I'll actually use the Tor Browser.
It depends what you're trying to defend against. The rotation hinders associating an address with a particular device. If someone looks at the network prefix to see if people are in the same household, then that's exactly the same as looking at the IPv4 address to determine the same thing.
> From outside there's 1 IP address. With IPv6, every device would get it's own address outside. Why do I want that? That sounds less private to me. Am I mis-understanding something? Lots of traffic on one IP address sounds more obfuscated than all separate.
Having recently enabled IPv6 for my home network, the "why" was that a) IPv6 to IPv6 connections are nominally more efficient than those that have to traverse NAT and b) it enables connectivity to/from IPv6-only internet devices.
The privacy upsides of a single IPv4 IP for a household are, to me, more marginal than the above benefits.
> This doesn’t solve the issue of cloud to device traffic being dropped after NAT timeout (check back for another post on that topic),
This is the key problem. I think it would be best solved by NAT devices having some way of probing their timeout policy, and then notifying the endpoints when they expire a mapping.
One way to do that might be for a client to deliberately send a packet to the server with an insufficient TTL.
The reply (via ICMP) could then contain fields specifying the timeout in minutes of the mapping, together with some generation number specifying if the mapping table has been cleared due to reboot or overflowed since last queried.
There might even be a way to request a specific mapping become long lived - perhaps for months or years.
The benefit of all this is that client devices can do push notifications from servers or P2P notifications, all with no polling - allowing for example coin cell devices to last for months or years (with appropriate WiFi protocol improvements too)
People have been complaining about NAT for decades. It's time to STFU. It's a well-known problem with a bunch of well-known solutions.
If you don't like NAT you could go IPv6.
But really, why do you need to talk to your device? If it's just reporting in NAT is irrelevant. If you want to do device management just write something into your protocol to check for updates/commands and deal with it on a periodic basis. You can even do that on startup, so you can tell the customer to power cycle the device. It's unlikely that any IoT device needs instant updates, so long periodic updates are probably fine.
There are plenty of IoT devices that people want to execute commands on (anything remotely controlled, basically). Polling for commands on a periodic basis introduces lag into that process which is irritating. Furthermore, polling at a frequent interval can end up using a lot of power as well versus waiting in a receive-only mode for an incoming command.
The alternative to polling is unfortunately polling, which is what the article is about.
You can avoid polling for messages, but you have to send packets outbound regularly in order to maintain a NAT mapping & connection, so that the external side can send messages inward.
The latency is overcome this way, so latency is a solvable problem, but this need to constantly wake up a radio every <30s in order to keep a NAT session alive is a significant power draw.
In theory you might be able to avoid this with NAT-PMP / UPnP however their deployments are inconsistent and their server side implementations are extremely buggy.
The problem(-s) described in the blog post are really acute for IoT in general, especially if you want your device to run on batteries or you have a limited data budget.
> Therefore, when you try to continue talking to the server over a previously established session, it will not recognize you. This means you’ll have to re-establish the session, which typically involves expensive cryptographic operations and sending a handful of messages back and forth before actually delivering the data you were interested in sending originally.
The blog post mentions Session IDs as a solution, but these require servers to be stateful, which can be challenging in most deployments. An alternative is Session Tickets (https://datatracker.ietf.org/doc/html/rfc5077), but these may cause issues when offloading networking to another device, such as a cellular modem, as their implementation may be non-standard or faulty.
incoming rant
These issues could be mitigated—or even solved—by using mature software platforms like Zephyr RTOS and their native networking stacks, which receive more frequent updates than traditional “all-in-one-package” SoCs. However, many corporations choose to save a few dollars on hardware at the expense of incurring thousands in software engineering costs and bug-hunting. It is often seen as more cost-effective to purchase a cellular modem with an internal MCU rather than a separate cellular modem and a host MCU to run the networking stack. It is one of the many reasons why many IoT devices are utter garbage.
Author here -- thanks for engaging in the discussion! You won't find any pushback from us on using Zephyr -- we are contributors, the firmware example in the post is using it (or Nordic's NCS distribution of it), and we offer free Zephyr training [0] every month :)
[0]: https://training.golioth.io/
> It is often seen as more cost-effective to purchase a cellular modem with an internal MCU rather than a separate cellular modem and a host MCU to run the networking stack.
This one isn't just cost--the compliance restrictions that the cellular carriers place on you are idiotic.
The big one we bumped into is "must allow allow carrier initiated firmware updates with no restrictions on scheduling" which translates to "the carrier will eat your battery often and without warning".
In addition, many IoT devices may not call home more than once every couple of months. And the carrier will happily roll out tower firmware that will kill those being able to call home.
If I use a module with my own firmware, the modem folks will simply point fingers at me. If I use a module with integrated SoC and firmware and it gets updeathed, I get the "joy" of yelling at the cellular module manufacturer.
(I had the wonderful experience of watching a cellular IoT project go gradually dead over 3 days as the carrier rolled out an "upgrade" across its system. We got a front seat as the module manufacturer was screaming bloody murder at the carrier who simply did "We Don't Care. We Don't Have To. We're the Phone Company.")
Your experience is bizzare. Was it Verizon (or AT&T/T-Mobile) and did you use a Cat-1bis/Cat-M/NB-IoT or higher class modem?
Normally US carriers require Firmware-Over-The-Air (FOTA) capability for the modem firmware. This is not the case for deployments outside of the United States, to my knowledge.
It would be interesting to hear more about your story!
Sounds like you need a more watertight contract and more lawyers.
A cellular carrier has a legal budget whose rounding errors are likely larger than your company's total worth.
Good luck litigating.
All the lawyers in the world won't protect you from paying the penalties in the contract you signed when you overtly and deliberately failed to deliver.
What makes a separate cellular modem better than an internal cellular modem? Is it because software updates are available for the separate modems?
I am evaluating some Nordic semiconductor parts for a project. They seem to have an internal modem but Nordic uses zephyr. Any thoughts?
> What makes a separate cellular modem better than an internal cellular modem?
The US 3G shutdown required some rather expensive and unexpected upgrades. Vendors signed long-term contracts with 3G providers, and then "someone" was on the hook, to replace something, when the 3G vendors terminated their contracts prematurely.
The deeper the modem was integrated into a product, the harder it was to change. The shallower the modem was integrated, the easier it was to change.
For example:
One of my cars just lost its internet connectivity, and the automaker never offered any way to fix it. (I didn't care, I only used Android Auto in that car.)
My employer (IOT) sent out free chips to our customers. They had to arrange someone to go do a site visit and swap a chip while on a phone call with us. We're small and early enough that it wasn't a big deal.
My solar panel vendor wanted me to $pend big buck$ on a new smart meter and refused to honor their warranty. I told them to run a cable to the ethernet port in my meter.
Car manufacturers should abandon developing their own entertainment systems and instead collaborate with Apple (CarPlay) and Google (Android Auto) to improve integration and support a wider range of use cases. Unfortunately, they seem to revive these efforts every 4-6 years (e.g. Mercedes Benz)
The only feature I need to control remotely in my car is preheating during winter—I wonder how they could achieve that without using cellular connectivity as paying a subscription for such a service would make it less attractive to me.
I like the Tesla car computer a lot more than Android Auto. I just wish I could install 3rd party apps.
Part of the reason is there's no delay waiting for my phone to connect, and no random disconnections while driving.
There are issues with infotainement systems in older vechicles supporting Android Auto and Apple Car Play. This is primarily because car manufacturers (or subcontractors) aren't good at integrating these systems. Often they are tight on budget too!
Personally, I’m not a fan of Tesla’s car computer. My main annoyance is the lack of physical buttons. All I want from my car is reliable navigation, seamless audio playback, and easy mobile call handling.
> What makes a separate cellular modem better than an internal cellular modem?
When using a separate cellular modem, you can connect it to your MCU via either a USB or UART interface. In IoT applications, UART is the more common choice. Then you can multiplex the interface with CMUX, allowing simultaneous use of AT commands and PPP.
With frameworks like lwIP or Zephyr supporting PPP, you can get your network running really quickly and have full control over the networking and crypto stacks. Using Zephyr you get POSIX-compliant sockets which allows you to leverage existing protocol implementations.
In contrast, using a SoC's often requires reintegrating the entire network stack, as they typically do not support POSIX sockets. I've worked on SoC's that only support TLS 1.1 and the vendor refused to upgrade it, as it would require them to re-certify their modem. Switching to a different SoC can mean repeating this process from scratch as different vendors implement their own solutions (sometimes even the same vendor will have different implementation(-s) for different modems).
> I am evaluating some Nordic semiconductor parts for a project. They seem to have an internal modem but Nordic uses zephyr. Any thoughts?
It runs on Zephyr RTOS and can be built as a standalone modem (https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/appli...). You can either use Zephyr's native networking stack or offload it to the modem while retaining the same API. This means you get access to all of Zephyr’s protocol implementations without additional effort. The design makes it feel as though you have a completely independent MCU connected to an external modem.
That said, it does have quirks and bugs, particularly when offloading networking. It also has relatively limited resources, with only 256 kB of RAM and 1 MB of flash storage.
Overall, it is the best SoC I’ve worked with, but it is still an SoC. Whether it suits your project depends on your specific use case. If you plan to integrate additional radios in the future (e.g., UWB, BLE, Wi-Fi), I’d recommend using a separate MCU if your budget allows. This will provide significantly more flexibility. Otherwise it is definitely one of the better SoC's currently in the market, to my knowledge.
PS. It only supports Cat-M (& NB-IoT but I'm going to skip over it intentionally!) which is not globally supported, so you should make sure the region you want to deploy in supports that technology.
Security.
On one hand, licensing requirements and regulation often mean that modems are locked down in terms of firmware updates, reference documentation, source, and capabilities. This often translates into a larger "black box" area, and one embedded inside your SoC instead of physically separate and connected over a serial bus.
On the other, on-chip modems often (not sure about those Nordics) have DMA.
The combination of those two is scary.
> The combination of those two is scary.
Could you clarify what you mean by this?
cellular modems go nonfunctional/obsolete much faster than other systems. 3g is almost entirely gone worldwide. 4g is still around, but providers are already reducing how much their towers dedicate to it. The standards body is working on 6g, who knows when that will come and push out older stuff.
If the case of my car I don't care - I have never found a use for the cellular connectivity it has (if any). However there are lots of other devices where the cellular connectivity is important and users will want to upgrade the modem to keep it working. If cellular connectivity is just a marketing bullet point nobody cares about then integrated is good enough, but if your device really isn't useful without the modem make that modem replaceable for somewhat cheap.
4g should survive better than 2g and 3g, because the 5g standard allows for mixed mode deployments where the coordination channel runs as 4g, and the slots can be 4g or 5g dependening on what the client device is capable of. Running the coordination channel with 5g encoding could be a little more efficient, but it's not a big loss compared to running a minimum size 2g/3g allocation.
EDGE and E-GPRS is going to survive for at least another 5-10 years. Although this statement might not be globally accurate.
Generally the modem would support EDGE/E-GPRS (2G) and 3G simultaniously. In Europe, EDGE/E-GPRS is still quite popular and unlikely to be sunsetted in the next 5 years. It is a pity that many manufacturers tightly integrated the modem into their PCB, instead of creating a separate "communication box" - especially as some use-cases have the budget for it.
> The blog post mentions Session IDs as a solution, but these require servers to be stateful, which can be challenging in most deployments.
Doesn't this just move the 'state' into the operating system, or networking layer, in the form of an active TCP connection?
You can use Session Ticket's w/ UDP too.
Nevertheless, with modern cloud providers moving the state into the OS/Networking layer is still easier to scale. You don't need to write your own services to handle it.
Is DTLS a workaround for the session issue? Haven't had much experience with it myself but it does cut down some of the statefulness.
You don’t want to redo the handshake (which, as far as I know, is identical to TLS) every time you send a packet using DTLS. Therefore, you still need to retain state information. In my opinion, using DTLS (UDP) is considerably more nuanced than using TLS (TCP) in an embedded IoT context.
The post details the use of CoAP over DTLS, employing Connection IDs.
[flagged]
Could you elaborate for someone who is unfamiliar?
Zephyr RTOS[1] is an RTOS supported by the Linux foundation. It has similar structures to Linux for device configuration like a device tree and tries to emulate POSIXish APIs. I think some embedded people are put off by this configuration structure.
Notably, Zephyr RTOS is the basis of Nordic Semiconductor's SDK[2]. Nordic is a major manufacturer of cellular and wireless MCUs.
[1]:https://www.zephyrproject.org/ [2]:https://docs.nordicsemi.com/bundle/ncs-latest/page/nrf/index...
Edit: GP seems to have a history of posting 1 line hot takes with no elaboration
[dead]
You prefer ThreadX?
Is it used by the army of Israel against civilians? Because I can't think of any other circumstance that would justify such a strong condemnation.
That NAT is a problem presumes that we actually want our IoT devices reaching out to the out-of-intranet zone.
NAT gets the blame, and the intranet as a concept is generally a big corp term.
But I prefer my IoT devices not to need to reach out of my network. For me, NAT is an unwitting ally in the fight against such nonsense.
The mere existence of Tailscale should give a hint that NAT is only a speedbump and not any protection whatsoever. It protects you against nothing. Every method that Tailscale uses to traverse NAT can be in isolation used by any other piece of software. For more info about that you can read the following article.
https://tailscale.com/blog/how-nat-traversal-works
What people really want is a firewall, and since NAT acts as a firewall, they confuse it with that.
My university has a public IP for every computer, but you could still only connect to the servers, not random computers, from the outside. Because they had a firewall.
What ordinary people (as opposed to IT departments) really want is firewall that can't be accidentally disabled by pushing an overly permissive firewall rule.
NAT/port forwarding, for all their faults make it rather difficult to write rules allowing traffic to a machine you didn't intend to expose to the world.
Consumer routers have very similar UI for managing an IPv6 firewall as IPv4 NAT port forwarding.
This is not in any way a benefit of NAT.
Then... make the firewall UI so that you can't accidentally push an overly permissive firewall rule?
Just because NAT accidentally achieves some good outcomes doesn't in any way imply that said good outcomes are somehow exclusive to NAT.
Yeah but the average person wouldn't know to set up a firewall (and can't count on their ISP to have their best interests at heart.) Therefore the general public benefits from the degree of protection that NAT provides.
Almost 50% of internet traffic is IPv6.
Obviously, those average people have a suitable firewall provided by default on their routers.
I think the vast majority of that is from phones?
It will vary by country, but for example all but one of the large broadband ISPs in the UK use IPv6.
Do they?
Then just enable the firewall by default, or don't even provide a way to disable it unless the user enters "developer/advanced/Pro (tm)" mode. None of these are valid excuses for NAT.
"not any protection whatsoever" is way too strong a statement. NAT does raise the bar to exploiting a random smart lightbulb in your house significantly higher.
The big distinction is that for Tailscale both endpoints know they want to talk to each other, and that both have Internet access. That's not the usual case firewalls are designed for.
Tailscale doesn't strictly need NAT traversal. They can run only their DERP servers and still continue to work. If your firewall tries to block two devices from communicating and yet allows both devices internet access, you have already lost.
Sounds like you like the idea of a stateful firewall, and good news: There are stateful firewalls for IPv6!
They have all the upsides of NATs (i.e. an option to block inbound connections by default), with none of the downsides (they preserve port numbers, can be implemented statelessly, they greatly simplify cooperative firewall traversal, you can allow inbound connections for some hosts).
I found it weird that IPv6 folks are so against NAT as a cultural thing when it works perfectly well on IPv6. They're not fundamentally opposed.
I could have all of my servers in public subnets and give them all public IP addresses, but I still prefer to put everything I can in private. Not only does the firewall not allow traffic in, but you can't even route to them. It now becomes really hard to accidentally grant more access than you intended.
I would hazard that most devices on there internet are in the boat of want to talk to the internet but not be reachable on it.
Yea IPv6 folks are indeed against NAT philosophically because it's considered one of the big mistakes of IPv4.
There is a distinction between being publicly addressable and publicly routable. You can have the former without having the latter.
If you want more private addresses, IPv6 has a solution too: use ULAs and not GUAs. Design your internal network so it has mostly ULAs for application servers, database servers and the like, except for the reverse proxy having both publicly accessible GUAs as well as ULAs for talking to the rest of the network.
I personally use ULAs and GUAs concurrently on my network, because I have a residential ISP where my GUA prefix is not fixed.
I'm not opposed to anyone voluntarily using a NAT at all. I just hate it when somebody makes that decision for me, and that unfortunately still happens all the time.
If it's a well-reasoned decision, sure, but I do suspect that more often than not it's a lack of knowledge about alternatives that makes people still opt for NATs, and that just makes me sad on top of being annoyed with the inconvenience of having to tunnel when a direct connection seems so close at hand.
> I would hazard that most devices on there internet are in the boat of want to talk to the internet but not be reachable on it.
I highly doubt that. One big example is VoIP: Incredibly common these days, yet so much of it is going through centralized relays, and often for absolutely no technical reason.
How do you feel about NPT?
For a personal network where you decide to use NAT on ipv6? Sure, go ahead.
Being forced into a CGNAT on ipv6 is just a dick move though. And I believe that's the kinda NAT that has coloured the opinions of most NAT for ipv6 detractors.
If you don't want that, then complain about a lacking a configuration as such and configure your firewall so that that they can't. But don't cheer on something that's breaking functionality that others might want (especially if it doesn't actually achieve your own goals reliably).
Oh, I do that too.
But to your point about not cheering on NAT, well I will because I see NAT as useful tool.
It is not an opinion well aligned with the preferences of the IETF. But the purist model of transparent end-to-end networking has never sat well with me. It’s just not a thing we want.
So because you don't want it means that nobody can get it?
Because that's what happens if you advocate for NAT by default. Conversely, with "IPv6 + inbound-blocking-firewall on CPEs by default", everybody gets the same behavior by default, and people that want something else can get that instead.
A telematics tracker in a vehicle that logistic companies use (e.g. Amazon, FedEx) is also considered as an IoT device. I don't believe that the author is talking about Smart Home appliances exclusively.
What NAT are you using that doesn’t have a firewall? I haven’t personally used one of those since the ‘90s.
The first NAT I used in the middle 90's was IP Masquerading in the Linux kernel, by Pauline Middelink. That had a firewall.
Same, but ISTR we had some Cisco gear where NAT (or PAT as they insisted; yes, I know difference; no, no one cared) had a different license or hardware requirement or something from firewall rules.
I agree until I discover I'm doing something where I want to access/change that device. It is really nice when I'm returning home early that I can change my thermostat out of vacation mode. I've often wished I had a way to tell if I left a door unlocked.
Security and privacy is of course critical to all this, but the concept of internet itself is not wrong.
That's what a VPN is for. Every router I've had in the past decade has had support for running a VPN server so you can have one running 24/7 without any additional hardware. Even my retired elderly parents run a VPN server on their home router.
> retired elderly parents run a VPN
Does that VPN use certificates or a pre shared key? Do they understand the different security implications between those two choices?
I hope you're not implying that allowing IoT devices access to the internet is not a massive security vulnerability. IoT devices are notoriously insecure and poorly maintained. I'd much rather have LAN-only IoT devices and an internet-accessible VPN server than letting IoT devices access the internet.
But theirs uses certificates (the router UI generates the openvpn client config files with the certificate embedded inside it) and no, they do not understand the security implications between those two choices.
Mine is a wireguard VPN with both the pub/priv keypair and PSK.
I'm implying that a VPN is not a "silver bullet." In particular since they don't understand the model, are stuck with a vendor implementation, and probably never update their router firmware.
There's a reason IoT vendors try to do this all "in device."
Especially if the data is unencrypted and only authenticated by source ip and a long lived token-like thing
What I want from platforms, and I fought for at one time in Google with no success: a platform API that provides applications a way to schedule packets when the radio turns on.
The mobile platforms in particular continue to assume that you can live with their high level HTTP stuff, and it's just not good enough. The non-mobile platforms largely don't even approach the problem at all.
Indeed, that sounds like an obvious feature. Hard to believe it hasn't been implemented! I'd love to have that feature on Linux desktop/laptops. I think you could make lots of applications behave a whole lot better.
the arguments from folks on the "architecture review boards" was that multiple connections are always bad and that developers can't be trusted. i'm willing to accept that they did get beat up quite a bit over power at various points when at times applications were a big part of the problem. That said this is also a gross misunderstanding of the problem and overall solution space, as well as very much gatekeeping.
Dump the entire queue to a google server (confirm receipt and shut down the connection) and then have the google server forward all the data to its destinations?
Seems to me that would be the lowest power, lowest developer trust, lowest number of connections, maximum gatekeeping method.
This article should clarify at the start whether TCP or UDP is under consideration. NAT idle timeouts for both are typically very different. RFC 5382 [0] specifies no less than 2 hours and 4 minutes for TCP. RFC 4787 [1] specifies no less than 2 minutes for UDP. Towards the end of the article it becomes clear that it's UDP.
The example diagrams also incorrectly show port numbers exceeding 65535. The port fields in TCP and UDP headers are 16 bits [2].
[0]: https://www.rfc-editor.org/rfc/rfc5382 [1]: https://www.rfc-editor.org/rfc/rfc4787 [2]: https://textbook.cs161.org/network/transport.html
That's interesting. Do NATs in the wild tend to be spec-compliant with their timeouts?
I have worked on a device with this exact same "send a tiny sensor reading every 30 minutes" use case, and this has not been my experience at all. We can run an STM32 and a few sensors at single digit microamps, add an LCD display and a few other niceties and it's one or two dozen. Simply turning on a modem takes hundreds of microamps, if not milliamps. In my experience it's always been better for power consumption to completely shut down the modem and start from scratch each time [1] - which means you're paying to start a new session every time anyway. Now I'll agree it's still inefficient to start up a full TLS session, a protocol like in the post will have it's uses, but I wouldn't blame it on NAT.
[1] Doing this of course kills any chance at server-to-device comms, you can only ever apply changes when the device next checks in. This does cause us complaints from time to time, especially for those with longer intervals.
Power Saving Mode (PSM), a power-saving mechanism in LTE, was specifically designed to address such issues. It allows the device to inform the eNB (base station) that it will be offline for a certain period while ensuring it periodically wakes up to perform a Tracking Area Update (TAU), preventing the loss of registration. This concept is similar to Session Tickets or Session IDs in (D)TLS—or at least, that’s how I like to think about it. However, there are no guarantees that the operator will support this feature or that they will support the report-in period that you want!
Maintaining an active session for communication between the endpoint and the edge device is highly power-intensive. Even with (e)DRX, the average power consumption remains significantly higher than in sleep mode. Moreover, the vast majority of devices do not need to frequently ping a management server, as configuration and firmware updates are typically rare in most IoT deployments.
Great pointer! My sibling post in this thread references a few other blog entries where we have detailed using eDRX and similar low power modes alongside Connection IDs. I agree that many devices don't need to be immediately responsive to cloud to device communication, and checking in for firmware updates on the order of days is acceptable in many cases.
One way to get around this in cases where devices need to be fairly responsive to cloud to device communication (on the order of minutes) but in practice infrequently receive updates is using something like eDRX with long sleep periods alongside SMS. The cloud service will not be able to talk to the device directly after the NAT entry is evicted (typically a few minutes), but it can use SMS to notify the device that the server has new information for it. On the next eDRX check in, the SMS message will be present, then the device can ping the server, and if using Connection IDs, can pull down the new data without having to establish a new session.
Is "Non-IP Data Delivery" (basically SMS but for raw data packets, bound to a pre-defined application server) already a thing in practice?
In theory, you get all the power saving that the cellular network stack has to offer without having to maintain a connection. While on protocol layer NIDD is almost handled like an SMS (paging, connectionless), it is not routed through a telephony core (and hence sloooow). The base station / core will directly forward it to your predefined application server.
It has been heavily advertised, but its support is inconsistent. If you are deploying devices across multiple regions, you likely want them to function the same way everywhere.
802.11 supports the same thing. A STA (client) can tell an AP that it'll be going away for some time, and the AP will queue all traffic for the STA until it actively reports back. Broadcast traffic can also be synchronized to particular intervals (but low power devices are usually not interested in that anyway for efficiency reasons).
I have very little experience with Wi-Fi, as the industry I worked in relied almost exclusively on cellular networks. However, I wonder how many Wi-Fi routers actually support this functionality in practice - as queing traffic means you need to cache it somewhere.
Author of this post here -- thanks for sharing your experience! One thing I'll agree with immediately is that if you can afford to power down hardware that is almost always going to be your best option (see a previous post on this topic [0]). I believe the NAT post also calls this out, though I believe I could have gone further to disambiguate "sleeping" and "turning off":
> This doesn’t solve the issue of cloud to device traffic being dropped after NAT timeout (check back for another post on that topic), but for many low power use cases, being able to sleep for an extended period of time is more important than being able to immediately push data to devices.
(edit: there was originally an unfortunate typo here where the paragraph read "less important" rather than "more important")
Depending on the device and the server, powering down the modem does not necessarily mean that a session has to be started from scratch when it is powered on again. In fact, this is one of the benefits of the DTLS Connection ID strategy. A cellular device, for example, could wake up the next time in a completely different location, connect to a new base station, be assigned a fresh IP address, and continue communication with the server without having to perform a full handshake.
In reality, there is a spectrum of low power options with modems. We have written about many of them, including a post [1] that followed this one and describes using extended discontinuous reception (eDRX) [2] with DTLS Connection IDs and analyzing power consumption.
[0]: https://blog.golioth.io/power-optimization-recommendations/ [1]: https://blog.golioth.io/turn-off-subsystems-remotely-to-redu... [2]: https://www.everythingrf.com/community/what-is-edrx
Interestingly, IPv6 is not listed as the solution
An IPv6 router with a stateful firewall blocking incoming connections could have just the same issues with timeouts, I'd imagine. Switching to IPv6 doesn't just mean that anyone can make a P2P connection to anyone else (even STUN needs a third-party server to coordinate the two peers).
(D)TLS session resumption (I'm not sure if their "Connection IDs" are that or something similar) seems like the most foolproof solution to this scenario, assuming that the remote host can support it.
But it'd be trivial to tell it to free the device from it, unlike with NAT, where you pretty much have to expire sessions to not run out of memory.
Not if the end user isn't in control of the firewall. (And if they were, then they could just forward dedicated ports for the devices they need.) It might not be as bad as the CGNAT situation, but there are plenty of big WANs that can't be reconfigured at will.
I have firewalls in v4 and v6 networks which don’t do any natting (well other than some 6-4 between them). They track sessions for security purposes, and they time them out for both security and memory reasons.
>An IPv6 router with a stateful firewall blocking incoming connections could have just the same issues with timeouts, I'd imagine.
You'd be surprised... PCP (Port Control Protocol) implemented by large vendors such as Cisco and Apple are able to punch through a firewall for up to 24 hours in a single session.
https://github.com/Self-Hosting-Group/wiki/wiki/Port-Mapping...
The weird fear around it is crazy. It’s mostly just bigger IPs and it makes so much complexity and ugly hacks like NAT go away.
and also so many other things. ARP goes away, dhcp goes away -- yet people reinvented dhcp anyway and did it wrong (IHMO).
I'm of the opinion that IPV6 changed some small things just enough to get people to have to learn new stuff -- and also forgot that NAT is not a firewall, somewhere along the way.
IPv6 is its own complexity.
Why?
Great question that I don't think the other commentors really answered. Sadly I don't really know "why" either. Second system effect?
Instead of making IPv6 just "IPv4 with 128bit instead of 32bit address space" the designers changed a bunch of things. NDP is new and different than ARP. ICMPv6 is extended and different. MLD instead of IGMP for multicast group membership. DHCP is extended and has a bunch of other alternatives, like SLAAC. IPsec is mandatory now.
All the new IPv6 protocols seem more secure, more powerful, and more efficient. So I guess that is "why". But I bet in an alternative history where backwards compatibility and the upgrade process were prioritized IPv6 would be a lot more widespread.
In general, the IPv6 stack is not supported on many smaller mcu. It is not that options like FreeRTOS can't support the stack, but rather the resource constraints pose a challenge.
That being said, most modern SoCs are competitively priced... and will boot Linux just fine for under $5/part. =3
Some of the new IoT protocols build on ipv6 natively.
The resource overhead is minimal for modern mcus. Dropping dhcp and arp can save a lot of resources too. Also I have mcus with more ram than my first pc.
Disabling IPv6 in ESP-IDF can save about 40KiB of flash and 2KiB of RAM. Not enough that I'd do it by default, but I've hit the limit of some of my hobbyist ESP32s to the point where I've disabled modules to cram my code in there.
Disabling IPv4 saves 25KiB of flash and less than a KiB of RAM. If you're down to the last kilobytes, disabling IPv6 makes more sense than disabling IPv4. Both options are choices of last resort, though.
A dual core 240MHz cpu with hardware paged flash and spi ram expansion support is not "small" in my opinion.
However, I am a fan of their price point for hobby projects. Best regards =3
Eliminating NAT makes virtual networking much hairier. E.g., my desktop is currently connected to a big enterprise network that keeps track of all devices and will only allocate one IP address per MAC address. That leaves no IPs for any VMs on my desktop to use, so they must go through NAT if they want to communicate with the outside world.
Based on your description, you're using NAT as a means to bypass network restrictions. It's a great hack that will do the trick wonderfully, but if that's the officially proposed solution then the network is not set up for what it's actually used for.
Contrary to popular belief, IPv6 does have NAT, it's just stupid and unnecessary in most use cases. There are even different kinds of NAT; there's the "swap the public prefix with a private one" NAT (useful for shit ISPs that rotate your IPv6 prefix to make you pay extra for a static one, so your addresses don't keep changing), or the IPv4-like "masquerade all traffic when forwarding" one that'll work in your weird enterprise network.
With the latter option, you can set up an IPv6 with DHCPv6 on the upstream and whatever combination of SLAAC or DHCP you want on the VM side, and set up NAT using guides like this (possibly outdated iptables-based) one: https://forums.raspberrypi.com/viewtopic.php?t=298878
> Based on your description, you're using NAT as a means to bypass network restrictions.
The threat model on this big enterprise WAN isn't "an authorized user creates a VM on an existing machine", but rather "some rando plugs something in with an Ethernet cable and goes to town on the intranet services". It has a whole web portal where you can log in and register your devices. So it's not like anything's really being bypassed here.
Anyway, I still prefer NAT for VMs over most passthrough schemes, which I've found unreliable even on IPv4 networks. What the outside network doesn't know about can't hurt it.
> The threat model on this big enterprise WAN isn't "an authorized user creates a VM on an existing machine", but rather "some rando plugs something in with an Ethernet cable and goes to town on the intranet services". It has a whole web portal where you can log in and register your devices. So it's not like anything's really being bypassed here.
Isn't that what the MAC filtering is for? I don't see the point of stopping a single registered MAC on a single port from having multiple IPs.
I might be misremembering some of the details (in fact, given that the web portal works, I think it might allow new devices to have local IPs, but bounce all outbound packets until they're registered), but however it works exactly, it does not like passthrough networking on VMs in practice.
(If I had to guess, just letting any machine have a million new IPs for the firewall to track has its own issues, so you'd end up with policies upon policies.)
In any case, sometimes middleboxes just don't behave precisely how we want them to, and that's why I'm skeptical of the typical IPv6 position of "a flat /64 network (or something emulating one) is all you'll ever need".
The firewall shouldn't have to track IPs, just the connections it would be tracking no matter what.
And sure a million IPs could cause problems, but that's not a good reason to set the limit to 1.
> I'm skeptical of the typical IPv6 position of "a flat /64 network (or something emulating one) is all you'll ever need".
That's not the position. You can have as many networks as you want. Connect them with routers like you would on IPv4.
If IT doesn’t want random things plugging into the network? That’s what 802.1x is for
On Virtualbox you can set up ipv6 NAT with a few clicks btw.
Are you talking about eliminating NAT on IPv4?
IPv6, set up the normal way, does not allocate IPs. Devices can and will have several at once and their VMs can easily get some too.
The enterprise corrupted IPv6 and disabled SLAAC so devices have to use whatever DHCPv6 gives them.
How does that work for android devices? I heard they don't allow dhcp provided addresses?
The Android problem only occurs in an IPv6-only enterprise network which is very rare. Otherwise Android can use IPv4.
That's definitely a limitation of your network. I don't see how ipv6 can shoulder any responsibility here.
"Doctor it hurts when I". Keep track of all devices, and alloc a static /64 per (host) MAC.
And even in the face of a supposed malicious network admin like yours, you can still adopt IPv6, and use the IPv6 private network ranges for the VMs, and then do NAT like it's 1983, same as you would have otherwise done with IPv4, except that all other uses cases get easier, while this one use case remains as hard as it was before.
You can do local NAT. Most VM hosts will do it. You just don’t need NAT everywhere anymore.
Shouldn't the VMs have their own MACs?
If they're not routed directly through the network card but use a virtual switch instead, the network upstream only sees the server MAC. If they're directly hooked into the network card they can have their own MACs, but that approach can make it hard to communicate between host and VM.
Also, multiple MACs does not work with wlan.
Why would it not? There's even a third MAC address field in the 802.11 headers for just that scenario, as far as I remember (i.e. to distinguish between "802.11 source/destination" and "Ethernet-level actual source/destination").
You actually need to use special 4addr/WDS frames with 4 addresses[1]. However no one accepts these by default so it's unusable for a Laptop that you want to connect to random wlan networks.
[1] https://lore.kernel.org/netdev/1258465585.3682.7.camel@johan...
Hm, maybe I'm mixing up something here, but all I can say is that it works like this on my machine: I have a VM running in bridged mode, I'm connected to my home router using Wi-Fi, and I can see the virtual MAC address of the VM on the router management interface.
Possibly this is just at the DHCP level, though, and what the router sees just comes from my host MAC address? A quick packet trace suggests that.
It was 3 months ago: https://news.ycombinator.com/item?id=41884515
Few cellular modems commonly used in IoT support IPv6, and not all mobile network operators provide an IPv6 address. Since cellular connectivity plays a major role in the industry, IPv6 cannot be used as a blanket solution to this problem.
Either it would or it wouldn’t help in most cases, but the absence of consideration of it at all weakens the article’s arguments from a carrier perspective. IPv6 adoption was at 90% by US mobile carriers a couple years ago, and the US is not known for its telco infrastructure investment; so, while using IPv6 may not be a uniform cure for their issues, the article’s total focus on legacy IPv4 NAT issues is in stark contrast to its availability carrier-side in one of the weakest examples available. China regulates that IPv6 be supported and enabled by default on all hardware sold for use in-country since last year, and telcos have six months left until a first-stage IPv4 new-hardware prohibition goes into effect later this year, so the assumption that most cellular modems don’t support IPv6 seems unlikely as well given their regulatory climate. This deserves more research or at least an explanation of why such was not done for the initial release of the paper.
There are still countries with zero IPv6 on mobile, e.g. New Zealand. All three mobile carriers refuse to deploy IPv6 on mobile despite two of them doing so on their fixed line networks. Do agree with your point tho IPv6 should be considered a possible solution.
On modern firewalls/routers, NAT is only one cooks in the network kitchen raining on this author's parade. Stateful Packet Inspection has timeouts too!
But for TCP, you don't even need to be stateful just to prevent inbound connections, which is a huge win over NATs.
UDP still needs state tracking, unfortunately.
It seems brave to let an IoT device talk to someone over an unencrypted wan though. They're often of pretty varying software quality and rarely updated.
If you really want IoT wifi devices, put them on a separate wifi, and only let them talk to a local device that you can keep up to date. Assume they're vulnerable to local attacks over wifi and act accordingly, e.g. don't give the IoT wifi access to your other devices beyond to that controller, and definitely not to the wider internet.
If they're closed source, assume they're already compromised from the factory
The issue in IoT is that most people expect to be able to control their IoT (e.g. Smart Home) devices from outside of their network. This requires you to have a central server these devices communicate with or a similar deployment on-premises with a public IP address.
I've always wondered how it is economically feasible to run these central services without a monthly subscription. If you stop selling devices you'll go under fairly quickly.
I have all my "Tuya" IoT devices on a separate network, isolated from my personal intranet, and the cloud servers these devices talk to are in China. I pay nothing to "Tuya" for their cloud service, never have. If the "Tuya" service ever goes down, my home automation is also down, and that sucks. I'm trying to replace it all with "Tasmota" devices now, but it's not quite as easy or as cheap to do.
Yeah, sadly, people care about convenience and not security. So you get things like this:
https://www.malwarebytes.com/blog/news/2024/04/ring-agrees-t...
and somehow they are still in business, and popular.
If you do care about security, keeping your home-automation within your own control is probably the only sane path. Homeassistant and similar open source things like openhab are pretty good if a bit fiddly, and a wireguard vpn like tailscale a fairly practical way to access it when away from home.
> If you do care about security, keeping your home-automation within your own control is probably the only sane path.
It is also very expensive, so only a minority of informed users would select such a route. Not to mention that it is not trivial to setup.
To my knowledge the only company that has decent security practises is Ubiquiti.
What the article describes is actually fairly standard in the IoT industry. I'm surprised there have been so few lawsuits.
NAT breaks the central idea of IP (i.e. packet switching with stateless intermediary nodes). All of its problems are essentially downstream from there.
What you can do is port forwarding. You have a bunch of devices behind a 1:N NAT, so they share one IP address. For specific services on those devices, you can pair dedicated ports with this IP address, binding them to internal IP:port pairs.
It's not a perfect solution for every scenario, and requires configuration, but there it is.
This is how people on residential lines run web servers, mail servers, ... they map TCP ports like 443, 80, 22 and 25 on their router to go to specific internal hosts.
Doh!
With CG-NAT this doesn't work. Multiple customers are sharing the same IP address, all of which are sitting behind a NAT. Further the internet gateway is a NAT sitting behind the CG-NAT. And if you prefer to use a nice Mesh WiFi router, well that's a third NAT layer.
Common suggestions I've heard:
"Use a VPN"
I tried to buy a computer from Apple directly. They detected the VPN and wouldn't let me purchase it. I turned off the VPN and the purchase worked. I then got a call from Apple asking me if I really did intend to buy the computer.
"Use Tailscale"
On my setup, tailscale can't really navigate the NAT setup correctly. I have a 200Mb downlink and I will often only get a 10th of that through tailscale. Also the routing table no longer works as routing is handled by a myriad of netfilter rules -- which may or may not conflict with docker on the same box.
"Use Ubiquiti (or other networking gear) to get rid of the last NAT"
Gah. I didn't need to perform a huge cash outlay 10 years ago and rewiring the house to do the same thing I'm trying to do today.
I don't have my own external IP address (even if dynamic) + I want my devices to be spontaneously contactable from the network without polling anything from the inside = does not compute
Right and that's what breaks the internet.
True in the general sense, but don't do this for IoT devices. Always remember that the S in IoT stands for Security
This has poorly considered generalizations and reads like an ad.
An ad for... the IETF? All of the firmware discussed in this post is open source, and we even contributed DTLS Connection ID server side support to a popular open source library [0] so other folks can stand up secure cloud services for low power devices. Sure, we sell a product, but our broader mission is making the increasing number of internet connected devices more secure and reliable. When sharing knowledge and technology is in service of that mission, we do not hesitate to do so.
[0]: https://blog.golioth.io/golioth-announces-connection-id-supp...
Wasn't IPv6 suppose to solve all this?
I don't understand why that stalled.
Also considering the state of iot security its probably not a great idea to have everything accessible anyway. But that's a slightly different problem to solve.
It’s not going to solve a stateful firewall timing out.
You try to continue a tcp session that’s timed out on my firewall and the packets will be dropped.
This applies a fair amount t to me when I suspend my laptop, my ssh session will drop as both the server and the firewalls drop the session while it sits there peacefully. When it comes back the tcp packets get sent into the void.
Meanwhile my WireGuard connection which runs through two separate ipv4 nats works just fine, as it doesn’t rely on sessions or a server timing out a socket.
Nat is irrelevant to the problem.
Why not let through all ipv6 traffic? It seems improbable that your device is going to get caught up in a ipv6 scan; the address space is too vast - I'm assuming here.
Just have relatively secure devices using ipv6 - not shipping with standard default admin passwords would nearly be enough if there aren't obvious vulnerabilities.
Why are we still making IoT devices that talk to anything other than a local hub? They don't need to care about NAT.
Isn't this the whole point of Thread? You have a low power mesh network for inside a deployment (house, office, factory, whatever) and it's the border router which does the "communicate with the internet/lan/wider network" piece. These are typically plugged in (so no worries about having to be low power) have plenty of memory (so no worries about having to drop established routes like a router).
> This doesn’t solve the issue of cloud to device traffic being dropped after NAT timeout (check back for another post on that topic), but for many low power use cases, being able to sleep for an extended period of time is less important than being able to immediately push data to devices.
Is "less important" a typo?
Yes, thank you for calling out. It should read "more important". This has been corrected.
NAT was an good solution that the IETF came up with, we wouldn't be able to have the internet at our scale without it
NAT was introduced by private company called Network Translation Inc. and successfully broke efforts to migrate off IPv4 (which was supposed to be EOLd by 1990) and permanently broke the "network of hosts" into asymmetric one of servers and clients.
Note that we had a solution for address exhaustion by 1991, but it was just "good" and not "perfect" and worst of all it used the hated OSI protocol stack (TUBA - TCP & UDP on top of OSI CLNS - also known as IPv9). It even had at least two usable implementations at the time it was proposed (for SunOS and Cisco IOS)
> permanently broke the "network of hosts" into asymmetric one of servers and clients.
That was inevitable and can't reasonably be blamed on NAT. As but a few examples. ISPs arbitrarily break things unless you pay them extra. Stateful firewalls are a good thing. Even on my local LAN I can't reliably SSH into my laptop because it's on WiFi and does funky power saving stuff with the chipset I guess. My phone is far worse than my laptop in that regard.
Setting up a reliable and widely reachable server requires deliberate effort regardless of the existence of NAT.
First product, but does the early RFCs and discussions predate their work?
There's some work predating it, but it was very much undeployed before PIX let out the genie out, because even the early NAT work notes it might be problematic even in short term.
People would have resisted TUBA the same ways they're resisting IPv6 now. It's not a technical problem.
It's partly technical (BSD Sockets being bad API that hard does low level proto ok details in applications) and partially business - vendors didn't want to do the work to upgrade software and hardware - especially with advent of CEF and similar hardware routing options. And by 1990s the government-led standardisation efforts that gave us widespread ethernet and IPv4 got axed, and efforts to make vendors update if only for federal contracts died in waiver hell.
The others kinds of problems are from there over time.
The thing I like about NAT is that it is essentially an ISP side stateful firewall. I would migrate to the ipv6 globally addressed mode immediately if my ISP had a checkbox "disallow all incoming connections".
> I would migrate to the ipv6 globally addressed mode immediately if my ISP had a checkbox "disallow all incoming connections".
Does your router not already do that by default?
As another commenter already clarified, I meant CGNAT on the ISP side. I don't believe any ISP currently offers an equivalent firewall on their side.
ISP side? Hopefully not, or rather, only if you're behind one of those awful CG-NATs (and I'm not aware of any that let you actually configure port forwarding, although my knowledge here might be outdated; fortunately I haven't had to deal one in a long time). Otherwise, it's usually your CPE doing the NATting.
It sounds like you want an ISP-provided stateful firewall though, upstream of your (metered, slow) connection, which I'd agree would be a great feature to have!
It's funny. They'll block things I don't want them to block (email, http server, ...) but not unsolicited inbound IPv6 connections.
Your router almost certainly has that option.
Of course, it's probably the default everywhere, but with NAT the traffic never reaches me in the first place.
NAT is done on your router. There is no difference with IPv6 firewall except doesn't do NAT.
Are you thinking about CGNAT which is done by the ISP? That results in double NAT which causes problems.
You're right, I was thinking of CGNAT. My ISP definitely does it (they have far more users than IP allocations) and my router does NAT as well, so I guess I have a double NAT.
Interesting. My ISP passes both IPv4 and IPv6 inbound, expecting you to block them yourself.
Unless IPv6 were to be actually adopted as it was introduced
I don't know networking all that well. In my mind, I have 50 devices connected to my router behind NAT. My Mac, My Apple TV, my iPhone, My PC, My Linux Box, My partner's versions of all of those. My video games. Etc
From outside there's 1 IP address. With IPv6, every device would get it's own address outside. Why do I want that? That sounds less private to me. Am I mis-understanding something? Lots of traffic on one IP address sounds more obfuscated than all separate.
With IPv6, every device has multiple IP addresses. One or more addresses that are rotated* to prevent you from being tracked easily, and one that's derived from your device's MAC address so you can make your devices easily accessible from WAN by opening ports in your firewall if you want to.
You could disable the rotating addresses, or disable MAC-based ones by using DHCP, but there's usually no point.
As for why you would want something like that: a whole bunch of software and hardware breaks because of NAT. Consumer NAT has some monkey patching inside of it rewriting some protocols to make them work again (which also allowed random websites to open arbitrary ports to arbitrary addresses in some Linux routers a while back, because NAT overrules firewall settings to work) but there are still limitations.
For instance, if you're having issues with your Nintendo Switch, Nintendo will tell you to forward every single port to your Switch (https://en-americas-support.nintendo.com/app/answers/detail/..., hope that IP address doesn't get reassigned to an unpatched device later). Multiple Xbox consoles behind the same NAT requires tricking them into super-restricted-NAT mode to work, or enabling UPnP which allows devices to open ports in your firewall without any authentication.
NAT just kind of sucks. IPv6 wasn't ready for deployment when NAT gained popularity, but all of the reasonable problems have been solved over a decade ago.
*=default rotation happens daily, but your OS may allow you to pick a shorter duration. I've found out the hard way that setting this to five minutes will fill up Linux' route table real fast after a few days.
Does it matter if they rotate if you use prefix delegation with standard size?
No, it doesn't. At least the last time I checked unless you go out of your way to implement a non-standard configuration IPv6 is a disaster for personal privacy for the typical multi-user household.
Then again, the "typical" multi-user household is likely logged in to most things via SSO with Google or Facebook and probably has approximately zero fingerprinting mitigations in use so perhaps it isn't worth worrying about?
If you aren't the typical household then given 2^64 addresses and a Linux box serving as a router you've got quite a few options available. Including various creative reinventions of NAT that don't break basic functionality.
> IPv6 is a disaster for personal privacy for the typical multi-user household
Why? With privacy extensions (which are normally enabled for user devices), then all someone can do is look at the prefix. This is identical to looking at the IPv4 address in a NAT setup, and it hasn't been that much of a privacy disaster.
As I see it, nothing is lost on that front.
> This is identical to looking at the IPv4 address in a NAT setup
It is not identical unless the OS uses a new IP for every new outbound connection. I believe that would qualify as a (very) nonstandard configuration.
> it hasn't been that much of a privacy disaster.
Indeed, it was tongue in cheek which is why I went on to point out SSO. The reality is most people aren't willing to sacrifice convenience to retain even a shred of privacy.
If you are one of the few who care then you can implement one of the many possible non-standard solutions.
Even disregarding fingerprinting, a single household doesn't have enough traffic from separate devices/users to the same servers to really matter from a privacy standpoint.
If my PC uses the same IP as my partner's to talk to Google, it hardly matters for our privacy if they mix up the attribution of traffic between the two of us.
Speak for yourself. I also don't want it to be readily apparent how many different devices I have, or when I'm using which one, or how many people are in the household, or when who is home.
Granted any service that I consistently interact with is likely to be able to figure out at least some of that information if they put in some effort. But I don't want to be freely providing a complete picture for zero effort.
Creepy data aggregator stories pop up on the HN front page regularly so hopefully I don't need to explain why I feel this way.
Yeah, I mean, I share those concerns in general, but my efforts are mostly centered around aggressive ad/tracker-blocking (moderate DNS-level blocking at the network level, more aggressive at the device level + browser-level blocking) and the avoidance of non-privacy-focused services, e.g. avoiding the popular social networks entirely, and using privacy-supporting pay-for services.
Using the same IP for all of my devices, for me, generally falls into the same bucket of anti-fingerprinting techniques that are used by the Tor Browser like letterboxed resolution that I don't find practical for general use. If I want to actually prevent fingerprinting by IP, resolution, etc. then I'll actually use the Tor Browser.
It depends what you're trying to defend against. The rotation hinders associating an address with a particular device. If someone looks at the network prefix to see if people are in the same household, then that's exactly the same as looking at the IPv4 address to determine the same thing.
> From outside there's 1 IP address. With IPv6, every device would get it's own address outside. Why do I want that? That sounds less private to me. Am I mis-understanding something? Lots of traffic on one IP address sounds more obfuscated than all separate.
Having recently enabled IPv6 for my home network, the "why" was that a) IPv6 to IPv6 connections are nominally more efficient than those that have to traverse NAT and b) it enables connectivity to/from IPv6-only internet devices.
The privacy upsides of a single IPv4 IP for a household are, to me, more marginal than the above benefits.
I'm pretty sure the IETF fought against NAT.
Why not just have the server send a token to the client that the client includes in the next request?
Because by that point the tcp session is already broken.
UDP?
> This doesn’t solve the issue of cloud to device traffic being dropped after NAT timeout (check back for another post on that topic),
This is the key problem. I think it would be best solved by NAT devices having some way of probing their timeout policy, and then notifying the endpoints when they expire a mapping.
One way to do that might be for a client to deliberately send a packet to the server with an insufficient TTL.
The reply (via ICMP) could then contain fields specifying the timeout in minutes of the mapping, together with some generation number specifying if the mapping table has been cleared due to reboot or overflowed since last queried.
There might even be a way to request a specific mapping become long lived - perhaps for months or years.
The benefit of all this is that client devices can do push notifications from servers or P2P notifications, all with no polling - allowing for example coin cell devices to last for months or years (with appropriate WiFi protocol improvements too)
People have been complaining about NAT for decades. It's time to STFU. It's a well-known problem with a bunch of well-known solutions.
If you don't like NAT you could go IPv6.
But really, why do you need to talk to your device? If it's just reporting in NAT is irrelevant. If you want to do device management just write something into your protocol to check for updates/commands and deal with it on a periodic basis. You can even do that on startup, so you can tell the customer to power cycle the device. It's unlikely that any IoT device needs instant updates, so long periodic updates are probably fine.
There are plenty of IoT devices that people want to execute commands on (anything remotely controlled, basically). Polling for commands on a periodic basis introduces lag into that process which is irritating. Furthermore, polling at a frequent interval can end up using a lot of power as well versus waiting in a receive-only mode for an incoming command.
The alternative to polling is unfortunately polling, which is what the article is about.
You can avoid polling for messages, but you have to send packets outbound regularly in order to maintain a NAT mapping & connection, so that the external side can send messages inward.
The latency is overcome this way, so latency is a solvable problem, but this need to constantly wake up a radio every <30s in order to keep a NAT session alive is a significant power draw.
In theory you might be able to avoid this with NAT-PMP / UPnP however their deployments are inconsistent and their server side implementations are extremely buggy.