yoshuaw 8 months ago

The Azure Upstream team has been working on a really fast hypervisor library written in Rust for the past three years. It does less than you'd conventionally do with hypervisors, but in turn it can start VMs around 2 orders of magnitude faster (around 1-2ms/VM).

I think this is really cool, and the library was just released on GitHub for anyone to try. I’m happy I got to help them write their announcement post — and I figured this might be interesting for folks here!

  • dangoodmanUT 8 months ago

    Do you think requiring to use their packages is too limiting for widespread usage? Seems like you're forced to use Rust or C atm.

    This seems like a limitation that sits in a somewhat unusable place: For something simple and platform-specific (e.g. a HTTP transform) we can just use JS where the boot time perf makes up for the execution perf, and for something more serious like a full-fledged API 120ms should be more than enough time (and we can preemtively scale as long as we're not already at 0)

    • yoshuaw 8 months ago

      The way to think about Hyperlight is as a security substrate intended to host application runtimes. You’re right that the Hyperlight API only supports C and Rust today — but you can use that to for example load Python or JS runtimes which can then execute those languages natively.

      But we might be able to do even better than that by leveraging Wasm Components [1] and WASI 0.2 [2]. Using a VM guest based on Wasmtime, suddenly it becomes possible to run functions written in any language that can compile to Wasm Components — all using standard tooling and interfaces.

      I believe the team has a prototype VM guest based on Wasmtime working, but they still needed a little more time before it’s ready to be published. Stay tuned for future announcements?

      [1]: https://component-model.bytecodealliance.org/introduction.ht...

      [2]: https://wasi.dev

      • 0x457 8 months ago

        We went full circle - I remember using python wrapper to run Rust on AWS Lambda.

fwsgonzo 8 months ago

Looks like my TinyKVM project, except it runs specialized programs instead of regular ELFs? TinyKVM also runs functions, with a fast execution timeout. I proved that without I/O you can essentially run KVM programs with native performance, and sometimes more due to automatic hugepages. I measured LLMs to run at 99.7% native speed using eg. Mistral 7B. For example, the STREAM memory benchmark doesn't use hugepages by default, and so the terminal version runs slower than the TinyKVM version due to hugepage-tables, but of course runs at the same speed once you modify the benchmark to use the same advantage. However, it does require modifying the program.

See: https://ieeexplore.ieee.org/document/10475832

I also implemented VM resets using page-table rewrites and CoW memory sharing, so that no memory is shared across different requests. This can be implemented as tail-latency in a cache.

I ended up adding support for most languages. All the systems languages, Go, v8, LuaJit etc. Go was by far the most annoying to support as it uses signals.

  • generalizations 8 months ago

    I don't have access to that paper - and when I looked for TinyKVM all I found was the rpi-based project that uses the other definition of KVM. Is your project online somewhere? Or is it proprietary?

  • zekrioca 8 months ago

    Wouldn’t containers provide the same effect as TinyKVM?

    • fwsgonzo 8 months ago

      Yes, if you don't need sandboxing then containers are just way easier to deal with. Although I wish they didn't use so much space.

      • zekrioca 8 months ago

        Why couldn’t one mathematically recreate the limitations of a VM through a namespace by means of SELinux?

        • kevincox 8 months ago

          Because the Linux kernel is incredibly complicated and shouldn't be trusted as a strong security boundary. A simple hypervisor likes has far fewer vulnerabilities so is an easier to trust boundary. They are in very different security tiers.

          I would summarize as containers are good for mostly trusted isolation (teams within a company, purchased software) VMs are good for general untrusted software (different tenants in a cloud provider) and separate physical hardware is for scenarios where attacks are likely (military, known malicious code). Of course use cases are very fuzzy, but I wouldn't run fully untrusted code in the same kernel as anything of value.

  • intelVISA 8 months ago

    Nice project, yeah this looks like a hobbled (in true MS fashion) version of TinyKVM!

    Were you inspired by includeOS, Mirage, or similar?

    • fwsgonzo 8 months ago

      I wrote the IncludeOS kernel bits

generalizations 8 months ago

> These micro VMs operate without a kernel or operating system, keeping overhead low. Instead, guests are built specifically for Hyperlight using the Hyperlight Guest library, which provides a controlled set of APIs that facilitate interaction between host and guest

Sounds like this is closer to a chroot/unikernel than a "micro VM" - a slightly more firewalled chroot without most of the os libs, or a unikernel without the kernel. Pretty sure it's not a "virtual machine" though.

Only pointing this out because these sorts of containers/unikernels/vms exist on a spectrum, and each type carries its own strengths and limitations; calling this by the wrong name associates it with the wrong set of tradeoffs.

  • wmf 8 months ago

    I guess if it uses CR3 it's a "process" and if it uses VMLAUNCH it's a "VM".

    • generalizations 8 months ago

      Heh. Going by that delineation we end up with very VM-ish containers and (now) very container-ish VMs. Though this seems like it's even more stripped down than a unikernel - which would also be a "VM" here.

  • 0cf8612b2e1e 8 months ago

    I thought a chroot was not considered a real security boundary?

    • ronsor 8 months ago

      Chroot is a real security boundary as long as you use it properly. That said, namespaces on Linux are much superior at this point, so I can only recommend using `chroot` for POSIX compliance.

      • derefr 8 months ago

        chroot is great for all sorts of things, but they're not security-related.

        A lot of tools expect to do things to "your system" at absolute paths — chroot lets those tools operate against an explicitly wired-up semi-virtualized simulacra of your system, designed to pass through just the parts of those operations you want to your real host, while routing the rest of the effects into a "rootfs in a can", that you're either building up, or will immediately throw away.

        Think: debootstrap; or pivot-root; or mounting your rootfs to fix your GRUB config and re-run update-grub from your initramfs rescue shell.

    • kevincox 8 months ago

      Yes. Anything that shares a kernel is a very weak security boundary as the kernel is complex and vulnerabilities are regularly discovered.

  • 0x457 8 months ago

    Okay, so every name in this post makes sense it's just some of these words started to mean very different things with time.

    This is not a hypervisor or a vm manager. It's a library that lets you run a C or Rust function[1] inside a VM managed by platform hypervisor[2].

    [1]: As in Function (computer programming) not a AWS Lambda.

    [2]: KVM or mshv on Linux and Windows Hypervisor on linux

oneplane 8 months ago

So in essence, this is somewhere between a unikernel+firecracker combo and a WASM module, but using VT.

apitman 8 months ago

Don't see any mention of firecracker, which is the first thing I think of in this space. Anyone have a TL;DR comparison?

  • eyberg 8 months ago

    Firecracker can run ordinary linux/GPOS vms and unikernels.

    Unikernels can run inside of firecracker.

    Unikernels are focused on single applications whereas general purpose operating systems are focused on multiple applications.

    This is focused on running functions embedded inside a host program. So it is fairly different than other things out there and in a class of its own.

    • ATechGuy 8 months ago

      > each function request to have its own hypervisor for protection.

      They are talking about isolating serverless functions, not host program functions. In that sense, it is exactly what Firecracker does for lambda functions

      • eyberg 8 months ago

        Firecracker boots up a runtime that has a full blown operating system in it - lambda just happens to call a known program with a known function. In that sense sure it provides similar functionality but it's really quite different. That's not what fly uses firecracker for, for instance.

        Qemu/firecracker are in the same space - this is different.

        These are most definitely in a different boat as you embed the guest functions inside the host program and then you register those functions. Taken from the readme:

        > The host can call functions implemented and exposed by the guest (known as guest functions).

        > Once running, the guest can call functions implemented and exposed by the host (known as host functions).

        This is more in the 'safe plugin' type of space. As with most things in this space - the best way to learn about them is to simply try it out.

        • stogot 8 months ago

          > The host can call functions implemented and exposed by the guest (known as guest functions).

          Can you explain this a bit more? Why/when would a developer want to do this? What’s the advantage over firecracker?

          • dboreham 8 months ago

            It's faster (shorter start time).

spai2 8 months ago

How does the micro VM's guest API talk to the host process? Does the communication between the two have to go through the hypervisor?

spankalee 8 months ago

They mention that most guests are expected to run code in a VM/interpreter... I wonder if they have a build of V8 or JSC for their environment?

  • yoshuaw 8 months ago

    I believe the team has a working build of JerryScript [1] to test out the C bindings, but I’m not sure that will be released.

    My understanding is that work on the Wasmtime VM guest is ongoing, which will enable Hyperlight to run the StarlingMonkey engine [2]. This is a WebAssembly build of Firefox’s SpiderMonkey engine which was donated by Fastly to the Bytecode Alliance.

    That said though, I agree it would be great to see runtimes like V8 and JSC run directly on Hyperlight. There are good reasons why people might prefer those over StarlingMonkey (compat comes to mind), and it would be neat to see how much faster they could start compared to conventional VM deployments.

    [1]: https://jerryscript.net/

    [2]: https://github.com/bytecodealliance/StarlingMonkey

u8080 8 months ago

So in general this is kludge to implement app isolation via "VM", because existing CPU architectures suck at isolating code?

sim7c00 8 months ago

i wondered how it worked in rust but the guest entrypoint>init>main is wrapped in unsafeblock as is a lot of other low level operations it does. interesting stuff

7e 8 months ago

Use CHERI for this?

m3kw9 8 months ago

[flagged]