More

stefanha · 2026-01-21T23:29:38 1769038178

Link to the protocol specification: https://github.com/TrustTunnel/TrustTunnel/blob/master/PROTO...

It's a thin HTTP/2 and HTTP/3 tunneling protocol for TCP, UDP, and ICMP traffic.

It should be easy to write an independent implementation based on this specification provided you already have an HTTP/2 or HTTP/3 library. Pretty neat!

dixie_land · 2026-01-22T05:16:07 1769058967

Looks very similar to the HBONE protocol the istio folks created for ambient mesh: https://istio.io/latest/docs/ambient/architecture/hbone/

mintflow · 2026-01-22T15:13:20 1769094800

just did some spec reading, it's quite clear and nit.

I can understand that put UDP payload into a single HTTP stream, at least when QUIC transport is in use, there is no UDP in TCP case.

The Source Address/Port in the UDP payload message serve as key to handle to the tunnel client if I understand correctly?

userbinator · 2026-01-22T05:55:08 1769061308

Basically a CONNECT proxy? That's definitely not a difficult thing to write.

ameshkov · 2026-01-22T06:45:57 1769064357

More or less, built on top of it with added udp/icmp.

When writing server and client a lot of time is consumed by additional features, not on implementing the spec itself. For instance, in order to be truly stealthy we have to make sure that it looks *exactly* like Chromium on the outside, and then maintain this similarity as Chromium changes TLS implementation from version to version. Or here’s another example: on the server-side we need to have an anti-probing protection to make it harder to detect what the server does.

eptcyka · 2026-01-22T07:01:04 1769065264

QUIC CONNECT supports UDP too now.

ameshkov · 2026-01-22T07:15:42 1769066142

We support both H2 and H3 and this is necessary. QUIC is not bad, but there are places where it either does not work at all or works too slow.

And one more thing, even though the code and spec is only published now, we’ve been using TrustTunnel for a long time, started before CONNECT_UDP became a thing.

We’re considering switching to it though (or having an option to use it) just to make the server compatible with more clients.

eptcyka · 2026-01-22T12:55:03 1769086503

Ah, so you resolve domains before to apply the routes to the profile, I see. As per the spec, network extensions are not allowed to reroute traffic outside the tunnel, destinations set in the tunnel network settings must be routed inside the tunnel. This means that users have to know their domains upfront, the app cannot do this dynamically, if only to comply with apple rules.

ameshkov · 2026-01-22T14:55:00 1769093700

Actually, no, we don't resolve them. We scan the incoming ClientHello before making a decision on where to route the connections. If the connection should be bypassed we make a connection by ourselves and proxy traffic. Implementing it that way requires having a TCP stack right in the client.

eptcyka · 2026-01-22T16:38:29 1769099909

Unfortunately, I am no stranger to embedding a whole userspace networking stack into a VPN client either.

xtacy · 2026-01-22T13:40:26 1769089226

> QUIC is not bad, but there are places where it either does not work at all or works too slow.

Curious: in your experience where does QUIC work bad/slow?

ameshkov · 2026-01-22T14:56:50 1769093810

For example, in some countries it's either slowed down or outright blocked.

stefanha · 2025-11-04T00:58:55 1762217935

In Linux (Wayland) you can copy text from the terminal without pressing Ctrl+C at all. Just select the text. To paste it in another Window, press the middle mouse button.

This is called the Primary Selection and is separate from the Clipboard (Ctrl+C/Ctrl+V). IMO the Primary Selection is more convenient than the Clipboard.

pmontra · 2025-11-04T06:29:57 1762237797

That's an X11 thing that Wayland had to reimplement because it's so convenient. The problem is when pasting into the terminal something that another program copied into the clipboard. That's ctrl-shift-c.

I thought about remapping copy and paste to their own keys, possibly a single one. Maybe on the number pad, which I never use. Or remapping ctrl-c.

tmtvl · 2025-11-04T12:14:30 1762258470

There's always Ctrl+Insert for copy and Shift+Insert for paste. I know that there's some laptops lacking an insert key, which is terrible, but for keyboard with an insert key the Ctrl/Shift + Insert combos are useful at times.

pmontra · 2025-11-04T17:35:46 1762277746

Especially because one does not have to push three keys with the same hand, which is not nice to tendons. I think I did that for a while time ago, then forgot about it. Thanks.

tmtvl · 2025-11-04T21:55:49 1762293349

I always use the opposite modifier key(s) (e.g. right control + a), which I was told to do when I learned to type Dvorak; but you're welcome.

DaSHacka · 2025-11-04T04:22:17 1762230137

Isn't this an X11-ism? I dont believe this is Wayland-specific

j1elo · 2025-11-04T01:05:04 1762218304

Yeah I know. I missed this for the first couple days, but didn't take much before forgetting it after the change to Windows. (anyway I keep using Linux at home)

lelandbatey · 2025-11-04T06:26:24 1762237584

This is also a thing in X, not only Wayland.

dzaima · 2025-11-04T13:04:28 1762261468

But that doesn't go into clipboard history. And severely restricts what you can do between copying and pasting in general (very importantly makes it a pain to do replace (i.e. select+(implicit-delete+)paste)) as any intermediate selection before pasting destroys your Primary Selection. And if you realize you did that, recopying takes manually reselecting the text, or the otherwise-never-used ctrl+insert to recopy, instead of just repeating the same old ctrl+c as you always do with the clipboard in any sane application.

Of course still a nice option (esp. in terminals where the proper copy/paste are nerfed and selecting for editing is annoyingly not a thing), but far from a replacement for the proper clipboard.

bytehowl · 2025-11-04T13:27:26 1762262846

How do you paste the selected text if you want to replace a text selection in the other window?

bluebarbet · 2025-11-04T09:46:56 1762249616

>middle mouse button

Speaking for myself (although I suspect many others), I haven't used a mouse in well over a decade. To be clear, I am in the terminal all the time. So this is not a universal solution.

stefanha · 2025-10-30T15:02:20 1761836540

@graveland Which Linux interface was used for the userspace block driver (ublk, nbd, tcmu-runner, NVMe-over-TCP, etc)? Why did you choose it?

Also, were existing network or distributed file systems not suitable? This use case sounds like Ceph might fit, for example.

graveland · 2025-10-30T15:16:35 1761837395

There's some secret sauce there I don't know if I'm allowed to talk about yet, so I'll just address the existing tech that we didn't use: most things either didn't have a good enough license, cost too much, would take a TON of ramp-up and expertise we don't currently have to manage and maintain, but generally speaking, our stuff allows us to fully control it.

Entirely programmable storage so far has allowed us to try a few different things to try and make things efficient and give us the features we want. We've been able to try different dedup methods, copy-on-write styles, different compression methods and types, different sharding strategies... All just as a start. We can easily and quickly create a new experimental storage backends and see exactly how pg performs with it side-by-side with other backends.

We're a kubernetes shop, and we have our own CSI plugin, so we can also transparently run a pg HA pair with one pg server using EBS and the other running in our new storage layer, and easily bounce between storage types with nothing but a switchover event.

yencabulator · 2025-11-06T17:14:27 1762449267

> would take a TON of ramp-up and expertise we don't currently have to manage and maintain

But you think you have resources to maintain a distributed strongly-consistent replicating block store?

The edge cases in RDB are literally why Ceph takes expertise to manage! Things like failure while recovering from failure while trying to maintain performance are inherently tricky.

adsharma · 2025-10-30T23:02:43 1761865363

Ceph is under LGPL.Cost doesn't seem to be a barrier. Supports k8s through CSI and has observability and documentation.

You can probably hire people to maintain it.

Was it the ramp-up cost or expertise?

kjetijor · 2025-10-30T20:01:40 1761854500

I was struck by how similar this seems to Ceph/RADOS/RBD. I.e. how they implemented snapshotted block storage on top, sounds more or less exactly the same as how RBD is implemented on top of RADOS in ceph.

adsharma · 2025-10-31T00:20:57 1761870057

One of the problems with Ceph is that it doesn't operate at the highest possible throughput or the lowest possible latency point.

DAOS seemed promising a couple of years ago. But in terms of popularity it seems to be stuck. No Ubuntu packages, no wide spread deployment, Optane got killed.

Yet the NVMe + metadata approach seemed promising.

Would love to see more databases fork it to do what you need from it.

Or if folks have looked at it and decided not to do it, an analysis of why would be super interesting.

adsharma · 2025-10-31T00:21:52 1761870112

https://www.epcc.ed.ac.uk/whats-happening/articles/exploring...

stefanha · 2025-10-01T12:40:55 1759322455

qemu-img convert supports copy_file_range(2) too. Was the `--copy-range-offloading` option used in the benchmark?

It would be helpful to share the command-line and details of how benchmarks were run.

stefanha · 2025-09-12T18:04:09 1757700249

I'm the presenter of the talk, but not an io_uring kernel developer or security expert.

The io_uring implementation is complex and the number of lines of code is non-trivial. On the other hand, as code matures and the number of bugs being reported falls, the trade-off between functionality gained and risk of security issues changes. More people will decide to use io_uring as time passes. People already rely on much larger and more complex subsystems like the network stack or file systems.

stefanha · 2025-08-27T15:49:31 1756309771

Donations are possible through PayPal: https://www.qemu.org/donations/

QEMU is a Software Freedom Conservancy member project like Git, OpenWRT, and many others. You can donate through the Conservancy link you posted and mention which project you wish to support.

stefanha · 2025-07-21T11:16:15 1753096575

io_uring is available from RHEL 9.3 onward. The catch is that it's disabled by default and needs to be enabled at runtime via the "kernel.io_uring_disabled" sysctl.

stefanha · 2025-07-20T14:31:31 1753021891

The Linux RWF_DSYNC flag sets the Full Unit Access (FUA) bit in write requests. This can be used instead of fdatasync(2) in some cases. It only syncs a specific write request instead of the entire disk write cache.

zozbot234 · 2025-07-20T15:22:58 1753024978

You should prefer RWF_SYNC in case the write involves changes to the file metadata (For example, most append operations will alter the file size).

stefanha · 2025-07-20T17:15:53 1753031753

Agreed, when metadata changes are involved then RWF_SYNC must be used.

RWF_DSYNC is sufficient and faster when data is overwritten without metadata changes to the file.

vlovich123 · 2025-07-20T21:55:36 1753048536

No that’s incorrect. File size changes caused by append are covered by fdatasync in terms of durability guarantees.

stefanha · 2025-07-21T11:41:35 1753098095

It looks plausible: XFS's xfs_dio_write_end_io() updates the on-disk file size. Do you have a link to documentation that confirms this is true for Linux or POSIX filesystems?

Edit: POSIX 1003.1-2017 defines fdatasync(2) behavior in 3.384 Synchronized I/O Data Integrity Completion, where it says "For write, when the operation has been completed or diagnosed if unsuccessful. The write is complete only when the data specified in the write request is successfully transferred and all file system information required to retrieve the data is successfully transferred".

So I think POSIX does guarantee that a write at the end of the file with O_DSYNC/followed by fdatasync(2) (and therefore, Linux RWF_DSYNC) is sufficient. Thank you for pointing out that RWF_DSYNC is sufficient for appends, vlovich123!

LtdJorge · 2025-07-20T16:39:33 1753029573

Not really, RWF_DSYNC is equivalent to open(2) with O_DSYNC when writing which is equivalent to write(2) followed by fdatasync(2) and:

  fdatasync() is similar to fsync(), but does not flush modified
       metadata unless that metadata is needed in order to allow a
       subsequent data retrieval to be correctly handled.  For example,
       changes to st_atime or st_mtime (respectively, time of last access
       and time of last modification; see inode(7)) do not require
       flushing because they are not necessary for a subsequent data read
       to be handled correctly.  On the other hand, a change to the file
       size (st_size, as made by say ftruncate(2)), would require a
       metadata flush.

stefanha · 2025-06-26T10:58:43 1750935523

There is ongoing discussion about this topic in the QEMU AI policy: https://lore.kernel.org/qemu-devel/20250625150941-mutt-send-...

stefanha · 2025-06-04T13:59:46 1749045586

The article mentions getting SPEC CPU running but doesn't share performance results or scalability results (now the CPU can decode twice as many instructions, etc). Can someone who has been following the research in this area share some results?