More

Gankra · on Feb 1, 2023

Do you mean this kind of dep? https://rust-lang.github.io/rfcs/3028-cargo-binary-dependenc...

That's an interesting thought, I'm not sure I've ever seen someone employ that as a pattern. Actually no wait, I thought cargo bin-deps specifically gave the developer no way to manually invoke it (i.e. there's no equivalent functionality to npm's npx)? Without that, what use would the dependency be?

jph · on Feb 1, 2023

Yes, where that link describes "[dev-dependencies]" and "[build-dependencies]".

I use the dev-dependencies section often, and the build-dependencies rarely. And anyone here, please correct my understanding if there's a better way to do what I'm describing.

For me, these sections are an easy way to be explicit with collaborators that the project needs the deps installed in order to work on the project, and does not want/need the deps to be compiled into the release.

My #1 use case is to have a collaborator start with a git clone, then run "cargo build", and have cargo download and cache everything that's needed to work on the project and release it. My #2 use case is to deal with semver, so a project can be explicit that it needs a greater cargo-dist version in the future.

To your point about cargo bin deps akin to npx, yes, your understanding matches mine i.e. not available yet. I do advocate for that feature to be added because it's helpful for local environments. Cargo does offer "default-run" which shows there an awareness of a developer preferring specific local exectuables-- maybe Cargo can/will add more like that for deps?

Gankra · on Feb 1, 2023

The "way-too-quickstart" is the minimal example: https://github.com/axodotdev/cargo-dist#way-too-quick-start

A key feature of cargo-dist is that

cargo dist init --ci=github

should simply set everything up for an arbitrary* Rust workspace and you just check in the results.

* Not sufficiently tested for all the wild workspaces you can build, but a "boring" one with a bunch of libraries supporting one binary should work -- that's what cargo-dist's own workspace is, and it's self-hosting without any configuration. Any time I want to update the bootstrap dist I just install that version of cargo-dist on my machine and run `cargo dist generate-ci github --installer=...` to completely overwrite the ci with the latest impl.

Gankra · on Feb 1, 2023

Just to elaborate on this a bit: as discussed in the Concepts section of the docs[0] the core of cargo-dist absolutely supports workspaces with multiple binaries, and will chunk them out into their own distinct logical applications and provide builds/metadata for all them.

However this isn't fully supported by the actual github CI integration yet[1], as I haven't implemented proper support for detecting that you're only trying to publish a new version of only one of the applications (or none of them!), and it doesn't properly merge the release notes if you're trying to publish multiple ones at once.

I never build workspaces like that so I'm waiting for someone who does to chime in with the behaviour they want (since there's lots of defensible choices and I only have so many waking hours to implement stuff).

[0]: https://github.com/axodotdev/cargo-dist/#concepts [1]: https://github.com/axodotdev/cargo-dist/issues/69

MuffinFlavored · on Feb 1, 2023

Cool, thanks.

It creates a ~100 line release.yml file

1. create GitHub release

2. upload artifacts

3. upload manifest

4. publish GitHub release

I like it, it's opinionated, but I don't know how much it will catch on. If somebody needs to maintain a 100 line GitHub Actions release YAML file, in my experience you typically want to understand everything in it in case you need to adjust it for future needs/as they grow.

It's well done though. Curious to see how much adoption it picks up.

Gankra · on Feb 1, 2023

Correction on this someone else sent me:

The check of interest is for a Mark Of The Web[0] flag that Windows includes in file system metadata. The builtin unzipping utility just faithfully propagates this flag to the files it unpacks. Other utilities like 7zip are unlikely to do this propagation (effectively clearing it).

But yeah either way it has nothing to do with code signing!

[0]: https://nolongerset.com/mark-of-the-web-details/

jakub_g · on Feb 1, 2023

If someone's interested with more details about MoTW, EricLaw (long time MSFT engineer at IE and Edge teams) got you covered:

https://textslashplain.com/2016/04/04/downloads-and-the-mark...

https://textslashplain.com/2022/12/02/mark-of-the-web-additi...

Zababa · on Feb 2, 2023

Interesting, thank you for sharing!

chatmasta · on Feb 1, 2023

macOS has a similar feature with Gatekeeper, which bit me when preparing a Pyinstaller binary for Mac. The flag doesn't get added when you download a file with curl, but it does when you download it through a web browser, which can cause difficult to debug issues with binaries downloaded from GitHub releases.

You can remove this flag with the xattr command:

    xattr -d com.apple.quarantine the_quarantined_binary

I wrote up the details of this in a PR [0] where I last dealt with it.

[0] https://github.com/splitgraph/sgr/pull/656

maldev · on Feb 1, 2023

This is actually pretty similar. The OS has an alternative data stream(An idea they stole from Mac), and they list what site a exe was downloaded on, or if it came from somewhere else. Others incorrectly called it a flag, when it works by having two different file data streams for a single file, one is the default one.

So for example, a single file can actually contain two different "files"(File data).

So, foo.exe, actually will effectively open the file foo.exe:DEFAULT. You could also add a piece of malware to the foo file in place of a datastream. So foo.exe is legit, but if you open foo.exe:MALWARE , it will open up the malware datastream.

So tldr, how Windows does this, it when you get a file from a third party source(Internet, USB Drive, etc), it adds a new datastream in the form of a textfile. And the textfile contains info about the source. Namely, a number for location it came from(3? for web), and then some more info.

chatmasta · on Feb 2, 2023

Thanks for the details! Judging by your username, I assume you know this area well :)

Most surprising to me on Mac was that the "flag" (I'm not sure that's the right term here either) was preserved on files extracted from a tarball downloaded from the internet. Although I think this also required extracting it via Finder (GUI) and did not apply when using the tar command - I can't remember exactly.

Gankra · on Jan 30, 2023

They are exactly the same except for when they're not.

(On 64-bit) Rust very naively has two 64-bit integers for the strong and weak count, Swift packs them into only one. Swift also packs in several extra flags for various things [0].

These flags mean that retain/release (increment/decrement) is actually an atomic compare-and-swap instead of a fetch-add. Allegedly performance issues with this were fixed by the hardware team, just, optimizing CASes better.

Swift also has to interop with ObjC "weak" pointers which have move constructors because their address is registered with a global map which is used to null them out when all strong counts go away, but I don't think this changes the design much when not using them.

Swift ARC is built into the language and a huge amount of the compiler's energy is dedicated to optimizing it. This is why it's part of the calling convention (+1/+0), why there are special getter/setter modes with different ARC semantics, why many stdlib functions are annotated with "this has such-and-such semantics" and so on.

Swift ARC is also very pervasive, as basic collections are all ARC-based CoW, all classes are ARC, and I think existentials and implicit boxes also go through ARC for uniformity? You can in principle avoid ARC completely by restricting yourself to value types (structs/primitives) but this is complicated by polymorphic generics and resilient compilation necessitating some dynamic allocations.

ARC is also why historically Swift gave itself fairly extreme leniency on running destructors "early" based on actual use [1]. Eliminating a useless +1 can be the difference between O(n) and O(n^2) once CoW gets involved!

By contrast in Rust it's "just" a library type which you have to clone/drop (increment/decrement) manually. It doesn't do anything particularly special, but it's very predictable. The existence of borrows in Rust lets you manually do +0 semantics without having to rely on the compiler noticing the optimization opportunity, although you do need to convince the borrow checker it's correct.

[0]: https://github.com/apple/swift/blob/3b00177f768b630a8f7a1135...

[1]: https://forums.swift.org/t/a-roadmap-for-improving-swift-per...

Gankra · on Jan 30, 2023

The biggest problem is that if you have a big """zero-cost-abstraction""" blob like iterator adaptors -- `Map<Filter<Fold<ArrayIter<MyType>>>>` -- and a single drop of Resilient Type is in there (i.e. MyType is resilient) then the whole thing gets polymorphically compiled and the compiler won't boil away any of the things that are supposed to be "zero cost".

How much you get burned by this kind of thing really depends on how you design APIs and where the hotspots are. Like if the big iterator blob is only ever for like 10 items, whatever. If the iterator blob is iterated inside the dylib where it's not resilient and can be inlined away, whatever.

Gankra · on Jan 30, 2023

The greatest trick Swift's developers ever performed was convincing the world that it wasn't just a better UX layer on top of COM

pjmlp · on Jan 30, 2023

IBM did it first with SOM, but unfortunely it went the way it did.

Gankra · on Dec 1, 2022

there's almost no bounds checking in rust code before the optimizer even looks at it because we use iterators and not goofy manually indexed for loops that are begging you to make a typo that crashes your code :)

kllrnohj · on Dec 1, 2022

Yeah but idiomatic modern C++ is also using iterators and even before that there's no bounds checking to eliminate in the first place since operator[] is unchecked so the optimizer can't be struggling to eliminate it since it's not there.

The question isn't "does Rust have bad bounds checking optimizations" but rather "what is this mythical heavily-bounds-checked C code that the compiler can't optimize away?"

Gankra · on Dec 1, 2022

No the claim is always that Rust "must" be slower than C/C++ because it has pervasive bounds checking for array indexing.

Then people insist on wanting to replace every x[i] in prod with x.get_unchecked(i) only to learn that, not only was that indexing not slowing the code down (the branch is perfectly predictable in a correct program!), but actually any difference is so in the noise that the random perturbation is worse (or that the asserts were actually adding extra facts for more profitable optimizations in llvm).

There is definitely specific hot loops with weird access patterns where it can be high impact but those are the exception, not the rule, as the Android team demonstrated.

Gankra · on Oct 26, 2022

The null pointer optimization is documented in many places, here's one official instance I published in 2015: https://doc.rust-lang.org/nomicon/repr-rust.html

There are many random details of a language that are "important" and they get documented all over the place. If it was all under "THE REFERENCE" and was a 500 page PDF, would the information actually be more discoverable? How would you even know to look up what you don't know? Are you going to read the reference cover to cover before writing hello world?

mjw1007 · on Oct 26, 2022

If the people maintaining a language produce a complete and correct description, then thousands of other people around the world can use it to provide helpful and correct blog posts, Stack Overflow answers, and so on.

But those thousands of people, however enthusiastic, can't produce authoritative documentation, because that needs the maintainers' "sign-off".

Gankra · on Oct 26, 2022

I am one of the people who has written a lot of Rust's docs, including one of the early attempts at "specifying" details of the language: the Rustonomicon.

Rust's documentation is taken very seriously and often praised! But we focus on documenting actual APIs (with examples that we actually run in CI to make sure they don't break!) (and links to the actual implementation to dig deeper!).

We also gave everyone builtin tools to write their own docs to the same standard and automatically build and host them. Example: https://docs.rs/tracing/0.1.37/tracing/

What Rust has eternally lagged on is properly documenting little fiddly nubs like "this is an integer literal expression which decays..." and "the grammar for a pattern expression is...".

For whatever reason some people think these details are of THE UTMOST IMPORTANCE, instead of just being a thing you try out and see what happens. Or a thing you discover by reading existing code, books, examples, posts, etc.

As a language Rust is extremely tuned for "fuck around and find out". I can rattle off some spooky interactions in Rust but I always end up concluding with "... and in practice this doesn't really matter because if you run afoul of it you just get a compilation error (or a lint)".

In this regard knowing fiddly details about Rust... tends to not matter. Unless you're writing Unsafe Code. Then you read the book I wrote, and eventually you drill down the rabbit hole deep enough to learn that "oh actually C doesn't even know what this means. UHHH... WELL. FINE?"

mjw1007 · on Oct 26, 2022

I agree entirely with you that Rust, today, is well-adapted to people who like to learn by the "fuck around and find out" route.

But it often feels as if the Rust community doesn't believe that people who prefer to learn by reading accurate documentation exist.

I commend the following page to you to get an idea of the mindset involved: https://blog.nelhage.com/post/computers-can-be-understood/

(No doubt it will gladden your heart to know that I updated the Reference to document the somewhat surprising ways in which integer literal expressions decay earlier this year.)

simplotek · on Oct 26, 2022

> Rust's documentation is taken very seriously and often praised! But we focus on documenting actual APIs (with examples that we actually run in CI to make sure they don't break!) (and links to the actual implementation to dig deeper!).

Your confusing two radically different types of documentation.

Specifications, such as the ISO standards for C and C++, are the source of truth of what determines what the language actually is.

Writing tutorials with example code ran in some implementation might help people onboard onto it but it does not, in any way, shape, or form, specify what's the language.

It's like comparing blueprints of a construction site with random videos and photos of the same building.

Sharlin · on Oct 26, 2022

I don't think Gankra is confusing them as much as asserting that a full specification should almost never be required reading for a user of a language, just like you don't need the detailed blueprints of a building to live or work there, or even to do simple repairs or redecoration.

A detailed specification only matters if you're trying to construct another building to the same spec (write another compiler), want to do heavy renovation work (implement a major new feature on rustc), or if the technical details are insufficiently "abstracted out" (say, there are exposed live wires that you have to know to avoid in daily use – cf. undefined behavior in C and C++).

simplotek · on Oct 26, 2022

> I don't think Gankra is confusing them as much as asserting that a full specification should almost never be required reading for a user of a language (...)

My point is that this line of argument makes no sense.

It's like claiming that dictionaries and English grammar books are not needed because kids learning their ABCs almost never require reading those to learn how to speak English.

Rust standardization opponents insist in these moot strawmen arguments. They completely miss the whole point of a standard. Standards are critical because how things work need to be specified in precise terms that are set in stone. Standards are read to gain deeper understanding of a language. They are used to clarify corner cases and obscure issues and behavior. You do not need a ISO standard because you want to write a "hello world". You need an ISO standard because you are a mature professional that understands that "it works on my machine" is not an acceptable answer to any question you have on how a programming language supposedly works.

tialaramex · on Oct 27, 2022

> It's like claiming that dictionaries and English grammar books are not needed because kids learning their ABCs almost never require reading those to learn how to speak English.

I don't think this makes the case you were hoping for?

Both Dictionaries and actual "English grammar books" (not pedantry written by non-experts like Lynne Truss) are descriptive because English is a natural language. I don't own the full Cambridge Grammar of the English Language but here's an extract from the Student's Introduction which I do own. (The Introduction is intended to be suitable as an undergraduate text)

"... it isn't sensible to call a construction grammatically incorrect when people whose status as fully competent speakers of the standard language is unassailable use it nearly all the time".

We don't use these books to teach children or ESL (English as a 2nd language) learners. My employer runs "pre-sessionals" which are short language courses for overseas students, they have a conditional visa, they fly in a month or two before term starts and they take our intensive ESL courses, if they do OK they're signed off as able to successfully use English, they start their degree course (say, Electronics) and the visa conditions check off, if they can't hack it the visa conditions terminate their stay and they are sent back. This allows us to recruit foreign students who could, if they apply themselves, be successful - without them needing to somehow find local English tutors in their home country, squeeze in learning a foreign language and convince immigration they've learned enough English to follow their degree course.

It is not the goal of pre-sessionals to teach a Chinese teenager to say "It is I" rather than "It's me" because that's a weird hyper-formal English. It is the goal that they should understand phrases their tutors and peers might use, which are often going to be informal or even somewhat non-standard, and be able to express themselves in both formal and informal contexts. They should be able to read say an Economist article, write a 1000 word summary of the article and talk about it in conversation.

Books like CGEL and dictionaries represent our understanding of how a very complicated system is actually used by real people. They are not guides to how English should work, such a document would be very silly.

> Standards are critical because how things work need to be specified in precise terms that are set in stone.

This is a mythologized version of what's going on. If a language is spending huge amounts of money and personal effort on the ISO process I can understand they'd want to project this myth, but it isn't true.

For C++ in particular - the most obvious comparison from the ISO standard languages to Rust - huge swathes of important stuff is "ill-formed, no diagnostic required" in the standard. If your C++ program falls foul of any such clauses, anywhere, even once in some obscure sub-routine you don't use, too bad your entire C++ program has no defined meaning and the compiler is not expected to report this as an error or warn you about the trouble you're in.

Let's use my favourite modern example. Suppose we've got a std::vector of float, and we sort that using std::sort but in some cases, due to an oversight, one element of the vector can be NaN. In C++ 20 Our entire program is undefined and anything might happen because of clauses introduced by the C++ 20 Concepts feature. Our program probably works fine in practice, but no thanks to your "critical" standard.

torstenvl · on Oct 27, 2022

> Rust is extremely tuned for "fuck around and find out"

Yes. And the issue is, that's not an acceptable stance for a language seeking to be a systems programming language.

Gwypaas · on Oct 27, 2022

If what you've created is unsound, then that is a compiler bug. Or you depend on something using unsafe which is unsound. Therefore you should, in both cases, appropriately report it.

torstenvl · on Oct 28, 2022

There is literally no such thing as a compiler bug in a language defined by "whatever the compiler does." The compiler is always, tautologically, correct.

GolDDranks · on Nov 1, 2022

You are being pedantic here; this stance doesn't survive contact with the real world. The fact that Rust's language reference is not complete/up to date, doesn't mean that Rust is defined _just_ by "whatever the compiler does." There are community-wide accepted documents (the reference, Rustonomicon, RFCs, and more informally, Language/Compiler team meeting notes/comments that reflect their consensus about the semantics) how Rust _ought_ to work.

tga_d · on Oct 29, 2022

By that reasoning specifications can't have bugs either. Sure, by some extreme prescriptive understanding of what a compiler or specification is, there are no bugs, but if a cryptography specification allows attacks in the threat model or rustc segfaults, all but the most asinine parties would agree there's a bug -- what's the point of asserting otherwise?

Gankra · on June 23, 2022

Going into this project I was very high on the idea of "different architectures should get completely independent stackwalker backend implementations" (although they call in to some agnostic machinery for CFI/symbols/registers). There are lots of platform-specific hacks and making everything super abstract is a big mess!

But geez I have been burned a few too many times on the parts that are the same between some arches accidentally diverging because I forgot to copy-paste between them. :(

(breakpad also has this approach and you can reaaaallly see the pain of this approach as every stackwalker has gotten wildly inconsistent TLC so some have tons of fancy machinery and some are super barebones. Makes it hard to tell if the divergence is intentional or just an artifact of independent code.)