Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Why bother, when Dagger caches everything automatically?

The fear with needing to run `npm ci` (or better, `pnpm install`) before running dagger is on the amount of time required to get this step to run. Sure, in the early days, trying out toy examples, when the only dependencies are from dagger upstream, very little time at all. But what happens when I start pulling more and more dependencies from the Node ecosystem to build the Dagger pipeline? Your documentation includes examples like pulling in `@google-cloud/run` as a dependency: https://docs.dagger.io/620941/github-google-cloud#step-3-cre... and similar for Azure: https://docs.dagger.io/620301/azure-pipelines-container-inst... . The more dependencies brought in - the longer `npm ci` is going to take on GitHub Actions. And it's pretty predictable that, in a complicated pipeline, the list of dependencies is going to get pretty big - at least a dependency per infrastructure provider we use, plus inevitably all the random Node dependencies that work their way into any Node project, like eslint, dotenv, prettier, testing dependencies... I think I have a reasonable fear that `npm ci` just for the Dagger pipeline will hit multiple minutes, and then developers who expect linting and similar short-run jobs to finish within 30 seconds are going to wonder why they're dealing with this overhead.

It's worth noting that one of Concourse's problems was, even with webhooks setup for GitHub to notify Concourse to begin a build, Concourse's design required it to dump the contents of the webhook and query the GitHub API for the same information (whether there were new commits) before starting a pipeline and cloning the repository (see: https://github.com/concourse/concourse/issues/2240 ). And that was for a CI/CD system where, for all YAML's faults, for sure one of its strengths is that it doesn't require running `npm ci`, with all its associated slowness. So please take it on faith that, if even a relatively small source of latency like that was felt in Concourse, for sure the latency from running `npm ci` will be felt, and Dagger's users (DevOps) will be put in an uncomfortable place where they need to defend the choice of Dagger from their users (developers) who go home and build a toy example on AlternateCI which runs what they need much faster.

> I will concede that Dagger’s clustering capabilities are not great yet

Herein my argument. It's not that I'm not convinced that building pipelines in a general-purpose programming language is a better approach compared to YAML, it's that building pipelines is tightly coupled with the infrastructure that runs the pipelines. One aspect of that is scaling up compute to meet the requirements dictated by the pipeline. But another aspect is that `npm ci` should not be run before submitting the pipeline code to Dagger, but after submitting the pipeline code to Dagger. Dagger should be responsible for running `npm ci`, just like Concourse was responsible for doing all the interpolation of the `((var))` syntax (i.e. you didn't need to run some kind of templating before submitting the YAML to Concourse). If Dagger is responsible for running `npm ci` (really, `pnpm install`), then it can maintain its own local pnpm store / pipeline dependency caching, which would be much faster, and overcome any shortcomings in the caching system of GitHub Actions or whatever else is triggering it.



> I think I have a reasonable fear that `npm ci` just for the Dagger pipeline will hit multiple minutes, and then developers who expect linting and similar short-run jobs to finish within 30 seconds are going to wonder why they're dealing with this overhead.

I was going to reply that you misunderstand how Dagger works: that your pipeline logic typically requires very few dependencies, if any, since it can already have dagger build, download and execute any container it needs - with free caching. So your “npm ci”, although technically not free, is a drop in the bucket and easily offset by the 2x or even 5x speedup that is typical when daggerizing an existing pipeline.

All the above is true… But I realize that the documentation you link to contradicts it. Reading these guides I understand your fear. Just know that these guides are, in that respect, giving you the wrong idea. In a more representative example, instead of importing a GCP client library, the script would have dagger run a GCP client in a container, with the appropriate input flags, config files and, if needed, artifacts. We will fix those guides accordingly, thank you for bringing this to my attention.

Also, soon Dagger will do exactly what you were expecting: it will execute the “npm ci” itself, in a container. So whether that script’s overhead is 3 seconds in the typical scenario I described to you, or a few minutes in the scary scenario you described to me: either way it will get cached, and either way it will have no host dependency on npm or any other language tooling: just the dagger CLI.


> We will fix those guides accordingly, thank you for bringing this to my attention.

Great :D

I read a lot more of the documentation this morning, in general I really like what I see, but the clustering bit is the key missing point for me at the moment. My current employer produces an Electron desktop application - we need to run 90% of the tests on Linux (most cost-effective), and 5% each on a Windows and macOS machine. I see the Dagger CLI runs on both macOS and Windows, but the architecture as it stands today expects that the Dagger pipeline will run all its workloads on the same VM. If I want to run pipeline tasks on other VMs, I can do it from Dagger, treating the Dagger Engine as the orchestration layer which makes some kind of an IaaS or PaaS call to schedule the workload on a separate macOS or Windows VM, outside the Dagger architecture. But... today I expect that workload scheduling to happen within the CI/CD architecture (it's trivially expressible within GitHub Actions, particularly as GitHub Actions hosts both Windows and macOS runners), and needing to hoist it outside the CI/CD architecture in order to use Dagger is a needless complication.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: