I hate that I have to say this, but be aware that one cannot "pip install ansible==2.13" (https://pypi.org/project/ansible/#history) since they took over the "ansible" name to mean the "ansible distribution", which includes the project you linked to as well as a bazillion separately versioned galaxy dependencies
Is pip an end user's package manager or is it a developer's dependency manager?
This is why language specific tools always gave me pause. It's focused on supporting the language and not the end user. I can't happily install programs from an array of languages anymore using a single common tool anymore. I have all this annoying overhead to memorize instead:
- Mapping which program comes from what language
- How to install things using that language's package/dependency manager
- How to access those programs, if they aren't installed to a common location
I hear you, and we do run into that _a lot_ in triaging SO reports since some folks "apt-get install ansible", others "sudo pip install", and rarest of them all "python -m venv && pip install" and the end result matters a great deal as far as "sure, the thing was installed, but WHERE and was it installed completely?"
I used "pip install" because it was shorthand, but the problem manifests with "brew install ansible" also since at this moment it says "5.7.1" but with a totally wrong `head "https://github.com/ansible/ansible.git", branch: "devel"`
I believe it's likely just a nomenclature problem; had they left "ansible" pointing at GH/ansible/ansible and called the new thing "superansible" (or whatever) it would have been so much easier to reason about and wouldn't have needed a damn near secret side-repo named "ansible-build-data" in which they would hide the actual, no kidding, git tags
The first time someone had to create a GH release named "these are not the releases you're looking for" would have been a great opportunity to acknowledge how off the rails things had gone: https://github.com/ansible/ansible/releases/tag/v2.2.1.0-0.3...
Related to that, I consider it almost purposefully misleading of them to leave the PyPI "source" link pointed at GH/ansible/ansible since that is _for sure_ not what is going to be pip installed for those version numbers. It'd be like pointing the ansible PyPI at the Jinja2 GH repo -- yes, that is one of the things that pip installing will put on disk, but where did the rest of it come from?
I hope you don't mind me stealing your comment to plug my side project, Judo. It was created in 2016, out of my frustration with Ansible. Judging by the state of affairs, that motivation continues to be relevant.
That's one of the design decisions, there should ideally be only one "obvious" way of achieving a certain goal. Per-invocation parameters can already be passed via environment variables ("-e FOO" to pass from your local environment; "-e BAR=42" to set explicitly); I'm also considering using envdir (or a similar scheme) to set per-host parameters (https://github.com/rollcat/judo/issues/11).
I could allow specifying command line arguments somehow with the -s flag, but that would complicate parsing, quoting/unquoting, etc. I would have to teach the user about quoting rules, and probably shoot myself in the foot at least once in the process. I try to actively avoid creating hard problems by solving an equivalent, easier problem.
If this is not clear from the readme, I'll take this as a bug report, and try to improve the readme :)
When would you ever want to install just the core without any plugins? It seems like what they have now, 'pip install ansible' and you get a working distribution, is exactly what users expect.
It's just unintuitive that the package that provides the "ansible" binary isn't ansible. And then it gets even more confusing with the title of this post. It should really be "ansible-core 2.13".
I guess that depends on ones definition of "users," since a *lot* of ansible troubleshooting involves looking at the sources for stuff due to the lack of defensive programming
But, "looking at the sources" is now extremely complicated since the "user" reports "I installed ansible==5.7.1" -- so which URL do I go to in order to see why "{{ lookup('something') }}" has IndexError-ed?
For the kind of product that Ansible is, yes, users will care. Tooling ecosystems are just like managing a programming language and all its dependencies. It's congruent to if you pulled github.com/golang/go and only got the go compiler but none of the standard library.
Your analogy has some holes. Go doesn't direct end users to pull from github and `git pull` is not expected to manage dependencies.
The root of this problem is whether pip is an end user's package manager, where a meta package makes sense, or is it a developer's tool, where you want to manage finer grained dependencies?
Actually, it does if you already have go installed (eg: install a new version[1]) can be installed with go install. In that way modules act like a system version of Pip. I do see your point though, at the least it's confusing to unpack.
But that's the opposite of what's being argued here. Their claim is that pulling down package 'ansible' should only give you the ansible binary, not ansible binary plus all the plugins and stuff to use the tool and ecosystem.
Yes, users should definitely care. Knowing the version is important for identifying which features you can use and for replicating errors for debugging, while knowing your dependency tree is important for keeping your build compact and minimizing exposure to security vulnerabilities.
I’m a fan of Ansible in general and this is not a world-ending mistake, but should be corrected to avoid confusion.
Fun fact, playbook library actions don't have to be in python, they can be in golang or shell or whatever, so long as they accept JSON on the input and emit JSON on the output
I believe that's true of all the plugins, I just have personally tried it using actions and haven't for the other pluggable ones
No, all other plugins require to be written in Python (they are imported into the engine), modules are executed externally on the remote and can be in ANY language, scripting or compiled.
It sounds like you want simple and reliable. Any Ansible-alternative that you find is going to fail your primary requirement because it won't be as widely adopted, documented, and used as Ansible is. I'm not defending YAML but you might be letting perfect be the enemy of the good.
Old School simple bash script ( < 10 lines). tar.gz files with your whole distribution. extract to a directory with a version on it then rely on a softlink that chooses which version to use.
I'm shocked that no one has said Terraform yet. It has its own declarative DSL, which some people complain about (because people complain about everything), but it works well for what it's intended to do.
Providers can be created for anything with an API, from the major cloud providers to k8s to anything else.
No agent is required: it just writes state to a file, and then it diffs that file against the actual state every time it runs. (In practice, you'll probably want to put that state in a remote location like an S3 bucket, but that's very easy to do. And if you're the only one using it, you can just save it locally, which is the default behavior.)
Depending on your use case for Ansible, it could be a very good fit.
Terraform is great for certain tasks, but even they advise against using it for local execution. Whatever you use it for you really need a provider and their module system isn't very intuitive either. Ansible has/had potential, but sucks in a lot of ways too. Unfortunately, as much as I dislike certain aspects, it really is the best generic automation tool available at the moment.
Yeah, if you need to manage individual servers/VMs, it's not a great fit. I've used cloud-init files to configure EC2 instances on startup with things like packages and SSH keys, and that works pretty well if you can treat those servers as if they're immutable. But if you need to get in there and run something, it's not quite a replacement for Ansible.
Yet people use daily to manage 10k-100k+ servers/devices.
The term "scaling" has very different meanings depending on context and how one product scales is very different from another.
You could setup a context that favors push vs pull and vice versa, you can also see different products scaling well or not depending on slight variations in context and implementation.
I am highly dubious that it would be possible to manage 100k servers, specially to do interactions with large numbers at a time. The way tower collects results in a thread pool, assuming success, simply does not work at any scale. I tried and tried. I fixed many bugs and got to about 4000 hosts before changing to another platform.
If almost every server is reliable, I am sure it would work fine. That is not going to happen at scale.
Terraform does the job but it's pretty dirty and unreliable. I've had so many cases where a plan looks all great, PR gets approved and merged, and then something happens during the apply causing it to fail because all validation is done in the cloud API, not in the provider code.
Yeah, we had a PR bomb recently in a way that "plan" could have *trivially* caught had it used any of the "Get*" APIs from the cloud provider to ask about the current situ.
I appreciate that "the map is not the terrain," and that "plan" is speculating about a future configuration of the world, but come on -- if "terraform plan" is going to require _live credentials_ to run, and then only use those to enumerate the active regions, what are we even doing here?!
Shoudn't a `terraform plan` tell you that? If not then the state of the infra vs what's in the terraform state is different. I've had issues with version changes in the past and needing to update state files and all that malarky.
No, that's kind of my point. Terraform looks sexy and declarative on the surface but it's really just turning HCL into cloud API calls where the actual logic happens. Once you've got a few hundred lines the wheels start falling off. If it were truly declarative it wouldn't need to store what it knows about the existing infrastructure in a tfstate file.
Tform started off as a cool idea with good principles and over time has morphed into a shitty scripting language for managing multi cloud infra without clickops.
I'll do you one better: it's turning HCL into *an opaque golang intermediary*[1] of cloud API calls
It's like a game of telephone were every new participant in the chain is one more place to have "let me help you" turn into "what the hell was that?"
1 = and that's not even getting into the tire fire of the providers being either some Internet rando or an already overloaded team trying to have PRs make it through and out to release. I believe the the recent "we're not reviewing PRs anymore, exhausted" was just scoped to the hashicorp/terraform repo specifically, but it could very easily also apply to every code-gen shim that sits between TF and the underlying cloud SDK
You'll find a lot of places use Terraform and a config management tool. Terraform is great to build out cloud infra (not just instances but load balancers, object storage etc), but when it comes to mantaining system configuration and application state, it's less optimal.
Could you explain how this is different from Ansible? That webpage says pyinfra is "Integrated with Docker, Vagrant & Ansible out of the box", so I would think the use cases aren't identical, otherwise they wouldn't bother integrating it with tools that do the exact same thing.
Can anyone compare pyinfra to these other projects?
The main difference is that pyinfra is a library, so you can use all the power of a general purpose programming language (functions, data structures, variables, modules, loops, conditions, calling APIs, debugging, etc.).
Ansible uses YAML configuration files, and basically reinvents and tries to shoehorn a lot of above into YAML. Due to the limitations of YAML, it's extremely verbose, and requires a lot of copy paste, and quickly becomes unmaintainable.
Customising and adding functionality is also a lot more difficult and time consuming with Ansible.
Ansible is also extremely slow due to the way it works, and can take ages to run even if nothing has changed on the target device. The slow feedback loop and the lack of debugging can make even small changes painful, even on localhost. On remote hosts, it's even worse.
pyinfra only has a dependency of a shell on the remote target, where Ansible requires Python.
pyinfra cons: smaller community, less modules, doesn't have extensive hardware support (e.g. routers/switches/firewalls). Though usually it's not a big problem, because it's easy enough to add any missing functionality.
Another vote for PyInfra here, I'm actually porting all our Ansible modules over to it right now and it's an absolute joy compared to the mare I had writing the Ansible modules in the first place. I can actually debug things!
Plus if you haven't written perl using the last five years' or so's best practices, you'll almost certainly have a very confused idea of how the language works out when wielded appropriately.
Old style scripting perl easily becomes write only line noise, sure.
Sane applications perl is really quite pleasant - and it's pretty much the only common dynamic language that can give you a compile time error if you screw up a variable name (ES6's let has basically the same semantics as perl's my but sadly the errors are still runtime rather than compile time).
(one may argue that typescript counts but the experience IMO still isn't the same)
Nix, CDK, and (to a lesser extent) Puppet are good alternatives, depending on exactly what you're trying to do. Nix is fantastic for reproducibility on individual machines and not borking the rest of the system when used on a non-NixOS distro. CDK is good infra-as-code for AWS. And Puppet is not-as-horrible-as-Ansible and can be used without an agent, despite all the documentation seeming to want you to use it that way.
We are currently evaluating nix as an replacement for our internal build system, our main product is some kind of linux distro.
Nix has real bad onboarding and documentation, the creators hopped off to some other, shinier side projects (flakes) and the rest of the community is struggling to manage the immense maintenance cost of keeping the packages up-to-date and integrated. And the external consultants we hired for it are like, hardcore believers that always tell you how great nix will be in the future, but completely disregard our needs like building offline in a separated network. Which is officially supported but unusable to due bugs.
Disclosure: I love Nix ecosystem, but I acknowledge the fact that it's perfect for my use cases which are very different from those of a business.
> Nix has real bad onboarding and documentation
Could not agree more. I found that diving into it head first actually did yield good results in the end but it was really not quick.
> some other, shinier side projects (flakes)
I would argue that flakes are an absolute necessity and logical continuation of the ecosystem development rather than a side project. Flakes allow to truly describe the desired state of the system from one place and properly pin all objects to their places.
What makes you think flakes are a shinier side project? In my opinion, flakes are the One-True-Nix at this point. They simplify a large swath of issues with using Nix, and every day that sees flake adoption improve is a net-win for Nix usage overall.
Its nothing to do with a world religion, its correcting what I see as a flawed view that flakes are somehow something unrelated to nix and the issues involved with nix, and that time spent on flakes is time wasted from a nix consumer POV. I'm telling you that not only are flakes a main-project for nix developers, but that they serve to completely upend how nix is used. As I said, flakes directly solve several issues of nix-without-flakes. This is a huge benefit for the nix consumer.
flakes is not a side project, it adds to the nix model, doesn’t supersede it. It is basically a lock file for a package, making it easy for anyone, even a separate repository not related to nixpkgs at all to reproduce said build with compatible versions.
Shell scripts have so many nasty edge cases and quirks. Quoting and `find -print0` aren't enough to not get bitten. And doing the equivalent shell work in Python can be kind of exhausting and maybe prone to different errors.
If I remember correctly puppet required installing an agent. To add to the injury it wasn't available on the RHEL (clone) ISO and right now it's also missing from Fedora EPEL.
I'm moving thinking that Ansible is simpler for simple things, but complex for complex things, I find it more and more often in our servers configuration to have "configuration drift" due to fact Ansible encourages to managing by parts, not as whole system. And very rare is run automatically over all nodes say every 15 minutes.
I'm not sure what you mean by parts and not a whole system. It's defined via hosts inventories which, by defintion are a whole system. You obviously include roles and further tasks for those definitions.
You'd use cloud-init, as the other poster mentions, to initalise, or use ansible in SSH mode with the inventory coming from your cloud provider or just an ini/YAML list curated by yourself, then run it regularly from something Jenkins/Rundeck/Cron whatever, in which case cloud-init configures your ansible run user and ssh keys so you can kick it off from the main box and perhaps register the node, or use tags in cloud provider, loads of ways.
> I'm not sure what you mean by parts and not a whole system.
From what I see around, work with Ansible is organized via multiple playbooks i.e.
* playbooks/nginx.yaml
* playbooks/kafka.yaml
* playbooks/monitoring.yaml
each of them configuring only subpart of the whole system. Some of them may have intercrossing functions, like changing sysctls. Thus, it's enforcing state of subset of services, and if, for example after Nginx playbook you logically need to run Monitoring playbook, it can be forgotten/skipped -> configuration drift grows.
Isn't that the point of things like cloud-init, and other system tools. Ansible may not be great at deploying, but the whole purpose would be to catch the drift using Ansible.
I recently re-wrote the infrastructure setup of a startup. They had a Chef infra-as-code before, written for CentOS 7. Since CentOS is dead, and I took the opportunity to move to Debian, it wasn't worth it to keep working with Chef so I re-wrote in Ansible.
And it was fantastic. Maybe because Chef is so awful and convoluted my reference point is skewed. But I loved every moment.
It was simple, the documentation is great, the extended packaged are great (I set up an highly available redis sentinel infra with one ansible-galaxy import and ~20 lines playbook).
Yeah, Kubernetes and Docker are the cool kids. But Ansible is killing it if you still live in a world of VMs.
It's the small things. Why do I have to write more than one task and register intermediate variables if I want to execute a command and see its output?
Executing a task while collecting its output is a different operation than a task to display arbitrary output generated by a command's output.
Imagine two shell commands:
# foo=$(cat bar)
# echo $foo
Yes, it's the little things we need to understand, else we'll be confused by false enumeration. Ansible excells at ensuring one (idempotent) task is indeed just one task.
Try the '-v' flag for the verbosity you're looking for from one task, as it's intentionally hidden from the default output.
I don't understand the first part of your comment, why do you compare different commands. My complaint is that ansible runs external command, always internally collects it output, but doesn't display it.
Thank you for the "-v". I've run ansible with various amounts of -v in the past but I didn't notice that it does display stdout_lines with it!
Another part of ansible that I hate is that it encourages bad development practices.
Anyone can write a working playbook and put it in some git repo. If you are not actively fighting against it, you might end up with several git repos, hundreds of playbooks and no way to easy understand what code applies to what servers.
Configuration management solutions that feature agent on managed machines and don't do everything over SSH encourage people to organize their efforts in one repo. Because you cannot manually apply your config to server without pushing it to git.
This I disagree with a lot. Ansible does have a lot of unfinished playbooks out there or half baked ones. But there is solid documentation there which makes it pretty clear how to properly design things. Have a feeling people just don't read it.
For work related use cases, I find it very unlikely you would ever pull in a git repo or playbook from galaxy. Coming from Puppet, Chef I always thought Puppet was in a bad state but then I realized it is 2022 and Ansible has exactly zero roles for LDAP that work, or even resemble a thing that would ever make sense to run in a real installation.
Also, if we're talking about what code would run on which server, I do not see where other configuration management tools would perform better. Have you seen the humongous mess one needs to do in Puppet and Hiera when one wants to build a configuration that is multi-os and multi-arch (like just a simple Debian/Ubuntu plus x86_64/aarch scenario not even Windows)? Or even what it would be like if one also manages firmware blobs there?
If I may be so frank... configuration management is a shit show on every available solution. And I am saying that as someone actively being a part of these ecosystems and contributing.
I know the feeling. We use ansible extensively in our company. While it is probably the best tool at the moment to get the job done, it is not a pleasant tool for larger/complex setups.
Like you said, it's the small things. The thing that annoys me most are probably variable scoping and precedence rules. Scoping is practically non-existent; a variable declared in one role can be freely accessed in another, which can easily lead to clashes when your variable names are too generic.
> it is not a pleasant tool for larger/complex setups.
I'd say it still is. People forget you can code your own modules in whatever language you want and just expose a nice declarative interface through Ansible.
Is there a competing config-management tool that behaves more like you want? I'm just curious because I haven't used the others in anger enough to know how they solve the problem you described
I am a fan of SaltStack. With SaltStack approach, you don't have to write additional yaml to see the output of external command's execution.
You see / don't see the output of commands based on logging settings in your client (salt-call).
# EDIT: people pointed out that running ansible with additional -v's will indeed print commands. I don't understand how'd I missed that... I've certainly run ansible with -v a lot.
And my favorite part of SaltStack is that you can use jinja in any place of your states
You can use jinja expressions inline with your Ansible tasks' yaml if you want to as well. As for whether that's a good idea or a mess to read is in the eye of the beholder. Personally I do it only as a last resort, but it has been nice to have occasionally rather than some clunky multiple-task alternative.
If you need to see the output of commands and tasks in ansible, select a different output callback. Where I want something rudimentary, I use `ANSIBLE_STDOUT_CALLBACK=json` and a tool like github.com/okapia/ansible-json-monitor to parse the json.
Is there any sane reason why in all standard modules the keyword “src” has 3 letters but “dest” has 4 letters instead of “dst”, which is pretty standard everywhere? Drives me crazy every time.
For example, "ansible 5.7.1" is actually this: https://github.com/ansible-community/ansible-build-data/blob... which includes "ansible" 2.12.5
It makes even talking about it hard