the big diff and the advantage is that on tb you can't group by hyperparams or easily divide into subplots while having all the research info in fro nt of you.
Aim does that and also aggregates groups of runs to reduce the dimension and make it easy to compare.
It has a proper search by hyperparsams (and everything else tracked/collected really). All in one panel where you can compare 100s of experiments at a time.
Loading many runs on TB, with very long names makes it super slow to analyze the runs really. Tbh that has been a motivation for building this open source tool - to have something efficient and beautiful :)
We are working on a new paradigm on interacting with Ml training runs. A lot of the effort is now focused on very efficient experiment comparison capabilities - talking about 1000s of them. Lots of challenges on the UI and the backend. When loading TB or any other tool really with lots of experiments it's super slow and becomes useless.
Also no way to do effective comparison of runs by hyperparams or other metadata on the tensorboard or MLFlow. Quite basic capabilities.
What is the new paradigm and how does it differ from the existing paradigms?
And what do you mean by "no way to do effective comparison of runs by hyperparams or other metadata on the tensorboard or MLFlow"? If you mean "you can't compare or sort a list of runs by hyperparameter or minimum loss or whatever" then MLFlow can certainly do that, so I think I'm misunderstanding.
Any comments on Losswise or W&B?
And do you have a plan for monetization or governance?
Sorry for all the questions! I have complaints about all the existing solutions, so I'm excited to see a new effort.
re comparison: we have always wanted to use a free open-source self-hosted tool that would let us group metrics/runs by hyperparams, experiment context(train, val, test ...) and any other adjacent info about the training runs. Be able to aggregate groups of metrics, be able to give them different styles, divide them into subplots, search through the runs easily (without regexps on super-long names) etc. As far as I checked last times no such features aren't built for those tools. This is huge motivation behind Aim.
Probably the closest to this is W&B but it's not open-source and doesn't allow to see full context of the runs while comparing them (separate module).
Haven't used Losswise tbh.
We are trying to build a way that would allow to compare 1000s of ML training runs at the same time while still making the full info (context) of the runs available.
This is what I meant by "new paradigm".
(It turns out this is a fun problem :) ).
We have been working on Aim just a few months only (3 of us) and it's in very early stages. Most of the ideas we have aren't really shipped yet.
But it's already very useful for many RL researchers who run lots of experiments and those experiments are sensitive to hyperparameters. Aim seems to be able to handle them.
Have you checked out the live demo from the README?
Hope this info is useful and makes sense. Would be awesome to connect.
I would love to learn more about your use-cases and needs in these tools. My twitter is @gevorg_s.
We are building Aim as a new paradigm for interacting with and organizing the ML training runs. And the project is really just a few months old.
It's focused on comparing 1000s of experiments really effectively in minutes. MLFlow, Tensorboard don't have these capabilities which has motivated us to work on Aim.
Especially valuable when running hyperparam sensitive tasks such as RL.
Yea, spot on. Right now UI is ran locally and it reads the data from `.aim`.
We have seen users deploy it for the team and have seen standalone in the local environment too.
Once training is ran, the logs are saved in the `.aim`. We are using a format that is ~50% more memory efficient than the tb logs - and searchable (see the aimrecords repo).
Hi, Aim is fully open source and self-hosted (at the moment - just like tensorboard).
Open source version of wandb would be one way to put it.
But we are building a new way of interacting with the ML training runs that lets the researchers compare lots of them (1000s) in really short period of time while having full access to the context of the experiments. This is a super early version. And lots more work needs to be done.
We have implemented a pythonic search to search through the experiments that is easy to use.
Hopefully this sheds more light to the work we are doing.
Hi all I am one of the co-authors to this project, and will try to answer all the Qs here.
Was just forwarded this link - one of the community members must have posted it.
Interesting effort. Does it snapshot the state of source code at the time an experiment is run? Does it do it without requiring a git commit? I believe the Replicate experiment tracking tool does this.
We have got similar requests couple of times and its in the pipeline. Currently focused on the comparison of 1000s of metrics/training runs. It's a serious challenge both on the Ui and on the storage end.
Inviting you to the Aim [slack channel](https://slack.aimstack.io/).
We would love to learn more about such use cases and why they are important.