gevorg_s's comments

gevorg_s · on Jan 14, 2021

Thanks :)

gevorg_s · on Jan 14, 2021

the big diff and the advantage is that on tb you can't group by hyperparams or easily divide into subplots while having all the research info in fro nt of you.

Aim does that and also aggregates groups of runs to reduce the dimension and make it easy to compare. It has a proper search by hyperparsams (and everything else tracked/collected really). All in one panel where you can compare 100s of experiments at a time.

Loading many runs on TB, with very long names makes it super slow to analyze the runs really. Tbh that has been a motivation for building this open source tool - to have something efficient and beautiful :)

gevorg_s · on Jan 14, 2021

Aim easily handles 100s of ML training runs. There are folks who have 1000s of experiments too. We haven't oficially benchmarked it yet though.

gevorg_s · on Jan 14, 2021

Pls see my answers above.

gevorg_s · on Jan 14, 2021

We are working on a new paradigm on interacting with Ml training runs. A lot of the effort is now focused on very efficient experiment comparison capabilities - talking about 1000s of them. Lots of challenges on the UI and the backend. When loading TB or any other tool really with lots of experiments it's super slow and becomes useless. Also no way to do effective comparison of runs by hyperparams or other metadata on the tensorboard or MLFlow. Quite basic capabilities.

williamsmj · on Jan 14, 2021

Sounds good (especially the performance) but ...

What is the new paradigm and how does it differ from the existing paradigms?

And what do you mean by "no way to do effective comparison of runs by hyperparams or other metadata on the tensorboard or MLFlow"? If you mean "you can't compare or sort a list of runs by hyperparameter or minimum loss or whatever" then MLFlow can certainly do that, so I think I'm misunderstanding.

Any comments on Losswise or W&B?

And do you have a plan for monetization or governance?

Sorry for all the questions! I have complaints about all the existing solutions, so I'm excited to see a new effort.

gevorg_s · on Jan 14, 2021

no worries at all, love the Questions!

re comparison: we have always wanted to use a free open-source self-hosted tool that would let us group metrics/runs by hyperparams, experiment context(train, val, test ...) and any other adjacent info about the training runs. Be able to aggregate groups of metrics, be able to give them different styles, divide them into subplots, search through the runs easily (without regexps on super-long names) etc. As far as I checked last times no such features aren't built for those tools. This is huge motivation behind Aim.

Probably the closest to this is W&B but it's not open-source and doesn't allow to see full context of the runs while comparing them (separate module). Haven't used Losswise tbh.

We are trying to build a way that would allow to compare 1000s of ML training runs at the same time while still making the full info (context) of the runs available. This is what I meant by "new paradigm". (It turns out this is a fun problem :) ).

We have been working on Aim just a few months only (3 of us) and it's in very early stages. Most of the ideas we have aren't really shipped yet.

But it's already very useful for many RL researchers who run lots of experiments and those experiments are sensitive to hyperparameters. Aim seems to be able to handle them.

Have you checked out the live demo from the README?

Check out my blogpost on TowardsDataScience for more info on Aim (https://towardsdatascience.com/3-ways-aim-can-accelerate-you...).

Hope this info is useful and makes sense. Would be awesome to connect. I would love to learn more about your use-cases and needs in these tools. My twitter is @gevorg_s.

gevorg_s · on Jan 14, 2021

Would love to invite you join the Aim community slack [here](https://slack.aimstack.io/) ? Let's connect!

gevorg_s · on Jan 14, 2021

We are building Aim as a new paradigm for interacting with and organizing the ML training runs. And the project is really just a few months old.

It's focused on comparing 1000s of experiments really effectively in minutes. MLFlow, Tensorboard don't have these capabilities which has motivated us to work on Aim. Especially valuable when running hyperparam sensitive tasks such as RL.

gevorg_s · on Jan 14, 2021

gevorg_s · on Jan 14, 2021

Yea, spot on. Right now UI is ran locally and it reads the data from `.aim`. We have seen users deploy it for the team and have seen standalone in the local environment too. Once training is ran, the logs are saved in the `.aim`. We are using a format that is ~50% more memory efficient than the tb logs - and searchable (see the aimrecords repo).

gevorg_s · on Jan 14, 2021

Hi, Aim is fully open source and self-hosted (at the moment - just like tensorboard). Open source version of wandb would be one way to put it.

But we are building a new way of interacting with the ML training runs that lets the researchers compare lots of them (1000s) in really short period of time while having full access to the context of the experiments. This is a super early version. And lots more work needs to be done.

We have implemented a pythonic search to search through the experiments that is easy to use. Hopefully this sheds more light to the work we are doing.

gevorg_s · on Jan 14, 2021

Hi all I am one of the co-authors to this project, and will try to answer all the Qs here. Was just forwarded this link - one of the community members must have posted it.

pchal · on Jan 14, 2021

Interesting effort. Does it snapshot the state of source code at the time an experiment is run? Does it do it without requiring a git commit? I believe the Replicate experiment tracking tool does this.

gevorg_s · on Jan 14, 2021

We have got similar requests couple of times and its in the pipeline. Currently focused on the comparison of 1000s of metrics/training runs. It's a serious challenge both on the Ui and on the storage end.

Inviting you to the Aim [slack channel](https://slack.aimstack.io/). We would love to learn more about such use cases and why they are important.