More

phissenschaft · on Dec 31, 2023

I use Emacs for most of my work related to coding and technical writing. I've been running phind-v2-codellama and openhermes using ollama and gptel, as well as github's copilot. I like how you can send an arbitrary region to an LLM and ask for things about it. Of course the UX is in early stage, but just imagine if a foundation model can take all the context (i.e. your orgmode files and open file buffers) and can use tools like LSP.

phissenschaft · on April 14, 2023

I just stopped worry and succumbed to https://github.com/emacs-evil/evil. Now I only mostly just fiddle with orgmode configs to generate nice looking HTML and PDFs.

phissenschaft · on March 25, 2023

Been using Spack for a while to manage my machine learning package dependencies. It allows me to quickly spin up projects with complex dependencies (my current environment has 329 packages built ...). It's pretty easy to use with containers. It allows me to evaluate and migrate to different PyTorch/CUDA versions easily.

phissenschaft · on Sept 4, 2022

Really happy for the author! I also find gamification and vanity help. Strava/Peloton are good places to keep scores and brag about what you've done.

phissenschaft · on Aug 11, 2022

Congratulation on the launch! Best wishes! Would absolutely love to dive into it soon.

Here are some high level questions:

- How does it handle failure of individual tasks in the pipeline? - What if the underlying jobs (e.g. training or dataset extraction or metrics evaluation) need to run outside the k8s cluster (e.g. running bare-metal, slurm, sagemaker, or even a separate k8s cluster)? - How does caching work if multiple pipeline can share some common components (e.g. dataset extraction)?

neutralino1 · on Aug 11, 2022

> - How does it handle failure of individual tasks in the pipeline? At this time there are no handling of failures (Sematic is 6 weeks old :). In the near future we will have fault tolerance mechanisms: retries, try/except.

> - What if the underlying jobs need to run outside the k8s cluster? You are free to launch jobs on third-party platforms from one of your pipeline steps. This is a pretty common pattern, for instance launching a Spark job, or a training job on a dedicated GPU cluster. In this case, the pipeline step that launches the job (the Sematic function) needs to wait for the third-party job to complete, or pass a reference to the job to a downstream step that will do the waiting.

> - How does caching work? At this time there is no caching (as mentioned Sematic is very new :). We will implement memoization soon. What you can do is run a data processing pipeline separately and then use the generated dataset as input to other pipelines. This is a pretty common pattern: having a number of sub-pipelines (e.g. a data processing loop, a train/eval loop, a testing/metrics loop, etc.) that you can run independently, but also you can put them together in an end-to-end pipeline for automation. Sematic lets your nest pipelines in arbitrary ways, and each sub-pipeline can still have its own entry-point for independent execution.

phissenschaft · on July 5, 2021

Is there a plan to tighter integrate into k8s, potentially in a multi-cluster/federated setting. It's a lot easier to get buy-ins for ray adoption from infra teams where k8s is the centralized compute substrate.

phissenschaft · on July 4, 2021

Great work and kudos to the Ray team! It's definitely a fresh look with a lot of lessons learned from previous generations (e.g. spark).

There are a few nice features I wish Ray would eventually get to.

On the user experience side, it would be nice to have task level logs: often time it's easier for users to reason at task level, especially the task is a facade that triggers other complicated library/subprocess calls.

For the scheduler, if there's more native support for sharded/bundled/partitioned tasks and https://cloud.google.com/blog/products/gcp/no-shard-left-beh...

phissenschaft · on June 15, 2021

Wonder if https://hydra.cc/ would be a better choice

phissenschaft · on June 10, 2021

Really like the flexibility provided by Argo. It's the missing workflow concept from Kubernetes.

phissenschaft · on May 7, 2021

Super useful for updating rpath!