The author mentioned AlphaGo and Alpha Zero without mentioning OpenAI gym and Op... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		paradite 11 months ago \| parent \| context \| favorite \| on: Google is winning on every AI front The author mentioned AlphaGo and Alpha Zero without mentioning OpenAI gym and OpenAI Five. Those products show OpenAI was innovating and leading in RL at that stage around 2017 to 2019. https://github.com/openai/gym https://en.wikipedia.org/wiki/OpenAI_Five

bitpush 11 months ago [–]

This is the first I'm hearing about it.

paradite 11 months ago | [–]

I forgot to mention that OpenAI also invented PPO, which is the default algorithm that everyone uses for RL since 2017: https://en.wikipedia.org/wiki/Proximal_policy_optimization

DeepSeek's GRPO is also just a minor variant of PPO.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact