Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I just don't understand why bot owners can't just run a complete windows 11 VM running Google Chrome complete with graphics acceleration.

You can probably run 50 of those simultaneously if you use memory page deduplication, and with a decent CPU+GPU you ought to be able to render 50 pages a second. That's 1 cent per thousand page loads on AWS. Damn cheap.



There are myriad providers competing to offer this, nicely packaged with all the accoutrements (IP rotation, location spoofing, language settings, prebuilt parsers, etc.) behind an easy to use API.

Honestly it is a very healthy competitive market with reasonably low switching costs which drives prices down. These circumstances make rolling your own a tough sell.


They do, but the fact that they have to do this means there are fewer bots because it's less economical to go to such lengths, compared to something much less complex (which is orders of magnitude cheaper).


there are scraping subreddits.

if you browse them you will see that bot writers are very annoyed if they can't scrape a site with a headless browser.

you can do what you suggested, but with Linux VMs/containers. windows is too heavy, each VM will cost you 4 GB of RAM


The reason to use windows is that anti bot tech is going to be a lot stricter if Linux is detected...


I’m in those. xvfb and headless=false still works great


If you know of a simple way to run a Windows 11 VM with good graphics acceleration (no GPU passthrough), please contact me.


I assume your concern with GPU passthrough is that each VM needs a whole GPU? You can use GPU-PV to split your GPU between VM instances. Then the main bottleneck becomes how thin you split out your VRAM.

More info here:

https://web.archive.org/web/20231107182321/https://mu0.cc/20...

https://youtu.be/XLLcc29EZ_8?t=570

https://github.com/jamesstringer90/Easy-GPU-PV


Wouldn't virtualbox or vmware's paravirtual GPUs be a better fit for this use case? Unfortunately the offerings with qemu/libvirt still lag vmwares by a lot.


I know those offer virtual GPUs, but I am unfamiliar with any paravirtual GPU offerings from VMWare or VirtualBox. The virtual GPUs are much more limited in performance and graphics API support.


284 on 296gb of ram with deduplication enabled on a 128c with 32Q vgpu.


I am reasonably sure that these kind of fingerprints can detect if the browser is inside a VM.


… yup?

I mean you missed the minigame of preventing Chrome from signaling that it’s being programmatically (webdriver etc) driven and tipping your hand, but … yup?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: