Hacker Newsnew | past | comments | ask | show | jobs | submit | mintzworld's commentslogin

Oh, the irony of using SeleniumBase to make Playwright stealthy!


Even a single script that performs actions too quickly on a website can trigger anti-bot measures, even if the bot isn't detected directly.


I'm not denying that, I'm saying it's not a difficult challenge to solve when u compare it to the others I mentioned.


On the topic of scraping grocery sites, here's an example of bypassing bot-detection on Albertsons: https://github.com/seleniumbase/SeleniumBase/blob/master/exa... (A demo of that is in https://www.youtube.com/watch?v=Mr90iQmNsKM)


It seems weird to me that works - when I do scroll into views and similar behaviors in other code I do a random scroll speed to simulate human behavior, but SeleniumBase evidently doesn't.

Maybe I am just too paranoid.


SeleniumBase CDP Mode uses `DOM.scrollIntoViewIfNeeded` (https://chromedevtools.github.io/devtools-protocol/tot/DOM/#...), so it only scrolls when elements are offscreen, rather than always scrolling. This reduces the number of scrolls needed. Also, it seems that most anti-bot services are not looking at scrolling as a way of identifying users.


PyAutoGUI is the optimal tool for clicking things inside of closed shadow-root elements, which are hidden to JavaScript. Can use CDP for clicking other elements.


The "CDP Mode" used for bypassing CAPTCHAs and bot-detection has it's own ReadMe within SeleniumBase: https://github.com/seleniumbase/SeleniumBase/blob/master/exa... And there's a recent YouTube video with live demos: https://www.youtube.com/watch?v=Mr90iQmNsKM


SeleniumBase is free, open-source, can bypass CAPTCHAs with a few lines of code, and it works from the free tier of GitHub Actions.


It cant bypass all captchas and thats what im talking about.


According to live demos seen in https://www.youtube.com/watch?v=Mr90iQmNsKM, it'll bypass Cloudflare, Akamai, Shape Security, DataDome, Incapsula, Kasada, and PerimeterX.


Okay, and? DeathByCaptcha can bypass all of those + all other captchas.

Write a ton of code or just roll in a solving service API. Ez decision and save a ton of time + get to scraping faster.


I feel like what you're saying is you have a vested interest in the services you mentioned with all of this scope creep to your OG argument.


With SeleniumBase, you can bypass CAPTCHAs with one line of code: `sb.uc_gui_click_captcha()`


okay but it doesnt solve all captchas but a solving service does with a few more lines of code.

Can your script even do Google CAPTCHA and HCaptcha? What about the captcha from Dread? (aint no way it can)

There is no need to bypass them when you can just solve them.


There's a reCAPTCHA on the Pokemon website. This SeleniumBase example bypassed it: https://github.com/seleniumbase/SeleniumBase/blob/master/exa...


> There is no need to bypass them when you can just solve them.

There is no need to solve them when you can just bypass them.


the point is you cant bypass them all but you CAN solve them all.


Why pay to solve CAPTCHAs when SeleniumBase can bypass them for free? SeleniumBase can also "solve" CAPTCHAs (such as Cloudflare via click).


It's like you're not even reading what he wrote.


"Service workers can only be used over HTTPS, or on localhost for development" - That's a known Gatsby issue: https://github.com/gatsbyjs/gatsby/issues/10369


Some websites have a Same-Origin policy that won't allow rendering in an iframe. The CORS proxy lets you bypass the restriction.


But this is a tool for development. So if you're loading up a site for development, don't you work on that site and can control the Same-Origin policy?


It's possible to be testing a live version of your own site when you can't easily change the Same-Origin policy for just yourself.


Not sure if people still do the "upload PHP file automatically to production server via FTP on save" way of development. Most web developers I know run servers locally for development, where you can easily set your own headers and/or meta tags.


There are a few websites that will detect unusual behavior from browsers living inside of browsers. Although it does seem to work for most websites.


Selenium to the rescue!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: