Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm surprised (and could be wrong), no one has made a chrome extension that just controls a page and exposes the output to localhost for consumption as an API. Similar to using chrome web driver, but without the setup.


It's not a browser extension, but controlling the actual browser without using webdriver is already a thing.

https://github.com/autoscrape-labs/pydoll


Isnt that basically what browser-use is?


I kind of agree and don't. You could say HTTP+DOM is the API, we're already there. But it lacks the structure and a more explicit regularity (in part because it's meant for human consumption, not programming). And if you were to describe the whole protocol (including CSS and JS as they can change ordering, even content, of what's shown) it's incredibly more complicated than the equivalent, distilled representation.

There are efforts going back at least fifteen years to extract ontologies from natural language [0] and HTML structure [1].

[0]: https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&d... (2010) [PDF]

[1]: https://doi.org/10.1016/j.dss.2009.02.011 (2009)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: