The publisher can control how much content is exposed via RSS (typically just th...

Andrex · on Nov 21, 2018

The publisher can also control how much is shared with third party aggregators, either through robots.txt or a paywall method.

Which has been the case since search engines became a thing.

tannhaeuser · on Nov 22, 2018

That isn't the same at all. A publisher cannot use robots.txt, and much less paywalls, to indicate a part of text that can be shared in syndication.

Andrex · on Nov 22, 2018

A paywall can. The page displays the snippet the publication is allowing to be shared, while the paywall hides the rest. I believe this is what a few of the bigger US newspapers are doing right now.

tannhaeuser · on Nov 22, 2018

Ok, but that would require regular readers to have credentials for the paywall. I understood the discussion to be about scraping publicly accessible sites.