mesa@piefed.social to Technology@lemmy.worldEnglish · 4日前Tesla said it didn’t have key data in a fatal crash. Then a hacker found it.www.washingtonpost.comexternal-linkmessage-square26fedilinkarrow-up1569arrow-down11file-textcross-posted to: technology@beehaw.org
arrow-up1568arrow-down1external-linkTesla said it didn’t have key data in a fatal crash. Then a hacker found it.www.washingtonpost.commesa@piefed.social to Technology@lemmy.worldEnglish · 4日前message-square26fedilinkfile-textcross-posted to: technology@beehaw.org
minus-squareMonkderVierte@lemmy.ziplinkfedilinkEnglisharrow-up11·edit-23日前How does archive get the unpaywalled version? I don’t think they pay the subscription for every single tabloid out there? Asking for a friend.
minus-squarestoly@lemmy.worldlinkfedilinkEnglisharrow-up7·3日前The paywall is JavaScript but the content is still in plaintext below. The crawlers don’t read the JavaScript.
minus-squareMonkderVierte@lemmy.ziplinkfedilinkEnglisharrow-up8·3日前Disabling 3rd-party js has no paywall, but only the first paragraph too. Crawlers get full access?
minus-squareAnarchistArtificer@slrpnk.netlinkfedilinkEnglisharrow-up6·3日前I think they use the same thing that web crawlers use. If Google’s crawler couldn’t access the content of the page (or could only access a limited amount of content), it would likely rank far lower in search results
minus-squareMonkderVierte@lemmy.ziplinkfedilinkEnglisharrow-up3·edit-23日前Btw, how come there is no search engine where you can sort and filter how you want instead of how they want? (except self-hosted i mean) Pornhub has better searchability than, uh, all search sites i know.
How does archive get the unpaywalled version? I don’t think they pay the subscription for every single tabloid out there?
Asking for a friend.
The paywall is JavaScript but the content is still in plaintext below. The crawlers don’t read the JavaScript.
Disabling 3rd-party js has no paywall, but only the first paragraph too. Crawlers get full access?
I think they use the same thing that web crawlers use. If Google’s crawler couldn’t access the content of the page (or could only access a limited amount of content), it would likely rank far lower in search results
Btw, how come there is no search engine where you can sort and filter how you want instead of how they want? (except self-hosted i mean)
Pornhub has better searchability than, uh, all search sites i know.