Hacker Newsnew | past | comments | ask | show | jobs | submit | malcolmhere's commentslogin

You can get Markdown nowadays too, at least using this Python wrapper:

https://aws-samples.github.io/amazon-textract-textractor/not...

It's very consistent, though pricey.


Many shelters in the US are run be religious organizations and discriminate against gay people, too. This is particularly a problem because LGBTQ make up a sizeable chunk of homeless youth (think: kids getting kicked out by parents).


[flagged]


You really can't imagine why a person kicked out by their religious parents for being gay would be apprehensive about entering a homeless shelter operated by that same religious group?


That would be quite the coincidence... and also assumes there is only one shelter with free beds.


People who have never faced consequences for their actions rarely consider the possible consequences of their actions


I've written similar code for investment banks, to extract financial reporting data from PDFs. It's shocking to think how much of the financial world runs on this kind of tin-cans-on-a-piece-of-string solution.


Do these pdfs even get printed ?


If you're in Oakland, you can get haircuts for $15 at Lee's Barbershop in Chinatown: https://www.yelp.com/biz/lees-barber-shop-oakland

Probably not the point, but there's generally no wait, little small-talk, and he'll clean up loose hairs with an industrial vacuum


This is a recurring problem in the animal feed industry. Easiest explanation for the phenomena: the small particles fall through the gaps gaps between the large particles.


Anyone who complains wind-turbines being unsightly should be forced to live next to an oil-refinery :-(


This is such garbage. If you clickthrough on this guy's "research", you'll find that he objects to Google's "Go Vote Today" reminder is unfair politically because:

- It increases voter turnout across the board (which favors Democrats marginally) - Google users tend to be younger (and hence more left-leaning)

Google has many questionable practices, but simply reminding people when it is an election day is not one of them


What if Google used all the data they collected and used machine learning to show the voting reminder only to users who align ideologically with Google?

That’s a big what-if, of course. I’m not even sure if it would be violation of any privacy law because that’s not selling your data to 3rd-parties.


Even assuming Google is theoretically capable of showing the voting reminder only to users who would vote for a Democratic candidate doesn’t mean that they did that. Dr. Epstein doesn’t present any evidence (even anecdotal) that the reminders were not shown to Republicans.


Exactly! This is the interesting question to ask!

And you should expect this to take a much less straight forwards form than outright malicious intend.

Machine learning stuff, to us mere mortals, is a black box that seems to output that what we desire. However, we don't need a single black box to completely hide the effect from everyone involved. It can be simply spread out over many processes without anyone noticing the sum of countless, by themselves very sensible, pro-google decisions made by google and its employees.

Say on a fuzzy edge some human or automation has to decide if a website gets banned from the index or not. Since they use adsense or google analytics it becomes possible to contact someone running the site or even, long before the offense, present them with a machine generated report suggesting decline of quality.

The google hating webmaster doesn't want all that tracking crap on his site and therefore he might be punished by not having the warning in advance. It doesn't seem harmful, it seems perfectly reasonable, if however you create 1000 such effects the meaningless 0.1% adds up to 100%.

Say, we take a user profile that use to frequent sites now banned from the search index. If that profile moves to a new website. Does that merit close monitoring? If the AI is a black box you don't even get to ask the question.

It all seems to be fair and logical but the sum of those 2 would boil down to close monitoring websites not using analytics. We would praise the AI for noticing it so soon and so efficiently.

And so things further escalate without anyone connecting the dots.

To make it more sinister:

We have a system of pro-consumerism entities that collectively indoctrinates us to enjoy consumerism. We don't know how efficient or effective that system is. But all we need to know is that we can improve it.

How hard is it really to plant a seed, water it and eat the tomatoes? How hard is it to mix flour, yeast and water then shove it in the oven? It sure seems like an infinitely complex task to me, infinite as in: ill never do it. That apathy is great news for consumerism. Its behavior that deserves to be praised and rewarded by the entities benefiting from it. There is no evil man pulling levers behind a curtain. Its just the way things are.


Companies sponsor campaigns and publicly support politicians all the time. Personally I think campaign contributions should be way more limited, but I'm not sure what is special about Google here.


What if reddit did this? Or Amazon? Or literally any popular web site. Is this reason to break up all of them?


Imagine if a newspaper endorsed a candidate?


Except it would be more subtle. Google would not be telling you who to vote for because they already have an idea who you'll be voting for, based on machine learning results. If they think you would vote for someone disadvantageous to Google, they would not show the voting reminder.


It sounds a bit like this degree of subtlety:

https://i.imgur.com/OkVMmQi.jpg


It's more than a big what-if: it's also not something Epstein claims has happened.


To play devil's advocate: What role does a for-profit advertising company have in encouraging people to vote? And what are their motives in doing so?

Disclaimer: I haven't read the pdf (only responding to your comment) and don't really hold a strong position on this question, other than being sceptical of Google's participation in any political process.


So, I was the person who created the "where do i vote" for Google originally and ran it for years.

My motive was "Where do i vote" is basically the top query on Google on election day, and people would get crappy answers. In fact, it was actually Ginny's idea (she's very civic minded), and she needed an engineer to help, and I was the only engineer in the DC office.

So i said "how hard it could it be" (famous last words) and did it. 2 swe's in geo got dragged in along the way because they thought it was cool (eventually we just staffed a team on our own).

Michael geary, who is on HN somewhere, did all the JS.

Along the way, i spent my time and energy creating the voting information project (with pew charitable trusts), and open standards for sharing the data necessary to answer this question, after discovering what a proprietary crap hole basic civic data like this is.

So there you go, now you know the motives.

In fact, if you ask some of the early data partners (until i could get critical mass in opening the data), you will discover we were in fact the only ones they pretty much ever had who asked to have all personally identifying info, etc. stripped from data sources before they were given to us.

They found it quite funny, because this kind of data is actually a big business owned by large political operative companies that have tentacles in various states. The notion that someone didn't want to know the people associated with the address records was hilarious to them.

These databases are large lists of who lives where, their political affiliation, voting history, and various political districts they belong to.

We want addresses and districts only.


I would make a distinction between curating voting information and displaying it once a user searches for that kind of information versus running a "Go Vote" banner (google doodle) on the home search page without a user having made any inquiry on that subject. The latter is what is being complained about in point 2 (I've now looked at the pdf).


I wouldn't. Pretty much everywhere tries to get the vote out. Most workplaces send out email reminders, etc. Every single person in the US gets a card in the mail reminding them to vote.

Literally everyone is trying to remind people to vote. This is a good thing.

Anyone complaining about anyone trying to remind people to vote is just ridiculous.


It depends on whether you think everyone should be encouraged to vote or whether you think only people who are 'informed voters' should vote (people who have followed the political debate and are informed on policy positions, etc). The latter don't need to be encouraged; they are self-motivated.

Again, I'm not taking either position myself but those are two different political positions which exist. Google is choosing one and thus taking a political, pro-active stance.


> only people who are 'informed voters' should vote

That is not a valid position to have and thus doesn't need to be considered.

Voting is the best way to figure out how everyone feels on things. As annoying as it is party lines giving free passes to incumbents that is a problem better solved via other methods such as term limits.

Even if someone is going to vote opposite of me I would rather they express that opinion than have the vote not hear their voice.

The solution to uniformed voters is for people to become informed not avoid people going to the polls.


This particular question was answered at the founding of our country, and doesn't need to be answered again. Also - attempts to do so since have mostly been naked racism/sexism[1]

Calling it a "political, pro-active stance" is, well, ridiculous.

I guess you can try to label everything, but i don't think you are going to find a lot of support for your attempt.

[1] I'm speaking about poll taxes, poll tests, etc.


To devil's advocate your devil's advocacy: let's just assume Google is encouraging people to vote because they know that more democrats will see that message, that they want democrats to win, and that's the only reason for them doing so.

What is wrong with that, and how would you propose to fix it via a mechanism that didn't also e.g. outlaw FOX News?


Can FOX News (or any other media organization) do a campaign with the same impact as Google?

If they can, then this is just a random guy making things out of proportion and there's nothing useful there. If they can't, and Google can indeed change people's mind in a level unparalleled to any other organization, then you go looking for why and you either takes that power away, or makes sure it's available for everybody.


To play devil's advocate to your advocacy of the devil's advocate, what if Exxon or a right-leaning org offered discounts to people who "promise to vote Republican". Nothing binding, simply an unenforceable promise (gentleman's or lady's agreement) to vote for a party in exchange for a 20% discount on your fuel.


> what if Exxon or a right-leaning org offered discounts to people who "promise to vote Republican"

Vote selling is election fraud, which is a crime. Anyone who took those discounts would go to jail. Sounds fair to me.

I don't see how "Please vote" is remotely comparable to vote buying. Can you elaborate?


A better example would be what if Google gave out 'I Voted!' awards that, upon proof that you voted, gave you access to special Google features.


OK, I guess. But... they don't. So what's the argument here?


This is a fascinating rabbit hole to go down.

When you consider some of the recent decisions about what constitutes "speech," it raises the important question of whether a private company influencing its customers' voting preferences would actually be protected.

How much different is it for a company to use its speech to influence a politician directly (donations, lobbying, PACs, etc.) vs. stating its political opinions on its own private "property" potentially to influence politics indirectly?

In physical space, is it legal to:

* Have a sign promoting a social position in the window of your private store? ("Say no to drugs!") * Have a sign promoting a political position? ("Say no to the Iraq war!") * Have a sign promoting a candidate? ("I like Ike!") * Tell customers who to vote for? ("Have a nice day, vote for JFK!") * Tell only customers that "look a certain way" who to vote for? ("As someone in a wheelchair, you should vote for FDR!")

Now convert all this to the online world with banner ads, user targeting, and personalization. Isn't it just free speech at scale?


it's the scale that may tip the balance from voicing your opinion to mass-manipulation. i am not saying that online speech is mass-manipulation, but it has the potential to be, whereas the other has not. and because of this potential they need to be considered separately.

fwiw, an online "go vote" is not manipulation if it's not targeted even if the audience is an uneven demographic, but "i like ike" is. and while google search may have a younger demographic, they certainly don't intentionally limit that, since they want everyone to use google search.


They same role they have in offering free email, or a free calendar app. People wanted a vote-reminder application, so Google made one to increase engagement in its platform.

If for profit companies are Constitutionally allowed to make political donations, why would reminding people to vote be any different?


It's like changing a company's logo to a rainbow flag during pride month. It has basically zero cost and gains a minor increase in public perception as being a company that stands with "the people." (Telling people to vote has always been pretty popular, and I remember my friends sharing posts of their favourite celebrities telling people to vote.)


I dont think there is anything problematic in simply encouraging people to vote, even if you tend to reach certain groups better. A higher voter turnout should be something to strive for, its a sign for a healthier democracy. The current approach to discourage more voters of the opposite party to not go voting then your own get discouraged seems more then just a bit problematic.

Differently put, what harm is done by getting people to go vote when you are not influencing them to vote in a certain fashion? I would argue that only the opposite is problematic, to discourage people to vote.

Googles motivation shouldnt matter as long as they are not influencing who you are going to vote for.

But thats another topic on its own, I find it hard to argue why google should be singled out in a country where even news organizations endorse candidates. If they are to powerful dismember them instead of making a lex google.


> to discourage more voters of the opposite party to not go voting then your own

"Vote rigging" would be the term.


It's a good business practice because it's good branding.

Companies intentionally try to associate their brand with other things seen as positive -- pride month, the Olympics, the environment, promoting diversity, etc.

Google does this with their home-page doodles. Encouraging people to vote is another feel-good brand association.

There's nothing nefarious here like Google promoting a specific political agenda, that's overthinking it. It simply improves their brand image which makes people more likely to trust and use and promote their products. It's the profit motive at work... which, to answer your question, is why a for-profit advertising company does it.


It's a marketing opportunity to encourage people to use their services to find polling stations and other related info.


I agree. It's a Pandora's box I don't want to open.


We also allowed anyone to embed our voting location widget (it was open source too), which he missed


I worked on a product like this for a long while, so I can appreciate how hard the problem is you are trying to solve. Some things I realized along the way:

* Price capture needs to happen in a headless browser (e.g. PhantomJS), rather than just capturing the HTML with a GET. Too many sites use JavaScript to make raw HTML analysis feasible.

* You can get > 50% of the pricing information with fairly simple matching on the class/id value in the HTML tag. But you need a headless browser to make sure the tag is visible. And since most product pages contain multiple prices, you need some heuristic to determine the relevant price. Oh, and watch of out for "reduced from" prices too (e.g. "Old Price: $50, New Price: $35".

* It doesn't hurt to be able to override the general heuristic on a domain-by-domain basis, saved me a lot of headaches.

* You need to be honest with yourself about how reliable the price capture algorithm is, and built up a regression database of known good pages, so when you change the algorithm, nothing else breaks. Also, you need to keep ahead of site redesigns!

* Product URLs tend to look messy, but tend not to change very often, if at all. I was worried about retailers e.g. changing product identifiers, but changing URLs hurts their SEO, so they don't do it. You will find "zombie" products, though - things which appear to be still on sale, but aren't linked anywhere on the site. Deciding when a product is sold out is tricky.

* The best user experience presents the items the user is watching as a "shopping basket". (I took design cues from Pinterest.) For a really slick experience, you should pick out the product name and image (Facebook meta-data helps here) and include them in you "pinned" products.

* Cutting-and-pasting URLs is a hassle. Consider writing a browser extension or a bookmarklet - users don't like to have the browsing flow interrupted by having to click across tabs. Having the price capture done inline on the page really impresses people.

Best of luck with this! I'm yet to see someone solve this problem well, and I eventually moved on other things after losing a lot of my hair. :-)


Really interesting feedback. Thanks!


- We don't have the breadth of Safelight's material (early days I guess), but the areas we cover, we do a much better job. It was our frustration with this kind of training material that inspired us to make Hacksplaining in the first place:

https://www.youtube.com/watch?v=jkQgVO993W8

Our exercises are interactive, rather than passive, and focus on specific ways to fix code, rather abstract concepts. Compare with what we have on SQL injection:

https://www.hacksplaining.com/exercises/sql-injection

We started with the question "what are the essential things we would want our development team to know?" and then figured out the most compelling way to teach about them.

- We are talking to a couple of firms that we could partner with to help establish credibility. It's a bit of Catch-22 selling this kind of training material - people buy your product on the basis of who your existing customers are, to some extent. Finding an established player to work with would really give us a leg up.

- Most companies reluctantly pay for security training, precisely because so much of it is onsite and expensive. Making security training mandatory for developers is a good policy for a CTO of a large company (particularly if they have been hacked recently), but it's generally impractical to send to send everyone out for a 5-day course. We hope engaging, online material can fill that niche.


Hey I just worked on the SQL injection course and I wouldn't use the Chase's logo for your fake banking application, or any major companies logo for your insecure sites.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: