Hacker Newsnew | past | comments | ask | show | jobs | submit | camkego's commentslogin

The real cherry on top, is that the Microsoft link from the blog post by the Microsoft senior product manager goes to a Kaggle dataset page claiming the dataset is CC0: Public Domain.

https://www.kaggle.com/datasets/shubhammaindola/harry-potter...

More than just using the data, it seems linking to a copy that claims the dataset is public domain, would be problematic copyright-wise.

Also interesting, this blog post has been up since November of 2024, very surprising to me that Microsoft hasn't taken it down yet.


Wow, that is a great catch. I looked at the Kaggle page. It has been up for two years. From the hamburger menu (top right), I tried: Report Dataset. When I click the button "Report illegal content", I am redirected to a Google page (huh?): https://support.google.com/legal/troubleshooter/1114905?prod...

When I try to fill the questionaire, my request is rejected with this message:

    We understand that you are not legally authorized to file a copyright complaint on behalf of the copyright owner.

    In accordance with applicable copyright laws, we only accept copyright complaints from copyright owners or their authorized representatives. If you have legal questions about copyright law, please consult your own legal counsel.

    We are sorry we cannot assist you further.
Hysterical. What a farce. That data set is pure theft.

Allowing third parties to open copyright complaints on behalf of the copyright owner opens a massive can of worms and is incredibly ripe for abuse.

i'm not sure why you think it's a farce though, not allowing third parties to file complaints

(e.g. see youtube, where this is (used to be?) poorly enforced, it's a mess)


I thought their process was just a checkbox that said "trust me bro i own this" basically

Kaggle is part of Google.

Welp, somebody certainly noticed now.

> it seems linking to a copy that claims the dataset is public domain, would be problematic copyright-wise.

Would it? Sounds to me like the blame lies on the person uploading the dataset under that license, unless there is some reasonable person standard applied here like 'everyone knows Harry Potter, and thus they should know it is obviously not CC0'


> unless there is some reasonable person standard applied here like 'everyone knows Harry Potter, and thus they should know it is obviously not CC0'

Yes there's an expectation that you put in some minimum amount of effort. The license issue here is not subtle, the Kaggle page says they just downloaded the eBooks and converted them to txt. The author is clearly familiar enough with HP to know that it's not old enough to be public domain, and the Kaggle page makes it pretty clear that they didn't get some kind of special permission.

If you want to get more specific on the legal side then copyright infringement does not require that you _knew_ you were infringing on the copyright, it's still infringement either way and you can be made to pay damages. It's entirely on you to verify the license.


> unless there is some reasonable person standard applied here like 'everyone knows Harry Potter, and thus they should know it is obviously not CC0'

Why wouldn't that apply?


I'm not a copyright expert and if you told me that Harry Potter was common domain then I'd probably be a bit surprised but wouldn't think it's crazy. The first book came out 30 years ago after all. On further research the copyright laws are way more aggressive than that (a bit too much if you ask me) but 30 years doesn't seem quick. Patents expire after 20 years.

It would be incredibly naive to assume that a moneymaker like that is PD.

Sherlock Holmes is public domain and there are still shows being announced

New Sherlock Holmes works are copyrighted. Not by Conan Doyle...

I find this fascinating, as I keep observing that there are pretty widespread differences between what people believe copyright does and what the law actually says.

The Berne Convention (author's life + 50 years) is the baseline for the copyright laws in most countries. Many countries have a longer copyright period than Berne.

https://en.wikipedia.org/wiki/List_of_copyright_duration_by_...


I think even people who don't care about how broken the copyright system is understand intuitively that huge commercial properties that are contemporaneous with themselves are protected. They don't need to know any details to know that these properties belong to massive companies and aren't free for the taking.

How many people think they can rip off Disney characters even if they don't know how much Disney lobbied to extend their ownership? People can observe that no one but Disney gets to use them and understand, even if not consciously, that those are Disney's to use.

^ Probably poorly written without time to proof cause time constraint.


It is a media franchise for children, and there are many elements, and trademarks in addition to copyrights. I think most fans understand the bright line that stops them copying an entire book or film work, unless their dad has a Roku at home.

But there are over 34,000 images uploaded to the Fandom.com site alone. There are character bios and generous quotes from films and books. Countless fans are using elements in memes and avatars and social media posts.

Fan-fiction abounds, where the characters and scenarios are endlessly remixed and mashed up with other fandoms.

Quidditch... simulated... is a collegiate sport, but they had to rename it.

Even on the official Wizarding World site, you can make custom downloadable stuff. Not long ago, freely download wallpapers. Get free clips and trailers on any video site.

News outlets had a difficult time explaining the "Public Domain" status of Mickey Mouse and Betty Boop with the new years. Because Mickey Mouse and Betty Boop, the characters, aren't the things which are copyrighted, and the characters' status didn't change with the new year.

I would bet that the typefaces in the official books have their own copyrights, and the book binding processes are patented.


Copyright infringement is a strict liability tort in the US. Willful infringement can result in harsher penalties, but being mistaken about the copyright status is not a valid defense.

I don't know if you're trying to say that, in the realm of tort law, it is only strict liability, or if you are saying that copyright infringement is only a tort. If it's the latter, it's completely untrue, as there are criminal copyright infringement statutes.

The article author and the uploader should _BOTH_ be sentient enough to engage brain and not just ignore it because they feel "it's an abstract concept I'd not get in trouble for when not working in the US or EU".

It’s been a while since I’ve done my Applied Computational Math Sciences degree, but I still appreciate seeing mathematically oriented posts like this on HN!

most of my recent post are about math, since it is the only matter that matter.

https://news.ycombinator.com/submitted?id=tzury


Didn't a HN reader/poster get locked out of his Apple accounts because of late payments or some other type of issue with an Apple Card credit card?

If this can happen, I don't plan on ever getting one.


I had thousands of dollars of charges I didn’t make (but billed electronically via Apple as if my card was attached to someone else’s iCloud account) appear, exceeding my limit and locking my entire family out of the shared Apple subscriptions (e.g. no music on their devices but strangely music on mine). GS could remotely lock your Apple account at will it seemed.

Called, charges reversed, 1 month later they were all reinstated, no reason. Called, charges reversed again, 1 month later they were all reinstated and my Apple Card was cancelled by GS. Since I still had it linked as my payment method w Apple this again locked my whole family out of subscriptions. GS gave no explanation for why the account was closed, but it was after they reversed the charges a second time. Balance was $0, account closed, no recourse. Then 1 month later all the reversed charges were reinstated and on the now locked account and I had no recourse but to pay GS’s ransom because it was a closed account that would be reported if I didn’t pay.

I have CCs from a number of banks and it was by far the most ridiculous consumer experience. No wonder GS wants to exit the consumer market because they are terrible at it. Chase is better, but it’s no AmEx. Sad that it’s not AmEx.

Now I’m wondering if I can get a new account under Chase because I’m definitely never calling GS again.


You clearly never experienced chases anti-fraud division. Who will, without warning, close all of your chase accounts because you used a credit card at the wrong store at the wrong time and you set off whatever fraud score. And csr will not help you as the anti fraud system is treated as god.


Funny you mention that. I did open a joint Chase account back when I was getting married and first order of business was paying for all the wedding expenses. We went to QR, Mexico for a weekend getaway prior to the madness and because I logged into the Chase account from Mexico they not only locked the bank account, but completely closed it with no way to reopen. We had to both go to the Chase branch at which it was opened with passports and even then the manager said Chase would mail me the money in 3-6 weeks, 2-5 weeks after the wedding. I politely told him I was not leaving the building until he handed me the cash in an envelope and after some time he finally caved and did it after making some phone calls.

Most bizarre experience. Never had a bank account with Chase ever again after that incident.


A client of mine pushed her business expenses through a Chase account. She banked with them too.

She traveled to the “wrong” country. A country that is not and was not embargoed or sanctioned. A country that the US is/was on good terms with. She had been there and used her Chase card previously. She didn’t do anything out of the norm.

Chase closed all her accounts with no notice while she was traveling and refused to provide a reason. I told her to sit on my invoices while she scrambled and got things sorted out.

It took months. They refused to send her the balance of her asset accounts until she threatened to sue them and air the calls she recorded.

I have kept a healthy distance from relying on Chase ever since.


This?

https://news.ycombinator.com/item?id=44021792

https://news.ycombinator.com/item?id=26310817

But:

"Allegedly unrelated to the apple card, your appleid just gets locked if you don't fulfill a trade-in."


Apple Store gift card (redeemable for apps music etc purchased in Apple ecosystem ) iirc


If you walk into the head office of Qualcomm (in Sorrento Vally, San Diego, CA) and you see the the "Patent Wall" in the entrance covered with almost 1400 patents, it's kind of hard to wonder just how open Arduino will be.


Wow that's tacky.


Maybe if the sign up process encouraged people to send videos (screen-side and user-side could be useful also), of their sign-up and usage experience, the teams responsible for user experience could make some real progress. I guess the question is, who cares, or who is responsible in the organization?


Now that I have seen groklaw.net/about-us page, I have seen it all.

Here is the new Groklaw mission statement:

Our Mission

Our mission is simple: to guide you toward safe, rewarding, and responsible crypto gambling experiences. We believe in transparency, player protection, and giving you the tools to make informed choices — whether you want massive Bitcoin bonuses, ultra-fast withdrawals, or niche altcoin gaming.


Just my two cents, as an end-user choosing a OS to use on an N150 to do static web hosting, I would sure like to know if those features make a meaningful difference.

But I also understand, that looking at that might have beyond the scope of the article.


Is this on YouTube ?



The article says "I think the biggest factor is that any rewrite of an existing codebase is going to yield better results than the original codebase.".

Yeah, sorry, but no, ask some long-term developers about how this often goes.


It depends on the codebase. If the code base deserves to be a case study in how not to do programming, then a rewrite will definitely yield better results.

I once encountered this situation with C# code written by an undergraduate, rewrote it from scratch in C++ and got a better result. In hindsight, the result would have been even better in C since I spent about 80% of my time fighting with C++ to try to use every language feature possible. I had just graduated from college and my code whole better, did a number of things wrong too (although far fewer to my credit). I look back at it in hindsight and think less is more when it comes to language features.

I actually am currently maintaining that codebase at a health care startup (I left shortly after it was founded and rejoined not that long ago). I am incrementally rewriting it to use a C subset of C++ whenever I need to make a change to it. At some point, I expect to compile it as C and put C++ behind me.


Data structures like maps and vectors from the standard library are still incredibly useful and make a fantastic addition to C if your focus relies on POD types, though if real time performance with heap cohesion is a problem then you’re right to go pure C


Hi author of the article here.

I've been a software developer for nearly 2 decades at this point, contributed to several rewrites and oversaw several rewrites of legacy software.

From my experience I can assure you that rewriting a legacy codebase to modern C++ will yield a better and safer codebase overall.

There are multiple factors that contribute to this, such one of which is what I reffer to as "lessons learnt" if you have a stable team of developers maintaining a legacy codebase they will know where the problematic areas are and will be able to avoid re-creating them in a rewrite.

An additonal factor to consider is that a lot of legacy C++ codebases can not be upgraded to use modern language features like smart pointers. The value smart pointers provide in a full rewrite can not be overstated.

Then there's also the factor that is a bit anecdotal which is I find that there are less C++ devs in general as there was 15 years ago, but those that stayed / survived are generally better and more experienced with very few enthusiastic juniors coming in.

I'm sorry you did not enjoy the article though, but thank you for giving it your time and reading it that part I really appreciate.


I enjoyed the article, and as a longtime developer. I certainly relate to being heads down on a problem, only to step away for a walk or a breather and realize I can maybe avoid solving the immediate problem altogether.

I also don’t think it’s possible to focus at 100% on a detailed complex problem, and also concurrently question is there a better path or a way to avoid the current problem. Sometimes you just need to switch modes between focusing on the details the weeds, and popping back up to asking does this even have to be completed at all?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: