More

fwilliams · 2026-01-15T15:59:24 1768492764

Personal website: https://fwilliams.info

I also own https://stonks.money and am looking for good ideas for what to do with it

fwilliams · on April 15, 2024

Ha! I had a side project during my PhD to build a segmenting and straightening program for this project!

Source code is here: https://github.com/fwilliams/unwind

Paper is here: https://arxiv.org/abs/1904.04890

It got accepted to Chi 2020 which was cancelled so the paper never got presented, sadly!

fwilliams · on Dec 27, 2022

I have personally gotten a lot of mileage from just writing the compute heavy parts of my code in C++ and exposing it to Python with a tool like PyBind11 [1] or NumpyEigen [2]. I find tools like numba and cython to be more trouble than they're worth.

[1] https://github.com/pybind/pybind11 [2] https://github.com/fwilliams/numpyeigen

netjiro · on Dec 27, 2022

I prototype in python or whatever, then, if the project survives into market and has legs I either buy more hardware or rewrite the expensive parts in C++.

Reduces calendar time, risk, cost. And I'm likely to make better decisions once the code and market is better understood after the prototype is tested under real world conditions and the requirements have changed (like they always seem to do).

dilawar · on Dec 28, 2022

+1 for pybind11. I wrote python bindings using pybind11 for two C++ based simulators: MOOSE and Smoldyn. It was surprisingly easy to use given how badly Python C-API and c++ tooling suck. Though you have to create binary wheels for every version of python and platform separately.

brnt · on Dec 28, 2022

> Though you have to create binary wheels for every version of python and platform separately.

cibuildwheel makes this easy.

fwilliams · on Dec 27, 2022

It’s crazy seeing this on the front page of HN!

I grew up in Stanstead. I have fond memories of story time as a child in the library, borrowing movies and comic books, and playing age of empires 2 with my best friend on the two shared computers in the front room.

There’s also a street in the town (aptly named Canusa st.) which is half in the US and half in Canada. Interestingly, the houses on one side have flags reminding you where you are, while there are no flags on the other. Figuring out which side is left as an exersize to the reader ;)

andyferris · on Dec 27, 2022

Yes! My mother-in-law grew up in Stanstead, and we’ve spent a lot of time there with her side of the family (and lived and worked in Sherbrooke for a couple years, though we are Australian and are in Australia now).

The flags on Canusa St are amusing :)

dtgriscom · on Dec 27, 2022

Is it that the US residents are chauvinistic, or that the Canadians are embarrassed?

andyferris · on Dec 27, 2022

Lol.

(Also note that in Quebec provence flying a Canadian flag can be a bit of a political statement… it seemed easier to me to be carefully neutral on such things).

joshlemer · on Dec 27, 2022

Also people identify less with Canadian iconography in Quebec.

cobb · on Dec 27, 2022

What’s the most Canadian-flag-waving part of the country (if any)?

joshlemer · on Dec 27, 2022

I would say likely Ontario, and specifically the Ottawa capital region. Though, since the Convoy, waving a Canadian Flag has taken on a Conservative/right wing/populist connotation that I think is still around today, so it may be shifting in the direction of Alberta and Saskatchewan.

TravHatesMe · on Dec 27, 2022

A little bit of both.

fwilliams · on Sept 6, 2022

Tried this on chrome and it's pretty awesome!

Unfortunately it doesn't work on Firefox on Ubuntu 20.04 with an NVIDIA GTX 3090Ti. :(

IAmGraydon · on Sept 6, 2022

Working on Firefox 104.0.2 under Windows 11.

fwilliams · on April 13, 2022

To quote the article:

> But even putting aside the fact that claiming someone else's writing as one's own is wrong, the value in survey papers is in how they re-frame the field. A survey paper that just copies directly from the prior paper hasn't contributed anything new to the field that couldn't be obtained from a list of references.

Good survey papers can be important contributions in their own right (e.g. [1]). A good survey should contextualize works within a subject area with respect to each other and identify high level trends/ideas in that subject. These connections are not only useful for learning a topic, but also for positioning novel work or identifying under-researched areas to focus on.

If the authors felt that one of the papers they plagiarized concisely expressed what they wanted to say, they could simply quote and cite that work. Otherwise, it could be construed that the authors are claiming to be the ones drawing the conclusions they wrote. Moreover, from the article, the survey in question seems to be pretty egregiously plagiarizing, which deserves to be called out/shamed.

[1] https://arxiv.org/abs/2111.11426

mywaifuismeta · on April 13, 2022

I disagree with this:

> But even putting aside the fact that claiming someone else's writing as one's own is wrong, the value in survey papers is in how they re-frame the field. A survey paper that just copies directly from the prior paper hasn't contributed anything new to the field that couldn't be obtained from a list of references.

Whether or not a survey paper is "good" is irrelevant here. Yes, a survey paper that just lists others papers may be a bad survey paper, but it does nothing wrong as long as it cites the original papers, which this does. A bad survey paper may not be published in a journal, that's what peer review is for, but there is nothing wrong with publishing it openly on the web.

And there is still value in aggregating other papers, even if it's just a list with description. That's the reason why these "awesome-XX" Github repos are so popular. Time to hunt them down?

fwilliams · on April 13, 2022

If you look at the plagiarized language in the article, it seems as if the BM paper authors are claiming contributions (emphasis mine). Credit is a major currency in research, and it's important to give it where it is due. If someone did this with one of my papers, I'd be quite upset.

For example (Emphasis mine):

> The risks of data memorization, for example, the ability to extract sensitive data such as valid phone numbers and IRC usernames, are highlighted by Carlini et al. [41]. While their paper identifies 604 samples that GPT-2 emitted from its training set, we show that over 1 of the data most models emit is memorized training data. In computer vision, memorization of training data has been studied from various angles for both discriminative and generative models Deduplicating training data does not hurt perplexity: models trained on deduplicated datasets have no worse perplexity compared to baseline models trained on the original datasets. In some cases, deduplication reduces perplexity by up to 10%. Further, because recent LMs are typically limited to training for just a few epochs

mywaifuismeta · on April 13, 2022

Yes, I agree that's bad but looks like sloppy copy and pasting as opposed to intentional plagiarism to claim contributions. Would it have been okay if they said "they" instead of "we"?

fwilliams · on April 13, 2022

Then who is "they" in this situation? You need a citation!

fwilliams · on Nov 9, 2021

I want to echo the other comments here that low expense ratio (<0.25%) funds from Vanguard, Fidelity, Schwab, etc... are all great stable investments.

My time horizon is longer than 5 years, and I buy broad market index funds split up as follows: 55% US large cap (e.g. VIIIX, VTSAX, SWTSX), 15% US mid cap (e.g. VMCPX), 10% US small cap (e.g. VSCPX), and 20% international (e.g. VTSNX, SWISX, VXUS).

I also highly recommend dollar cost averaging. i.e. buying a fixed amount of your portfolio at fixed periods. I have my bank do this automatically every 2 weeks. The benefit of dollar cost averaging is (1) it takes the emotion out of investing, and (2) over a long time window, more of your assets will be purchased at a low prices than high prices (because you're buying a fixed dollar amount of assets every N days, fewer you will buy fewer assets when prices are high and more assets when prices are low).

fwilliams · on Nov 7, 2021

I was a software engineer briefly before starting grad school. During that time, I found I didn't have the time to sit down and learn about topics that interested me. I also wanted to be in research-y roles where I could build things that were more experimental and less well understood.

During my PhD, I got to spend time learning, and attending talks/seminars/conferences. Gaining deeper background knowledge in my field as well as learning how to quickly evaluate and explore new ideas gave me the tools to have the type of job I wanted. I'm a research scientist at an industrial lab now and quite enjoy it.

That being said, I agree with the grandparent post that doing a PhD can be a grueling experience. I had to carry the bulk of the work for many of the papers I submitted. If I took a day off, nobody would pick up the slack. Tight deadlines meant the only way to succeed was putting in long hours. My advisors were also spread very thin so it was difficult to get a lot of time with them. There were times when I felt very alone. This was a really stark contrast to how collaborative engineering in industry was and I don't think I ever fully adjusted to it. My current job feels like a happy middle ground. I publish papers alongside other people and we split the work.

fwilliams · on Jan 31, 2021

It’s not the site you’re looking for, but I found https://poolside.fm recently and it’s become one of those quirky corners of the internet that I have come to enjoy. I definitely miss the days of discovering weird specialty sites, and poolside gave me a bit of that new site discovery rush (also the music is great).

zomg · on Feb 1, 2021

i just downloaded the app. this is amazing, thank you for posting!

fwilliams · on Aug 8, 2019

No mention of conda for environment management? In my experience, conda is by far the best tool for this. Especially when using packages which have non python dependencies.

abalaji · on Aug 8, 2019

Hi, it's the author here. You're right conda is fantastic and in fact you can use pyenv/pyenv-virtualenv with conda. [1] The article was already getting long and the goal of the article was a survey at 42,000 feet with the example that was created.

[1] https://github.com/pyenv/pyenv-virtualenv#anaconda-and-minic...