Creating LFortran, an interactive Fortran compiler built on top of LLVM

archgoon · on May 2, 2019

There seems to be a missing question in the FAQ: Why is Fortran worth saving? What are the advantages of writing code in Fortran? Why not just allow the ports to C++ to happen?

There might well be good answers to these; but it seems that if you want to make Fortran cool again, you have to provide a vision for what makes it special.

sampo · on May 2, 2019

> What are the advantages of writing code in Fortran? Why not just allow the ports to C++ to happen?

A physicist or engineer, already being familiar with Matlab or Python, can pretty much pick up Fortran in about 2 days to 1 week, get to be productive, and the compiler will generally generate fast code. Try to teach C++ to a domain specialist, who doesn't have too much extra time to study programming language syntax and quirks, because their main motivation is working in their domain?

If the scientific code was in C++, some scientists would probably never become fluent enough, and the workflow would probably deviate towards a division where scientists prototype their ideas in Matlab or Python, and separate technical programmer personnel implement it in the production code. The turnaround time from ideas to seeing results would be days instead of hours.

enriquto · on May 2, 2019

> Why not just allow the ports to C++ to happen?

I see two reasons why C++ is not a suitable language for domain specialists in numerical computing.

1. Programming in C++ requires a certain discipline to understand what is going on, for example to not accidentally make useless copies of huge arrays. In fortran you do not need to learn such a discipline because everything is explicit.

2. Fortran natively supports multidimensional numeric arrays. In C++, not really; you need to use "libraries" whose evolution is independent to that of the language and may become unsupported. A few years ago, everybody said to use "blitz", somewhat later it was "eigen", now it probably is something new. The C++ numerical code that you may write today will probably be obsolete in a few decades. Yet the fortran code will be alright. It is "eternal".

alexhutcheson · on May 2, 2019

Eigen is still the best-in-class C++ library for linear algebra as far as I know. It's been around for a while, is still actively developed, and I don't think there's a risk of it becoming obsolete anytime soon.

That being said, the learning curve for someone not familiar with C++ to write code using Eigen is much higher than writing the same code in Fortran. If you're not already a C++ programmer, you just want to do some math, and Numpy/Matlab are too slow or constrained, then Fortran is a solid choice.

enriquto · on May 2, 2019

> That being said, the learning curve for someone not familiar with C++ to write code using Eigen is much higher than writing the same code in Fortran.

Moreover, the algorithms implemented inside eigen are hidden behind dozens of onion-like layers. Once you peel all these layers you find code such as this:

https://eigen.tuxfamily.org/dox/JacobiSVD_8h_source.html

I can read and write C++ and I teach numerical linear algebra for a living, but mother of god, this SVD implementation is horrendously unreadable to me. Most of the lines of code are stupid bureaucracy necessary just to add and multiply a few vectors. The same algorithm in linear algebra textbooks requires about fifteen lines of pseudo-code, and similarly for a fortran implementation.

alexhutcheson · on May 2, 2019

Yeah, libraries that make extensive use of templates are basically dark magic only comprehensible to C++ experts (see also: almost everything in Boost).

The positive side of this is that it enables nice interfaces for the library, and makes a lot of the abstractions basically "free" (since the cost is paid at compile time rather than via pointer indirection at runtime). But it definitely makes those libraries inappropriate for students who want to look at an understandable implementation of the algorithm.

certik · on May 3, 2019

The main advantage of this approach is that you can use any C++ compiler and the library (in this case Eigen) will work. The disadvantage of this approach is that the code is incomprehensible to average scientists, you must spend a lot of effort to become a C++ expert.

The advantage of Fortran is that the code is simple and comprehensible to average scientist, and yet fast, because the effort goes into the compiler itself. I would also argue it is easier to implement an optimization pass in the compiler than implement optimizations on the template level in C++. The disadvantage is that you need a good Fortran compiler. If your compiler can't run your code on, say, GPU, then suddenly there is not a good path forward. While in C++, there is always a way forward via template metaprogramming: perhaps ugly, but at least it gets the job done.

enriquto · on May 2, 2019

> ...makes a lot of the abstractions basically "free"

Maybe it is just me, but I do not really see the need for "abstractions" in a linear algebra library. Numbers are already an abstraction, you do not want to hide them! Ok, maybe you want to choose between "float" and "double", but this does not merit all that overcomplication.

vbarrielle · on May 2, 2019

The abstractions inside Eigen are there to enable powerful compile time optimisations, such as compiling the addition of three vectors `Eigen::VectorXf x = a + b + c;` in a single loop instead of naively allocating a temporary to evaluate `Eigen::VectorXf tmp = b + c;` then evaluating `Eigen::VectorXf x = a + tmp;`.

This logic is omnipresent in Eigen and is crucial to its performance. It is also used to have specialized versions of some decompositions when the matrix's size is known at compile time.

melan13 · on May 2, 2019

> I see two reasons why C++ is not a suitable language for domain specialists in numerical computing.

Have you heard of MPI or even OpenMP ?. They can be used to accelerate fast computation in C++ or even work on domain-specified parallel execution.

enriquto · on May 2, 2019

> Have you heard of MPI or even OpenMP ?. They can be used to accelerate fast computation in C++

these features are also available in fortran since forever

wahern · on May 2, 2019

Didn't they originate with Fortran?

melan13 · on May 2, 2019

> Hence why learning Fortran ?

coldacid · on May 4, 2019

Because you don't have to learn yet another set of functions in a language that makes it stupidly easy to blow off your own feet, and yet still have all the advantages of those systems in a language that is actually _designed_ for calculating formulas (hence the name FORTRAN, i.e. FORmula TRANslation).

melan13 · on May 4, 2019

Again, should your users understand the code you present or should you use the language you feel more comfortable with. Reading FORTRAN formulas is a pain (just reading for example a Prime factorization function implemented in parallel written in Fortran takes 10-15min to understand). Hence, I see a large advantage of OPEN_MP & C++ here.

cbcoutinho · on May 2, 2019

An important advantage that doesn't get mentioned a lot is that modern Fortran (imo f90+) is much simpler to write and reason about that C++, precisely because it limits what you can actually do in the language. That opens up the possibility for engineers and scientists to contribute to writing HPC code. If everything was in C++ then the ability to contribute would be available to a smaller number of capable programmers, and that's where Fortran still has a reasonable niche

hdfbdtbcdg · on May 2, 2019

I'm going to take this in good faith and assume that you are thinking of old school FORTRAN77. Modern Fortran (90 and onwards) is a very different language in many ways whilst maintaining high backwards compatibility. You should check out modern versions of the language and code examples.

Fortran is a language more suited for numerical computing than C++. Modern Fortran is simpler, more maintable and more easily optimised than C++. Yes one could pick a subset of C/C++ and write fast numerical code but it would require more work and more discipline than using modern Fortran.

zorked · on May 2, 2019

Where does the Modern Fortran community congregate? Any websites or mailing lists?

endoftime5 · on May 2, 2019

Quite a healthy community on github producing nice modern Fortran code!

https://github.com/Fortran-FOSS-Programmers

SiempreViernes · on May 2, 2019

That's a bit more energy than I would use for describing the activity in the group :)

Last time I think I saw any action was the heated debate about goto:s in the style guide ;)

simonbyrne · on May 3, 2019

Am I missing something? Other than the doc generator (which is in Python), none of the repos have been touched for a year or so.

chestervonwinch · on May 2, 2019

More so you’ll find its usage clustered in academia in areas you might expect, eg, numerical linear algebra, stuff involving solving PDEs.

wycy · on May 2, 2019

Some of us are at /r/fortran

accurrent · on May 2, 2019

Modern Fortran is really nice when it comes to matrix arithmetic. I really wish it were more widespread and there were more libraries for data loading into the language. Having written a lot of CV/Robotics related code in Python and C++, I find that often there are errors in coding which you may not catch till runtime. For instance matrix dimension mismatch will not be caught in OpenCV or in Python till you actually execute the code. On the other hand, these things can be hard coded into Fortran 2003 and the compiler will perform the checks before you shoot yourself in the foot.

bachmeier · on May 2, 2019

A few years ago, Scott Meyers gave a keynote at the D Language Conference. It was titled "The Last Thing D Needs". The answer given at the end was, "Someone like me". It's a bad sign if your language is so complex that it needs someone like him to write books on how to not massively screw things up.

Fortran does not need Scott Meyers. And that's a good thing, because those who program in Fortran are typically not full-time software developers, they are individuals with other things to do that would not ever invest the time to write "proper" C++.

certik · on May 2, 2019

One of LFortran authors here. You have a very good point. We are going to publish a new blog post answering this exact question.

In the meantime, others in this thread pretty much hit all the main points already why Fortran as a language is still a very good choice for numerical computing. It's the tooling that must improve.

atemerev · on May 2, 2019

Because modern Fortran is an extremely fast, vectorized and highly optimized numerical computing language. Like MATLAB, but much faster.

C++ is a messy, unsafe, and not particularly elegant systems programming language, which can be used for numerical computing, but it is not a good idea.

acqq · on May 2, 2019

Recently:

https://news.ycombinator.com/item?id=19705561#19706731

WalterGR · on May 3, 2019

That link is to a comment on the post

The new features of Fortran 2018 [pdf]

https://news.ycombinator.com/item?id=19705561

goerz · on May 2, 2019

This looks very interesting! I hope that maybe this project can include a “refactoring parser”, that is, for each token in the parse tree, store where in the source code file that token originated from. Also, white space and comments have would have to be included, such that the round trip (source file -> AST -> source file) is possible without loss. I’ve long wanted to write some refactoring tools for modern Fortran, but not having such a parser is a big blocker—there’s only so much you can do with regexes.

certik · on May 3, 2019

One of LFortran authors here. Yes, we already have an open issue for exactly this: https://gitlab.com/lfortran/lfortran/issues/42. As you can read there, it's actually not that easy as I first thought. But I think it's very much worth pursuing.

For example, Python's AST is not round-trippable, and that forces people to write alternative parsers such as Parso (https://github.com/davidhalter/parso).

ANTLR should allow to parse all the white space. But the issue is how to represent it in the AST, see the issue for more details. One could have an AST and a parser just for this round-trippable application, but that defeats the purpose. The goal should be to have this part of the compiler so that one can trust that it parses things correctly.

Right now I am concentrating my efforts to finish gfortran compatibility (see the roadmap at https://lfortran.org) and to get first users. Once we get further along, we will tackle this problem. I am hopeful there might be a way to do this.

jagger27 · on May 2, 2019

Wouldn't writing an Antlr grammar handle this?

goerz · on May 2, 2019

Does Antlr support an annotated parse tree that would allow for round-tripping? I didn't see anything in the documentation, at a quick glance. Even then, formulating a grammar (especially one that also catches whitespace and comments) seems like a pretty daunting task.

jagger27 · on May 2, 2019

Catching whitespace and comments is no problem in Antlr[1]. I'm not exactly sure what you mean by round-tripping. In terms of how daunting it is, it's much easier than trying to use Flex/Bison or lex/yacc, that's for sure.

1: At the bottom of the grammar, there are rules to intentionally skip over whitespace and comments, but it could have just as easily captured them and parsed them further. https://github.com/antlr/grammars-v4/blob/master/c/C.g4

goerz · on May 2, 2019

What I mean by "round-tripping" is that I would have to be able to read a .f90 file into an AST structure, then write out the tree to a new file, and end up with the exact same file content as the original. This would be a prerequisite for a refactoring tool, which has to preserve comments and code formatting.

betatim · on May 2, 2019

You can try it online: https://mybinder.org/v2/gl/lfortran%2Fweb%2Flfortran-binder/... !

ktpsns · on May 2, 2019

Using Fortran in Jupyter CNA be extremely handy to learn certain algorithms by interactive exploration. Despite all the nice and modern alternatives, Fortran is so popular in HPC world that still generations of students will have to learn it.

ben_e · on May 2, 2019

This post talks a lot about creating a community. Are they doing that anywhere in particular?

certik · on May 2, 2019

I am one of LFortran authors. We just open sourced it recently. Do you have any recommendations if we should create a mailinglist, or rather use something like Discourse? Or both.

Right now if you have any questions, you can open an issue at our GitLab repository:

https://gitlab.com/lfortran/lfortran

new4thaccount · on May 2, 2019

Question:

What is a good Fortran compiler/implementation these days?

I know they have Intel and IBM and SimplyFortran as commercial software. GFortran works via GCC. Is it as good? Has anyone on here used this LFortran.

septc · on May 2, 2019

Gfortran is very nice, but I recommend a recent version (at least ver >= 7) rather than old versions (e.g. ver 4.4, which may still be the default on CentOS 6 etc). For performance, I guess Intel fortran is probably better (but also depends on calculations).

new4thaccount · on May 2, 2019

Thanks!

certik · on May 3, 2019

The best open source implementation is GFortran. Just make sure you use a recent version, as others mentioned.

Regarding commercial implementations, I only have experience with Intel and NAG, both of which are very good.

atemerev · on May 2, 2019

Everybody I know in public science (physics) uses gfortran, works just fine.

new4thaccount · on May 2, 2019

Good to know, thanks! I've used software compiled with the Intel compiler, but have only written some toy programs in GFortran and it seems fast enough. I like the revival idea behind LFortran though. Reading scientific C++ is much more difficult than Fortran.

C1sc0cat · on May 2, 2019

Not sure what they mean by lack of support for GPU - Intel Fortan has this and I am sure others do.

certik · on May 2, 2019

I am one of the LFortran authors.

Can you please point me to some documentation or examples where Intel Fortran is used to offload to, say, NVIDIA GPUs? I don't think it supports it yet.

PGI and IBM support CUDA Fortran, other compilers don't. PGI/Flang has an experimental support for offloading "do concurrent". Some compilers also support the openmp/openacc pragmas and GPU offloading. One can also call CUDA C from any Fortran compiler via iso_c_binding. None of this is ideal.

I think it's still a research problem how to best integrate GPU support in the language itself. CUDA Fortran is nice --- one thing we want to do down the road is to implement CUDA Fortran and to allow LFortran to transform code it understands into a Fortran dialect that all Fortran compilers can compile, so that one can use any Fortran compiler and still use CUDA Fortran.

The other path is to parallelize "do concurrent", but part of this is also means to extend Fortran to allow to specify the array layout, as Kokkos (https://github.com/kokkos/kokkos) allows.

That is one of the biggest problem with Fortran: people want to be assured that their code will run on modern hardware. When using, e.g., C++ and Kokkos, then there is some assurance that the code will run.

xiphias2 · on May 2, 2019

AFAIK Julia uses patched LLVM's PTX output, which I think should be done by all languages to work towards a common optimization platform. Also CuArrays uses multiple higher level NVIDIA libraries, like CudaBLAS and CuDNN.

The goals look similar to me, so it's worth to take a look at them.

certik · on May 3, 2019

Yes, I was planning to start with what NumBa (http://numba.pydata.org/) is doing, they also use the LLVM PTX backend.

There is a really promising new project by Chris Lattner (the original author of LLVM) called MLIR: https://github.com/tensorflow/mlir. That might be the best intermediate representation that all the compilers (Julia, Fortran, ...) could target.

C1sc0cat · on May 2, 2019

Ah I sit corrected I though Intel would it looks like they have their own ideas in that area.

certik · on May 3, 2019

No worries. Yes, ultimately down the road in couple years, if there is some agreed upon way of extending Fortran to handle GPU well, the best way is to get it into the Fortran standard itself, that way all compilers will eventually support it. I recently became the Fortran Standard Committee member, so when the time is right, I will try to help on this front. Right now it's too early, first we need to implement the new capabilities in some compilers and get some experience and agreement among users. My own first goal is to get LFortran polished enough to get first users.