There seems to be a missing question in the FAQ: Why is Fortran worth saving? What are the advantages of writing code in Fortran? Why not just allow the ports to C++ to happen?
There might well be good answers to these; but it seems that if you want to make Fortran cool again, you have to provide a vision for what makes it special.
> What are the advantages of writing code in Fortran? Why not just allow the ports to C++ to happen?
A physicist or engineer, already being familiar with Matlab or Python, can pretty much pick up Fortran in about 2 days to 1 week, get to be productive, and the compiler will generally generate fast code. Try to teach C++ to a domain specialist, who doesn't have too much extra time to study programming language syntax and quirks, because their main motivation is working in their domain?
If the scientific code was in C++, some scientists would probably never become fluent enough, and the workflow would probably deviate towards a division where scientists prototype their ideas in Matlab or Python, and separate technical programmer personnel implement it in the production code. The turnaround time from ideas to seeing results would be days instead of hours.
I see two reasons why C++ is not a suitable language for domain specialists in numerical computing.
1. Programming in C++ requires a certain discipline to understand what is going on, for example to not accidentally make useless copies of huge arrays. In fortran you do not need to learn such a discipline because everything is explicit.
2. Fortran natively supports multidimensional numeric arrays. In C++, not really; you need to use "libraries" whose evolution is independent to that of the language and may become unsupported. A few years ago, everybody said to use "blitz", somewhat later it was "eigen", now it probably is something new. The C++ numerical code that you may write today will probably be obsolete in a few decades. Yet the fortran code will be alright. It is "eternal".
Eigen is still the best-in-class C++ library for linear algebra as far as I know. It's been around for a while, is still actively developed, and I don't think there's a risk of it becoming obsolete anytime soon.
That being said, the learning curve for someone not familiar with C++ to write code using Eigen is much higher than writing the same code in Fortran. If you're not already a C++ programmer, you just want to do some math, and Numpy/Matlab are too slow or constrained, then Fortran is a solid choice.
> That being said, the learning curve for someone not familiar with C++ to write code using Eigen is much higher than writing the same code in Fortran.
Moreover, the algorithms implemented inside eigen are hidden behind dozens of onion-like layers. Once you peel all these layers you find code such as this:
I can read and write C++ and I teach numerical linear algebra for a living, but mother of god, this SVD implementation is horrendously unreadable to me. Most of the lines of code are stupid bureaucracy necessary just to add and multiply a few vectors. The same algorithm in linear algebra textbooks requires about fifteen lines of pseudo-code, and similarly for a fortran implementation.
Yeah, libraries that make extensive use of templates are basically dark magic only comprehensible to C++ experts (see also: almost everything in Boost).
The positive side of this is that it enables nice interfaces for the library, and makes a lot of the abstractions basically "free" (since the cost is paid at compile time rather than via pointer indirection at runtime). But it definitely makes those libraries inappropriate for students who want to look at an understandable implementation of the algorithm.
The main advantage of this approach is that you can use any C++ compiler and the library (in this case Eigen) will work.
The disadvantage of this approach is that the code is incomprehensible to average scientists, you must spend a lot of effort to become a C++ expert.
The advantage of Fortran is that the code is simple and comprehensible to average scientist, and yet fast, because the effort goes into the compiler itself. I would also argue it is easier to implement an optimization pass in the compiler than implement optimizations on the template level in C++.
The disadvantage is that you need a good Fortran compiler. If your compiler can't run your code on, say, GPU, then suddenly there is not a good path forward. While in C++, there is always a way forward via template metaprogramming: perhaps ugly, but at least it gets the job done.
> ...makes a lot of the abstractions basically "free"
Maybe it is just me, but I do not really see the need for "abstractions" in a linear algebra library. Numbers are already an abstraction, you do not want to hide them! Ok, maybe you want to choose between "float" and "double", but this does not merit all that overcomplication.
The abstractions inside Eigen are there to enable powerful compile time optimisations, such as compiling the addition of
three vectors `Eigen::VectorXf x = a + b + c;` in a single loop instead of naively allocating a temporary to evaluate `Eigen::VectorXf tmp = b + c;` then evaluating `Eigen::VectorXf x = a + tmp;`.
This logic is omnipresent in Eigen and is crucial to its performance. It is also used to have specialized versions of some decompositions when the matrix's size is known at compile time.
Because you don't have to learn yet another set of functions in a language that makes it stupidly easy to blow off your own feet, and yet still have all the advantages of those systems in a language that is actually _designed_ for calculating formulas (hence the name FORTRAN, i.e. FORmula TRANslation).
Again, should your users understand the code you present or should you use the language you feel more comfortable with.
Reading FORTRAN formulas is a pain (just reading for example a Prime factorization function implemented in parallel written in Fortran takes 10-15min to understand).
Hence, I see a large advantage of OPEN_MP & C++ here.
An important advantage that doesn't get mentioned a lot is that modern Fortran (imo f90+) is much simpler to write and reason about that C++, precisely because it limits what you can actually do in the language. That opens up the possibility for engineers and scientists to contribute to writing HPC code. If everything was in C++ then the ability to contribute would be available to a smaller number of capable programmers, and that's where Fortran still has a reasonable niche
I'm going to take this in good faith and assume that you are thinking of old school FORTRAN77. Modern Fortran (90 and onwards) is a very different language in many ways whilst maintaining high backwards compatibility. You should check out modern versions of the language and code examples.
Fortran is a language more suited for numerical computing than C++. Modern Fortran is simpler, more maintable and more easily optimised than C++. Yes one could pick a subset of C/C++ and write fast numerical code but it would require more work and more discipline than using modern Fortran.
Modern Fortran is really nice when it comes to matrix arithmetic. I really wish it were more widespread and there were more libraries for data loading into the language. Having written a lot of CV/Robotics related code in Python and C++, I find that often there are errors in coding which you may not catch till runtime. For instance matrix dimension mismatch will not be caught in OpenCV or in Python till you actually execute the code. On the other hand, these things can be hard coded into Fortran 2003 and the compiler will perform the checks before you shoot yourself in the foot.
A few years ago, Scott Meyers gave a keynote at the D Language Conference. It was titled "The Last Thing D Needs". The answer given at the end was, "Someone like me". It's a bad sign if your language is so complex that it needs someone like him to write books on how to not massively screw things up.
Fortran does not need Scott Meyers. And that's a good thing, because those who program in Fortran are typically not full-time software developers, they are individuals with other things to do that would not ever invest the time to write "proper" C++.
One of LFortran authors here. You have a very good point. We are going to publish a new blog post answering this exact question.
In the meantime, others in this thread pretty much hit all the main points already why Fortran as a language is still a very good choice for numerical computing. It's the tooling that must improve.
Because modern Fortran is an extremely fast, vectorized and highly optimized numerical computing language. Like MATLAB, but much faster.
C++ is a messy, unsafe, and not particularly elegant systems programming language, which can be used for numerical computing, but it is not a good idea.
This looks very interesting! I hope that maybe this project can include a “refactoring parser”, that is, for each token in the parse tree, store where in the source code file that token originated from. Also, white space and comments have would have to be included, such that the round trip (source file -> AST -> source file) is possible without loss. I’ve long wanted to write some refactoring tools for modern Fortran, but not having such a parser is a big blocker—there’s only so much you can do with regexes.
One of LFortran authors here. Yes, we already have an open issue for exactly this: https://gitlab.com/lfortran/lfortran/issues/42. As you can read there, it's actually not that easy as I first thought. But I think it's very much worth pursuing.
For example, Python's AST is not round-trippable, and that forces people to write alternative parsers such as Parso (https://github.com/davidhalter/parso).
ANTLR should allow to parse all the white space. But the issue is how to represent it in the AST, see the issue for more details. One could have an AST and a parser just for this round-trippable application, but that defeats the purpose. The goal should be to have this part of the compiler so that one can trust that it parses things correctly.
Right now I am concentrating my efforts to finish gfortran compatibility (see the roadmap at https://lfortran.org) and to get first users. Once we get further along, we will tackle this problem. I am hopeful there might be a way to do this.
Does Antlr support an annotated parse tree that would allow for round-tripping? I didn't see anything in the documentation, at a quick glance. Even then, formulating a grammar (especially one that also catches whitespace and comments) seems like a pretty daunting task.
Catching whitespace and comments is no problem in Antlr[1]. I'm not exactly sure what you mean by round-tripping. In terms of how daunting it is, it's much easier than trying to use Flex/Bison or lex/yacc, that's for sure.
1: At the bottom of the grammar, there are rules to intentionally skip over whitespace and comments, but it could have just as easily captured them and parsed them further. https://github.com/antlr/grammars-v4/blob/master/c/C.g4
What I mean by "round-tripping" is that I would have to be able to read a .f90 file into an AST structure, then write out the tree to a new file, and end up with the exact same file content as the original. This would be a prerequisite for a refactoring tool, which has to preserve comments and code formatting.
Using Fortran in Jupyter CNA be extremely handy to learn certain algorithms by interactive exploration. Despite all the nice and modern alternatives, Fortran is so popular in HPC world that still generations of students will have to learn it.
I am one of LFortran authors. We just open sourced it recently. Do you have any recommendations if we should create a mailinglist, or rather use something like Discourse? Or both.
Right now if you have any questions, you can open an issue at our GitLab repository:
Gfortran is very nice, but I recommend a recent version (at least ver >= 7) rather than old versions (e.g. ver 4.4, which may still be the default on CentOS 6 etc). For performance, I guess Intel fortran is probably better (but also depends on calculations).
Good to know, thanks! I've used software compiled with the Intel compiler, but have only written some toy programs in GFortran and it seems fast enough. I like the revival idea behind LFortran though. Reading scientific C++ is much more difficult than Fortran.
Can you please point me to some documentation or examples where Intel Fortran is used to offload to, say, NVIDIA GPUs? I don't think it supports it yet.
PGI and IBM support CUDA Fortran, other compilers don't. PGI/Flang has an experimental support for offloading "do concurrent". Some compilers also support the openmp/openacc pragmas and GPU offloading. One can also call CUDA C from any Fortran compiler via iso_c_binding. None of this is ideal.
I think it's still a research problem how to best integrate GPU support in the language itself. CUDA Fortran is nice --- one thing we want to do down the road is to implement CUDA Fortran and to allow LFortran to transform code it understands into a Fortran dialect that all Fortran compilers can compile, so that one can use any Fortran compiler and still use CUDA Fortran.
The other path is to parallelize "do concurrent", but part of this is also means to extend Fortran to allow to specify the array layout, as Kokkos (https://github.com/kokkos/kokkos) allows.
That is one of the biggest problem with Fortran: people want to be assured that their code will run on modern hardware. When using, e.g., C++ and Kokkos, then there is some assurance that the code will run.
AFAIK Julia uses patched LLVM's PTX output, which I think should be done by all languages to work towards a common optimization platform. Also CuArrays uses multiple higher level NVIDIA libraries, like CudaBLAS and CuDNN.
The goals look similar to me, so it's worth to take a look at them.
Yes, I was planning to start with what NumBa (http://numba.pydata.org/) is doing, they also use the LLVM PTX backend.
There is a really promising new project by Chris Lattner (the original author of LLVM) called MLIR: https://github.com/tensorflow/mlir. That might be the best intermediate representation that all the compilers (Julia, Fortran, ...) could target.
No worries. Yes, ultimately down the road in couple years, if there is some agreed upon way of extending Fortran to handle GPU well, the best way is to get it into the Fortran standard itself, that way all compilers will eventually support it. I recently became the Fortran Standard Committee member, so when the time is right, I will try to help on this front. Right now it's too early, first we need to implement the new capabilities in some compilers and get some experience and agreement among users. My own first goal is to get LFortran polished enough to get first users.
There might well be good answers to these; but it seems that if you want to make Fortran cool again, you have to provide a vision for what makes it special.