uses all available CPU's even if -nt 1 with OpenBLAS
If fastcodeml is compiled with OpenBLAS it uses all available CPU's irrespective of -nt option.
It's a default behavior of OpenBLAS to use all CPU's. That's especially problematic since
- OpenBLAS is the default BLAS implementation in Ubuntu
- Most CPUs have hyper-threading which doesn't give a real advantages. Instead huge number of processes is making things significantly slower.
One work-around is to run fastcodeml with OMP_NUM_THREADS=1, but it only works if you need one thread.
I consider this bug to be important, because it prevents most users from using a parallel version of fastcodeml.