I think that it would be a good Idea to realize a kind of thread scheduler that provides one (or two if hyperthreading is enabled) thread(s) per core.
If a calculation is able to get parallelized, like the calculation of matrix determinants or the calculus of integrals, the thread scheduler should get feeded with individual objects that are performing the needed operations separately.
Thoughts?