If you start naively without any library that avoids the problem then memory access is the problem. Have a look at how much effort is needed to avoid the problem, for example with blocking algorithms.
Mathematicians love a good puzzle. Even something as abstract as multiplying matrices (two-dimensional tables of numbers) can feel like a game when you try to find the most efficient way to do it.