I wonder about the reasons behind reward hacking (by this I mean both reward tampering and reward gaming/specification gaming). It seems to me like reward hacking will always happen, if it is easier to do so, than to fulfill the actual base objective.
So, if an agent is searching, in other words optimizing, for optimal solutions to a given objective, then reward hacking will always be a viable option. In the cases where an agent is not reward hacking, it is just easier to not hack, than to hack the given reward function.