If a developer wants 60fps on a VR system pushing stereoscopic channels, that requires 120fps (60fps per eye). So there's already a reason for doubling the number of frames rendered per second, even without increasing the perceived frame rate.
As for 120fps render on a single channel 4k display, my guess is that 8k and above displays will be available before there are GPU set ups that can render at 120fps @ 4k native resolutions which would bring the question whether a developer or player would prioritize playing 60fps @ 4k or 120fps @ 1080p with current GPU set ups.