When you go the monitor route, unless you have some really expensive monitors and happen to have a pretty ideal room naturally, you are going to have to treat the acoustics too. In combination with the fact that with cheaper monitors you need at least a pair, it adds up to a higher cost than with headphones. Besides this you then also have the practical aspects, for instance that you might want to play loud in the middle of the night and you need the skills to setup your room properly too. All of this means it takes more energy and money to end up with a great solution, but it is definitely something you should aim for.
With headphones on the other hand, you can if you live in a relatively big city, just enter some store and try out various expensive headphones, handpick the ones that fits your ears and when you come home and start mixing you now have got the same sound at home because the acoustics does not play a role. All of this at a lower cost and with the possibility of working remotely and in the middle of the night as well. So for anyone that wants to learn mixing and mastering, headphones are definitely going to help you on that journey.
I have experience with both near fields and headphones, I find it's ideal if you can use both, but great monitors in great acoustics is the winner because you can have more than two sets active at the same time, with open back headphones you can pretty much only use two, beyond that it's practically difficult. When I work with headphones for monitoring, I commonly work with open back headphones and iterate two sets at a time (one set around the neck and one set in front of the ears). They become like small speakers and in my experience it works.
If you decide the monitor route I do think you should spend quite a lot of money on both the speakers and the acoustics, everything else is basically waste of money. Also don't forget that the audio interface also has a certain frequency response, therefore it is optimal when you are on a budget to reference check on several entirely discrete setups. And it's great if you incorporate some in ear buds too, because they can reveal certain things about the dynamics due to how close they are to the ears.