I've got a bit of knowledge of military radio systems. The military is not especially fond of ground based repeater systems, especially in any environment where they would need to be mobile, partially for the reasons you mention. They're bulky, finicky and take time to set up. More importantly though, losing one to enemy action or just bad luck instantly results in chaos as troops a) realize it immediately and react by moving to established contingencies, b) realize it but decide the established contingency is a bad idea for whatever reason and do their own thing c) fail to realize the repeater is dead, and instead spend 30 minutes swearing at their radios. You typically end up with 3 or more groups of people who can talk to the other people in their group...and nobody else, and it takes forever to get all the ducks back in a row.
On VHF in particular, but also to some extent on UHF, cavities are really the best bang for your buck in almost all scenarios.
In theory you could motorize the adjustments on a typical VHF or lower UHF 'can' style duplexer, and build a streaming system that could send the data from a VNA over a network connection to a remote operator who could adjust things accordingly, but that's going to cost thousands (or tens of thousands) of dollars and it sounds like a huge pain in the ass, and it adds more things to the chain that can break. Way cheaper to just pay a tech to go to the site.
Another option would be to have several duplexers, each tuned to a specific frequency pair, and a coaxial switch to move them in and out of the chain. If you only needed to have 3 or 4 pairs, this would probably still be cheaper, easier, and more reliable than the above remote system.
Generally speaking though, if frequency agility is paramount, cross-band repeat (as used on nearly all hamsats) is far and away the least expensive and least hassle on the repeater end. It just creates a bigger pain in the ass on the user side. With an inexpensive diplexer, you can even use a single antenna for two radios.
You can make the cross-band invisible to the user by having two different locations geographically separated from each other, one for the transmitter, one for the receiver, and use an RF link on a different band (or VOIP) to send received audio to the transmitter site. This way, as far as the user is aware both frequencies in the pair are on the same band, but there's no issue with desense on the receiver. The challenge then becomes creating a roughly symmetrical RX and TX coverage area, otherwise you may have areas where the user can be heard, but can't hear, or can hear the repeater fine, but can't get in to it. Multiple receive sites with a polling system can be a workaround for that.