You should be able to aggregate them no problem, though performance can be spotty, esp. latency performance. This is really more of a problem on the input side, so in your case aggregating ought to work well.
If you're using Logic and aren't trying to get more outputs with the VRM, you can just select one device for input (Apogee Duet/One) and the VRM as the output device.
If you are looking for a headphone processor that you can listen to music with that doesn't require extra hardware, check out Audiofile's Fidelia +
FHX, which I helped design