Re-identifying People in Video via Learned Temporal Attention and Multi-modal Foundation Models | IEEE Conference Publication | IEEE Xplore