Translating Images into Maps
Unlike previous approaches, we treat the transformation to BEV as an image-to-world translation problem, where the objective is to learn an alignment between vertical scan lines in the image and polar rays in BEV.
Transformers are well-suited to the image-to- BEV transformation problem, as they can reason about interdependence between objects, depths and the lighting of the scene to achieve a globally consistent representation.
Input: image, intrinsic matrix.
Output: semantic BEV maps for static and dynamic classes
Method
Treat 1-1 correspondence between each vertical scanline and its associated ray as a seq2seq translations.
??? 竟然有彩蛋 ???
Inter-plane attention
I'm not sure the paper is written clearly and whether it is the final version.
发布于 2021-10-17 10:54