Barbershop [Kor]
Peihao et al. / Barbershop; GAN-based Image Compositing using Segmentation Masks / SIGGRAPH Asia 2021
Last updated
Was this helpful?
Peihao et al. / Barbershop; GAN-based Image Compositing using Segmentation Masks / SIGGRAPH Asia 2021
Last updated
Was this helpful?
Image compositingģ ģķė ģ“ėÆøģ§ė„¼ ķ©ģ±ķģ¬ ģ§ģ§ ź°ģ ģ“ėÆøģ§ė„¼ ė§ė¤ģ“ė“ė ź² ģ ėė¤. ģ“ ģ¤ ķź°ģ§ ė°©ė²ģ¼ė” ģ¬ė¬ ģ“ėÆøģ§ģ ķ¹ģ§ģ ėŖØģģ ķ¼ķ©ķģ¬ ģė”ģ“ ģ“ėÆøģ§ė„¼ ė§ė¤ģ“ė“ė ė°©ė²ģ“ ģģµėė¤. GANģ ģµź·¼ ė°ģ ėė¶ģ ģ“ė¬ķ ģ“ėÆøģ§ ķ©ģ±ģ ėķ ģ°źµ¬ģ ź²°ź³¼ė¤ģ“ ėģ¤ź³ ģģ¼ė, ģ¬ė¬ ģ“ėÆøģ§ė¤ ź°ģ ģ”°ėŖ , źø°ķķ, partial occlusion ė±ģ ģ°Øģ“ź° ķ¼ķ©ģ ģ“ė ¤ģģ ģ ė°ķ©ėė¤. ķ¹ķ ģ¼źµ“ ģ“ėÆøģ§ė„¼ ķ¼ķ©ķź±°ė ķøģ§ķė ė¶ė¶ģ ėØøė¦¬ģ¹“ė½ģ“ė, ė, ģ“ė¹Øź³¼ ź°ģ ģė” ė¤ė„ø ķ¹ģ§ģ ģ“ėÆøģ§ ķØģ¹ź° ė§ģ“ ģ”“ģ¬ķźø° ėė¬øģ ķ¹ķ ģ“ė µģµėė¤.
ģµź·¼ GANģ ķģ©ķė ģ¼źµ“ ģ“ėÆøģ§ ķøģ§ģ ķ¬ź² ėź°ģ§ ė°©ė²ģ¼ė” ėė©ėė¤. ķėė ķģµė ė¤ķøģķ¬(ė³“ķµģ StyleGAN[3,4])ģ latent spaceė„¼ ģ”°ģķė ė°©ė²ģ ėė¤. ģ“ė ģ“ėÆøģ§ģ ģ ģ²“ģ ģø ķ¹ģ§ģø ģ±ė³, ķģ , ķ¼ė¶ģ, ķ¬ģ¦ ė±ź³¼ ź°ģ ķ¹ģ§ģ ė°ź¾øė ė°ģ ķØź³¼ģ ģ ėė¤. ė ė¤ė„ø ė°©ė²ģ conditional GANźµ¬ģ”°ė„¼ ķģ©ķģ¬ ģķė ģģ± ė³ķ ģ ė³“ė„¼ ģ ė „ģ¼ė” ė£ģ“ ģ£¼ė ė°©ė²ģ ėė¤. ģ“ė¬ķ ė°©ė²ė¤ģ ķģ©ķģ¬ ķ¤ģ“ ģ¤ķģ¼ģ ė³ķģķ¤ė ė°©ė² ģ¤ conditional GANźµ¬ģ”°ė„¼ ķģ©ķģ¬ ķ©ģ±ķ ģģģ ėķ ģ ė³“ė„¼ ģ ė „ķģ¬ ķ“ė¹ ģģģ ģģ±ķė ė°©ė²ģ“ ė°ģ“ė ź²°ź³¼ė„¼ ė³“ģ¬ģ¤ėė¤. źø°ģ”“ģ ė°©ė²ė¤ģ źø“ėØøė¦¬ģģ ģ§§ģ ėØøė¦¬ė” ė°ėģ“ģ ė°°ź²½ģ“ ė øģ¶ėė ź²½ģ° ģ²ė¼ ģģģ“ ģģ“ģ§ė ź²½ģ°ė„¼ ķ“ź²°ķźø°ķģ¬ pre-trained inpatining networkė„¼ ģ¬ģ©ķ©ėė¤. ķģ§ė§ ģ“ė¬ķ ź²½ģ°, ģ“ėÆøģ§ ķ©ģ± ė¤ķøģķ¬ģ inpainting nerworkģ ź²°ź³¼ė¤ģ ķė¦¬ķ° ģ°Øģ“ź° ė°ģķģ¬, ģ ķ© ė¶ė¶ģ“ ģ“ģķ“ģ§ź±°ė ģķģ§ ģė ģķ°ķ©ķøź° ģźø°ė ź²½ģ°ź° ė§ģµėė¤.
ė°ė¼ģ ģ“ ė ¼ė¬øģģė ģ“ė¬ķ ė¬øģ ģ ģ ķ“ź²°ķźø° ģķ“ GAN-inversion ė°©ė²ģ ģ¬ģ©ķģ¬ ķėģ ė¤ķøģķ¬ė§ ķģ©ķØģ¼ė”ģØ ė ģ¢ģ ķģ§ģ ģ“ėÆøģ§ė„¼ ķ©ģ±ķ“ė ėė¤.
ģµź·¼ StyleGAN [3,4] ź³¼ ź°ģ ģģ±ėŖØėøģ ėģ ķģ§ģ ė¤ģķ ģ¼źµ“ ģ“ėÆøģ§ė¤ģ ģģ±ķ ģ ģź² ėģź³ ģ“ė„¼ ķģ©ķģ¬ ģ¬ė¬ ģ“ėÆøģ§ģ ķ¹ģ§ģ ķ©ģ±ķė ģ°źµ¬ė¤ģ ģ§ķķģģµėė¤. ķ¹ķ, ģ£¼ė” GAN inversion źø°ė²ģ ģ¬ģ©ķģėė° ģ°ģ ģ°øģ”° ģ¼źµ“ ģ“ėÆøģ§ģ ėÆøė¦¬ ķģµė ģģ±ėŖØėøģ ģ¬ģ©ķģ¬ latent codeė” ė§µķ ģķ¤ź³ ģķė ķ©ģ± ź²°ź³¼ ģ“ėÆøģ§ė„¼ ėķė“ė ģė”ģ“ latent codeė„¼ optimizationģ ģ“ģ©ķģ¬ ģ°¾ģė“ė ė°©ė²ģ¼ė” ź³ ķģ§ģ ģ¼źµ“ ģ“ėÆøģ§ė„¼ ģģ±ķģģµėė¤. ė¤ė§, ģ“ė¬ķ latent codeė„¼ ģ°¾ģė“ė ė°©ė²ģ ė°ė¼ ź²°ź³¼ė¬¼ģ“ ģ²ģ°Øė§ė³ė” ė¤ė„“ź³ , ģė” ė¤ė„ø ķ¹ģ±ģ ģ§ė ģ“ėÆøģ§ģ latent codeė„¼ ķ©ģ±ķė ė°©ė²ģ ķ¹ģ§ė¤ ź°ģ spatial correlation ģ¼ė” ģøķģ¬ ģ½ģ§ ģģģµėė¤.
ģ“ ė ¼ė¬øģģė ėÆøė¦¬ ķģµė ģģ±ėŖØėøģģ ė ģ¢ģ ķ©ģ± ź²°ź³¼ ģ“ėÆøģ§ė„¼ ėķė“ė latent codeė„¼ segmentation mapģ ģ°øź³ ķģ¬ ģ°¾ģė“ė ź²ģ ėŖ©ķķ©ėė¤.
StyleGAN[3,4]ģ ģģ± ź²°ź³¼ ģ“ėÆøģ§ė ėė¼ģ“ ķė¦¬ķ°ė„¼ ė³“ģ¬ģ¤ėė¤. ėķ, StyleGANģ latent spaceė„¼ ģ”°ģķØģ¼ė”ģØ ģģ± ģ“ėÆøģ§ģ ķ¬ģ¦ė, ģ±ė³ź³¼ ź°ģ ģģ±ė¤ģ ģģ°ģ¤ė½ź² ė³ķģķ¬ ģ ģģµėė¤. ķģ§ė§, StyleGANģ“ ģģ±ķ ģ“ėÆøģ§ė ėŖØė ģ¤ģ ģ“ėÆøģ§ź° ģė ź°ģ§ ģ“ėÆøģ§ģ“źø° ėė¬øģ ģ¤ģ ģ“ėÆøģ§ė„¼ StyleGANģ“ ķķ ķ ģ ģėė” ķ“ģ¼ ģ“ė¬ķ ģ”°ģģ ģ¤ģ ģ“ėÆøģ§ģ ėķ“ģė ģ¬ģ©ķ ģ ģź² ė©ėė¤. ė°ė¼ģ pre-trained StyleGANģ“ ģ¤ģ ģ“ėÆøģ§ė„¼ ķķ ķ ģ ģė latent spaceė„¼ ģ°¾ģ ė“ź³ ģ ķė GAN-inversion ģ°źµ¬ė¤ģ“ ģģµėė¤. ģ“ė¬ķ ė°©ė²ģė ķ¬ź² ėź°ģ§ ė°©ė²ģ“ ģģµėė¤. ģ¤ģ ķź² ģ“ėÆøģ§ģ ėķģ¬ gradientė„¼ ź³ģ°ķģ¬ latent codeė mapping networkģ ģ ė „ ė„¼ optimization ķė embed ė°©ģź³¼, ģ¤ģ ģ“ėÆøģ§ė„¼ ģ ė „ģ¼ė” ė£ģ¼ė©“ StyleGANģ latent spaceģ codeė” ė³ķģģ¼ģ£¼ė encoderė„¼ ķģµķė project ė°©ģģ ėė¤. I2S[5]ė, StyleGANģ ģź° ģ¬ģ©ķ ź²ź³¼ ź°ģ embedė°©ģģ ķź² ģ¤ģ ģ“ėÆøģ§ė„¼ ėģ ķė¦¬ķ°ė” ģģ±ķ“ ė“ģ§ė§, optimizationė°©ģģ ķģ©ķźø° ėė¬øģ ķģ„ģ ģ“ėÆøģ§ė„¼ ķķķźø° ģķ“ģ ķģµķė ź²ź³¼ ź°ģ“ ģ¤ėģź°ģ“ ź±øė¦°ė¤ė ėØģ ģ“ ģģµėė¤. ģ¤ģ ģ“ėÆøģ§ė„¼ ģ ė „ģ¼ė” ė£ģ¼ė©“ ģ“ģ ķ“ė¹ķė latent codeė” ė³ķģķ¤ė encoderė„¼ ķģµķė ė°©ģė¤ģ (psp[6], e4e[7] ķė²ģ network feed forwardė§ ģ“ė£Øģ“ģ§źø° ėė¬øģ ė¹ ė„“ź² ģ“ėÆøģ§ė„¼ ķķķ ģ ģė¤ė ģ„ģ ģ“ ģģ§ė§ embedė°©ģ ė³“ė¤ė ė®ģ ķė¦¬ķ°ģ ģ“ėÆøģ§ė„¼ ģģ±ķ“ ė ėė¤.
ģµź·¼ ģ°źµ¬ė¤ģ StyleGANģ²ė¼ mapping networkė„¼ ķµķ“ģ ė§ė¤ģ“ģ§ latent spaceź° ė§ģ ģ ė³“ė„¼ ģ§ėź³ ģź³ , ģ“ė„¼ ģ”°ģķØģ¼ė”ģØ ģģ± ģ“ėÆøģ§ģ ķ¹ģ§ģ ė³ķģķ¬ ģ ģģģ ė³“ģ¬ģ£¼ģģµėė¤. ģ“ė„¼ ģ”°ģķģ¬ ģ“ėÆøģ§ė„¼ ģ»Øķøė”¤ ķė ė°©ė²ģ¤ pre-trained StyleGANģ ķģ©ķė ė°©ė²ź³¼, ė¹ģ·ķ źµ¬ģ”°ģ Image2Image tralanslationźµ¬ģ”°ė„¼ ķģ©ķģ¬ ģ ė „ģ ģķė ģ ė³“ė„¼ ģ£¼ė ė°©ė²ģ“ ģģµėė¤. pre-trained StyleGANģ latent spaceė„¼ ģ”°ģ ķė ė°©ė²ģ¼ė”ė ģģ±ģ latent codeģ ė°ģķė Styleflow[8], text inputģ ķģ©ķģ¬ latent spaceė„¼ ģ”°ģķė StyleCLIP[9]ė±ģ“ ģģµėė¤. Image2Image tranlationźµ¬ģ”°ė„¼ ķģ©ķģ¬ ģ ė „ģ ģ”°ģ ķė ė°©ė²ģ¼ė”ė segmentation mapģ ģ ė „ģ¼ė” ė°ģ ģ“ėÆøģ§ė” ė³ķķė SPADE[10], SEAN[11]ģ“ė, ė³ķķź³ ģ ķė ģģ±ģ ģ°øź³ ģ“ėÆøģ§ė” ė°ģģ ģ ģ²“ ģ“ėÆøģ§ģ ģ¤ķģ¼ģ ė³ķķė StarGAN-v2[12]ė±ģ“ ģģµėė¤.
ė³ø ė ¼ė¬øģ Barbershopģ ķė¦¬ķ°ė„¼ ģķģ¬ embedė°©ģģ ķģ©ķģģµėė¤. ėķ segmentation mapģ ķģ©ķģ¬ ź° ģģė³ė” ė¤ė„ø ģ“ėÆøģ§ė„¼ ķź²ģ¼ė” ķ loss functionģ ķģ©ķģ¬ ģ¤ģ ģ“ėÆøģ§ė„¼ pre-trained StyleGANģ embed ķė ė°©ģģ ķģ©ķģģµėė¤. ģ“ė„¼ ķµķ“ ģģė³ė” ė¤ė„ø ģ¤ķģ¼ģ ģ§ė ģ“ėÆøģ§ė„¼ ķ©ģ±ķ“ ė ėė¤.
ė³ø ė ¼ė¬øģģė segmentation mapģ ķģ©ķģ¬ ź° ģģė³ė” ė¤ė„ø ģ“ėÆøģ§ė„¼ ķź²ģ¼ė” ķ loss functionģ ķģ©ķģ¬ ģ¤ģ ģ“ėÆøģ§ė„¼ pre-trained StyleGANģ embed ķė ė°©ģģ ķģ©ķģģµėė¤.
ė³ø ė ¼ė¬øģģė ģ“ėÆøģ§ ķ©ģ±ģ ģķģ¬ segmentation mapģ ķģ©ķ©ėė¤. ė°ė¼ģ ķ©ģ±ė ģ“ėÆøģ§ģ ķķė ķģ©ė segmentation mapģ ķķģ ė°ė¼ ģ ķ“ģ§ź² ė©ėė¤. ģ“ėÆøģ§ ķ©ģ±ģ ė©ģøģ¼ė” ķģ©ėė ė°©ė²ģ ģ¤ģ ģ“ėÆøģ§ė„¼ StyleGAN latent spaceģ projectģķ¤ė GAN-inversionė°©ė²ģ ķģ©ķ©ėė¤. ė³ø ė ¼ė¬øģģ ģ¬ģ©ķė StyleGANģ StyleGANv2[4]ė„¼, embedė°©ė²ģ¼ė”ė II2S[13]ģ ķģ©ķģģµėė¤. II2Sė„¼ ķģ©ķØģ ģģ“ģ ź²°ź³¼ ģ“ėÆøģ§ģ ėķ ģ¼ģ ģķģ¬ ė¼ė ģė”ģ“ latent code ė„¼ ģ ģķ©ėė¤. StyleGANv2ė 18ź°ģ latent codeė„¼ ģ¬ģ©ķėė° ė StyleGANv2ģ 8ė²ģ§ø style blockģ output feature mapģ ģėÆøķė©° ėėØøģ§ 10ź°ģ latent codeė„¼ , appearance codeė¼ź³ ėŖ ėŖ ķ©ėė¤. ė°ė¼ģ II2Sė°©ė²ģ¼ė” ģ¤ģ ģ“ėÆøģ§ė„¼ ķķķė ź°ģ ģ°¾ź² ė©ėė¤. ģģ ģøźøė ė°©ė²ė¤ė ģ“ėÆøģ§ė„¼ ķ©ģ±ķė ģģė ė¤ģź³¼ ź°ģµėė¤.
ģ¤ķģ¼ ė³ķģ ģ¬ģ©ė ģ°øģ”° ģ“ėÆøģ§ė¤ģ segmentation mapģ źµ¬ķ©ėė¤.
źµ¬ķ segmentation mapģ ģ ė ¬ķģ¬ target segmentation mapģ ė§ėėė¤.
target segmentation mapģ ė§ėė” ģ°øģ”° ģ“ėÆøģ§ė¤ģ alignķ©ėė¤.
alignė ģ“ėÆøģ§ė¤ģ embedding ķģ¬ ź° ģ“ėÆøģ§ ė³ė” ķ“ė¹ķė ź°ģ ģ°¾ģė ėė¤.
targe tsegmentation mapģ ģģė³ė” ķź² ģ“ėÆøģ§ź° ė¤ė„“ź² ķģµķė masked-appearance loss functionģ ķģ©ķģ¬ ģ¬ė¬ ģ“ėÆøģ§ģ appearanceģ structure ź° ķ¼ķ©ė C ź°ģ ģ°¾ģė ėė¤.
Target segmentation mapģ ģģ±ķė ė°©ė²ģ ź°ėØķ©ėė¤. ź·øė¦¼1ź³¼ ź°ģ“ ģ“ėÆøģ§ ė³ė” ģķė ģģģ ģ¶ģ¶ķģ¬ ģ ģ ķź² ģģģ¼ė”ģØ target segmentation mapģ ė§ė¤ź² ė©ėė¤. ė§ģ½ ģģģ“ ė§ģ“ ė¹ėź°ģ ė¹ ģģģ“ ė§ģ“ ģźø°ė ź²½ģ°ģė ź·øė¦¼ 2ģ ź°ģ“ ģ“ėÆøģ§ģ ź°ģ“ė° ė¶ė¶ģ źø°ģ¤ģ¼ė” ģģģ ģ±ģ°ė ėØģķ ė°©ė²ģ¼ė” ģģģ ģ±ģ°ź² ė©ėė¤.
ģ“ėÆøģ§ė„¼ ķ©ģ±ķźø° ģķ“ģ ėؼģ ģ°øģ”° ģ“ėÆøģ§ė¤ģ ė§ė¤ģ“ģ§ target segmentation ģ ė§ėė” ģ ė ¬ķ©ėė¤. ģ ė ¬ ė°©ė²ģ ėź°ģ§ ģ¤ķ ģ¼ė” ģ“ė£Øģ“ ģ§ėė¤. StyleGANģ ź° ģ°øģ”° ģ“ėÆøģ§ ė§ė¤ embedding ė°©ė²ģ ķģ©ķģ¬ ź° ģ“ėÆøģ§ė„¼ ė³µģķė latent code ė„¼ ģėģ ź°ģ lossė„¼ ķģ©ķģ¬ ģ°¾ģė ėė¤.
ģ“ė embedķė latent spaceė„¼ space ė¼ź³ ķė©°, SytleGANģ 8ė²ģ§ø ėøė”ģ feature map ģ ėėØøģ§ ėøė”ģ ė¤ģ“ź°ė 10ź°ģ latent code ė” ģ“ė£Øģ“ģ ø ģģµėė¤. ģ“ė ź² ģ°¾ģėø ė„¼ ķģ©ķģ¬ ģ ģ°¾ģė“ėė° ź·ø ģ“ģ ė ģ²ģė¶ķ° target segmentation ģ ė§ė latent codeė„¼ ģ°¾ė ź²ė³“ė¤ spatial ģ ė³“ź° ė§ģ codeģ ėķ ģ¼ģ ģ“ė¦¬źø° ģķ“ģė¼ź³ ķ©ėė¤. Alignė ģ°øģ”° ģ“ėÆøģ§ė„¼ ķķķė ģ ģ°¾źø° ģķģ¬ masked style-lossģ segmentation outputģ ėķ cross-entropy loss ėź°ģ§ė„¼ ķģ©ķ©ėė¤.
ė ź° ģ°øģ”° ģ“ėÆøģ§ ģģ ģ ģ¬ģ©ė ģģģ ķ“ė¹ ėė ė¶ė¶ė§ģ ģ¶ģ¶ķ ė§ģ¤ķ¬ģ“ė©° ė°ė¼ģ I_{k}(Z)\bulletZ_{k}ė ķ“ė¹ ģģė§ ģ“ėÆøģ§ģģ ģ¶ģ¶ķ ź²ģ ķķ ķ©ėė¤. ė style loss ģ ģ¬ģ©ėė gram matrixė„¼ ģėÆøķ©ėė¤. ė ģģ±ė ģ“ėÆøģ§ģ segmentation mapź³¼ target segmentation ģ ė¹źµķźø° ģķ creoss-entropy loss ģ ėė¤.
ģ“ė ź² ģ°¾ģėø ė„¼ źø°ė°ģ¼ė” ģ ģė” ģ°¾ģė ėė¤. Embedė ģ“ėÆøģ§ģ ģģź³¼ ķź² ė§ģ¤ķ¬ź° ź²¹ģ¹ė ģģ()ģ codeė ģģ ź°ģ øģ¤ź³ ėėØøģ§ ģģģ ģ¼ė” embedķ ģ“ėÆøģ§ģģ ź°ģ øģ¤ė ź²ģ¼ė” ģ ė§ėėė¤.
ģ“ėÆøģ§ ė³ė” ģ°¾ģėø ģ ģ“ģ©ķģ¬ ģ“ėÆøģ§ė„¼ ķ©ģ±ķź² ė©ėė¤. ģµģ¢ ķ©ģ± ģ“ėÆøģ§ė„¼ ķķķė ė„¼ ģ°¾źø° ģķ“ģ ģ“ėÆøģ§ģ structureė„¼ ķķķė ģ½ė ė target mask ģģ ķ“ė¹ķė ģģ ģ ģ“ėÆøģ§ ģ½ė ģ ķ©ģ¹ė ź²ģ¼ė” ė„¼ ģ°¾ģė ėė¤. ėėØøģ§ ķ¹ģ§ģ ķķķė ė ė§ģ¤ķ¬ ģ ķ“ė¹ķė ģģė³ė” LPIPS lossė„¼ ź³ģ°ķģ¬ ķ“ė¹ķė ģ“ėÆøģ§ ė§ė¤ ģ ģ©ėė weight ė„¼ ģ°¾ģė ėė¤.
ģ“ ė ¼ė¬øģģė II2Sė„¼ ķµķ“ģ embedding ė ģ“ėÆøģ§ 120ź°ģ 1024x1024 ģ¬ģ“ģ¦ģ ģ“ėÆøģ§ė„¼ ķģ©ķ©ėė¤. ģ“ė„¼ ķµķ“ģ ė§ė¤ģ“ģ§ 198ź°ģ ģ“ėÆøģ§ pairė¤ė” ģ“ėÆøģ§ ķ©ģ± ģ¤ķģ ģ§ķķ©ėė¤. ģ ģ°¾źø° ģķ“ģ 400 iterationsģ, ģ ģ°¾źø° ģķ“ģ 100 iterationsģ, ģ½ėė¤ģ ķ©ģ± weight ė„¼ ģ°¾źø° ģķ“ģė 600 iterationsģ embedding ź³¼ģ ģ ģ§ķķģģµėė¤.
ģ ėģ ķź°ė„¼ ģķ“ģ źø°ģ”“ģ ė°©ė²ė¤ź³¼ RMSE, PSNR, SSIM, perceptual similarity, LPIPS, FIDė„¼ ė¹źµķģģµėė¤. ķģ baselineģ ģ ģķ ė°©ė² ģ¤ align ė°©ė²ģ ģ ģ© ķģ§ ģź³ , spaceź° ģė źø°ģ”“ģ spaceģ ėķģ¬ embedķģ ėģ ź²°ź³¼ė¬¼ ģ ėė¤. ėķ 396ėŖ ģ ģ°øź°ģė¤ģ ķµķ“ ģ“ėÆøģ§ ķģ§ ķź°ė„¼ ģ§ķķ ź²°ź³¼, LOHO[15]ģ ź²°ź³¼ģ ė¹źµķ“ģė 378:18ģ ģ ķģ, MichiGAN[16]ģ ź²°ź³¼ģ ė¹źµķ“ģė 381:14ģ ģ ķģ ė°ģģµėė¤.
ģ ź²°ź³¼ģģ ė³“ė©“ źø°ģ”“ģ ė¤ė„ø ė°©ė²ė¤ ė³“ė¤ Barbershopģ ģ“ėÆøģ§ ķ©ģ± ė„ė „ģ“ ķØģ¬ ė°ģ“ė ź²ģ ķģø ķ ģ ģģµėė¤. ėķ ķ¤ģ“ģ¤ķģ¼ ėæė§ ģėė¼ face swappingģģė ė°ģ“ė ź²°ź³¼ė„¼ ģģ±ķė ź²ģ ė³¼ ģ ģģµėė¤.
ģ“ ė ¼ė¬øģģė ėÆøė¦¬ ķģµė ģģ±ėŖØėøź³¼ segmentation maskė„¼ ķģ©ķģ¬ģ ģ“ėÆøģ§ ķ©ģ±ģ ķģģµėė¤. ķ¹ķ, embedė°©ė²ģ ģ¬ģ©ķ ģė”ģ“ latent space spaceė„¼ ģ ģķź³ ģ“ģ ėķģ¬ ėؼģ target maskė„¼ ģ§ģ ķź³ ģ“ģ ė§ģ¶°ģ ģ°øģ”° ģ“ėÆøģ§ė¤ģ ėŖØė ģ ė ¬ķė aligned codeė„¼ ģ°¾ė ė°©ė²ź³¼, ģ°øģ”° ģ“ėÆøģ§ ģģė³ė” styleģ ė°ģķė masked-style loss functionģ ģ“ģ©ķ embed ė°©ė²ģ ķģ©ķģ¬ ėģ ķģ§ģ ģ“ėÆøģ§ ķ©ģ± ź²°ź³¼ė„¼ ė³“ģ¬ģ£¼ģģµėė¤.
Using latent space of GAN is useful
ģ”°ģģ£¼ (Youngjoo Jo)
KAIST AI
[github](https://github.com/run-youngjoo)
Korean name (English name): Affiliation / Contact information
Korean name (English name): Affiliation / Contact information
...
Zhu, Peihao, et al. "Barbershop: GAN-based image compositing using segmentation masks." ACM Transactions on Graphics (TOG) 40.6 (2021): 1-13.
Official GitHub repository : https://github.com/ZPdesu/Barbershop
Karras, Tero, Samuli Laine, and Timo Aila. "A style-based generator architecture for generative adversarial networks." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
Karras, Tero, et al. "Analyzing and improving the image quality of stylegan." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
Abdal, Rameen, Yipeng Qin, and Peter Wonka. "Image2stylegan: How to embed images into the stylegan latent space?." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
Richardson, Elad, et al. "Encoding in style: a stylegan encoder for image-to-image translation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
Tov, Omer, et al. "Designing an encoder for stylegan image manipulation." ACM Transactions on Graphics (TOG) 40.4 (2021): 1-14.
Abdal, Rameen, et al. "Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows." ACM Transactions on Graphics (TOG) 40.3 (2021): 1-21.
Patashnik, Or, et al. "Styleclip: Text-driven manipulation of stylegan imagery." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.
Park, Taesung, et al. "Semantic image synthesis with spatially-adaptive normalization." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
Zhu, Peihao, et al. "Sean: Image synthesis with semantic region-adaptive normalization." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
Choi, Yunjey, et al. "Stargan v2: Diverse image synthesis for multiple domains." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
Zhu, Peihao, et al. "Improved stylegan embedding: Where are the good latents?." arXiv preprint arXiv:2012.09036 (2020).