HyperGAN [Kor]

Ratzlaff et al. / HyperGAN - A Generative Model for Diverse, Performant Neural Networks / ICML 2019

English version of this article is available.

1. Problem definition

HyperGAN์€ ์‹ ๊ฒฝ๋ง ๋งค๊ฐœ ๋ณ€์ˆ˜์˜ ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ ์ƒ์„ฑ ๋ชจ๋ธ์ด๋‹ค. ํŠนํžˆ, ์ปจ๋ณผ๋ฃจ์…˜ ํ•„ํ„ฐ์˜ ๋ณ€์ˆ˜๊ฐ’๋“ค์€ latent ์ธต๊ณผ ํ˜ผํ•ฉ(Mixer) ์ธต์œผ๋กœ ์ƒ์„ฑ๋œ๋‹ค.

alt text

์„œ๋กœ ๋‹ค๋ฅธ ๋ฌด์ž‘์œ„ ์ดˆ๊ธฐํ™”๋กœ๋ถ€ํ„ฐ ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์„ ํ›ˆ๋ จ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์€ ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ๋˜ํ•œ, ์‹ฌ์ธต ๋„คํŠธ์›Œํฌ์˜ ์•™์ƒ๋ธ”์€ ๋” ๋‚˜์€ ์„ฑ๋Šฅ๊ณผ ๊ฒฌ๊ณ ์„ฑ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ์ถ”๊ฐ€๋กœ ์—ฐ๊ตฌ๋˜์—ˆ๋‹ค. ๋ฒ ์ด์ง€์•ˆ ๋”ฅ ๋Ÿฌ๋‹์—์„œ๋Š” ๋„คํŠธ์›Œํฌ ๋งค๊ฐœ ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์‚ฌํ›„(posterior) ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•œ ๊ด€์‹ฌ์‚ฌ์ด๋ฉฐ, ๋“œ๋กญ์•„์›ƒ์€(dropout) ๋ฒ ์ด์ง€์•ˆ ๊ทผ์‚ฌ๋ฅผ ์œ„ํ•ด ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋œ๋‹ค. ํ•œ ์˜ˆ์‹œ๋กœ์„œ, ๋ชจ๋ธ ๋ถˆํ™•์‹ค์„ฑ์„ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•œ MC dropout์ด ์ œ์•ˆ๋˜์—ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ชจ๋“  ๊ณ„์ธต์— ๋“œ๋กญ์•„์›ƒ์„ ์ ์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ์˜ ์ ํ•ฉ๋„๊ฐ€ ๋‚ฎ์•„์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋‹จ์ผ ์ดˆ๊ธฐํ™”์—์„œ๋งŒ ๋„๋‹ฌํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ ๊ณต๊ฐ„์— ๊ฐ‡ํžˆ๊ฒŒ ๋œ๋‹ค.

๋˜ ๋‹ค๋ฅธ ํฅ๋ฏธ๋กœ์šด ๋ฐฉํ–ฅ์œผ๋กœ, ๋Œ€์ƒ(target) ์‹ ๊ฒฝ๋ง์— ๋Œ€ํ•œ ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ํ•˜์ดํผ ๋„คํŠธ์›Œํฌ๋ผ๋Š” ๋ถ„์•ผ๊ฐ€ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ํ•˜์ดํผ๋„คํŠธ์›Œํฌ์™€ ๋Œ€์ƒ ๋„คํŠธ์›Œํฌ๋Š” ๊ณต๋™์œผ๋กœ ํ›ˆ๋ จ๋˜๋Š” ๋‹จ์ผ ๋ชจ๋ธ์„ ํ˜•์„ฑํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด์ „์˜ ํ•˜์ดํผ๋„คํŠธ์›Œํฌ๋Š” ์‚ฌํ›„๋ถ„ํฌ๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด normalizing flow์— ์˜์กดํ–ˆ๊ณ , ์ด๋Š” ๋ชจ๋ธ ๋ณ€์ˆ˜์˜ ํ™•์žฅ์„ฑ์„ ์ œํ•œํ–ˆ๋‹ค.

๋ณธ ์—ฐ๊ตฌ๋Š” ๊ณ ์ •๋œ ๋…ธ์ด์ฆˆ ๋ชจ๋ธ์ด๋‚˜ ์ƒ์„ฑ ํ•จ์ˆ˜์˜ ๊ธฐ๋Šฅ์  ํ˜•ํƒœ๋ฅผ ๊ฐ€์ •ํ•˜์ง€ ์•Š๊ณ  ์‹ ๊ฒฝ๋ง์˜ ๋ชจ๋“  ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ํ•œ ๋ฒˆ์— ์ƒ์„ฑํ•˜๋Š” ์ ‘๊ทผ๋ฒ•์„ ํƒ๊ตฌํ•œ๋‹ค. ์ €์ž๋Š” normalizing flow ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹  GAN์„ ํ™œ์šฉํ•œ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋ฌด์ž‘์œ„ ์ดˆ๊ธฐํ™”(์•™์ƒ๋ธ”) ๋˜๋Š” ๊ณผ๊ฑฐ์˜ ๋ณ€ํ˜• ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•œ ํ›ˆ๋ จ๋ณด๋‹ค ๋” ๋‹ค์–‘ํ•œ ๋ชจ๋ธ์„ ์ œ๊ณตํ•œ๋‹ค.

Idea

HyperGAN์€ ๋ณ€์ˆ˜๋ฅผ ์ง์ ‘ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„ํ•ด GAN์˜์ ‘๊ทผ๋ฒ•์„ ํ™œ์šฉํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ๋กœ ํ›ˆ๋ จ๋œ ๋งŽ์€ ๋ชจ๋ธ ๋งค๊ฐœ ๋ณ€์ˆ˜ ์„ธํŠธ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. (image๋ฅผ ์ƒ์„ฑํ•ด๋‚ด๋Š” GAN์„ ์œ„ํ•ด์„œ real image๊ฐ€ ํ•„์š”ํ•œ ๊ฒƒ ์ฒ˜๋Ÿผ). ๊ทธ๋ž˜์„œ ์ €์ž๋“ค์€ ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ทจํ•ด์„œ, ์ง์ ‘ ๋Œ€์ƒ ๋ชจ๋ธ์˜ supervised ํ•™์Šต ๋ชฉํ‘œ๋ฅผ ์ตœ์ ํ•œ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ normalzing flow๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ์œ ์—ฐํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ฐ ๊ณ„์ธต์˜ ๋งค๊ฐœ ๋ณ€์ˆ˜๊ฐ€ ๋ณ‘๋ ฌ๋กœ ์ƒ์„ฑ๋˜๊ธฐ ๋•Œ๋ฌธ์— ๊ณ„์‚ฐ์ ์œผ๋กœ ํšจ์œจ์ ์ด๋‹ค. ๋˜ํ•œ ๋งŽ์€ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œ์ผœ์•ผ ํ•˜๋Š” ์•™์ƒ๋ธ” ๋ชจ๋ธ๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ๊ณ„์‚ฐ์ ์ด๊ณ  ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์ ์ด๋‹ค.

3. Method

Introduction ์„น์…˜์˜ ์œ„ ๊ทธ๋ฆผ์€ HyperGAN์˜ ๊ตฌ์กฐ๋ฅผ ๋ณด์—ฌ์ค€๋‹ค. ํ‘œ์ค€ GAN๊ณผ๋Š” ๋‹ฌ๋ฆฌ, ์ €์ž๋“ค์€ s ~ S๋ฅผ ํ˜ผํ•ฉ ์ž ์žฌ ๊ณต๊ฐ„ Z์— ๋งคํ•‘ํ•˜๋Š” fully connected ๋„คํŠธ์›Œํฌ์ธ Mixer Q๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋ฏน์„œ๋Š” ํ•œ ๊ณ„์ธต์˜ ์ถœ๋ ฅ์ด ๋‹ค์Œ ๊ณ„์ธต์— ๋Œ€ํ•œ ์ž…๋ ฅ์ด ํ•„์š”ํ•˜๋ฏ€๋กœ ๋„คํŠธ์›Œํฌ ๊ณ„์ธต ๊ฐ„์˜ ๊ฐ€์ค‘์น˜ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ๊ฐ•ํ•˜๊ฒŒ ์ƒ๊ด€๋˜์–ด์•ผ ํ•œ๋‹ค๋Š” ๊ด€์ฐฐ์— ์˜ํ•ด ์ œ์•ˆ๋˜์—ˆ๋‹ค. ํ˜ผํ•ฉ ์ž ์žฌ ๊ณต๊ฐ„ Q(z|s)์—์„œ Nd์ฐจ์› ํ˜ผํ•ฉ ์ž ์žฌ ๋ฒกํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ, ์ด๋Š” ๋ชจ๋‘ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ์žˆ๋‹ค(correlated). ์ž ์žฌ ๋ฒกํ„ฐ๋Š” ๊ฐ๊ฐ d์ฐจ์› ๋ฒกํ„ฐ๊ฐ€ ๋˜๋Š” N ๋ ˆ์ด์–ด ์ž„๋ฒ ๋”ฉ์œผ๋กœ ๋ถ„ํ• ๋œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ N ๋ณ‘๋ ฌ ์ƒ์„ฑ๊ธฐ๋Š” ๊ฐ N ๊ณ„์ธต์— ๋Œ€ํ•œ ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐฉ์‹์€ ๋งค๊ฐœ๋ณ€์ˆ˜์˜ ๊ทน๋„๋กœ ๋†’์€ ์ฐจ์› ๊ณต๊ฐ„์ด ํ˜„์žฌ ์—ฌ๋Ÿฌ ์ž ์žฌ ๋ฒกํ„ฐ์— ์™„์ „ํžˆ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๋Š” ๋Œ€์‹  ๋ณ„๋„๋กœ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์ ์ด๋‹ค.

์ด์ œ ์ƒˆ ๋ชจ๋ธ์ด ํ•™์Šต ์„ธํŠธ์—์„œ ํ‰๊ฐ€๋˜๊ณ  ์ƒ์„ฑ๋œ ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์†์‹ค L์— ๋Œ€ํ•ด ์ตœ์ ํ™”๋œ๋‹ค.

alt text

๊ทธ๋Ÿฌ๋‚˜ Q(z|s)์—์„œ ์ถ”์ถœํ•œ ์ฝ”๋“œ๊ฐ€ MLE์— ๋”ฐ๋ผ ์ถ•์†Œ๋  ์ˆ˜๋„ ์žˆ๋‹ค(mode collapse). ์ด๋ฅผ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์ €์ž๋Š” ํ˜ผํ•ฉ ์ž ์žฌ ๊ณต๊ฐ„์— ์ ๋Œ€์  ์ œ์•ฝ(adversarial constraint)์„ ์ถ”๊ฐ€ํ•˜๊ณ  P ์ด์ „์˜ ๋†’์€ ์—”ํŠธ๋กœํ”ผ์—์„œ ๋„ˆ๋ฌด ๋งŽ์ด ๋ฒ—์–ด๋‚˜์ง€ ์•Š๋„๋ก ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•œ HyperGAN objective๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค:

alt text

D๋Š” ๋ชจ๋“  ๋‘ ๋ถ„ํฌ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜์ผ ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ, ํŒ๋ณ„๊ธฐ ๋„คํŠธ์›Œํฌ๋Š”(discriminator network) ์ ๋Œ€์  ์†์‹ค๊ณผ ํ•จ๊ป˜ ๊ฑฐ๋ฆฌ ํ•จ์ˆ˜๋ฅผ ๊ทผ์‚ฌํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.

alt text

๊ณ ์ฐจ์› ๊ณต๊ฐ„์—์„œ๋Š” ํŒ๋ณ„๊ธฐ๋ฅผ ๋ฐฐ์šฐ๊ธฐ ์–ด๋ ต๊ณ  ๊ทธ๋Ÿฌํ•œ ๋งค๊ฐœ ๋ณ€์ˆ˜์—๋Š” (์ด๋ฏธ์ง€์™€ ๋‹ฌ๋ฆฌ) ๊ตฌ์กฐ๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ž ์žฌ ๊ณต๊ฐ„์—์„œ๋Š” ์ •๊ทœํ™”๋ฅผ ํ†ตํ•ด ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค. (์ž์„ธํ•œ ๋ฐฉ์‹์€ ๋…ผ๋ฌธ์— ์–ธ๊ธ‰๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.)

4. Experiment & Result

Experimental setup

  • MNIST ์™€ CIFAR-10์—์„œ์˜ ๋ถ„๋ฅ˜๊ธฐ ํ•™์Šต ๋ฐ ์„ฑ๋Šฅํ‰๊ฐ€

  • ๋‹จ์ˆœ 1D ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ๋ถ„์‚ฐ(variance) ํ•™์Šต

  • ๋ถ„ํฌ ์™ธ ์˜ˆ์ œ์˜ ์ด์ƒ ํƒ์ง€(Anomaly detection of out-of-distribution examples)

    • MNIST์— ๋Œ€ํ•ด ํ•™์Šตํ•œ ๋ชจ๋ธ/notMNIST๋กœ ํ…Œ์ŠคํŠธํ•œ ๋ชจ๋ธ

    • CIFAR-10 5๊ฐœ ํด๋ž˜์Šค์— ๋Œ€ํ•ด ํ•™์Šตํ•œ ๋ชจ๋ธ / ๋‚˜๋จธ์ง€ ํด๋ž˜์Šค์—์„œ ํ…Œ์ŠคํŠธ๋œ ๋ชจ๋ธ

  • baselines

    • APD(Wang et al., 2018), MNF(Louizos & Welling, 2016), MC Dropout(Gal & Ghahramani, 2016)

Result

Classification ๊ฒฐ๊ณผ

alt text

Anomaly detection ๊ฒฐ๊ณผ

alt text

Ablation Study

์ฒซ์งธ, ๋ชฉ์ ์—์„œ ์ •๊ทœํ™” ๋ถ€๋ถ„์ธ D(Q), P๋ฅผ ์ œ๊ฑฐํ•˜๋ฉด ๋„คํŠธ์›Œํฌ์˜ ๋‹ค์–‘์„ฑ์ด ๊ฐ์†Œํ•œ๋‹ค. ์ด๋ฅผ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์ €์ž๋“ค์€ 100๊ฐœ์˜ weight ์ƒ˜ํ”Œ์˜ L2 norm์„ ์ธก์ •ํ•˜๊ณ  ํ‘œ์ค€ ํŽธ์ฐจ๋ฅผ ํ‰๊ท ์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค. ๋˜ํ•œ, ์ €์ž๋“ค์€ ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ๋‹ค์–‘์„ฑ์ด ๊ฐ์†Œํ•œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜๊ณ  ํ•™์Šต์˜ ์กฐ๊ธฐ ์ค‘๋‹จ์„ ์ œ์•ˆํ•œ๋‹ค(early stopping). ๋‹ค์Œ์œผ๋กœ ์ €์ž๋“ค์€ ๋ฏน์„œ Q๋ฅผ ์ œ๊ฑฐํ•œ๋‹ค. ์ •ํ™•์„ฑ์€ ์œ ์ง€๋˜์ง€๋งŒ ๋‹ค์–‘์„ฑ์€ ํฌ๊ฒŒ ์ €ํ•˜๋œ๋‹ค. ๋ฏน์„œ๊ฐ€ ์—†์œผ๋ฉด ์œ ํšจํ•œ ์ตœ์ ํ™”๋ฅผ ์ฐพ๊ธฐ ์–ด๋ ต๋‹ค๋Š” ๊ฐ€์„ค๋„ ์„ธ์› ๋Š”๋ฐ, ๋ฏน์„œ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋‹ค๋ฅธ ๊ณ„์ธต์˜ ๋งค๊ฐœ ๋ณ€์ˆ˜๋“ค ์‚ฌ์ด์— ๋‚ด์žฌ๋œ ์ƒ๊ด€๊ด€๊ณ„๊ฐ€ ์ตœ์ ํ™”๋ฅผ ๋” ์‰ฝ๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ฃผ์žฅํ•œ๋‹ค.

5. Conclusion

๊ฒฐ๋ก ์ ์œผ๋กœ HyperGAN์€ ๋งค์šฐ ๊ฐ•๋ ฅํ•˜๊ณ  ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ์•™์ƒ๋ธ” ๋ชจ๋ธ์„ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•œ ํ›Œ๋ฅญํ•œ ๋ฐฉ์‹์ด๋‹ค. ๋ฏน์„œ ๋„คํŠธ์›Œํฌ ๋ฐ ์ •๊ทœํ™” ์šฉ์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋“œ ๋ถ•๊ดด(mode collapse) ์—†์ด GAN ๋ฐฉ์‹์œผ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด ์ž‘์—…์€ MNIST ๋ฐ CIFAR10๊ณผ ๊ฐ™์€ ์ž‘์€ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๊ฐ€์ง„ ์†Œ๊ทœ๋ชจ ๋Œ€์ƒ ๋„คํŠธ์›Œํฌ๋กœ ๊ตฌ์ถ•๋˜์–ด ๊ฐ„๋‹จํ•œ ๋ถ„๋ฅ˜ ์ž‘์—…๋งŒ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ๋‹ค. ResNets์™€ ๊ฐ™์€ ๋Œ€๊ทœ๋ชจ ๋„คํŠธ์›Œํฌ์—์„œ ๋” ํฐ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ๋” ํฅ๋ฏธ๋กœ์šธ ๊ฒƒ์ด๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

ํ•˜์ดํผ๋„คํŠธ์›Œํฌ(Hypernetworks)๋ฅผ GAN๋ฐฉ์‹์œผ๋กœ ํ•™์Šต์‹œ์ผœ์„œ ํšจ๊ณผ์ ์ธ ๋ฒ ์ด์ง€์•ˆ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ(bayesian neural networks) ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

Author / Reviewer information

Author

ํ˜•์ค€ํ•˜ (Junha Hyung)

  • KAIST AI๋Œ€ํ•™์› M.S.

  • Research Area: Computer Vision

  • sharpeeee@kaist.ac.kr

Reviewer

  1. Korean name (English name): Affiliation / Contact information

  2. Korean name (English name): Affiliation / Contact information

  3. ...

Reference & Additional materials

[1]Ha, D., Dai, A. M., and Le, Q. V. Hypernetworks. CoRR

[2]Henning, C., von Oswald, J., Sacramento, J., Surace, S. C., Pfister, J.P., and Grewe, B. F. Approximating the predic- tive distribution via adversarially-trained hypernetworks

[3]Krueger, D., Huang, C.W., Islam, R., Turner, R., Lacoste, A., and Courville, A. Bayesian Hypernetworks

[4]Lorraine, J. and Duvenaud, D. Stochastic hyperparameter optimization through hypernetworks. CoRR

[5]Pawlowski, N., Brock, A., Lee, M. C., Rajchl, M., and Glocker, B. Implicit weight uncertainty in neural networks

Last updated

Was this helpful?