๐Ÿ“
Awesome reviews
  • Welcome
  • Paper review
    • [2022 Spring] Paper review
      • RobustNet [Eng]
      • DPT [Kor]
      • DALL-E [Kor]
      • VRT: A Video Restoration Transformer [Kor]
      • Barbershop [Kor]
      • Barbershop [Eng]
      • REFICS [ENG]
      • Deep texture manifold [Kor]
      • SlowFast Networks [Kor]
      • SCAN [Eng]
      • DPT [Kor]
      • Chaining a U-Net With a Residual U-Net for Retinal Blood Vessels Segmentation [Kor]
      • Chaining a U-Net With a Residual U-Net for Retinal Blood Vessels Segmentation [Eng]
      • Patch Cratf : Video Denoising by Deep Modeling and Patch Matching [Eng]
      • LAFITE: Towards Language-Free Training for Text-to-Image Generation [Kor]
      • RegSeg [Eng]
      • D-NeRF [Eng]
      • SimCLR [Kor]
      • LabOR [Kor]
      • LabOR [Eng]
      • SegFormer [Kor]
      • Self-Calibrating Neural Radiance Fields [Kor]
      • Self-Calibrating Neural Radiance Fields [Eng]
      • GIRAFFE [Kor]
      • GIRAFFE [Eng]
      • DistConv [Kor]
      • SCAN [Eng]
      • slowfastnetworks [Kor]
      • Nesterov and Scale-Invariant Attack [Kor]
      • OutlierExposure [Eng]
      • TSNs [Kor]
      • TSNs [Eng]
      • Improving the Transferability of Adversarial Samples With Adversarial Transformations [Kor]
      • VOS: OOD detection by Virtual Outlier Synthesis [Kor]
      • MultitaskNeuralProcess [Kor]
      • RSLAD [Eng]
      • Deep Learning for 3D Point Cloud Understanding: A Survey [Eng]
      • BEIT [Kor]
      • Divergence-aware Federated Self-Supervised Learning [Eng]
      • NeRF-W [Kor]
      • Learning Multi-Scale Photo Exposure Correction [Eng]
      • ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions [Eng]
      • ViT [Eng]
      • CrossTransformer [Kor]
      • NeRF [Kor]
      • RegNeRF [Kor]
      • Image Inpainting with External-internal Learning and Monochromic Bottleneck [Eng]
      • CLIP-NeRF [Kor]
      • CLIP-NeRF [Eng]
      • DINO: Emerging Properties in Self-Supervised Vision Transformers [Eng]
      • DINO: Emerging Properties in Self-Supervised Vision Transformers [Kor]
      • DatasetGAN [Eng]
      • MOS [Kor]
      • MOS [Eng]
      • PlaNet [Eng]
      • MAE [Kor]
      • Fair Attribute Classification through Latent Space De-biasing [Kor]
      • Fair Attribute Classification through Latent Space De-biasing [Eng]
      • Learning to Adapt in Dynamic, Real-World Environments Through Meta-Reinforcement Learning [Kor]
      • PointNet [Kor]
      • PointNet [Eng]
      • MSD AT [Kor]
      • MM-TTA [Kor]
      • MM-TTA [Eng]
      • M-CAM [Eng]
      • MipNerF [Kor]
      • The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos [Eng]
      • Calibration [Eng]
      • CenterPoint [Kor]
      • YOLOX [Kor]
    • [2021 Fall] Paper review
      • DenseNet [Kor]
      • Time series as image [Kor]
      • mem3d [Kor]
      • GraSP [Kor]
      • DRLN [Kor]
      • VinVL: Revisiting Visual Representations in Vision-Language Models [Eng]
      • VinVL: Revisiting Visual Representations in Vision-Language Models [Kor]
      • NeSyXIL [Kor]
      • NeSyXIL [Eng]
      • RCAN [Kor]
      • RCAN [Eng]
      • MI-AOD [Kor]
      • MI-AOD [Eng]
      • DAFAS [Eng]
      • HyperGAN [Eng]
      • HyperGAN [Kor]
      • Scene Text Telescope: Text-focused Scene Image Super-Resolution [Eng]
      • Scene Text Telescope: Text-focused Scene Image Super-Resolution [Kor]
      • UPFlow [Eng]
      • GFP-GAN [Kor]
      • Federated Contrastive Learning [Kor]
      • Federated Contrastive Learning [Eng]
      • BGNN [Kor]
      • LP-KPN [Kor]
      • Feature Disruptive Attack [Kor]
      • Representative Interpretations [Kor]
      • Representative Interpretations [Eng]
      • Neural Discrete Representation Learning [KOR]
      • Neural Discrete Representation Learning [ENG]
      • Video Frame Interpolation via Adaptive Convolution [Kor]
      • Separation of hand motion and pose [kor]
      • pixelNeRF [Kor]
      • pixelNeRF [Eng]
      • SRResNet and SRGAN [Eng]
      • MZSR [Kor]
      • SANforSISR [Kor]
      • IPT [Kor]
      • Swin Transformer [kor]
      • CNN Cascade for Face Detection [Kor]
      • CapsNet [Kor]
      • Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [Kor]
      • CSRNet [Kor]
      • ScrabbleGAN [Kor]
      • CenterTrack [Kor]
      • CenterTrack [Eng]
      • STSN [Kor]
      • STSN [Eng]
      • VL-BERT:Visual-Linguistic BERT [Kor]
      • VL-BERT:Visual-Linguistic BERT [Eng]
      • Squeeze-and-Attention Networks for Semantic segmentation [Kor]
      • Shot in the dark [Kor]
      • Noise2Self [Kor]
      • Noise2Self [Eng]
      • Dynamic Head [Kor]
      • PSPNet [Kor]
      • PSPNet [Eng]
      • CUT [Kor]
      • CLIP [Eng]
      • Local Implicit Image Function [Kor]
      • Local Implicit Image Function [Eng]
      • MetaAugment [Eng]
      • Show, Attend and Tell [Kor]
      • Transformer [Kor]
      • DETR [Eng]
      • Multimodal Versatile Network [Eng]
      • Multimodal Versatile Network [Kor]
      • BlockDrop [Kor]
      • MDETR [Kor]
      • MDETR [Eng]
      • FSCE [Kor]
      • waveletSR [Kor]
      • DAN-net [Eng]
      • Boosting Monocular Depth Estimation [Eng]
      • Progressively Complementary Network for Fisheye Image Rectification Using Appearance Flow [Kor]
      • Syn2real-generalization [Kor]
      • Syn2real-generalization [Eng]
      • GPS-Net [Kor]
      • Frustratingly Simple Few Shot Object Detection [Eng]
      • DCGAN [Kor]
      • RealSR [Kor]
      • AMP [Kor]
      • AMP [Eng]
      • RCNN [Kor]
      • MobileNet [Eng]
  • Author's note
    • [2022 Spring] Author's note
      • Pop-Out Motion [Kor]
    • [2021 Fall] Author's note
      • Standardized Max Logits [Eng]
      • Standardized Max Logits [Kor]
  • Dive into implementation
    • [2022 Spring] Implementation
      • Supervised Contrastive Replay [Kor]
      • Pose Recognition with Cascade Transformers [Eng]
    • [2021 Fall] Implementation
      • Diversity Input Method [Kor]
        • Source code
      • Diversity Input Method [Eng]
        • Source code
  • Contributors
    • [2022 Fall] Contributors
    • [2021 Fall] Contributors
  • How to contribute?
    • (Template) Paper review [Language]
    • (Template) Author's note [Language]
    • (Template) Implementation [Language]
  • KAIST AI
Powered by GitBook
On this page
  • Multi-Task Neural Processes [Kor]
  • 1. Problem definition
  • 2. Motivation
  • Related work
  • Idea
  • 3. Method
  • 4. Experiment & Result
  • Experimental setup
  • Result
  • 5. Conclusion
  • Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)
  • Author / Reviewer information
  • Author
  • Reviewer
  • Reference & Additional materials

Was this helpful?

  1. Paper review
  2. [2022 Spring] Paper review

MultitaskNeuralProcess [Kor]

(Description) Kim et al. / MULTI-TASK NEURAL PROCESSES / ICRL2022

PreviousVOS: OOD detection by Virtual Outlier Synthesis [Kor]NextRSLAD [Eng]

Last updated 2 years ago

Was this helpful?

Multi-Task Neural Processes [Kor]

1. Problem definition

Neural Processes (NPs)๋Š” ํ•จ์ˆ˜์˜ ๋ถ„ํฌ๋ฅผ ๋ชจ๋ธ๋ง (์˜ˆ: ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค)ํ•˜๋Š” ๋ฉ”ํƒ€ ๋Ÿฌ๋‹ ๊ณ„์—ด์˜ ๋ฐฉ๋ฒ•๋ก  ์ค‘ ํ•˜๋‚˜์ด๋‹ค. NPs์€ ๋‚ด์žฌ๋˜์–ด ์žˆ๋Š” ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค๋กœ๋ถ€ํ„ฐ ๊ตฌํ˜„๋œ ํ•จ์ˆ˜๋ฅผ ํ•˜๋‚˜์˜ task๋กœ ๊ณ ๋ คํ•˜์—ฌ ๋ณด์ง€์•Š์€ task์— ํ•จ์ˆ˜์˜ ์ถ”๋ก ๊ณผ์ •์„ ํ†ตํ•ด์„œ adaptํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ํŠน์„ฑ ๋•Œ๋ฌธ์— image regression, image classification, time series regression ๋“ฑ ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์— ํ™œ์šฉ๋˜์–ด ์™”๋‹ค. ํ•ด๋‹น ๋…ผ๋ฌธ์—์„œ ์ €์ž๋“ค์€ ๊ธฐ์กด์˜ neural processes๋ฅผ ๋‹ค์ค‘ ํƒœ์Šคํฌ ํ™˜๊ฒฝ์œผ๋กœ ํ™•์žฅํ•˜์—ฌ ๋ฐฉ๋ฒ•๋ก ์„ ์†Œ๊ฐœํ•˜์˜€๋‹ค. ์ด ๋•Œ ๋‹ค์ค‘ ํƒœ์Šคํฌ ํ™˜๊ฒฝ์€ ๋‹ค์ค‘์˜ ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค๋กœ๋ถ€ํ„ฐ ๊ตฌํ˜„๋œ ์ƒ๊ด€ ๊ด€๊ณ„์˜ ํƒœ์Šคํฌ๋กœ ๊ตฌ์„ฑ๋˜์–ด์žˆ๋‹ค. ๋‹ค์ค‘ ํƒœ์Šคํฌ ํ™˜๊ฒฝ์€ ์˜๋ฃŒ ๋ฐ์ดํ„ฐ๋‚˜ ๊ธฐ์ƒ ๋ฐ์ดํ„ฐ์™€ ๊ฐ™์ด ํ™˜์ž๋‚˜ ์ง€์—ญ์— ๊ด€ํ•œ ์ •๋ณด๊ฐ€ ๋‹ค์–‘ํ•œ ์ƒ๊ด€ ๊ด€๊ณ„๊ฐ€ ์žˆ๋Š” ํŠน์„ฑ์„ ๊ฐ€์ง„ ์ •๋ณด๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋Š” ๊ฒƒ๊ณผ ๊ฐ™์ด ๋งŽ์€ ์‹ค์„ธ๊ณ„์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ๋‹ค์ค‘์˜ ์ƒ๊ด€๊ด€๊ณ„์˜ ํ•จ์ˆ˜๋“ค์„ ํ‘œํ˜„ํ•œ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•œ ํ•™์Šต ํ™˜๊ฒฝ์ด๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ธฐ์กด์˜ neural processes ๊ณ„์—ด์˜ ๋ฐฉ๋ฒ•๋ก ์€ ๋‹ค์ค‘ ํ•จ์ˆ˜์˜ ์…‹์„ ๊ณต๋™์œผ๋กœ ๋‹ค๋ฃจ๊ณ  ์žˆ์ง€ ์•Š๊ณ  ์ด๋“ค ๊ฐ„์˜ ์ƒ๊ด€ ๊ด€๊ณ„์˜ ์ •๋ณด๋„ ์–ป์„ ์ˆ˜ ์—†๋Š” ๊ตฌ์กฐ๋กœ ๋˜์–ด ์žˆ๋‹ค๋Š” ์ ์—์„œ ๋‹ค์ค‘ ํ•™์Šต ํ™˜๊ฒฝ์œผ๋กœ์˜ neural processes์˜ ํ™•์žฅ์€ ๊นŠ์€ ์˜๋ฏธ๋ฅผ ๊ฐ€์ง„๋‹ค๊ณ  ์ƒ๊ฐ๋œ๋‹ค.

2. Motivation

Related work

๋‹ค์ค‘ ํƒœ์Šคํฌ ํ•™์Šต์„ ์œ„ํ•œ ํ™•์œจ ํ”„๋กœ์„ธ์Šค ๋‹ค์ค‘ ํƒœ์Šคํฌ ํ•™์Šต์„ ํƒ€์ผ“์œผ๋กœ ํ•˜๋Š” ๊ธฐ์กด์˜ ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ๋กœ๋Š” ๋Œ€ํ‘œ์ ์œผ๋กœ Multi-Output Gaussian processes (MOGPs)๊ฐ€ ์žˆ๋Š”๋ฐ ์ด๋Š” ๊ธฐ์กด์˜ Gaussian ํ”„๋กœ์„ธ์Šค๋ฅผ ํ™•์žฅํ•˜์—ฌ ๋‹ค์ค‘ ํƒœ์Šคํฌ๋ฅผ ์ถ”๋ก  ํ•˜๊ณ  ๋ถˆ์™„์ „ํ•œ ๋ฐ์ดํ„ฐ๋„ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ์ •ํ™•ํ•œ ์˜ˆ์ธก ์„ฑ๋Šฅ์„ ์œ„ํ•ด์„œ๋Š” ๋งŽ์€ ๊ด€์ฐฐ ๊ฐ’์ด ํ•„์š”ํ•œ ํ•œ๊ณ„๊ฐ€ ์žˆ๋‹ค. ์ตœ๊ทผ์˜ ๋ฐฉ๋ฒ•๋ก  ์ค‘์—๋Š” Gaussian ํ”„๋กœ์„ธ์Šค์™€ ๋ฉ”ํƒ€ํ•™์Šต ๊ธฐ๋ฒˆ์„ ๊ฒฐํ•ฉํ•œ ๋ฐฉ๋ฒ•๋ก ์ด ์žˆ์ง€๋งŒ ์ด๋Š” ๋‹ค์ค‘ ํ•™์Šต ํ™˜๊ฒฝ์„ ๊ณ ๋ คํ•˜์ง€๋Š” ์•Š์•˜๋‹ค. Conditional Neural Adaptive Processes (CNAPs)๋Š” ๋‹ค์–‘ํ•œ ์…‹์˜ ํด๋ž˜์Šค๋ฅผ ๊ณ ๋ คํ•˜๋Š” general ํ•œ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ์ œ์•ˆํ–ˆ์ง€๋งŒ NP์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ๊ฐ ํƒœ์Šคํฌ์— ๋Œ€ํ•ด ๋…๋ฆฝ์ ์ธ ์ถ”๋ก ๋งŒ ๊ฐ€๋Šฅํ•˜๊ณ  ์ถ”๋ก  ์‹œ์— ํƒœ์Šคํฌ ๊ฐ„์˜ ์ƒ๊ด€ ์ •๋ณด๋ฅผ explicitํ•˜๊ฒŒ ๊ณ ๋ คํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ํ•œ๊ณ„๊ฐ€ ์กด์žฌํ•œ๋‹ค.

Neural process ๊ณ„์—ด์˜ ๊ณ„์ธต์  ๋ชจ๋ธ Attentive Neural Processes (ANPs)๋Š” ์–ดํ…์…˜ ๋ฉ”์นด๋‹ˆ์ฆ˜์„ deterministic์— ํ†ตํ•ฉํ•˜์—ฌ ๊ฐ๊ฐ์˜ target example์— ๋Œ€ํ•ด ์ถ”๊ฐ€์ ์ธ context ์ •๋ณด๋ฅผ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜์˜€๊ณ  ์ด๋Š” ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ๊ณผ underfitting ๋ฌธ์ œ๋ฅผ ๋ฐฉ์ง€ํ•˜๊ฒŒ ํ•ด์ฃผ์—ˆ๋‹ค. ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” local ์ž ์žฌ ๋ณ€์ˆ˜๋ฅผ ํ™œ์šฉํ•˜์—ฌ example์— ํŠนํ™”๋œ stochasticity๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ NP์˜ ๊ทธ๋ž˜ํ”ฝ ๋ชจ๋ธ์„ ๊ณ„์ธต์ ์ธ ๊ตฌ์กฐ๋กœ ํ™•์žฅํ•˜๊ฒŒ ํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ์กด์žฌํ•œ๋‹ค.

Idea

๋‹ค์ค‘ ํ™˜๊ฒฝ์—์„œ์˜ ๋‹ค์ค‘ ํ•จ์ˆ˜์˜ ์…‹์„ ๊ณต๋™์œผ๋กœ ํ•™์Šตํ•˜๊ณ  ํƒœ์Šคํฌ ๊ฐ„์˜ ์ƒ๊ด€ ๊ด€๊ณ„์˜ ์ •๋ณด๋„ ํ•™์Šตํ•˜๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ๋„์ „ ๊ณผ์ œ ์ค‘ ํ•˜๋‚˜๋Š” ๊ด€์ฐฐ ๊ฐ’๋“ค์ด ๋ถˆ์ถฉ๋ถ„ ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๋ถ€๋ถ„์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ์—ฌ๋Ÿฌ ์„ผ์„œ์—์„œ multi-modal ํ˜•ํƒœ์˜ ์‹œ๊ทธ๋„์„ ์ˆ˜์ง‘ํ•  ๋•Œ, ์„ผ์„œ๋Š” ๋™์‹œ์— ์กด์žฌํ•˜์ง€ ์•Š๋Š” (asynchronous) ์ƒ˜ํ”Œ๋ง ๋น„์œจ์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋‹ค. ๋‹ค์‹œ ๋งํ•˜๋ฉด, ๋ชจ๋“  ํ•จ์ˆ˜๋“ค์ด ๊ณตํ†ต์ ์ธ ์ƒ˜ํ”Œ location์„ ๊ฐ€์ง€์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋Š”๋ฐ ์ €์ž๋“ค์€ ์ด๋Ÿฌํ•œ ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ์˜ ํ™œ์šฉ๋„๋ฅผ ์ตœ๋Œ€ํ™”ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ด์ƒ์ ์ธ ํ•™์Šต ๋ชจ๋ธ์€ ์„œ๋กœ ๋‹ค๋ฅธ ์ธํ’‹์—์„œ ๊ด€์ฐฐ๋  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ํ•จ์ˆ˜๋“ค์„ ์—ฐ๊ด€์ง€์–ด ํ•™์Šตํ•  ์ˆ˜ ์žˆ์–ด์•ผํ•œ๋‹ค๊ณ  ์ œ์•ˆํ•˜์˜€๋‹ค. ๊ธฐ์กด์˜ ๋‹ค๋ณ€๋Ÿ‰ ๊ฐ€์šฐ์‹œ์•ˆ ํ”„๋กœ์„ธ์Šค ๋ฐฉ๋ฒ•๋ก ์ด ์ด๋Ÿฌํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ถˆ์ถฉ๋ถ„ํ•œ ๊ด€์ฐฐ๊ฐ’์„ ๊ฐ€์ง€๊ณ  ๋‹ค์ค‘ ํ•จ์ˆ˜๋ฅผ ์ถ”๋ก ํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜์ง€๋งŒ ์ผ๋ฐ˜์ ์œผ๋กœ ๋ฐ์ดํ„ฐ ์‚ฌ์ด์ฆˆ์— ๋”ฐ๋ฅธ ๋ณต์žก๋„๊ฐ€ ๋†’์•„์ ธ ์ด๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ถ”์ • ๋ฐฉ๋ฒ•์„ ์ถ”๊ฐ€๋กœ ํ•„์š”๋กœ ํ•˜๊ฒŒ ๋œ๋‹ค. (๊ทธ๋ฆฌ๊ณ  ์ ํ•ฉํ•œ kernerl์„ ์„ ํƒํ•  ์ˆ˜ ์žˆ๋Š” ์—ฌ๋ถ€์— ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ์˜์กด์ ์ธ ํŽธ์ด๋‹ค.)

์ด์— ๋Œ€ํ•ด ์ €์ž๋“ค์€ ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๋‹ค์ค‘ ํƒœ์Šคํฌ๋ฅผ ๊ณต๋™ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ๋Š” Multi-task neural processes (MTNPs)๋ฅผ ์ œ์•ˆํ•˜์˜€๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค๋ฃจ๊ณ  ํ•จ์ˆ˜๋“ค์„ ํ†ตํ•ด ๊ณต๋™ ์ถ”๋ก ์„ ํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์ค‘ ํ•จ์ˆ˜ ๊ณต๊ฐ„์„ ๋””์ž์ธํ•˜์˜€๊ณ  ํ†ตํ•ฉ๋œ ํ•จ์ˆ˜ ๊ณต๊ฐ„์—์„œ ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค๋ฅผ ์ด๋ก ์ ์œผ๋กœ ์œ ๋„ํ•˜๊ธฐ ์œ„ํ•œ ์ž ์žฌ ๋ณ€์ˆ˜ ๋ชจ๋ธ (Latent variable model)์„ ์ •์˜ํ•˜์˜€๋‹ค. ์ด ๋•Œ, ํƒœ์Šคํฌ ๊ฐ„์˜ ์ƒ๊ด€ ๊ด€๊ณ„ ํ™œ์šฉ์„ ์œ„ํ•ด์„œ ์ž ์žฌ๋ณ€์ˆ˜ ๋ชจ๋ธ์„ ๊ณ„์ธต์ ์œผ๋กœ ๊ตฌ์„ฑํ•˜์˜€๋Š”๋ฐ ์ด๋Š” ๋ชจ๋“  ํƒœ์Šคํฌ์˜ ์ •๋ณด๋ฅผ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•œ 1) global latent variable๊ณผ ๊ฐ๊ฐ์˜ ํ…Œ์Šคํฌ์— ์ง‘์ค‘๋œ ์ •๋ณด๋ฅผ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•œ 2) task-specific latent variable๋กœ ๋˜์–ด์žˆ๋‹ค. ์ œ์•ˆ๋œ ๋ชจ๋ธ์€ ๋˜ํ•œ ๊ธฐ์กด์˜ neural processes๊ฐ€ ๋ณด์—ฌ์ฃผ๋Š” ์žฅ์ ๋“ค(flexible adaptation, scalable inferece, uncertainty-aware prediction)์„ ์—ฌ์ „ํžˆ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค.

3. Method

Neural processes ๋ฅผ ๋‹ค์ค‘ ํƒœ์Šคํฌ์— ์ ์šฉํ•˜๋Š” ์ง๊ด€์ ์ธ ๋ฐฉ๋ฒ•์€ ํƒœ์Šคํฌ ๊ฐ„์˜ ๋…๋ฆฝ์„ฑ์„ ๊ฐ€์ •ํ•˜๊ณ  ํ•จ์ˆ˜ ๊ณต๊ฐ„ $(y^1)^\mathcal{x}, ..., (y^T)^\mathcal{x}$ ์— ๋Œ€ํ•œ ๋…๋ฆฝ์ ์ธ NPs๋ฅผ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. Single-task neural processes (STNPs, Figure (a))๋กœ ๋ช…๋ช…ํ•˜์˜€๋‹ค. ๋…๋ฆฝ์ ์ธ ์ž ์žฌ ๋ณ€์ˆ˜ $v^1, v^2,...,v^T$์—์„œ ๊ฐ๊ฐ์˜ $v^t$๋Š” ํƒœ์Šคํฌ $f^t$๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

p(YD1:TโˆฃXD,C)=โˆt=1Tโˆซp(YDtโˆฃXD,vt)p(vtโˆฃCt)dvt.p(Y_D^{1:T}|X_D, C)=\prod_{t=1}^{T} \int p(Y^t_D|X_D, v^t)p(v^t|C^t)dv^t.p(YD1:Tโ€‹โˆฃXDโ€‹,C)=t=1โˆTโ€‹โˆซp(YDtโ€‹โˆฃXDโ€‹,vt)p(vtโˆฃCt)dvt.

์ด ๋•Œ, STNP๋Š” ๊ฐ ํƒœ์Šคํฌ์— ํŠนํ™”๋œ ๋ฐ์ดํ„ฐ $C^t$ ์— ๋Œ€ํ•ด ์กฐ๊ฑดํ™”๋ฅผ ํ†ตํ•ด ๋ถˆ์ถฉ๋ถ„ํ•œ ๊ด€์ฐฐ ๊ฐ’ (contexts)์„ ๋‹ค๋ฃฐ์ˆ˜ ์žˆ๊ฒŒ๋œ๋‹ค. ํ•˜์ง€๋งŒ ๋‹ค์ค‘ ํƒœ์Šคํฌ์˜ ๊ฒฐํ•ฉ ๋ถ„ํฌ์—์„œ ์กด์žฌํ•˜๋Š” ํƒœ์Šคํฌ ์‚ฌ์ด์˜ ๋ณต์žกํ•œ ์ƒ๊ด€ ๊ด€๊ณ„๋ฅผ ๋ฌด์‹œํ•˜๊ณ  ์ฃผ๋ณ€ ๋ถ„ํฌ์— ๋Œ€ํ•œ ๋ชจ๋ธ๋ง๋งŒ ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์ ์—์„œ ๋‹จ์ ์ด ์กด์žฌํ•œ๋‹ค.

๋‹ค๋ฅธ ๋Œ€์•ˆ์œผ๋กœ๋Š” ์ถœ๋ ฅ ๊ณต๊ฐ„์„ product space $\mathcal{Y}^{1:T} = \prod_{t\in\tau}\mathcal{Y}^t$ ๊ฒฐํ•ฉํ•˜์—ฌ ํ•จ์ˆ˜ ๊ณต๊ฐ„ $(\mathcal{Y}^{1:T})^\mathcal{X}$ ์— ๋Œ€ํ•œ ํ•˜๋‚˜์˜ NP๋ฅผ ์ •์˜ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด ๊ฒฝ์šฐ์—๋Š” ํ•œ ๊ฐœ์˜ ์ž ์žฌ ๋ณ€์ˆ˜ $z$๊ฐ€ ์ „์ฒด ํƒœ์Šคํฌ $T$๋ฅผ ๊ณต๋™์œผ๋กœ ํฌํ•จํ•˜๊ณ  Joint-Task Neural Processes (JTNPs)๋ผ ๋ช…๋ช…ํ•œ๋‹ค.

p(YD1:TโˆฃXD,C)=โˆซp(YD1:TโˆฃXD,z)p(zโˆฃC)dz.p(Y_D^{1:T}|X_D, C)= \int p(Y^{1:T}_D|X_D, z)p(z|C)dz.p(YD1:Tโ€‹โˆฃXDโ€‹,C)=โˆซp(YD1:Tโ€‹โˆฃXDโ€‹,z)p(zโˆฃC)dz.

์ด ๋•Œ, JTNP๋Š” ์ž ์žฌ ๋ณ€์ˆ˜ $z$๋ฅผ ํ†ตํ•ด ์ „์ฒด ํƒœ์Šคํฌ ๊ฐ„์˜ ์ƒ๊ด€ ์ •๋ณด๋ฅผ ํฌํ•จํ•  ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ๋ฌธ์ œ๋Š” ํ•™์Šต๊ณผ ์ถ”๋ก  ์‹œ์— ์™„์ „ํ•œ ๊ด€์ฐฐ๊ฐ’ context์™€ target ๊ฐ’์„ ํ•„์ˆ˜์ ์œผ๋กœ ํ•„์š”๋กœ ํ•œ๋‹ค.

Multi-Task Neural Processes

์œ„์—์„œ ์–ธ๊ธ‰๋œ ๋ฌธ์ œ (์™„์ „ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•„์š”๋กœ ํ•˜๋Š”)๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ €์ž๋“ค์€ ๊ธฐ์กด์˜ JTNP์˜ ํ˜•ํƒœ๋ฅผ ์žฌ๊ณต์‹ํ™” ํ•˜์—ฌ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œํ˜„ํ•œ๋‹ค: $h: \mathcal{X} \times \mathcal{\tau} \rightarrow \bigcup_{t\in\tau}\mathcal{Y}^t$. ์ด๋Ÿฌํ•œ union form์„ ์‚ฌ์šฉํ•จ์œผ๋กœ์จ ์–ด๋–ค ๋ถ€๋ถ„์ ์ธ ์ถœ๋ ฅ ๊ฐ’์˜ set๋„ ${y_i^t}_{t\in\tau}$ ๋‹ค๋ฅธ ์ž…๋ ฅ ํฌ์ธํŠธ $(x_i, t),t\in\tau_i$์—์„œ ํƒ€๋‹นํ•œ ๊ฐ’์ด ๋˜๊ธฐ ๋•Œ๋ฌธ์— ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค.

Figure 1์˜ (c)์—์„œ ์ฒ˜๋Ÿผ ๊ณ„์ธต์ ์ธ ์ž ์žฌ ๋ณ€์ˆ˜ ๋ชจ๋ธ์„ ์ •์˜ํ•˜๋Š”๋ฐ ์žˆ์–ด์„œ globalํ•œ ์ž ์žฌ๋ณ€์ˆ˜ $z$ ์ „์ฒด context์ธ $C$๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์ค‘ ํƒœ์Šคํฌ์— ๊ฑธ์นœ ๊ณต์œ ๋œ ํ™•๋ฅ ์ ์ธ ์š”์†Œ๋ฅผ ํ™•๋ณดํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•˜์˜€๊ณ , ๊ฐ ํƒœ์Šคํฌ์— ์ง‘์ค‘๋œ ํ™•๋ฅ ์  ์š”์†Œ๋Š” $C^t, z$๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํƒœ์Šคํฌ์— ์ง‘์ค‘๋œ (task-specific) ์ž ์žฌ ๋ณ€์ˆ˜ $v^t$์— ์˜ํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ™•๋ณด๋˜๊ฒŒ ํ•˜์˜€๋‹ค.

p(YD1:TโˆฃXD,C)=โˆซโˆซ[โˆt=1Tp(YDTโˆฃXDT,vt)p(vtโˆฃz,Ct)]p(zโˆฃC)dv1:Tdz.p(Y_D^{1:T}|X_D, C)= \int \int [\prod_{t=1}^T p(Y^{T}_D|X_D^T, v^t)p(v^t|z, C^t)]p(z|C)dv^{1:T}dz.p(YD1:Tโ€‹โˆฃXDโ€‹,C)=โˆซโˆซ[t=1โˆTโ€‹p(YDTโ€‹โˆฃXDTโ€‹,vt)p(vtโˆฃz,Ct)]p(zโˆฃC)dv1:Tdz.

์ด ๋•Œ, $v^{1:T}:= (v^1,..,v^T)$์ด๊ณ  $p(Y_D^t|x_D^t, v^t)$์— ๋Œ€ํ•œ ์กฐ๊ฑด์ ์ธ ๋…๋ฆฝ์„ฑ์„ ๊ฐ€์ •ํ•œ๋‹ค.

์ •๋ฆฌ๋ฅผ ํ•ด๋ณด๋ฉด ์šฐ์„  ์ „์ฒด $v^{1:T}$์— ๋”ฐ๋ฅธ $z$๋ฅผ ๊ณต์œ ํ•จ์œผ๋กœ์จ ํ•ด๋‹น ๋ชจ๋ธ์€ ํƒœ์Šคํฌ๊ฐ„์˜ ์ƒ๊ด€ ์ •๋ณด๋ฅผ ํ™•๋ณดํ•˜๊ธฐ ์ด๋ฅผ ํšจ์œจ์ ์œผ๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  global ์ž ์žฌ ๋ณ€์ˆ˜ $z$๋ฅผ ํ†ตํ•ด ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ถฉ๋ถ„ํžˆ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋Š”๋ฐ ์ด๋Š” ์ด ์ž ์žฌ๋ณ€์ˆ˜๊ฐ€ 1) ์ „์ฒด context ๋ฐ์ดํ„ฐ $\bigcup_{t\in\tau}C^t$์—์„œ ์ถ”๋ก ๋˜๋ฉฐ 2) ๊ฐ ํ…Œ์Šคํฌ์— ํŠนํ™”๋œ ์ž ์žฌ๋ณ€์ˆ˜ $v^t$๋ฅผ ์ถ”๋ก ํ•  ๋•Œ๋„ global ์ž ์žฌ ๋ณ€์ˆ˜ $z$๊ฐ€ ์กฐ๊ฑดํ™”๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๋Š” $v^t$์— ์œ ๋„๋œ ๊ฐ๊ฐ์˜ ํ•จ์ˆ˜ $f^t$๊ฐ€ ํ˜„์žฌ ํƒœ์Šคํฌ์˜ $C^t$ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋‹ค๋ฅธ ํƒœ์Šคํฌ $C^{t\prime}$์—์„œ์˜ ๊ด€์ฐฐ ๊ฐ’๋“ค๋„ ๋ฒ”์šฉ์ ์œผ๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ๋˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

ํ•™์Šต๊ณผ ์ถ”๋ก ์‹œ์— ์ €์ž๋“ค์€ encoder $q_\phi$์™€ decoder $p_\theta$๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ conditional prior์™€ generative ๋ชจ๋ธ์„ ์ถ”์ •ํ•˜์˜€๋‹ค. ์–ธ๊ธ‰๋œ ๋‹ค์Œ์˜ ์‹์€ $p(Y_D^{1:T}|X_D, C)= \int \int [\prod_{t=1}^T p(Y^{T}_D|X_D^T, v^t)p(v^t|z, C^t)]p(z|C)dv^{1:T}dz$์€ intractableํ•˜๊ธฐ์— variational lower bound์„ ํ†ตํ•ด ํ›ˆ๋ จ์„ ์ง„ํ–‰ํ•œ๋‹ค.

logpฮธ(YD1:TโˆฃXD1:T,C)โ‰ฅEqฯ•(zโˆฃD)[โˆ‘t=1TEqฯ•(vtโˆฃz,Dt)[logpฮธ(YDtโˆฃXDt,vt)]โˆ’DKL(qฯ•(vtโˆฃz,Dt)โˆฃโˆฃqฯ•(vtโˆฃz,Ct))]โˆ’DKL(qฯ•(zโˆฃD)โˆฃโˆฃqฯ•(zโˆฃC))log p_\theta(Y_D^{1:T}|X_D^{1:T}, C) \geq \mathbb{E}_{q_{\phi}(z|D)}[\sum_{t=1}^T \mathbb{E}_{q_{\phi}(v^t|z,D^t)}[logp_{\theta}(Y_D^t|X_D^t, v^t)] - D_{KL}(q_\phi(v^t|z, D^t)||q_\phi(v^t|z, C^t))] - D_KL(q_\phi(z|D)||q_\phi(z|C))logpฮธโ€‹(YD1:Tโ€‹โˆฃXD1:Tโ€‹,C)โ‰ฅEqฯ•โ€‹(zโˆฃD)โ€‹[t=1โˆ‘Tโ€‹Eqฯ•โ€‹(vtโˆฃz,Dt)โ€‹[logpฮธโ€‹(YDtโ€‹โˆฃXDtโ€‹,vt)]โˆ’DKLโ€‹(qฯ•โ€‹(vtโˆฃz,Dt)โˆฃโˆฃqฯ•โ€‹(vtโˆฃz,Ct))]โˆ’DKโ€‹L(qฯ•โ€‹(zโˆฃD)โˆฃโˆฃqฯ•โ€‹(zโˆฃC))

๋…ผ๋ฌธ์—์„œ๋Š” ๊ธฐ์กด์˜ Attention Neural Process (ANP) ๋ชจ๋ธ ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ implementation์„ ์ง„ํ–‰ํ•˜์˜€๊ณ  ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋Š” ์œ„์˜ Figure 2์™€ ๊ฐ™๋‹ค.

4. Experiment & Result

Experimental setup

๋ฐ์ดํ„ฐ์…‹ ์ €์ž๋“ค์€ ์ด ์„ธ๊ฐœ์˜ ๋ฐ์ดํ„ฐ ์…‹ (synthetic & real-world ๋ฐ์ดํ„ฐ์…‹)์œผ๋กœ MTNP๋ฅผ ๊ฒ€์ฆํ•˜์˜€๊ณ  ๋ชจ๋“  ์‹คํ—˜์—์„œ context ๋ฐ์ดํ„ฐ๋Š” ๋ถˆ์ถฉ๋ถ„ํ•˜๊ฒŒ ๊ตฌ์„ฑํ•œ ํ›„ ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค.

๋ฒ ์ด์Šค๋ผ์ธ ๋ชจ๋ธ๊ณผ ํ•™์Šต ํ™˜๊ฒฝ MTNP ๋ชจ๋ธ์˜ ๋น„๊ต๊ตฐ์œผ๋กœ ์ €์ž๋“ค์ด ๋ฐฉ๋ฒ•๋ก ์—์„œ ์–ธ๊ธ‰ํ•œ STNP์™€ JTNP ๋ชจ๋ธ์„ ANP๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์„ค๊ณ„ํ•˜์—ฌ ๊ตฌ์„ฑํ•˜์˜€๋‹ค. JTNP ๋ชจ๋ธ์€ ๋ถˆ์™„์ •ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์—†๊ธฐ์— missing label์€ STNP๋ฅผ ํ†ตํ•ด imputation์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. 1D regreesion task์—์„œ๋Š” ์ถ”๊ฐ€์ ์œผ๋กœ ๋‘ ๊ฐœ์˜ Multi-output Gaussian processes ๋ฒ ์ด์Šค ๋ผ์ธ ๋ชจ๋ธ (CSM, MOSM)๊ณผ ๋‘ ๊ฐœ์˜ ๋ฉ”ํƒ€ ํ•™์Šต ๋ฒ ์ด์Šค๋ผ์ธ ๋ชจ๋ธ (MAML, Reptile)๊ณผ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€๋‹ค.

๊ฒ€์ฆ ๋ฉ”ํŠธ๋ฆญ Regression ํƒœ์Šคํฌ์—์„œ๋Š” mean squared error (MSE)๋กœ ์„ฑ๋Šฅ ์ธก์ •์„ ํ•˜์˜€๊ณ  image completion ํ…Œ์Šคํฌ์—์„œ๋Š” pseudo-lbael๊ณผ prediction ๊ฐ’์˜ error๋ฅผ MSE์™€ mIoU๋กœ ์ธก์ •ํ•˜์˜€๋‹ค.

Result

์ด ์„ธ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ ์…‹์œผ๋กœ ์ฃผ์š” ์‹คํ—˜๊ณผ ablation ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๊ณ  ๋Œ€ํ‘œ์ ์œผ๋กœ ๋‚ ์”จ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ 1D ์‹œ๊ณ„์—ด regression ํƒœ์Šคํฌ ๊ฒฐ๊ณผ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์„ค๋ช…์„ ์ง„ํ–‰ํ•˜๊ฒ ๋‹ค.

ํ•ด๋‹น ์‹คํ—˜์˜ ๋ฐ์ดํ„ฐ ์…‹์€ 266 ๊ฐœ ๋„์‹œ์˜ 258์ผ ๊ฐ„์˜ ์ˆ˜์ง‘๋œ ๋‚ ์”จ ๊ธฐ๋ก์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๊ณ  ์ด 12๊ฐœ์˜ ๋‚ ์”จ ๊ด€๋ จ attribute ์ •๋ณด (๊ณ ์˜จ, ์ €์˜จ, ์Šต๋„, ๊ตฌ๋ฆ„ ์–‘ ๋“ฑ)๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์œ„์˜ figure์—์„œ table 2๋Š” ์ •๋Ÿ‰์  ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋Š”๋ฐ ๋…ผ๋ฌธ์—์„œ ์ œ์‹œ๋œ MTNP ๋ชจ๋ธ์ด ๋ฒ ์ด์Šค๋ผ์ธ ๋ชจ๋ธ์— ๋น„ํ•ด ์ •ํ™•๋„์™€ ๋ถˆํ™•์‹ค์„ฑ ์ถ”์ • ์ธก๋ฉด์—์„œ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋Š” ๊ฒƒ์„ ์•Œ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ ์ œ์‹œ๋œ ๋ชจ๋ธ์ด ํšจ๊ณผ์ ์œผ๋กœ ์ผ๋ฐ˜ํ™” ๋จ์„ ๋ณด์—ฌ์ค€๋‹ค. ๋˜ํ•œ, figure 4์—์„œ๋Š” ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ ํ™˜๊ฒฝ์—์„œ MTNP ๋ชจ๋ธ์ด ํ…Œ์Šคํฌ ๊ฐ„ ์ง€์‹ ์ „์ด (knowledge transfer)๋ฅผ ํšจ๊ณผ์ ์ด๊ฒŒ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค. figure (a)์—์„œ ๊ด€์ฐฐ๊ฐ’์ด ์ ์„ ์‹œ์— ๋ถˆํ™•์‹ค์„ฑ์ด ๋†’์•„์ง€๋ฉด์„œ ๋†’์€ NLL ์ˆ˜์น˜๋ฅผ ๋ณด์—ฌ์ฃผ์ง€๋งŒ ์ ์ฐจ์ ์œผ๋กœ ์ถ”๊ฐ€์ ์ธ ๊ด€์ฐฐ ๊ฐ’ (Cloud) ์„ ํ†ตํ•ด ์ง€์‹ ์ „์ด๊ฐ€ ํšจ๊ณผ์ ์œผ๋กœ ์ง„ํ–‰๋˜์–ด ์˜ˆ์ธก ์„ฑ๋Šฅ์ด ๋†’์—ฌ์ฃผ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

5. Conclusion

์ œ์‹œ๋œ ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค ๊ธฐ๋ฐ˜์˜ MTNP์€ ๋ถˆ์ถฉ๋ถ„ํ•œ ๋ฐ์ดํ„ฐ ํ™˜๊ฒฝ์—์„œ ๋‹ค์ค‘ ํ•จ์ˆ˜๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๊ณ ์•ˆ๋˜์—ˆ๊ณ  ๋‹ค์–‘ํ•˜๊ฒŒ ๋””์ž์ธ๋œ ์‹คํ—˜์„ ํ†ตํ•ด ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ž…์ฆํ•˜์˜€๋‹ค. Large scale ๋ฐ์ดํ„ฐ ์…‹ ํ™˜๊ฒฝ์—์„œ ์„ฑ๋Šฅ์„ ๊ฒ€์ฆ์ด ์ข‹์€ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์ด ๋  ๊ฒƒ์ด๋ผ ์ƒ๊ฐ๋˜๊ณ  ๊ด€์ฐฐ๋˜์ง€ ์•Š์€ ๊ณต๊ฐ„์— ๋Œ€ํ•ด ์ผ๋ฐ˜ํ™”๋ฅผ ์ง„ํ–‰ํ•˜๋Š” ๋ฐฉํ–ฅ๋„ ๋ชจ๋ธ์˜ ๋ฒ”์šฉ์„ฑ์„ ํ–ฅ์ƒ ์‹œํ‚ค๋Š”๋ฐ ๋„์›€์ด ๋  ๊ฒƒ์ด๋ผ ์ƒ๊ฐ๋œ๋‹ค.

Take home message (์˜ค๋Š˜์˜ ๊ตํ›ˆ)

Neural Processes (NPs)๋Š” ์œ„๋Œ€ํ•˜๋‹ค.

์—ฐ๊ตฌ์ž๋‹˜๋“ค ์ˆ˜๊ณ ํ•˜์…จ์Šต๋‹ˆ๋‹ค.

Author / Reviewer information

Author

  • ํ—ˆ์ž์šฑ

  • School of Computing

  • jayheo@kaist.ac.kr

Reviewer

  1. Korean name (English name): Affiliation / Contact information

  2. Korean name (English name): Affiliation / Contact information

  3. ...

Reference & Additional materials

  1. Kim, Donggyun, et al. "Multi-Task Processes." arXiv preprint arXiv:2110.14953 (2021).

  2. Caruana, Rich. "Multitask learning." Machine learning 28.1 (1997): 41-75.

  3. Fortuin, Vincent, Heiko Strathmann, and Gunnar Rรคtsch. "Meta-learning mean functions for gaussian processes." arXiv preprint arXiv:1901.08098 (2019).

  4. Peyman Bateni, Raghav Goyal, Vaden Masrani, Frank Wood, and Leonid Sigal. Improved few-shot visual classification. In CVPR, 2020.

  5. Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, 2017.

  6. Marta Garnelo, Dan Rosenbaum, Christopher Maddison, Tiago Ramalho, David Saxton, Murray Shanahan, Yee Whye Teh, Danilo Rezende, and SM Ali Eslami. Conditional neural processes. In ICML, 2018a.

  7. Kiyosi Itรด et al. An Introduction to Probability Theory. Cambridge University Press, 1984.

Figure 1: ๋‹ค์ค‘ ํ•จ์ˆ˜๋ฅผ ์œ„ํ•œ ํ™•๋ฅ  ํ”„๋กœ์„ธ์Šค ๊ทธ๋ž˜ํ”ฝ ๋ชจ๋ธ
Figure 2: Architecture of the neural network model for MTNP
Figure 3: Experimental results for 1D regression task