Skip to content

Instantly share code, notes, and snippets.

@sikang99
Last active November 18, 2024 01:07
Show Gist options
  • Save sikang99/92f9eac82a14038d1ba77aab5729f434 to your computer and use it in GitHub Desktop.
Save sikang99/92f9eac82a14038d1ba77aab5729f434 to your computer and use it in GitHub Desktop.
Large Behavior Model

Large Behavior Model (LBM)

  • TRI : Toyota Research Institute

Articles

Information

  • OpenVLA (Open Vision-Language-Action Model)
  • VIMA (General Robot Manipulation with Multimodal Prompts
  • RT-1-X
  • LM-Nav
  • Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
  • Drake - Model-Based Design and Verification for Robotics

Slides

Papers

Open Source

@sikang99
Copy link
Author

Toyota Research Institute (TRI)์—์„œ ์—ฐ๊ตฌ ์ค‘์ธ Diffusion Policy๋Š” ๋กœ๋ด‡์ด ์ƒˆ๋กœ์šด ํ–‰๋™์„ ํ•™์Šตํ•˜๋Š” ํ˜์‹ ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ, ๊ธฐ์กด ๋ฐฉ์‹๋ณด๋‹ค ํšจ์œจ์ ์ด๊ณ  ์ผ๊ด€๋˜๊ฒŒ ๋ณต์žกํ•œ ๊ธฐ์ˆ ์„ ์Šต๋“ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ธฐ๋ฒ•์€ ํŠนํžˆ ์ธ๊ฐ„์˜ ์‹œ์—ฐ(demonstration)์„ ํ†ตํ•ด ๋กœ๋ด‡์ด ๋™์ž‘์„ ๋ฐฐ์šฐ๋Š” ๋ฐ ์ค‘์ ์„ ๋‘๊ณ  ์žˆ์œผ๋ฉฐ, ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋„ ๋‹ค์–‘ํ•œ ๊ธฐ์ˆ ์„ ์Šต๋“ํ•˜๋„๋ก ๋•์Šต๋‹ˆ๋‹ค.

Diffusion Policy๋ž€?

Diffusion Policy๋Š” **์กฐ๊ฑด๋ถ€ ํ™•์‚ฐ ๋ชจ๋ธ(Conditional Diffusion Model)**์„ ํ™œ์šฉํ•œ ์ƒ์„ฑ AI ๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋กœ๋ด‡์˜ ํ–‰๋™ ์ •์ฑ…์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํŠนํžˆ ๋กœ๋ด‡์˜ **๋น„์ฃผ์–ผ ๋ฐ ๋ชจํ„ฐ ์ •์ฑ…(visuomotor policy)**๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ด ์ ‘๊ทผ๋ฒ•์€ ๊ธฐ์กด์˜ ๋กœ๋ด‡ ํ•™์Šต ๋ฐฉ์‹๋ณด๋‹ค ๋น ๋ฅด๊ณ  ์•ˆ์ •์ ์ด๋ฉฐ, ํŠนํžˆ ๋ณ€ํ˜• ๊ฐ€๋Šฅํ•œ ๋ฌผ์ฒด(์˜ˆ: ์ฒœ์ด๋‚˜ ์•ก์ฒด)๋ฅผ ๋‹ค๋ฃจ๋Š” ์ž‘์—…์—์„œ๋„ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์ž…๋‹ˆ๋‹ค.

์ž‘๋™ ์›๋ฆฌ

Diffusion Policy๋Š” ์ฃผ์–ด์ง„ ๋ชฉํ‘œ๋ฅผ ๋‹ฌ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ๋žœ๋คํ•œ ์ดˆ๊ธฐ ๋™์ž‘ ๊ฒฝ๋กœ๋ฅผ ์‹œ์ž‘์ ์œผ๋กœ ์„ค์ •ํ•œ ํ›„, ์ด๋ฅผ ์ ์ง„์ ์œผ๋กœ ์กฐ์ •ํ•˜์—ฌ ์ตœ์ ์˜ ๋™์ž‘ ๊ฒฝ๋กœ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์€ **ํ™•๋ฅ ์  ์ƒ˜ํ”Œ๋ง(Stochastic Sampling)**์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ, ์—ฌ๋Ÿฌ ๋ฒˆ์˜ ๋ฐ˜๋ณต์„ ํ†ตํ•ด ์ ์  ๋” ์ •๋ฐ€ํ•œ ํ–‰๋™์„ ๋งŒ๋“ค์–ด๋ƒ…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋‹จ ๋ช‡ ๋ฒˆ์˜ ์‹œ์—ฐ๋งŒ์œผ๋กœ๋„ ์ƒˆ๋กœ์šด ํ–‰๋™์„ ๋น ๋ฅด๊ฒŒ ํ•™์Šตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ํŠน์ง•

1.	๋‹ค์–‘ํ•œ ์‹œ์—ฐ ํ•™์Šต ์ง€์›: ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์‹œ์—ฐ(์˜ˆ: ์‹œ๊ฐ ๋ฐ ์ด‰๊ฐ ํ”ผ๋“œ๋ฐฑ)์„ ํ™œ์šฉํ•ด ์ž์—ฐ์Šค๋Ÿฝ๊ฒŒ ํ–‰๋™์„ ๊ฐ€๋ฅด์น  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
2.	๊ณ ์ฐจ์› ์•ก์…˜ ๊ณต๊ฐ„์— ์ ํ•ฉ: ๋กœ๋ด‡์ด ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ๊ณ„ํš์„ ์„ธ์šธ ์ˆ˜ ์žˆ์–ด ์˜ˆ์ธก ๋ถˆ๊ฐ€๋Šฅํ•œ ํ™˜๊ฒฝ์—์„œ๋„ ์•ˆ์ •์ ์ธ ํ–‰๋™์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค.
3.	๋†’์€ ํ›ˆ๋ จ ์•ˆ์ •์„ฑ: ๊ธฐ์กด์˜ ๊ฐ•ํ™” ํ•™์Šต ๋ฐ ํ–‰๋™ ๋ณต์ œ ๊ธฐ๋ฒ•๋ณด๋‹ค ๋” ์•ˆ์ •์ ์ด๊ณ  ๋น ๋ฅด๊ฒŒ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค.

์‹ค์šฉ ์‚ฌ๋ก€ ๋ฐ ๋ชฉํ‘œ

TRI๋Š” ์ด๋ฏธ ์ด ๊ธฐ๋ฒ•์„ ํ†ตํ•ด 60๊ฐœ ์ด์ƒ์˜ ๋ณต์žกํ•œ ๊ธฐ์ˆ (์˜ˆ: ์•ก์ฒด ๋”ฐ๋ฅด๊ธฐ, ๋„๊ตฌ ์‚ฌ์šฉ, ๋ณ€ํ˜• ๊ฐ€๋Šฅํ•œ ๋ฌผ์ฒด ์กฐ์ž‘ ๋“ฑ)์„ ๋กœ๋ด‡์— ๊ฐ€๋ฅด์ณค์œผ๋ฉฐ, 2024๋…„๊นŒ์ง€ 1,000๊ฐœ์˜ ๊ธฐ์ˆ ์„ ์ถ”๊ฐ€๋กœ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ธฐ์ˆ ์€ ์ฃผ๋กœ ๋กœ๋ด‡์ด ์ธ๊ฐ„์˜ ์ž‘์—…์„ ์ง€์›ํ•˜๊ณ , ์ผ์ƒ์ ์ธ ๊ฐ€์ • ํ™˜๊ฒฝ์—์„œ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์—ฐ๊ตฌ ์„ฑ๊ณผ

TRI๋Š” Columbia University์˜ ์—ฐ๊ตฌ์ง„๊ณผ ํ˜‘๋ ฅํ•˜์—ฌ Diffusion Policy๋ฅผ ๊ฐœ๋ฐœํ–ˆ์œผ๋ฉฐ, 12๊ฐœ์˜ ๋กœ๋ด‡ ์ž‘์—… ๋ฒค์น˜๋งˆํฌ์—์„œ 46.9% ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” ๋กœ๋ด‡์ด ๋” ๋ณต์žกํ•˜๊ณ  ๋‹ค์–‘ํ•œ ์ƒํ™ฉ์—์„œ ์ธ๊ฐ„์ฒ˜๋Ÿผ ํ–‰๋™ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ์—ด์–ด์ฃผ์—ˆ์œผ๋ฉฐ, ํ–ฅํ›„ ๋กœ๋ด‡ ์‚ฐ์—…์˜ ํŒจ๋Ÿฌ๋‹ค์ž„์„ ๋ณ€ํ™”์‹œํ‚ฌ ์ž ์žฌ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋ฒˆ ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋Š” 2023๋…„ Robotics: Science and Systems (RSS) ์ปจํผ๋Ÿฐ์Šค์—์„œ๋„ ๋ฐœํ‘œ๋˜์—ˆ์œผ๋ฉฐ, ์ถ”๊ฐ€์ ์ธ ๊ธฐ์ˆ ์  ์„ธ๋ถ€ ์‚ฌํ•ญ์€ TRI์˜ ๊ณต์‹ ์›น์‚ฌ์ดํŠธ์™€ ํ•™์ˆ ์ง€์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋กœ์จ TRI๋Š” ๋กœ๋ด‡์ด ๋” ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์œผ๋กœ ์ธ๊ฐ„๊ณผ ์ƒํ˜ธ์ž‘์šฉํ•˜๋ฉฐ ์ƒˆ๋กœ์šด ๊ธฐ์ˆ ์„ ์Šต๋“ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ์—ด์–ด๊ฐ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๊ธฐ์ˆ ์€ ํ–ฅํ›„ ๋กœ๋ด‡์˜ ํ–‰๋™ ๋ชจ๋ธ(Large Behavior Model, LBM) ๊ตฌ์ถ•์— ์ค‘๋Œ€ํ•œ ์—ญํ• ์„ ํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค. ์ด๋Š” ๋งˆ์น˜ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ์—์„œ ๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ(LLM)์ด ์ˆ˜ํ–‰ํ•˜๋Š” ์—ญํ• ๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ, ๋กœ๋ด‡์˜ ํ–‰๋™ ๋ฒ”์œ„๋ฅผ ํฌ๊ฒŒ ํ™•์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค ๏ฟผ ๏ฟผ ๏ฟผ ๏ฟผ.

@sikang99
Copy link
Author

ํ˜„์žฌ Large Behavior Models(LBM)์„ ์˜คํ”ˆ์†Œ์Šค๋กœ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ”„๋กœ์ ํŠธ๋Š” ๋‹ค์–‘ํ•˜์ง€ ์•Š์ง€๋งŒ, ์œ ์‚ฌํ•œ ์ ‘๊ทผ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ๋“ค์ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ, ๋กœ๋ด‡ ์ œ์–ด๋ฅผ ์œ„ํ•ด ๋Œ€๊ทœ๋ชจ ๋น„์ „-์–ธ์–ด-ํ–‰๋™ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ์—ฐ๊ตฌ๋“ค์ด ํ™œ๋ฐœํžˆ ์ง„ํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
1. OpenVLA (Open Vision-Language-Action Model):
OpenVLA๋Š” ๋กœ๋ด‡์˜ ๋‹ค์–‘ํ•œ ์กฐ์ž‘ ์ž‘์—…์„ ์œ„ํ•ด ์„ค๊ณ„๋œ ์˜คํ”ˆ์†Œ์Šค ๋น„์ „-์–ธ์–ด-ํ–‰๋™(VLA) ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. 7์–ต ๊ฐœ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ฐ€์ง„ ์ด ๋ชจ๋ธ์€ Open X-Embodiment๋ผ๋Š” ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹(970,000๊ฐœ ์ด์ƒ์˜ ๋กœ๋ด‡ ์‹œ์—ฐ ๋ฐ์ดํ„ฐ)์œผ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. OpenVLA๋Š” ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ํ”Œ๋žซํผ์—์„œ ๋ฒ”์šฉ์ ์œผ๋กœ ํ™œ์šฉ ๊ฐ€๋Šฅํ•˜๋ฉฐ, ํŠนํžˆ ๋กœ๋ด‡์ด ๋‹ค์–‘ํ•œ ์‹œ๊ฐ์  ํ™˜๊ฒฝ๊ณผ ์–ธ์–ด์  ๋ช…๋ น์— ๋Œ€ํ•ด ์ ์‘ํ•˜๋Š” ๋Šฅ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ HuggingFace์—์„œ ์˜คํ”ˆ์†Œ์Šค๋กœ ์ œ๊ณต๋˜๋ฉฐ, ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ๋„ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ ๋‹ซํžŒ ๊ตฌ์กฐ ๋ชจ๋ธ(RT-2-X) ๋Œ€๋น„ ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜๋ฉฐ, ์ƒˆ๋กœ์šด ์ž‘์—… ํ™˜๊ฒฝ์— ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•˜๋Š” ํŠน์ง•์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
2. VIMA (General Robot Manipulation with Multimodal Prompts):
VIMA๋Š” ๋‹ค์ค‘ ๋ชจ๋‹ฌ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋กœ๋ด‡ ์กฐ์ž‘ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์˜คํ”ˆ์†Œ์Šค ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ๋‹ค์–‘ํ•œ ํ˜•ํƒœ์˜ ์ž…๋ ฅ(์˜ˆ: ์ด๋ฏธ์ง€, ํ…์ŠคํŠธ ๋ช…๋ น)์„ ๋ฐ›์•„ ๋กœ๋ด‡์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. Pytorch ๊ธฐ๋ฐ˜์œผ๋กœ ์˜คํ”ˆ์†Œ์Šค ์ฝ”๋“œ๊ฐ€ ์ œ๊ณต๋˜์–ด ์—ฐ๊ตฌ์ž๋“ค์ด ์†์‰ฝ๊ฒŒ ์ ‘๊ทผํ•˜๊ณ  ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
3. RT-1-X:
Google Robotics์—์„œ ๊ฐœ๋ฐœํ•œ RT-1-X ์‹œ๋ฆฌ์ฆˆ๋Š” ๋Œ€๊ทœ๋ชจ ๋น„์ „-์–ธ์–ด ๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๋กœ๋ด‡์˜ ํ–‰๋™์„ ์ œ์–ดํ•˜๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ Open X-Embodiment์™€ ๊ฐ™์€ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜์—ฌ ํ•™์Šต๋˜์—ˆ์œผ๋‚˜, ์˜คํ”ˆ์†Œ์Šค๋กœ ์ œ๊ณต๋˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ๋‹ค๋งŒ, ์ด์™€ ์œ ์‚ฌํ•œ ์ ‘๊ทผ์„ ํ•˜๋Š” ๋ชจ๋ธ๋“ค์€ ์˜คํ”ˆ์†Œ์Šค๋กœ ๊ณต๊ฐœ๋˜์–ด ์—ฐ๊ตฌ ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ์™ธ์—๋„ Github์—์„œ ๋กœ๋ด‡๊ณผ LLM(๋Œ€ํ˜• ์–ธ์–ด ๋ชจ๋ธ)์„ ๊ฒฐํ•ฉํ•œ ์—ฌ๋Ÿฌ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ๋“ค์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, โ€œLM-Navโ€ ํ”„๋กœ์ ํŠธ๋Š” ๋Œ€ํ˜• ์‚ฌ์ „ ํ•™์Šต ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋กœ๋ด‡์˜ ํƒ์ƒ‰ ๋ฐ ์กฐ์ž‘ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•˜๋ฉฐ, ํ•ด๋‹น ์ฝ”๋“œ ์—ญ์‹œ ์˜คํ”ˆ์†Œ์Šค๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค.

์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ๋ฅผ ํ™œ์šฉํ•˜๊ณ ์ž ํ•œ๋‹ค๋ฉด OpenVLA๊ฐ€ ํ˜„์žฌ๋กœ์„œ๋Š” ๊ฐ€์žฅ ์œ ๋งํ•œ ์„ ํƒ์ง€๋กœ ๋ณด์ด๋ฉฐ, ๋‹ค์–‘ํ•œ ๋กœ๋ด‡ ํ™˜๊ฒฝ์—์„œ ๋ฒ”์šฉ์ ์œผ๋กœ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์ด ํฝ๋‹ˆ๋‹ค. ์ถ”๊ฐ€์ ์œผ๋กœ VIMA์™€ ๊ฐ™์€ ๋ชจ๋ธ๋“ค์„ ๊ฒ€ํ† ํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์กฐ์ž‘ ์ž‘์—…์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•ด๋ณด๋Š” ๊ฒƒ๋„ ์ข‹์€ ์ ‘๊ทผ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment