A love letter to Pi | Lucas Meijer

Transcript

0:01 · [clears throat] Everyone, my name is Lucas Meyer.

[기침] 안녕하세요, 저는 Lucas Meyer입니다.

0:05 · I spent most of my life, my working life on game engines and games.

제 커리어의 대부분을 게임 엔진과 게임을 만드는 데 썼습니다.

0:13 · One of them we managed to make quite popular, it’s called Unity.

그중 하나는 꽤 유명해졌는데, 이름이 Unity입니다.

0:17 · And um turns out the robots can do programming.

그런데, 알고 보니 로봇들도 프로그래밍을 할 수 있더군요.

0:26 · I I don’t know about you, but I did not really have this on my bingo card.

여러분은 어떠신지 모르겠지만, 저는 이걸 전혀 예상하지 못했습니다.

0:31 · I’m still I’m still sort of trying to cope with this.

저는 아직도 이 상황을 받아들이려고 애쓰는 중입니다.

0:36 · Um I am playing around with new ideas for what a game engine might look like if you make it for agents or humans and agents instead of only humans.

요즘은 게임 엔진을 사람만을 위해서가 아니라 에이전트용, 혹은 사람과 에이전트 모두를 위해 만든다면 어떤 모습일지에 대한 아이디어를 이리저리 실험하고 있습니다.

0:48 · Um I host a call it like a coping session bi-monthly for brownfield programmers that decided that they want to be really good at all this AI stuff, but that they realize that um it’s actually not so easy. Actually, who here is really good at using the AI agents?

저는 격월로 일종의 “서로 위로하는 세션”을 진행합니다. 기존 코드 기반에서 일하는 프로그래머들 중, AI를 정말 잘 다루고 싶은데 막상 해 보니 쉽지 않다는 걸 깨달은 사람들을 위한 자리입니다. 그런데 여기서 진짜로 AI 에이전트를 잘 쓰시는 분, 손 들어 보실래요?

1:11 · One?

한 분?

1:12 · Mr. Dunning-Kruger here.

여기 더닝-크루거 씨가 계시네요.

1:15 · [laughter] Um so, for the rest of us [laughter] My question is are you at stage nine?

[웃음] 그럼 나머지 우리들에게 [웃음] 묻고 싶은 건, 여러분은 지금 “9단계(stage nine)“에 계신가요?

1:23 · No, I am absolutely not at stage nine and I have a lot of opinions on the stage nine people. Um Let me begin by saying like my advice is to not chase every shiny new tool and only solve the problem you actually have, which is probably not at stage nine. That said, um here’s my shiny tools.

저는 절대 9단계가 아니고, 9단계라고 주장하는 사람들에 대해 할 말이 많습니다. 먼저 드리고 싶은 조언은, 반짝이는 새 도구마다 쫓아다니지 말고 지금 실제로 가진 문제만 풀라는 것입니다. 그 문제는 보통 9단계가 아니거든요. 그렇긴 하지만, 이제 제가 쓰는 반짝이는 도구들을 소개하죠.

1:47 · All right. I like to use Codex. Um I like I like to joke that it’s like Claude code, but for programming. Maybe Claude is someone you’d invite to your birthday and Codex is this sort of autistic German. Um But if you want to write software, I really I’m going for Codex.

저는 Codex를 씁니다. 농담 삼아 “Claude Code 같은 건데, 프로그래밍용”이라고 말하곤 합니다. Claude는 생일 파티에 초대할 만한 사람이고, Codex는 뭔가 자폐 스펙트럼 같은 독일 사람 느낌이죠. 하지만 진짜로 소프트웨어를 짜고 싶다면 저는 Codex를 택합니다.

2:14 · All right.

자.

2:14 · [laughter] Um There This is basically a snapshot of all the things that I learned that works for me, don’t work for me. One of the things that work for me is embracing HTML as a output form factor. Um I, you know, the 1980s were cool, but coding agents in these small black and white boxes who like that is crazy.

[웃음] 이 발표는 제가 지금까지 배운, 저한테 되는 것/안 되는 것들의 스냅샷입니다. 저한테 잘 먹히는 것 중 하나는 결과물 포맷으로 HTML을 적극 활용하는 것입니다. 80년대는 멋진 시대였지만, 2025년에 코딩 에이전트가 흑백 작은 박스 안에 갇혀 있는 건 미친 짓이죠.

2:37 · Um So, I like to do my prompts kind of like this. I try to find a repo on the monumental GitHub repo. I ask for like do a deep dive analysis and then a lot of the prompts I just end with present your work as a single HTML slide deck. And if you do that instead of a coding agent, you get something like this.

그래서 저는 프롬프트를 이렇게 씁니다. Monumental GitHub에서 레포를 찾게 하고 “심층 분석을 해 달라”라고 시키고, 프롬프트 끝에는 거의 항상 “결과물을 하나의 HTML 슬라이드 덱으로 보여 달라”라고 붙입니다. 그러면 코딩 에이전트에서 이런 결과가 나옵니다.

2:58 · With like this How do you call this?

이걸 뭐라고 부르죠?

3:00 · Like index on the left and I just find this like such a much more pleasant way to consume large amounts of information. It’s also a much more pleasant to sort of like skip over stuff than when you read it from a terminal. There’s no reason we should, you know, go back to that part of the ’80s. Um Um Making your code base agent friendly.

왼쪽에 목차(인덱스)가 있고요. 대량의 정보를 소비할 때 훨씬 쾌적하고, 터미널로 읽을 때보다 건너뛰기도 훨씬 편합니다. 굳이 80년대의 그 부분까지 되돌아갈 이유가 없죠. 다음 주제, “코드베이스를 에이전트 친화적으로 만들기”.

3:28 · This is a game. Who knows this game?

이건 어떤 게임입니다. 이 게임 아시는 분?

3:31 · Uh some old-timers here. It’s called Marble Madness. It’s from the ’90s. Um I like to pretend that the marble is your coding agent and the level is your repo. And it’s your job for this marble to roll down your repo very conveniently. However, there’s all sorts of hazards like the marble could fall off the cliff. For instance, if your agent’s MD instructions are incomplete or wrong.

몇몇 올드타이머가 계시네요. 이름은 Marble Madness, 90년대 게임입니다. 저는 이 구슬이 코딩 에이전트이고 스테이지가 여러분의 레포라고 상상하기를 좋아합니다. 여러분의 임무는 구슬이 여러분의 레포를 매끄럽게 굴러 내려가게 하는 것이죠. 하지만 곳곳에 위험이 있습니다. 예를 들어 에이전트용 MD 지침이 불완전하거나 잘못돼 있으면 구슬이 절벽 아래로 떨어져 버립니다.

4:05 · Um if your build system that you use, if it has been spewing out 500 warnings for the last 2 years and you tell everyone, “Yeah, yeah, yeah, you know, ignore that.” The agent’s going to go off track with that. And there’s all sorts of like your job is basically to change the repo so that the ball will fall down smoothly.

빌드 시스템이 지난 2년간 경고 500개씩 뱉고 있는데 팀원들한테는 “에이, 그거 그냥 무시해”라고 해 왔다면, 에이전트는 거기서 엉뚱한 길로 새 버립니다. 여러분의 진짜 일은 구슬이 부드럽게 굴러 내려가도록 레포를 다듬는 것입니다.

4:30 · I like to ask this question, what would have helped the agent reach its goal faster?

저는 이런 질문을 자주 던집니다. “에이전트가 목표에 더 빨리 도달하도록 도왔을 법한 건 무엇인가?”

4:37 · And I think the only way to really do that is to just read the whole transcript. You read the whole transcript, you see all the tool calls, you see all the things it does and then you’re like, “Why like why did it like why did it go there? Why did it go there?”

그 답을 제대로 얻는 유일한 방법은 전체 트랜스크립트를 통째로 읽는 거라고 생각합니다. 모든 툴 호출과 모든 행동을 읽다 보면 “얘가 왜 저기로 갔지? 왜 저기로 갔을까?” 싶은 지점들이 보입니다.

4:51 · And you just make changes accordingly, right? And if you’d done that actually, let me show you how I like to do that like I’m here. This is my favorite agent. I like to do {slash} share um to create these sort of like easily as you’ve noticed I like my HTML.

그걸 보고 그에 맞춰 레포를 고치면 됩니다. 실제로 제가 이걸 어떻게 하는지 보여드리겠습니다. 여기가 제 최애 에이전트인데요, 저는 /share를 써서 — 눈치채셨겠지만 저는 HTML을 좋아합니다 — 이런 걸 쉽게 만듭니다.

5:13 · Um I just go through the whole thing and I read all the tool calls and I figure out sort of like, you know, like how it was performing. If you’ve done that a few times like anything in software development, if you’ve done it a few times it starts to feel boring. Of course, you can use AI to help you with it. Um You can say something like analyze the previous session.

처음부터 끝까지 훑으면서 모든 툴 호출을 읽고 에이전트가 어떻게 동작했는지 파악합니다. 소프트웨어 개발의 다른 모든 일이 그렇듯, 몇 번 해 보면 지루해집니다. 당연히 AI로 자동화할 수 있죠. 예를 들어 “이전 세션을 분석해 달라”라고 시킬 수 있습니다.

5:37 · Find places where the agent went in a wrong direction only to later figure out the right way. And of course, in my Um Oh yeah, and make me some recommendations on what I could have added to the repo that would have made it have it not make that mistake.

“에이전트가 처음에는 잘못된 방향으로 갔다가 나중에야 올바른 길을 찾은 지점들을 찾아 달라”라고 하고, 또 “레포에 무엇을 추가했더라면 그런 실수를 피할 수 있었을지 추천해 달라”라고 덧붙입니다.

5:54 · Um Obviously, I ask it to report that stuff in HTML for me. This is one that I was working on the other day. It gives some, you know, info on how the session was going and then here it had some frictions and this one was funny to me because it turns out um this was on the game engine that I’m doing some work on.

당연히 결과는 HTML로 보고하라고 시킵니다. 이게 최근에 작업한 건데, 세션이 어떻게 진행됐는지 정보를 주고, 여기에는 마찰 지점(frictions)들이 있습니다. 이게 재밌었던 이유는 제가 작업 중인 게임 엔진에서 발생한 일이거든요.

6:17 · Um The documentation said that you have to call the the build command, you have to do {at} Mac if you want to do a Mac build. And then turns out the agent tried that and it was wrong.

문서에는 Mac 빌드를 하려면 빌드 명령에 @Mac을 붙여야 한다고 돼 있었는데, 에이전트가 그걸 시도했더니 작동하지 않았습니다.

6:28 · And then it had to read the whole source code of the build system to figure out like, “No, it’s colon Mac.” And I did change that like a few weeks ago. I just missed a spot in the docs. All right, so by fixing that I may, you know, like you can actually make a continuous process to to making your code base like the most beautiful Marble Madness level because this repeated process is actually what you need, right?

결국 빌드 시스템의 전체 소스 코드를 읽어서 “아, :Mac(콜론 Mac)이구나”라고 알아내야 했습니다. 저는 몇 주 전에 규칙을 바꿨는데, 문서 한 곳을 놓쳤던 거죠. 이걸 고치는 방식을 반복하면 여러분의 코드베이스를 세상에서 가장 아름다운 Marble Madness 스테이지로 만들어 가는 연속적인 프로세스가 됩니다. 이 반복 과정이야말로 정말 필요한 것이죠.

6:55 · Like it’s super easy to have clunkers write MD files and you just get more and more and more and more and it’s like, you know, this Google Drive from how many how many docs in the monumental drive?

대충 찍어낸 MD 파일이 점점 쌓이기는 너무 쉽습니다. 계속 늘어나고 또 늘어나서 결국 뭐… Monumental 드라이브에 문서가 몇 개나 있더라?

7:10 · Notion. In the Notion like Like I’m I’m assuming it’s hard to find things just by previous experience. Um Um Let me Am I join Yeah, let me I’m going to show this. Um I like to use this program. It’s called Super code. Um it’s like conductor, but I like it much better. Um Maybe we can go into why later. Um And I showed the share. Let me go back to this topic actually.

아 Notion이었죠. Notion 안에서 뭔가 찾기 어렵다고 다들 경험으로 알고 계실 겁니다. 이걸 잠깐 보여 드릴게요. 저는 Super code라는 프로그램을 씁니다. Conductor 비슷한데 훨씬 마음에 듭니다. 왜 그런지는 나중에 얘기할 수도 있겠네요. 방금 share를 보여 드렸고, 다시 이 주제로 돌아가죠.

7:47 · Evaluating agent work. I think this is the biggest mental shift that made the biggest difference for me is thinking about how you’re going to actually evaluate the agent’s work, right? Like you give it a task and it goes off for half an hour and then at some point it’s done, right?

“에이전트의 작업 평가하기”. 저에게 가장 큰 차이를 만든 가장 큰 멘탈 시프트는, 에이전트의 작업을 실제로 어떻게 평가할지를 미리 생각하는 것이었습니다. 작업을 맡기면 30분쯤 돌아가다가 어느 순간 끝나죠, 그다음엔?

8:06 · Now what?

그다음엔요?

8:07 · Like what you’re going to do? Like you’re going to read the whole transcript? Are you going to read the summary? Are you going to read the source code? Are you going to play the game? Open the website? Like what is it that you’re going to do?

뭐 할 거예요? 트랜스크립트를 다 읽을 건가요? 요약을 읽을 건가요? 소스 코드를 읽을 건가요? 게임을 플레이해 볼 건가요? 웹사이트를 열 건가요? 대체 뭘 할 거냐는 겁니다.

8:18 · And my advice is to ask yourself this question before you send the agent on its task. Because when you ask yourself the question before, you can actually put it in the prompt. And it turns out that agents really like to know ahead of time how you’re going to evaluate the results because it also gives them a lot of clarity on when they are done and when they’re not done.

제 조언은, 에이전트에게 작업을 맡기기 전에 먼저 스스로에게 이 질문을 던지라는 것입니다. 미리 생각하면 그 내용을 프롬프트에 녹일 수 있으니까요. 에이전트는 자기 작업이 어떻게 평가될지 미리 아는 걸 매우 좋아합니다. “언제 끝났는지/안 끝났는지”에 대한 명확성이 크게 올라가거든요.

8:45 · This is actually also a great tip for humans.

이건 사실 사람한테도 아주 좋은 팁입니다.

8:49 · [laughter] They also like to know how you’re going to evaluate them and when they are done and when they’re not done.

[웃음] 사람도 자기가 어떻게 평가될지, 언제 끝났고 언제 안 끝난 건지 미리 아는 걸 좋아하죠.

8:55 · [laughter] Actually, almost [clears throat] everything that you turns out with these agents like it’s almost all true for humans, too. Okay, so you put it in the beginning. You tell it how you will evaluate. And one of the things that I like Actually, let me show you back to Super code. Yeah.

[웃음] 사실 [기침] 에이전트에서 통하는 거의 모든 것이 사람에게도 통합니다. 아무튼 프롬프트 앞부분에 “어떻게 평가할지”를 적어 두세요. 제가 좋아하는 것 중 하나를 Super code로 다시 보여 드리죠.

9:16 · So, on the Steve Yegge stage nine, there’s like agent swarms and agents starting agents and all these things. I don’t know what those guys are smoking. I’m not like I’m not having any of that, but you might see on the left side here. I do have like 10 or 12 of these things open. Right? But there’s no spinners there because all of these agents work streams that I have, they’re all waiting for me.

Steve Yegge가 말하는 “9단계”에는 에이전트 스웜(swarm)이나 에이전트가 다른 에이전트를 시작시키고 하는 것들이 있습니다. 그분들 뭘 피우시는지 모르겠고, 저는 거기에 동참 안 합니다. 다만 왼쪽을 보시면 저는 이런 세션을 10~12개 정도 띄워 놓고 있습니다. 그런데 스피너가 돌고 있는 건 없습니다. 왜냐면 이 모든 에이전트 작업 스트림이 전부 저를 기다리고 있거든요.

9:42 · Right? So, in this new world, I am absolutely the bottleneck. And it’s act like if a agent does like an hour of work, it’s actually a ton of work to evaluate it. Like it could take like 15 minutes depending on, you know, like the quality and how happy you with it.

즉 이 새 세계에서 병목은 완벽하게 저 자신입니다. 에이전트가 한 시간짜리 작업을 하면, 그걸 평가하는 일은 꽤나 큽니다. 품질이나 만족도에 따라 15분쯤 걸릴 수도 있습니다.

10:01 · So, if I’m the bottleneck of my whole little software factory here, and I’m like the evaluation of the work is the bottleneck, let’s try to have the agent do more of it. And let’s like I like to call these evaluation packs. Like let’s really as if it were sort of like you know, let let’s make them a lot of work to present a beautiful package for you that makes it really efficient to evaluate.

이 작은 소프트웨어 공장의 병목이 저이고, 그중에서도 ‘평가’가 병목이라면, 평가 작업도 에이전트에게 더 넘겨 봅시다. 저는 이걸 “평가 팩(evaluation pack)“이라고 부릅니다. 에이전트로 하여금 여러분이 평가하기 쉬운 예쁜 패키지를 만들어 내도록 많은 일을 시키는 거죠.

10:30 · For instance, I one-shotted as a to prepare for this talk, I one-shotted a website to where you can have photos and you can drag them on a timeline for an animation and then set up transitions between them.

예를 들어, 이 발표 준비용으로 한 방(one-shot)에 웹사이트를 만들었는데, 사진을 타임라인에 드래그해서 애니메이션으로 배치하고 사이에 트랜지션을 설정할 수 있는 사이트였습니다.

10:46 · The way I would ask to evaluate that is to say, well, why don’t you record a video where you open the websites that you just made where you demonstrate to me all the different features by moving the mouse and doing all these things and make a recording of that and then obviously show it to me in a single page slide deck together with a bunch of other stuff. And if you ask that, it turns out you can just ask that.

이걸 평가하라고 할 때 저는 이렇게 시킵니다. “방금 만든 웹사이트를 열어서, 마우스를 움직이며 여러 기능을 나에게 시연하는 영상을 녹화해 주세요. 그리고 다른 결과물들과 함께 하나의 페이지 슬라이드 덱에 담아 보여 주세요.” 이런 걸 그냥 시킬 수 있더라고요.

11:20 · And I get back from my one-shot this evaluation pack which makes my life easier to evaluate what it’s done and also to quickly evaluate if I kind of like it. Like it makes this video and I can hit play. Not sure how well the video works on this screen sharing thing, but here it is going through the different photos. It’s renaming them. It’s adding them to the timeline.

그러면 원샷 작업의 결과로 ‘평가 팩’이 돌아오는데, 이걸로 결과물을 평가하기도, 마음에 드는지 빠르게 판단하기도 훨씬 쉬워집니다. 영상을 만들어 주니까 재생 버튼만 누르면 됩니다. 이 화면 공유에서 잘 보일지 모르겠는데, 여러 사진을 넘기고, 이름을 바꾸고, 타임라인에 추가하는 모습이 보이시죠.

11:47 · It’s changing transitions and by making it do the work to make my evaluation life easy, I get to spend less time on the evaluation and I can get more of these individual work streams done. Additionally, it also helps the agent to not cheat. Right? It’s super easy for the agent to say like, yeah, wrote some code and I think we’re good.

트랜지션도 바꾸고 있네요. 제가 평가하기 쉽도록 에이전트에게 일을 떠넘김으로써 평가에 쓰는 시간을 줄이고, 더 많은 작업 스트림을 완료할 수 있습니다. 게다가 에이전트가 속이는 것도 막아 줍니다. 에이전트가 “코드 좀 짰고요, 잘 된 것 같아요”라고 말하기는 정말 쉽거든요.

12:15 · Um if you force it to make a video, it has to open it in Chrome or whatever and it will run all the JavaScript and if there’s an error there, it will find it and if it fails to actually do these commands, it will notice it and it will kick itself in a loop to try to fix that.

하지만 영상을 찍게 강제하면, 에이전트는 Chrome 같은 걸 열어서 JavaScript를 다 실행해야 하고, 에러가 있으면 발견하게 되고, 명령 실행에 실패하면 그것도 감지해서 스스로 루프를 돌며 고치려고 합니다.

12:35 · Um all right, so that is Oh, yeah, and then I all like because the video is actually hard for the agent to read itself, I also always ask it for a bunch of screenshots because you can force it to actually read the image files and use the models in How do you call that, Rick? The image Multi-model capabilities. Yeah, multi-model capabilities to actually look at the picture to see what is wrong.

그리고 영상은 에이전트가 스스로 읽기엔 어렵기 때문에, 저는 항상 스크린샷도 같이 시킵니다. 그러면 이미지 파일을 강제로 읽게 만들 수 있고, 모델의 그… 릭, 뭐라고 하죠? 이미지… 멀티모달(multi-modal) 능력. 네, 멀티모달 능력으로 사진을 보고 뭐가 잘못됐는지 스스로 확인하게 할 수 있습니다.

13:01 · All right. Um let me speed through this. Um here we are. All right, so then finally, um I am in this rabbit hole of Pi. Pi is a coding agent. The rest of this talk is my love letter to Pi. Uh Pi is not Cloud Code. It is not Codex. You can use whatever model you want. Um and I love it because it’s hackable.

자, 좀 빠르게 넘어가죠. 드디어 본론입니다. 저는 요즘 Pi라는 토끼굴에 빠져 있습니다. Pi는 코딩 에이전트고, 이 발표의 나머지는 Pi에게 보내는 제 러브레터입니다. Pi는 Claude Code도 아니고 Codex도 아닙니다. 원하는 아무 모델이나 쓸 수 있고, 제가 사랑하는 이유는 해킹 가능(hackable)하다는 점입니다.

13:33 · Um we just uh you know, we just did a poll and it turns out none of you know what we’re doing with coding agents. I also have no idea what we’re doing with coding agents. The Cloud Code guys have no idea. The Codex guys have no idea.

방금 설문을 해 봤더니 여러분 중 누구도 코딩 에이전트로 뭘 하고 있는지 제대로 모르더군요. 저도 모릅니다. Claude Code 팀도 모르고, Codex 팀도 모릅니다.

13:48 · We’re so early. We have no idea what an ergonomic AI assistant actually looks like and that’s why in this phase it’s so important to just try endless and throw it against the wall and see what sticks. Right? And um oh, yeah, that is this. So, uh being able to experiment with what works for you as a person, what works for your project is super important.

아직 엄청나게 초기 단계입니다. “인체공학적인 AI 어시스턴트”가 어떤 모습이어야 하는지 아무도 몰라요. 그래서 지금 단계에서는 끝없이 시도해 보고 벽에 던져서 뭐가 붙는지 보는 것이 매우 중요합니다. 나 개인에게, 내 프로젝트에 맞는 것이 무엇인지 실험할 수 있는 능력은 정말 중요합니다.

14:16 · What is not amazing is waiting for some Cloud Code guys in San Francisco to come up with the workflow that is sort of like middle of the road for everyone that might or might not work for you. And that is why I love Pi because it allows you to do these things. It also has precise context management and I hope to show some of these things.

반대로 별로 좋지 않은 건, 샌프란시스코의 Claude Code 팀이 “모두에게 대충 맞는” 중도(中道)적 워크플로를 내놓기를 기다리는 겁니다. 그게 여러분에게는 맞을 수도, 안 맞을 수도 있죠. 그래서 저는 Pi를 사랑합니다. 이런 실험을 할 수 있게 해 주거든요. 또 정교한 컨텍스트 관리(context management) 기능도 있는데, 몇 가지 보여 드리겠습니다.

14:38 · Let me start off with the context management. So, I was rehearsing this thing in the back and I have all this stuff in my session here. I’m going to do {slash} tree. Let me actually Can you guys read that? It’s I’m going to do {slash} tree, my Pi feature. And you get an overview of your entire context over here. And you’ll notice that it’s not linear, but that I’ve actually gone into different branches.

컨텍스트 관리부터 시작하죠. 무대 뒤에서 리허설하며 세션에 이것저것 채워 놨는데요. /tree를 쳐 보겠습니다. 이거 보이시나요? /tree는 제가 만든 Pi 기능입니다. 이걸 치면 전체 컨텍스트 개요가 뜨는데, 선형이 아니라 여러 브랜치로 갈라져 있는 걸 볼 수 있습니다.

15:08 · Um since the bottom here was all a bit nonsense, I’m going to go back to this one. I’m going to hit enter. And then for now, I’m going to say no summary. And what this does is it brings me back to the beginning of the context. I get all that context back. Um let me ask for this is the repo of this sort of game engine experiment. Like run the fighter game for me.

아래쪽은 다 좀 엉망이라서 이 지점으로 되돌아가겠습니다. 엔터를 누르고, 일단 요약(summary) 없이 진행할게요. 그러면 컨텍스트의 시작 지점으로 되돌아가서 그때의 컨텍스트를 전부 되찾습니다. 이건 제 게임 엔진 실험 레포인데, “파이터 게임을 실행해 달라”고 해 볼게요.

15:38 · Let’s hope Codex is not down.

Codex가 다운되지 않았기를 빕니다.

15:49 · Ooh.

오.

15:50 · Uh-oh.

어어.

15:52 · [laughter] Let’s try that again.

[웃음] 다시 해 보죠.

16:05 · That looks better.

이제 좀 나아 보이네요.

16:10 · Um all right, so fighter game. Nothing in particularly exciting about it.

자, 파이터 게임. 딱히 흥미진진한 건 없습니다.

16:17 · Um all right.

네.

16:19 · Um I’m sorry, I lost my train of thought.

죄송합니다, 말 흐름을 놓쳤네요.

16:22 · Where was I?

어디까지 했죠?

16:23 · So, yeah, I so I went back with the So, I went back with {slash} tree to the beginning of the context and the {slash} tree I actually use it all the time because very often you will go into a side quest that turns out to be a dead end. Right? Let me actually you know, let’s go into the side quest with the dead end. Like ask me five questions about my beef chili.

아, 네. /tree로 컨텍스트 시작 지점으로 돌아갔었죠. 이 /tree는 저는 항상 씁니다. 왜냐하면 샛길(side quest)에 들어갔다가 막다른 길인 경우가 정말 많거든요. 일부러 막다른 길인 샛길로 한번 가 보죠. “내 비프 칠리에 대해 다섯 가지 질문을 해 달라”라고 하겠습니다.

16:52 · All right. Slight detour. Bear with me. Um I’m going to do {slash} answer here. {slash} answer. {slash} answer is an extension that some random dude on the internet made. This is not part of Pi.

자, 약간 옆길로 새겠습니다. 양해 부탁드려요. 여기서 /answer를 쓰겠습니다. /answer는 인터넷의 누군가가 만든 확장 기능입니다. Pi의 기본 기능은 아닙니다.

17:08 · It is someone that thought, you know what really sucks about coding agents?

어떤 사람이 “코딩 에이전트에서 진짜 별로인 게 뭔지 알아?” 하고 생각한 결과물이죠.

17:13 · That when you ask them to interview about your plan and you get 20 questions that you have to sort of like type them back one by one. Like yeah, oh yeah, the answer to 16 is yes and then the answer to the other one Oh, yeah, scroll back.

에이전트에게 계획에 대해 인터뷰해 달라고 하면 질문 20개가 쏟아지는데, 그걸 하나씩 타이핑으로 답해야 한다는 것입니다. “아, 16번 답은 예스고, 그 다른 거는… 아 스크롤 좀 올려서…”

17:25 · Oh, yeah, 17. Yeah, 17 is no. So, he figured like what if we just had like a special UI for that?

“17번은 노(No).” 그래서 이 친구가 생각한 거죠. “이걸 위한 전용 UI가 있으면 어떨까?”

17:31 · And he wrote an extension for it called answer and what it does it takes the previous message, sends it to some to some cheap LLM to extract all the questions and then preserve like present it to you in this nice UI.

그래서 answer라는 확장을 만들었습니다. 이전 메시지를 받아서 값싼 LLM에 보내 질문들을 추출하고, 보기 좋은 UI로 보여 줍니다.

17:43 · Uh what kind of beef?

어떤 소고기냐고요?

17:45 · Um losses token.

음… 로지즈 텐더(?)(토큰).

17:49 · Very and never beans, you animal. Um and then when you are done with it, it just turns it into a user message like that.

아주 매운 맛, 그리고 절대 콩은 안 넣습니다, 이 야만인아. 다 끝내면 이것들을 유저 메시지로 변환해 줍니다.

18:01 · Just I I really like that as an example of how you can just notice a problem that you actually have, like answering these plan questions, instead of a problem that you don’t have, which is, you know, like how do I keep 20 agent swarms busy?

저는 이걸, “실제로 내가 가진 문제(계획 질문에 답하기)“를 알아채서 해결한 예로 정말 좋아합니다. “20개의 에이전트 스웜을 바쁘게 굴리는 법” 같은, 여러분이 실제론 가지지도 않은 문제가 아니라요.

18:18 · All right, now that we went into a useless side quest, I can finally so show what I like {slash} tree for.

자, 이렇게 쓸데없는 샛길에 갔다 왔으니, 이제야 제가 /tree를 왜 좋아하는지 보여 드릴 수 있겠네요.

18:26 · Because you pay for all these side quests in your context, right? It’s very normal for people to what’s it called? Anthropomorphize their coding agent like it’s some human you’re supposed to talk to. No. No, it’s not a human and you’re also not really supposed to talk to it like that. It is a token producing machine that can help you get to good code, but you shouldn’t pretend it’s like a conversation. Like very often the right thing to do is like you should never tell it like, no, I didn’t want it like that.

이 모든 샛길에 대해 여러분은 컨텍스트로 비용을 치르고 있습니다. 사람들은 코딩 에이전트를 의인화(anthropomorphize)해서 마치 사람과 대화하듯 다루는 경향이 매우 강하죠. 하지만 그건 사람이 아니고, 그렇게 대화하듯 쓸 필요도 없습니다. 좋은 코드를 만드는 걸 도와주는 토큰 생성 기계일 뿐, 대화하는 척하면 안 됩니다. 뭔가 마음에 안 들 때 절대 하면 안 되는 건 “아니, 그런 뜻 아니었어”라고 말하는 것입니다.

18:58 · I wanted it like that. Like when I have when I when it made something I didn’t like, I’d use {slash} tree and I go back and then I ask it in a different way. I never, you know, like like don’t argue with these things. Um if you do argue with them or if you if you say like, okay, forget about it.

“이렇게 해 달라고 했잖아”라고 하지 마세요. 마음에 안 드는 결과물을 만들었다면 저는 /tree로 이전 지점으로 되돌아가서 다른 방식으로 다시 요청합니다. 이 친구들과 절대 말싸움하지 마세요. 말싸움을 하거나 “알았어, 그건 됐고 그냥 다른 걸로 가자”라고 말하면…

19:19 · Let’s just continue with this other feature. You continuously pay for this side quest to be in your context. You pay for it in tokens and you pay for it in intelligence because once your once your context gets longer, like once you get into the 50, 60% range, you enter like this dumb zone and it gets more stupid. So it’s really important.

그 샛길이 컨텍스트에 남은 대가를 계속 치르게 됩니다. 토큰으로도 비용을 치르고, 더 중요한 건 ‘지능’으로도 비용을 치릅니다. 컨텍스트가 길어져서 한 50~60% 구간에 들어가면 멍청해지는 존(dumb zone)에 진입해서 모델이 점점 바보가 됩니다. 그래서 정말 중요합니다.

19:43 · I’m not sure if you can see it down here, but most coding agents these days show you how far you are in the context window. If like if I see that go above 50, I get very nervous. I try to always make sure I am below 50.

아래쪽에 보이실지 모르겠는데, 요즘 대부분의 코딩 에이전트는 컨텍스트 윈도우를 얼마나 썼는지 표시해 줍니다. 저는 이 수치가 50을 넘으면 많이 초조해집니다. 항상 50 아래로 유지하려고 합니다.

20:01 · Um so when I go into this side quest like this, uh and then later it turns out that the discussion about beef is maybe not uh beneficial to the rest of my demo, I’m going to go back here. And here I can choose. I can say no summary, it will just throw it away.

그래서 이런 식으로 샛길에 갔다가, 나중에 “비프 얘기는 이 데모 나머지에 별 도움이 안 되는군” 싶으면 이 지점으로 돌아갑니다. 그리고 “요약 없이(no summary)“를 고르면 그냥 버립니다.

20:20 · I’m probably going to do that because for the rest of the demo, the agent won’t get better performance if it knows my beef preferences. If it was a side quest where you do something on your project that is actually useful to know that you tried it and that it failed because blah, then you would do summary and it would put a summary in there.

이 경우엔 그렇게 할 것 같아요. 남은 데모에서 에이전트가 제 비프 취향을 안다고 성능이 좋아질 리 없거든요. 반대로 프로젝트에서 뭔가 시도했고, 어떤 이유로 실패했다는 것이 기록으로 남아 있으면 유용한 샛길이라면, “요약(summary)“을 선택해서 요약을 컨텍스트에 넣을 수 있습니다.

20:40 · So it’s like you have much more intention and control over your context than most other coding agents provide. I think they’re also catching up. I’m not I’m not quite sure, but having this control is super important. Um All right. I think I have one or two more things.

즉 다른 코딩 에이전트들이 주는 것보다 훨씬 더 의도적(intention)으로 컨텍스트를 통제(control)할 수 있습니다. 다른 에이전트들도 점점 따라잡고는 있는 것 같은데, 아무튼 이 통제권은 정말 중요합니다. 자, 한두 가지 더 보여 드릴게요.

21:02 · Um notice um so one of the cool things about this game engine project I have Do I have um is that I have like full local CIs. I’m going to make a change to arena, which is like my main memory allocator that gets used everywhere. And now I’m going to ask like, hey, do a full CI run using uh S A all with a colon, not with a at symbol.

제 이 게임 엔진 프로젝트의 멋진 점 중 하나는 완전한 로컬 CI가 있다는 것입니다. 메인 메모리 할당자인 arena를 수정하겠습니다. 이건 곳곳에서 쓰이죠. 그리고 “전체 CI를 돌려 달라, :All로(at 기호 아니라 콜론으로)“라고 요청하겠습니다.

21:37 · And one of the things that was annoying me about these coding agents is that when you have these long-running tool calls, you have so poor visibility in what’s going on. Right? Like when it does like rep grip, it’s not a problem because it just finishes immediately. But the but I have Oh, this is not going well.

기존 코딩 에이전트에서 짜증 났던 것 중 하나는, 오래 걸리는 툴 호출을 할 때 내부에서 무슨 일이 벌어지는지 가시성이 너무 떨어진다는 점이었습니다. ripgrep 같은 건 바로 끝나니까 괜찮은데요. 어… 이거 잘 안 되고 있네요.

21:56 · What did did did it ever show the thing?

그 화면 뜬 적 있었나요?

21:57 · Yes. Yeah.

네, 네.

21:59 · Was it just very quickly run?

너무 빨리 실행된 건가요?

22:01 · [clears throat] Oh, here we go.

[기침] 아, 여기 있네요.

22:05 · Oh, that’s not good. Oh, you think I think my build system is so good that I already that it has this one cached. I’m going to do it like this.

이런, 좋지 않네요. 아, 제 빌드 시스템이 너무 좋아서 이미 캐시돼 있는 것 같네요. 이렇게 해 보죠.

22:14 · Here, try again.

자, 다시.

22:18 · Um But it like I really disliked that you had such poor visibility. So I figured, oh, it would be cool if we can actually run these uh commands as a virtual terminal and that my coding agent actually shows it to me when it’s working so that when it’s sitting there for 10 minutes, which is like that is sometimes fine and that is sometimes uh it’s stuck on something and it would be very nice if I could see it.

아무튼, 이 가시성 부족이 저는 정말 싫었습니다. 그래서 생각했죠. “이 명령들을 가상 터미널(virtual terminal)로 돌려서, 에이전트가 작업 중인 걸 나한테 보여 주면 멋지지 않을까?” 에이전트가 10분 동안 가만히 있을 때, 괜찮은 경우도 있고 뭔가에 막혀 있는 경우도 있는데, 그걸 볼 수 있다면 정말 좋을 것 같았거든요.

22:45 · So I made this this morning and um I thought it was a cool example of my new sort of like mental model, like how am I going to write a feature like this?

그래서 오늘 아침에 이걸 만들었습니다. 이게 제 새 멘탈 모델의 좋은 예라고 생각해요. “이런 기능을 어떻게 만들 것인가?”

22:57 · How am I going to like it’s cool that pilot should do it the first place. Yay pi.

일단 Pi가 자기 자신에게 이 기능을 추가할 수 있다는 것 자체가 멋지죠. Pi, 만세.

23:03 · Uh but how like when when you write a feature like that, how are you going to close the loop? How are you going to make the agent know if the thing it wrote worked? And how am I going to check?

그런데 이런 기능을 작성할 때, 루프를 어떻게 닫을 건가요? 에이전트가 자기 작성한 게 작동하는지 어떻게 알게 할 건가요? 그리고 저는 그걸 어떻게 확인할 건가요?

23:15 · And for this one, I asked it to write a single page HTML file. Here, this one. I asked it to like, hey, write me a bunch of different test programs that do different things with the terminal, you know, like alt screen, all these spinners, different colors, different repainting, like sending a hundred screens, all these test cases.

이번 건은 단일 페이지 HTML 파일을 만들어 달라고 했습니다. 바로 이거예요. “터미널에 다양한 동작을 하는 테스트 프로그램들을 만들어 달라”라고 했죠. 얼터네이트 스크린(alt screen), 스피너, 다양한 색, 다양한 리페인트 방식, 화면 100번 보내기 같은 모든 테스트 케이스 말이죠.

23:38 · And then I asked it to like, okay, well, run it through the code, make screenshots every few seconds, turn them into an animated GIF, put the animated GIF in my HTML file so that my job of reviewing it is easy. Right? And then run both my my virtual terminal code path against pi’s own code path over there so that I can quickly see that it that that matches.

그리고 이렇게 시켰습니다. “그것들을 코드로 돌리고, 몇 초마다 스크린샷을 찍고, 애니메이션 GIF로 만들고, 그걸 HTML 파일에 넣어 달라. 내가 리뷰하기 쉽게.” 그리고 제 가상 터미널 코드 경로와 Pi 자체의 코드 경로를 나란히 돌려서 결과가 일치하는지 빠르게 볼 수 있게 했습니다.

24:07 · It has some alt screen code, some synchronization code. I’m like it tries curl, it tries FFmpeg. So that’s sort of it’s another example of putting the onus on the agent to do the work for both the agent’s benefit, making sure that it actually works, and my benefit of making it easy to get confidence that this thing actually works. Oh. Ah. All right, that is a bummer. I am going like let me do one final demo.

얼터네이트 스크린 코드, 동기화 코드도 있고, curl도 시도하고 FFmpeg도 시도합니다. 이것도 “에이전트에 일을 떠넘기기”의 또 다른 예입니다. 에이전트 입장에서는 실제로 작동하는지 확인하는 이득, 저 입장에서는 이게 실제로 작동한다는 확신을 쉽게 얻는 이득이 있죠. 아, 아쉽네요. 마지막 데모 하나만 해 보겠습니다.

24:41 · It’s probably too late. Uh I wanted to do this at the beginning to show you that the thing that’s probably coolest is that I can write extensions for itself because it’s a coding agent that has access to its own docs. So let’s try make a quick extension that when the agent starts, I can play Doom in an overlay. And we’ll see if we like if it is quick enough for this thing to work.

시간이 좀 모자랄 수도 있겠네요. 원래 이건 초반에 보여 드리려던 건데요. 아마 가장 멋진 점은, Pi가 자기 자신의 문서에 접근할 수 있는 코딩 에이전트이기 때문에 자기 자신을 위한 확장 기능을 작성할 수 있다는 것입니다. 그럼 에이전트 시작 시 오버레이로 Doom을 플레이할 수 있는 빠른 확장을 만들어 보겠습니다. 충분히 빠르게 만들 수 있을지 보죠.

25:12 · Um This particular form of software where software finishes [snorts] and extends and configures itself while it is running on the user machine, I find mind-blowing and is amazing for an agent. It’s also I think amazing for a potential game engine. And I actually like to call this um Barbapapa software. Uh Barbapapa is this cartoon from the ’70s where you have all these characters and they go on adventures.

소프트웨어가 사용자 머신에서 실행되는 와중에 스스로 완성(?)되고, 확장되고, 자신을 구성하는 이 형태의 소프트웨어가 저는 정말 경이롭게 느껴집니다. 에이전트에게도 훌륭하고, 잠재적 게임 엔진에도 훌륭하다고 봅니다. 저는 이걸 “바르바파파(Barbapapa) 소프트웨어”라고 부르기 좋아합니다. 바르바파파는 70년대 만화인데, 여러 캐릭터들이 모험을 떠납니다.

25:51 · And when they go on an adventure, they are able to shape-shift in whatever form is required for the adventure at hand.

그리고 모험에 나설 때, 그 모험에 필요한 어떤 형태로든 변신(shape-shift)할 수 있습니다.

26:00 · Like these.

이렇게요.

26:02 · And software could be like this. You could have software that instead of that one programmer writes it once and thousands of people use it, that a programmer writes sort of like a base for the software and then when you use it, it sort of like morphs in morphs itself into whatever shape is good for you and is good for the problem you have right now. So pi can do that. Um can it also Oh. Ooh.

소프트웨어도 이럴 수 있습니다. 한 명의 프로그래머가 한 번 작성하고 수천 명이 그대로 쓰는 게 아니라, 프로그래머는 일종의 베이스만 작성하고, 사용자가 쓸 때 그때의 사용자와 지금의 문제에 맞는 형태로 스스로 변형(morph)되는 거죠. Pi가 이걸 할 수 있습니다. 음, 이것도 될까요? 오. 우.

26:36 · Ooh, it’s ready. Let’s try. Um so another cool thing with pi, you can do slash reload. So it has hot reload for its own code changes that it do. I have no idea if this is going to actually Oh. Oh.

오, 준비됐네요. 해 보죠. Pi의 또 멋진 점은 /reload를 쓸 수 있다는 겁니다. 자기가 한 코드 변경에 대해 핫 리로드(hot reload)가 됩니다. 이게 진짜 될지는 전혀 모르겠지만… 오. 오.

26:52 · Oh.

오.

26:54 · Oh.

오.

26:55 · Yeah.

됐다.

26:56 · Here we go.

갑니다.

26:58 · Here we go. Here we go. All right, I’m going to end on that.

갑니다, 갑니다. 자, 여기서 끝내겠습니다.

27:01 · Thanks, everyone.

감사합니다, 여러분.

27:03 · [applause] [applause]

[박수] [박수]

고도

탐색기

A love letter to Pi | Lucas Meijer

Transcript

그래프 뷰