GitLab Inc. ($GTLB)
Earnings Call Transcript · June 10, 2026
Highlights from the call
In the first quarter of fiscal year 2027, GitLab Inc. reported a significant milestone, surpassing $1 billion in annual revenue, driven by a 30% increase in new paying customers year-over-year. The company achieved revenues of $250 million for the quarter, exceeding analyst expectations of $240 million, marking a 25% year-over-year growth. Management maintained a positive outlook, highlighting the transformative impact of AI on their platform and signaling continued investment in agentic infrastructure, which is expected to drive further growth in the upcoming quarters.
Main topics
- Revenue Milestone Achievement: GitLab surpassed $1 billion in annual revenue, a significant milestone for the company. CEO Bill Staples noted, 'We just surpassed $1 billion in annual revenue last quarter,' highlighting the company's growth trajectory.
- Customer Growth and Engagement: The company reported a 30% increase in new paying customers compared to the previous year, indicating strong demand for its offerings. Management stated, 'We've seen 100% double the number of code contributions into GitLab from our customers and our community from just 1 year ago.'
- AI-Driven Product Enhancements: GitLab's focus on AI is reshaping its product offerings, with the launch of the Duo agent platform leading to a 1,000% growth in weekly active users. Staples emphasized, 'AI is transforming software engineering,' pointing to its critical role in future growth.
- Security and Compliance Innovations: Management introduced new governance capabilities for agents to enhance security and compliance, stating, 'We are expanding on top of GitLab Ultimate recently by bringing agents to security to automate a lot of these tasks for you.' This is aimed at addressing vulnerabilities more efficiently.
- Introduction of GitLab Flex: GitLab announced GitLab Flex, a new buying program that allows customers to adjust their product usage dynamically. This flexibility is designed to meet evolving customer needs in the agentic era, as noted by management, 'The fixed contract model and the agentic era were not built for each other.'
Key metrics mentioned
- Quarterly Revenue: $250M (vs $240M est, +25% YoY)
- Annual Revenue: $1B (surpassed milestone)
- New Paying Customers Growth: 30% (compared to last year)
- Weekly Active Users Growth: 1,000% (for the Duo agent platform)
- Code Contributions Growth: 100% (from customers and community YoY)
- CI/CD Pipeline Speed Improvement: 42x faster (in certain operations)
GitLab's strong quarterly performance and strategic focus on AI and agentic infrastructure position it well for future growth. The introduction of GitLab Flex and enhancements in security and compliance are likely to attract more customers. However, the company must navigate challenges related to scalability and quality control as it continues to expand.
Earnings Call Speaker Segments
Unknown Attendee
AttendeesPlease welcome to the stage GitLab's Chief Executive Officer, Bill Staples.
William Staples
ExecutivesWelcome to GitLab Transcend. We are broadcasting live from a packed house here in London to more than 15,000 registered people around the world. No matter where you're tuning in from, thank you for spending time with us today. Now he doesn't know I'm going to do this, and he doesn't crave the spotlight very often, but I'd be remiss if I didn't recognize a very special guest in the audience today because none of us would be here without him. He is our Co-Founder, our Exec Chair, and he is healthy and cancer-free. Please join me in welcoming Sid Sijbrandij. The company Sid has built is truly amazing. It's a platform in every sense of the word. We just surpassed $1 billion in annual revenue last quarter, serving more than 50 million users and hundreds of thousands of organizations around the world. In fact, more than 50% of the Fortune 100 trust GitLab to build their software to serve their customers. These are iconic companies in every industry vertical that I know every one of us in our consumer lives and in our professional lives do business with. We use the software that they use GitLab to build every single day. It is such a privilege to be part of this community. What's really remarkable, though, is despite all of that success inside GitLab, coming to work every day with 2,000 teammates is the passion we have for solving your problems, for innovating new ways of building software. In fact, in this era, reinventing how software is built. It's an incredible community. And the community is growing. In fact, just last quarter, we added 30% more new paying customers than the same time last year. Thanks to you, we've seen 100%, double the number of code contributions into GitLab from our customers and our community from just 1 year ago. And developers are choosing GitLab to build software more than they ever have. In fact, over last year, 250% more user namespace is created. Isn't that amazing? What are you all doing with GitLab? Let me tell you. Platform usage is surging. In fact, CI/CD pipelines have grown 40% 1 year, 50% increase in code pushes, 60% increase in secure repos. And some of your code bases over 1 year ago have grown 500%. What is going on that's driving more value from GitLab than ever before? Can you guess? In a word, AI. AI is transforming software engineering. And in fact, we launched our Duo agent platform at the start of this year. And in its first quarter, we took more bookings in our first quarter with that platform than any prior quarter with Duo Pro and Duo Enterprise combined. Clearly, Agentic engineering is in demand. And in fact, since the beta went general availability, we've seen a 1,000% growth in weekly active users on that platform. Agentic engineering is here. We see about it in the news. We read about it in the tech press. We talk about it in the hallways and the virtual hallways around the water coolers everywhere. And it's amazing, isn't it? It's been fueled by our partners from Anthropic, Google, who will be with us on stage today as well as others who've made coding incredibly fast. In fact, nontechnical users who've never written a line of code can now generate working code 10x faster than a professional developer did 1 year ago. And professional developers are harnessing these tools to take ideas all the way to production in minutes. That is what's igniting software engineering on fire. But speed can also come with a downside. And just like a race car, it doesn't matter how fast you can go if you can't stay in control. It doesn't matter how fast you can go if you can't trust the steering wheel to get you where you want to be or trust the brakes when you need to slow down around the curves. Speed without control is chaos. And we also see this everywhere we look today. I'm guessing your social media feeds look a lot like mine. With the explosion of code, we see the explosion of bugs and quality issues. We see code review queues get longer and longer. We see more security issues than ever. We see infrastructure and major services that can't maintain reliability. And yes, we see costs explode as well. So speed without control is chaos. We've been anticipating this problem, and we've been thinking about it for a while. And we think we know the answer. Together with agentic coding, GitLab will bring agentic infrastructure to help you harness speed with control. That's our theme today. Helping you maintain speed with control. How do we do that? A few weeks ago, I shared a letter to all of our customers and investors titled Act 2. And in that letter, I shared 5 architectural bets that we have been making and that you're going to see today, which are going to change the world, change the way software is built. Let me give a brief overview before we dive in and get a chance to see them. Number one, machine scale infrastructure. You see the DevSecOps infrastructure that was built today over the last decade, was built for human scale engineering. But machines, agents, they work 24/7. They don't take coffee breaks. They don't get sick. And in fact, they work in parallel to software engineers, sometimes multiple agents per engineer. So the infrastructure on the other side has to scale at machine scale. We're building that. Agents are also great and getting better at performing tasks, but you don't need more tasks. You need quality software that meets all of your engineering standards and regulatory standards, going to your consumers and driving your business. Orchestration is what takes agentic tasks, connects them, passes context to them and gets you working software certified for your customers on the other end. Context is a superpower of GitLab. We've always been good at capturing all of the software life cycle data and providing that to your human engineers. We're now doing the same in an all-new way for agents. You're going to see GitLab Orbit today because the difference between hallucinations and reality, the difference between false confidence and real confidence is really good context. With GitLab Ultimate, we've been your trusted partner to make sure that you meet the quality standards, the security standards and the compliance and regulatory standards of your business. We're extending that now to agents as well to ensure that you can govern and audit every single action from every entity building software in your team, whether that's a human or an agent. You're going to see that today as well. And finally, we're delivering this in one platform for all the ways that software engineers will work. I'm guessing your engineering teams look a lot like our engineering team. We have teams and projects that work with human-led software engineering as they have for the past decade. We have engineers who are using agents like Duo agent platform to do Agentic Assist and they're going 2 to 4x faster than they were just 1 year ago. And we have bleeding-edge teams that are using advanced Agentic techniques to do autonomous engineering. And what's incredible is they're building software at roughly 20x the rate that those same engineers were 1 year ago. We're learning from all of those teams. We're capturing their problems. We're bringing solutions, and we're sharing that with all of you. In fact, you'll meet some of the engineers in all 3 modes today. What is incredible about GitLab, unlike the cloud era, where you were forced to decide a technology stack in the cloud or on-prem, where you were building different ways of working in the public cloud versus your own private data centers is with GitLab, you can stay in one place for all modes of engineering, have one set of engineering standards, have one security boundary to manage and audit and to serve your customers no matter how your teams want to work. And we deliver it in a cloud-neutral and model-neutral way. That's the promise of GitLab. All right. Let's go ahead and dive in now. It is my pleasure to introduce our Chief Product and Marketing Officer, Manav Khurana.
Manav Khurana
ExecutivesThank you. Hey, everybody, welcome to Transcend. We're going to do a few demos so you can see how you get speed with control. To anchor these demos, come with me on a 4-part exploration of the new GitLab, which is your Agentic infrastructure. You see whether it's your team or your team's agents, when they are building and shipping software, they need a motor system as in the arms and legs or the execution layer to build and ship code fast. That human and agentic brain also needs a nervous system as in the context to make better and faster decisions. That human and agentic brain also needs an immune system to ship software safely. And then finally, that human and agentic brain also needs an orchestration system so that they can coordinate all the tasks that need to happen across the software life cycle. Let's start with the motor system. You all use GitLab today because you get all the tools you need in one platform. whether that's planning, source code management, continuous integration, artifact management, continuous deployment, all the tools you need are stitched together in one platform, so your teams don't have the friction of putting everything together and can do their job a lot faster. Let's dive into source code management. It's been a hot topic recently. I mean, literally hot, literally burning hot because the Git platforms, in fact, the most popular Git platforms in the world are buckling under the load of not just your teams cloning, branching and merging code, but also dozens, in some cases, hundreds of agents working simultaneously and putting a lot of pressure on those systems. You've all seen the same headlines I have seen. So today, I'm beyond excited to introduce the next generation of source code management. Internally and lovingly, we call this project, Project Switch, well, because switches are better than hubs. If you're a networking geek like me. But really, it is the same Git protocol for backward compatibility, but a completely redesigned back end, new interfaces for agents to work blazingly fast and do so at scale without any disruption. To show you how this works, please welcome Nick from Anthropic and Kranti from GitLab.
Manav Khurana
ExecutivesNick, I'll start with you. Thank you for being a design partner on this initiative. You have the unenviable job of managing the coding infrastructure for perhaps one of the most demanding software engineering teams in the world. Tell us what you're seeing.
Nicholas Joseph
AttendeesYes. Thank you for having me. At Anthropic, we're seeing development accelerate to the point that a lot of tools that people take for granted, like source control, really can't keep up with the load, the same thing you were describing. We put out a blog post about a week ago that showed that we had an 8x increase in the amount of developer output since a year ago. And I really don't see why this is slowing down. So what are we running into? It's a good question. We're running into 2 major categories of problems. The first are sort of traditional big repo problems. So things like getting your CI -- getting your checkouts into your CI jobs or even just supporting a large commit rate. We see that a lot of people have these problems, but people don't necessarily talk about them. And it's kind of a shame that everyone has to reinvent the wheel here. We think this should just work out of the box. The second category of problem that we're seeing is kind of unique to agents. We want to run a lot of checkouts, a lot of developers basically that need the full repo context. So not just like CI jobs, not a single commit, but the history, the blame, the logs. And this is an even bigger problem than just checking out code for CI. Git doesn't really handle this very well. So dealing with partial checkouts of the repo, dealing with slices of context, Git does not support very well. So we imagine that there's new ways of agents interacting with the repo, maybe via rich source control APIs to solve this problem at a bigger scale.
Manav Khurana
ExecutivesYes. And this is not just a problem that is an Anthropic problem. All of you, as you scale your coding efforts, your development efforts in the agentic era, these are problems that either you're already running into or will be running into very soon. Now Kranti, you've been tackling this problem now. I gather you have something to show us.
Kranti
AttendeesAbsolutely. Hello, everybody. The challenges that Manav spoke about are precisely the ones that we have been set out to solve with a new architecture. Let me demonstrate that with a harness, lets us compare our current generation system with our next-generation system side by side. On your left, what you'd see is a cluster that is running our current generation system, which is Community Edition 18.0. On your right side, you'd see a cluster that is running the next generation. And both of them are provisioned with the same amount of memory and CPU resources. For the first scenario, let me touch upon one of the common problems that we have right now, which is doing clones at scale. As you see here, the third generation -- the next-generation system is going to get to the act of doing the clone pretty fast, while the second -- the current generation system feels a little sluggish. When a clone happens on the server side, the server has to compute a packfile from all the files in your repo and all of its history. And if your repo has a lot of files and a lot of history, it's going to take seconds, in some cases, even minutes to get to that packfile. Third -- the next-generation system is efficient at doing that because it's going to convert that into a manifest pointer towards a pre-computed packfile. And if one packfile does not exist prior and your server is getting 1,000 requests at the time, instead of doing this per request, it's going to coalesce the request and compute the packfile once and let the clients stream the output from the object store directly and the clients can scale because the object store behind the scenes is very, very scalable. As you can see on the right side, the next-generation system is already done. It has finished 100 clone operations within a second on a modest sized repo of our own Gitaly. Now this is going to take a little while on the current generation. So let me show you a run from the previous -- a prior run to show the statistics. As you can see in this case, it's 42x faster on raw wall clock time. That means your agents and your clients can wait 42x lesser. Not only that, it consumes way less CPU and memory on your server side. That means if you were to run this on your premises, it's going to be very cost effective in addition to being very, very fast. Now let's look at a right scenario. Let's see how the writes would perform at scale. Again, what I kicked off here is a comparison of performing 100 write operations on both the clusters. And behind the scenes, it has to create a fork, read a file, make some changes and commit the change. And for good measure, it will go and even verify the commit is successful and stuff like that. Now as you can see here again, it is very stupendously fast on the right side because creation of a workspace in the next generation is super quick. It does this with a clever use of manifests and pointers on a shared pool of objects across all the forks in your repository. And effectively managing that when people are creating forks on like smaller repos, massive repos, it doesn't really scale with the size of the repo anymore. It just like lets you get to the fork and start working on it right away. Now we have looked at both reads at scale and rights at scale. Let me show you an agentic use case as well. But before that, let me show like how the rights would perform on wall clock time. The second -- the current generation is going to take a little while, so let me show a prior run for you to get a quick understanding. In this particular case, it's finished, both 100 tasks, but it's 17x faster. That means, again, your agents and your clients can get to get -- doing the actual work much faster and not be slowed down by the underlying source control system. For the third scenario, I'm going to let loose Claude on performing a task. And the task is this. The task is to go and look at some undocumented code files in Git repo and then go and create the documentation and check it in. As you can see here, the next-generation system is chugging along super fast. It's just like going after reading the files, like understanding it, incorporating what has to be documented and like it is also going and checking it in. And the current generating system looks a little sluggish, but it's going to get there. This is going to take a little -- actually, the next-generation system already finished within a wall clock time of 30 seconds. Mind you, this is -- this 30 seconds is being measured from the client side. That means it has the agent inference time, getting the data all over the wire, all of it, right? On the server side, the stats are even more fabulous, right? The current generation is going to take a little while. Let me show you a prior run just to kind of give you a taste of how it looks like. On the wall clock time, you are getting a massive benefit as it is. It's 22x faster. It is moving much lesser data on the network because it doesn't need to. But I would like to bring your attention to a couple of more interesting stats here. Look at how few tokens it has used. On the right side, our next-generation system ended up using just 500,000 tokens to perform this task. And with our current generation, it ended up using 1.4 million tokens to get to the same outcome. That is like almost 3x cheaper. And it is going to translate to lesser cost for your agents to do your thing. And behind the scenes, what's happening is the full architectural advantage is coming to fruition. The new access patterns that we created are going to enable the agents to really interact with the code base in more effective ways to get to the act of doing the task much faster. Now how do we do all of this?
Manav Khurana
ExecutivesDo you want to see the slide?
Kranti
AttendeesYes, I would love to show you the architecture slide. All right. With the next-generation architecture, we got in 3 important advancements in the architecture. Number one, we are letting the compute and storage be separated and allow them to scale horizontally on their own. Number two, we put a layer of intelligence in between that can bring together the benefits of the distributed compute and storage together and does a lot of hard work behind the scenes like routing the request to the right place, caching what is important, partitioning your objects as your repo size grows, creating packfiles, updating bitmaps, a lot of heavy lifting is done by that intelligence layer. And the third, rather more important advancement that we are bringing in is we are allowing the clients and agents to interact with the source code system using newer access patterns.
Manav Khurana
ExecutivesAmazing. Amazing. So Nick, you've obviously seen this work in progress over the last several weeks and months. What's your take? Does this scratch the itch?
Nicholas Joseph
AttendeesYes. I was actually -- truthfully speaking, I was very impressed when I first saw the demo for this. It sort of worked better than I even imagined it would. We've been playing with stuff like this at Anthropic as well. And I think it would be increasingly important to actually scale how these agents work. We're also pretty excited that we now have Sable available into your agent platform. So we expect this to be even more impactful for these large models.
Manav Khurana
ExecutivesThank you, Nick. Really appreciate the partnership Yes. Thank you, Kranti. Nice work. So what you just saw there is the next generation of source code management that is available today in private beta. It is the same Git protocol that you and your teams are used to, but with a redesigned motor underneath for agents to work a lot faster and for your overall system to scale a lot better. You saw things like less than half the number of tokens. In Kranti's example, it was 3x. In our test, we've seen 50x faster wall clock time already and over 1,000x lesser network traffic required, right? Really incredible. can't wait for all of you to use the product. All right. Let's move on to the next part of the Agentic infrastructure, which is the nervous system. Today, all of you, when you use GitLab, one of the amazing things about the platform is that you have a common data store under everything that you do in GitLab, whether it's your code, whether it's your pipelines, your merge requests, your tests, your security scans, all of those data points are stitched together for you and your teams in one data platform. It makes it easy operationally to correlate what's happening across the software life cycle. Turns out that's also quite useful for agents because they need that context. But here's the thing with context. When you're working on a small project, a bounded project, agents can very quickly get the information they need and deliver a magical experience like we've all seen, where we ask a question and we get a fantastic response in seconds. But I'm sure you've also tried to use agents in a large mono repo, where there are tens of thousands of files that are in your code repository. When you use agents in that setup, you'll see that agents are constantly iterating. They're constantly trying to ping the back end to get the right information to do the task that you have asked. And it goes back and forth. And each time it goes back and forth, it takes more time, it takes more tokens. And agents reach a point where they give up. They don't have complete information, yet they give you a response which honestly is more artificial confidence than artificial intelligence. And you are left the bag to fix what the agent told you. Worse, if you're working across multiple repositories in your business, across different teams and you need not only the code information, but you also need information across the software life cycle, all the related metadata, that's where agents just flat out fail and cannot succeed in doing the job that you've asked them to do. That's why today, I'm excited to share that we are introducing GitLab Orbit. It is a context graph for the entire software life cycle, where all the context your agents need, whether that's in a mono repo or across repositories with all the life cycle data is available with a single query so that your agents work faster, are more accurate, require fewer tokens and more importantly, you can answer questions that you could never answer before with agents. To show you how all this works, I'm going to invite the Orbit team, Angelo and Meg to stage to give you a quick demo. Hi,Meg. How are you? Hey, Angelo, Great to see you.
Meg Corren
AttendeesGood to see you.
Manav Khurana
ExecutivesYou have something to show us.
Meg Corren
AttendeesI think we have a few things to show you. So Manav is right. For a single small local repo, agents shine. But that's not the stack our enterprise customers are working on. With large mono repos or multi repos, agents break down because they're trying to chain together context by calling dozens of tools and running thousands of API requests. But the data quality suffers. So we had to reimagine context at scale. So Angelo, how about you tell us how we solve this by building Orbit?
Michael Angelo Rivera
AttendeesThank you, Nick. And just to touch on that. GitLab itself is a classic example of a mega monolith repo and thousands of repositories within our own organization. So that's where we had a pretty crazy question, what if we took all of GitLab's data and turned it into a graph that agents could query directly. And we did that by building a highly distributed system in just 3 months that essentially is an indexing engine. And as I show you here in the schema, what we can do is we take -- actually, just the other week, we've been able to index 160,000 repositories into code graphs. And as you can see on the right here, we take those graphs and we marry that data to the rest of the software development life cycle. And that unlocks a variety of different questions, which we'll jump into. But one thing to touch on here is we built this for scale. So all of this can be indexed namely 500 million nodes and 2 billion edges in just 15 minutes. So to touch on that a little bit more and how this improves agentic outcomes, I'd first like to ask everybody, has anyone ever gone through a painful experience of refactoring a code base. I don't know, maybe show of hands. Me too. It's not fun. So with that, let's pull up a little bit of an example of what Orbit can do for you. So I'm going to kick off this run here or what we call the Orbit benchmark. And as you can see, we'll go into a little bit about what Orbit does. But we've got this orbit benchmark here, and we've asked a very relatable question to GitLab itself. So at GitLab, we are actually currently exploring decoupling, authentication and authorization within GitLab itself. And as anyone knows, if you have a mega monolith, decoupling such a critical service is a huge pain. And if you're doing that manually, that will take months. And if you're doing that with agents, there's high risk involved, right?
Manav Khurana
ExecutivesAnd in this repo, how many files are there?
Michael Angelo Rivera
AttendeesThere's around 50,000 files, and it's millions of lines of Ruby code. And so if we're going to make any change, we want to know what's going to be affected and how. So if we go back to the benchmark here that's already running, we've asked a prompt saying, can you get the complete authorization class hierarchy and all of its front-end consumers within the GitLab monolith. And remember, this is 50,000 files. So let's jump into a little bit of what's going on here. So Claude code without Orbit is doing the normal thing that you would expect any agent to do. It's searching through all 50,000 of those files. It's using the classic tools like grep and text search and essentially assembling the entire picture from scratch. On the right, the agent has access to Orbit. And what we've done is built a universal code indexing engine that indexes over 11 languages into a unified graph that essentially acts as a prebuilt map for your agents to query directly. And what that means is that agents are able to write their own queries here, as you can see, and get back the entire authorization tree in just a few hundred milliseconds. And so that effectively allows you to pretty much answer most of the questions that you would normally ask an agent, but get that answer back a lot quicker with a lot higher quality. And as you can see, the agent on the left is still running. And here's the output report with all the authorization classes. And to increase accuracy, we've leveraged a lot of different techniques like SSA and various compiler techniques. So...
Meg Corren
AttendeesAnd Angelo, I think what I saw when you scrolled up, the agent on the right with Orbit is already done cooking at just a minute and 15 seconds. We're still chugging along on the left. But I have the most important question of all, which is tell us about the data quality.
Michael Angelo Rivera
AttendeesSo that is the most interesting part about this whole experiment that we've been running. So as you said, it's still running, and so we won't bore everybody with the results there. But with this previous run, you can see that in just 1 minute, we completed the results, and we just saw that with the previous run. And it took over 11 minutes. So this is -- Manav, you and I were just talking about this. This is you and I, if we're coding, we can stay in the loop on the right here. And on the left, as a developer, maybe I'll go get some coffee, but we don't want to lose our flow. So with that, the last thing I wanted to touch on that's very interesting about this benchmark is the accuracy. So we did something to measure in a deterministic way the output of this report. What we did is we took the actual Ruby on Rails run time. We generated a script to get that same class hierarchy. And as you can imagine, both of these Claude codes don't have access to that run time. So what we're doing is comparing the classes that come from the actual run time itself with the results from the agent. And we've instructed the agent to output those classes. And what's very interesting is it's not as complete in the amount of rules that it found from the authorization classes, but Claude Code without Orbit actually hallucinated 1,000 more rules as compared to Claude Code with Orbit.
Manav Khurana
ExecutivesThat's 1,000 more rules that you have to manually go make sure that got right.
Michael Angelo Rivera
AttendeesExactly. Yes. And so that's what we're talking about with agent trust. So with that, we can imagine how this is super useful for a developer like myself on your local machine, but we didn't want to just stop there. We wanted to empower all of organizations and enterprises with this technology. And with that, we built a service. So Meg, why don't you show us what else we've been cooking?
Meg Corren
AttendeesYes. So as Angelo is alluding to, we didn't just stop at indexing the code base. We indexed your entire GitLab instance, which means you have access to all of your rich GitLab SDLC data. So one of the perfect examples is a pipeline analysis. For all the DevOps and platform teams in the room, you might have heard that 1 in 3 CI pipelines fail, which can really add up at scale. And in the paradigm before Orbit, you could really only analyze your kind of pipeline health at a single project at a time. But now with Orbit because we have these traversal and aggregation abilities, you can understand thousands of pipelines and thousands of projects and their pipelines at once. So another one of our goals with Orbit is to make Duo agent platform even more capable. So we're jumping in here to a agentic chat, and we're going to ask a pretty heavy hitting question here. We're going to ask the agents to deep research our most recurrent failing pipelines and their jobs over the last 60 days. So in GitLab Org, that's 8,000 projects and it's 12 million pipelines. This is a huge question that we're asking. So we have 2 instances. We have GitLab Duo agent in the old paradigm on the left with access to just the GitLab API, and we have the same GitLab Duo agent on the right with access to Orbit. And I'm going to fast forward into a completed output and show you what we're looking at here as well. So on the left, let's zoom in for just a moment. The agent says to us, I'll be straight with you. I can't do it. And this is a limitation, not of GitLab Duo, but of all agents today because they don't have access to Orbit. To achieve an analysis like this, it would have to make tens of thousands of API calls and the API would just time out. But we have Orbit now. So our agent on the right is actually traversing the whole graph and aggregating the job failures. It's understanding the projects, the failed pipelines and the common failed jobs underneath them. It's giving us a CI compute cost attribution and looking at those common failed pipelines and then ultimately taking us to the most important question of all, which is what do I change to resolve this. So as we continue to see the agent with Orbit, it gives us the shared CI template hotspots and points us exactly to what I'm going to work on today, which is resolve these. And with this resolution, we're probably going to save developers a few headaches and maybe a few dollars on CI compute. This is just one of the possibilities that Orbit makes possible. We've been doing a lot of cool stuff. And Angelo, I know our engineers have been asking some crazy questions because Orbit can go further than any agent ever could. How would you tell us some of those wild things our engineers have been looking at?
Michael Angelo Rivera
AttendeesThanks, Megan. And that's a great question. And we have the team sat down, and we've been just playing around with it and asking some of the most wild engineering questions that you would ask when you can ask anything about your GitLab instance. And just some examples here that we pulled up off the cup. One of them is find all the dead repos across GitLab org or find all the critical services within the fulfillment department and who maintains them and who is the expert in those services. And then one of the craziest ones was we actually took the call graph technology that we built and indexed all 5 years' worth of repositories, then use the graph that's available for the SDLC and did a cross comparison and we were able to basically get a trend line of various security fixes throughout GitLab over the past 5 years. So you basically, whatever you can do with your imagination is the limit. And like Manav said, if the nervous system is what lets your body act coherently, then Orbit is that for your entire software organization. And until today, your agents have been definitely flying blind without one.
Manav Khurana
ExecutivesAnd agents in the GitLab Orbit just work a lot better as a result.
Michael Angelo Rivera
AttendeesYes, we're sending them to Orbit.
Manav Khurana
ExecutivesAmazing, amazing. Thank you. Great job, Angelo. So what you saw there with GitLab Orbit that is now available in public beta for all of you to use in your GitLab instance is your agents will work just a lot faster. Getting a response in a few seconds instead of a few minutes using up to 4.5x fewer tokens. And most importantly, you will see up to fewer than 45x hallucinations, which is a real, real important thing. That's why today, we are also kicking off a community hackathon where all of you here as well as throughout the world can join for the next 2 weeks and see what you can do with Orbit and all the different agents you use inside and outside GitLab and see how -- what you can build with that. All right. Next, let's get into the immune system. This is about making sure that you, your teams and their agents can build and ship code safely. With GitLab Ultimate, you already can be proactive with security and compliance because every tool you need is already built into your whether that's security scanning, secret detection, software composition analysis, vulnerability management, policy enforcement, making sure you're meeting all your compliance frameworks, all of that's built into the same platform that you use to build and ship code to make sure you're always secure and always compliant. But here's the thing. In the agentic era, the security and compliance exposure is only multiplying. And that's because agents, just like they are great at writing code, they're also great at exposing vulnerabilities, and they can do that faster than we can react. The typical cycle goes like this. A new vulnerability is found, and there is all this excitement inside a company to decide, hey, do we have that vulnerability? And as you go search for that, it's very common that we find that there are coverage gaps in our testing, and we are not testing all of our code repositories to find if that vulnerability exists. When we set up those -- that security testing and cover the coverage gaps, we find that there are a lot more vulnerabilities to address. We have to triage them. We have to fix them, and that takes weeks of coordination, expanding the risk window for you. All of us are also using agents across the software life cycle. So the compliance team also wants to know if those agents are acting with the right rules, with the right setup and everything is in compliance. And then we invariably discovered that we need more controls to make sure that anything agents do from this point onwards will always be compliant and meet our regulatory requirements. That's why we have expanded on top of GitLab Ultimate recently by bringing agents to security to automate a lot of these tasks for you, so you don't have to wait weeks. You can address these problems in minutes. And today, we're introducing new governance capabilities for agents so you can always remain compliant. To show you how this all works, please welcome members of our security team, Alan and Michael. Hey, Michael. How are you?
Michael Omokoh
AttendeesGreat. Thank you. Let me pick up from the first problem Manav earlier called out. Security can't keep up. And let's walk through a common scenario that we're all familiar with. You wake up on a Monday morning and new vulnerability dropped in. Our team thinks it's in production. This best question isn't how to fix it. It's whether we know where it exists. Alan, what can we do about this?
Alan Paruszewski
AttendeesSure. We all know the page, vulnerability report based for the project where you know you have your scans enabled. So you can quickly go and solve it either by using resolve with AI or by using one of your specialized agents like security analyst agent or one of the own -- you build for your own organization. But that's not really a problem I would like to solve today because you see the biggest gap in security is rather related with the lack of scans running for your project. So you don't know if they're running or not. So let's go to security inventory and check that. And you see this is the problem I was talking about. We have scans enabled only on 2 projects where we have this vulnerability found. Let me quickly fix that. So I know I would like to enable those scans, but only offer most critical projects. So let me do that by selecting business impact and then choose business-critical projects. So now I have a list of projects where I would like to enable those scans. And just by going through some -- through few clicks, you can just enable those scanners one after another, fast, secret detection and dependency scanning. And now you remember, we had those pills, those pills that you see were white here, so no scans were running. Now for every single of the project, health scans are running whenever you push changes to those projects. And on top of that, let's also ensure that we have enabled our Duo workflows like fault positive detection and vulnerability resolution workflows. So these are all enabled for all of those projects. from this moment. So we got one vulnerability. Now we have hundreds.
Michael Omokoh
AttendeesMonths of coordination, hundreds of vulnerabilities are replaced by one critical action. Your critical asset are covered, the agents are watching. But wait a minute, like you said, we had one critical vulnerabilities and now we have over 200. And how do we know which one we really care about?
Alan Paruszewski
AttendeesSure. I mentioned agents that are running in the background and understanding if vulnerabilities that were found are false positive or not. So you noticed there is this icon associated with which vulnerability saying if that vulnerability is a false positive or not. And I can use filter to filter out the noise. So let me do that. And just like that, we have reduced the noise. And also, we just heard about Orbit. Security analyst agent was also integrated with Orbit to help you understand everything about your vulnerabilities and how about the code is being exposed in other projects as well.
Michael Omokoh
AttendeesThat's amazing. We've just brought down the vulnerabilities from over 200 down to a little bit over 20. But these are still real vulnerabilities, and we still need to fix them. Typically, before the pre-agentic era, you have to book your meetings with your asset team, coordinate -- a lot of coordinations and then you need to discuss about what is the best possible path to actually resolve these vulnerabilities. That would take a long time. Alan, I know we just shipped something that could make this even faster.
Alan Paruszewski
AttendeesYes, I love the challenge. So I mentioned those activity icons next to each vulnerability. So I mentioned they are telling you if there is a false positive or not, but there's also new icons here added. So you see we have information about for each of those vulnerabilities, AI agents already created merge request to fix it. So I can go and immediately go to this merge request, talk with the team, let them review it and merge that change immediately.
Michael Omokoh
AttendeesIsn't that amazing? Instead of months of coordination, instead of having these meetings with the asset team to discover these vulnerabilities to think about what's the best path forward. The meetings you're having with this asset team is to decide if this MR is good to go. The fix is right here waiting for the developers even before they open their laptop.
Manav Khurana
ExecutivesAmazing, amazing. That would save everybody a lot of time. What about the second problem? Because invariably, we all have compliance teams in our companies as well. And they want to know if these agents that we are using are doing things the right way. And are we exposing new risk? Are we meeting our compliance regulations? What's the story there?
Michael Omokoh
AttendeesManav, that's a big problem. And because of the EU AI Act, regulators and auditors are starting to require that most teams and organizations can prove that their AI agents acted within predefined boundaries. And when auditors walk into the room, most teams have no answer for them. Alan, do we know what our agents are doing?
Alan Paruszewski
AttendeesSure. Let me go to the part of the GitLab that we're building, AI governance, where we can have all agent artifacts, all interactions agents did and all tools that they were calling and whenever they had approval or not. So I'm in a session, and I can view more details about each action that agent did. And I can quickly go to the session details to learn more about what happened. And it looks like we have dismissed vulnerability without human approval. So let's see what we can do about this.
Michael Omokoh
AttendeesThat's a problem. How do we make sure that we can control the agents from now on.
Alan Paruszewski
AttendeesYes. So that is why we are working as well on the second part of the AI governance called Tool Management. So within Tool Management, you're able to decide how your agents are interacting with you and in this tool. So either you would like them to write, read and if you would like to allow them to do it, if you would like to decide rather, they should always ask or always deny so they will not be able to do that action. So in this particular case, I was talking about dismissing vulnerability. Let me switch that option from always allowed that was previously configured to always ask. And now whenever I would like to dismiss a vulnerability, the agents will still provide me a helpful guidance, but at the end, it will ask me for my approval. So agents can still help, but I just need to approve it.
Michael Omokoh
AttendeesThat's awesome. And let's see what the agent just did. It caught the violation, fixed the policy, proved it worked within one platform, GitLab. This is exactly what the EU AI Act asking for. And when auditors want, they want this today.
Alan Paruszewski
AttendeesYes. And that is just to start. Security policy store that we're working on will also include more capabilities like allowing you to decide about triggers, rules and actions that should be taken based on the situations that are happening in your code, either related to AI, security or compliance. So we fixed the coverage in seconds, then we had remediations in minutes and compliance already built into GitLab.
Michael Omokoh
AttendeesAwesome. Speed without control is more risk. GitLab gives you both, native security with govern agents. And let's recap a little bit. First, we brought agents to security to improve security coverage, detect false positives and accelerate resolutions. Part one, done. Part 2, we brought governance to agent. We both saw what agents can do and the risks of doing that. I want you to be able to control what those agents can do in the future. Over to you, Manav.
Manav Khurana
ExecutivesGreat. Nice job, team. Very good.
Alan Paruszewski
AttendeesThank you.
Manav Khurana
ExecutivesCan't wait for you all to try the new security for -- new agents for security and the new governance for agents. All right. Let's move on to the last part of the Agentic infrastructure, which is the orchestration system. In January this year, we introduced in general availability Duo agent platform that Bill had referenced earlier. Duo agent platform brings agents, specialized agents and Agentic workflows for you and your team at every stage of the software life cycle, so you and your team can be a lot more productive. Since GA in January, we've been busy making Duo agent platform even better. Now when you go log into GitLab, you'll see several specialist agents available for you that are task-tuned to handle specific goals for you and your team right out of the box, whether that is helping you plan what you want to work on next or fix security issues like what Alan and Michael had just shown and many others. You'll also see built-in agentic workflows that automate complex tasks by chaining agents together in a predetermined workflow that we know works. For example, you can now with one invocation, with one click or one CLI command, go from an issue to working software and agents take care of everything else in the middle and many other agentic flows that are now available out of the box. Also recently, we have introduced several agentic triggers. So you and your team can automatically invoke agents when new code is introduced or when things happen in your environment. And we've also introduced several manual triggers across different surfaces that you and your team work in, not only in GitLab, but in your IDE, in your CLI and many other places that your team works. So let's see the power of Duo agent platform. And please welcome Shekhar, our distinguished engineer. Hey, Shekhar. How are you?
Shekhar Patnaik
ExecutivesIt's great to be here. So Manav spoke about the platform. What I want to talk to you about today is how our customer and our internal developers are using the platforms. So I typically start my day by looking at my backlog. So I've a little work item assigned to me in the backlog, which talks about adding product search functionality to the home page. Now I can use the UI to do this, but I prefer the surfaces that I use. So I like to use my IDE, I like to use my CLI.
Manav Khurana
ExecutivesShekhar, maybe we can get the demo up on the screen first. There we go.
Shekhar Patnaik
ExecutivesOh, there we go.
Manav Khurana
ExecutivesYes, yes.
Shekhar Patnaik
ExecutivesSo what I'm going to do is I'm going to quickly copy this issue, and this issue is well written, by the way. It's been written by Duo planner. So it's got all the materials I need to actually start iterating on this. So I'm going to go ahead and copy this, switch to my terminal. And then in the terminal, I'm going to go invoke the new Duo CLI.
Manav Khurana
ExecutivesThe new Duo CLI now available.
Shekhar Patnaik
ExecutivesAnd all I want to do is go ahead and ask it to implement this issue. So it's going to go ahead and -- because it has organizational context, it has got -- it's got all the rules that we need, it's going to go ahead and actually start implementing this issue. But in the interest of time, it's going to take a bit of time. I'm going to go ahead and switch back and show you an issue that I actually implemented this morning. So this is an issue I had asked Duo CLI to implement. So what it did, it went ahead and created an MR. And it's exactly what we expect. It made a number of commits. It went ahead and made the changes that I needed, right? It went ahead and tested the MR as well to make sure it worked. But what's really interesting is that as soon as the MR was created, automatically, Duo Code review, went ahead and actually started reviewing this MR. So it went and it went ahead and made several recommendations. So it said, hey, your styles are not in order. You can make a few changes as far as hard coding is concerned. You can go ahead and actually make some security changes as well. And it's doing this because it understand my organizational context. It's doing this because it saw the review instructions that we've given it. And we can have these review instructions at various levels. We can have it as a project level, we can have it at the group level so that's applicable to all your projects within that group. And that's really powerful. And that's really powerful because you can provide exactly what you want from an organizational perspective. You can do things like provide it exactly with CSS refactoring rules you want. You can provide the style of code you want, you provide all the security things that are important to you, and it will do that. And these review instructions can be applicable to certain files. And we've been working hard on making the code review agent as good as it can be. So we've been working hard on trying to make this code review agent as good as it can be, and we've been making constant improvements. And now with our own benchmark...
Manav Khurana
ExecutivesYou may have to go back to your podium, Shekhar.
Shekhar Patnaik
ExecutivesSo in our own benchmark as well as third-party benchmarks, we have made tremendous improvements. So now we are top 5 in the Martian code bench, and we're extremely proud of that. So now I'm going to switch back. And I could have gone ahead and actually gone ahead and looked at these changes. But what I'm going to do instead is I am going to ask Duo developer to implement these changes. So I don't need to do anything. I just go ahead and ask Duo developer, can you please make these changes based on these recommendations, which are great. And Duo developer actually went ahead and did this. So it went and implemented the security fixes. It went ahead and implemented all the style changes that I needed it to. It went ahead and did all these things for me. And that's really the power of automation. And we've been doing a lot in terms of automation. So as Manav mentioned, we now have triggers. The triggers based on different GitLab events can automatically invoke the agents that you have in your catalog. And the triggers are really useful. So for example, when a review is mentioned, it can go ahead and invoke an agent. When there's a merged conflict, it can go ahead and invoke an agent to actually fix the merged conflict. And my favorite trigger is the pipeline events trigger. And the pipeline event trigger, what it does, it goes ahead and fixes pipeline. Now as a dev, it's been a constant source of annoyance to actually go whenever a pipeline fails, I have to switch context. I have to break out of my flow. I have to go and look at what the pipeline is doing. I need to go and push a new change, figure out the logs, all of that. And that takes me away from the flow. It stops me from what I'm doing right now, and I need to then switch context, right? So the pipeline event trigger is something we've rolled out internally and our teams love it, right? It's automatically going in. And every time there's a failed pipeline, it goes fixes it automatically. So this is an example of a real project. So this is our CLI project. And the pipeline fix trigger here is going ahead and running an agent, which goes and says, hey, there's a race condition here. I'm going to go and fix that race condition. Or in this case, it looks at it and realizes it's a flaky test and it knows that because of the organization context and says, this is a flaky test, I'm just going to restart the pipeline to fix this. And that's really powerful. So to recap, I could have done all of these things, right? I know how to fix code. I know how to fix the pipeline, look at the logs, all of that. But every time I do so, it takes me away from the work that I want to do. And this is where the Duo Agent Platform seamlessly fits in and slots in. It is able to fit into your existing workflow and help you automate the parts that you are interested in automating.
Manav Khurana
ExecutivesAnd Shekhar, it's not just you and your flow because it's very common that if I'm writing code, I'll ask somebody else to review it. So I'm taking them out of their flow. But if I'm running into a pipeline problem, it's common for me to go ask the DevOps or a lead engineer to help me fix that pipeline. I'm taking them out of their flow as well, right? So this is all about giving you the right productivity for you and your teams to do what they do best.
Shekhar Patnaik
ExecutivesExactly. And there's a cascading effect. And so Duo Agent Platform lets you do -- automate the way you want it to, and it gives you speed with control. Back to you.
Manav Khurana
ExecutivesAmazing, nice work Shekhar.
Shekhar Patnaik
ExecutivesThank you.
Manav Khurana
ExecutivesWhat Shekhar shared there was really about helping you and your teams be a lot more productive. We've been looking at how our early customers over the last several months have been using Duo Agent Platform. And here are the top 5 use cases that we see across our entire customer base with Duo Agent Platform. And some of these ROI numbers are staggering. For example, with code review, we've seen our customers on a per person basis, save a minimum of 20 minutes because agents are doing the code review for them as opposed to they themselves doing that code review. And when you take that and the labor cost of doing that code review and the fact that a code review only costs $0.25 per run, that's 100x ROI. The rest of the ROI numbers are calculated similarly, and I can't wait if you haven't used Duo Agent Platform already, check it out, see how these ROI numbers play out for you in your particular environment. All right. Let's move on to something else. Now this is not a technical challenge. I want to share with you a new commercial challenge that's showing up in the agentic era. You see the way you buy software, any software and GitLab for that matter as well, you buy software through fixed contracts. But in the agentic era, what you need is constantly changing. And your fixed contracts force you to define what you need for the next year, in some cases, multiple years up ahead. Within the agentic era, I've heard many of you say, hey, I may need more people in my company access GitLab because I want to give product managers and designers access to GitLab so they can contribute and code and help with the various projects that we are doing. I've heard many of you say that you don't know how much AI usage you will have a few months from now or a couple of years from now because AI technology is evolving. How your teams use AI is changing, how each person is enabled to use AI is changing. It's really hard to predict how much AI usage and therefore, how many credits you need. With all of the new innovation that we just introduced today and many more coming in the months, they're also going to be built on a credit basis. And it's going to be really hard for you to predict how much you should budget for the credits you need for those new capabilities. The net is that the fixed contract model and the agentic era were not built for each other. That's why today, we're introducing GitLab Flex. It is a new buying program that allows you to commit once, just like you do today and then decide how you use the dollars you spend on GitLab on which product and how much of which product at any time. You can make that change as your needs change. To show you how Flex works, please welcome the Flex team, Courtney and Jerome, to give you a quick demo.
Jerome Z. Ng
ExecutivesHey, everyone. I'm Jerome, Director of Engineering.
Courtney Meddaugh
ExecutivesAnd I'm Courtney, Group Product Manager.
Jerome Z. Ng
ExecutivesSo Courtney, for this demo, how about I be ACME Inc.'s billing account manager? That way, you can do most of the talking, and I'll just click the buttons.
Courtney Meddaugh
ExecutivesThat sounds good. But hey, don't undersell it. You did build most of those buttons.
Jerome Z. Ng
ExecutivesSo at ACME, we are on a Flex contract. We've signed a $1.2 million annual commitment, and we're currently 6 months in. Let's take a look at our setup. We currently have 500 ultimate seats along with 50,000 Duo Agent Platform credits, both of which are reserved.
Courtney Meddaugh
ExecutivesThat part is part of what makes Flex really powerful. For the products where ACME has a good sense of what they'll need, they can reserve spend upfront and lock in a volume discount. How does that sound?
Jerome Z. Ng
ExecutivesThat sounds great. I love saving money.
Courtney Meddaugh
ExecutivesFor the products where maybe you're not quite as sure about your spend, you can enable per use and pay as you go, drawing from the same pre-committed pool. So Flex offers you both, discounted economics on what's predictable and the flexibility to spin up new things as your needs evolve. Speaking of which, how are ACME's needs evolving, Jerome?
Jerome Z. Ng
ExecutivesLet's take a look at the seat side first. So for seats, we have 500 ultimate seats reserved, and we're using pretty much that amount. The credit side tells a slightly different story. For Duo Agent Platform, we've reserved 50,000 credits, but we've already exceeded our allocation. Teams are really leaning into Duo Agent Platform and AI usage is growing quickly.
Courtney Meddaugh
ExecutivesHey, that's a great problem to have. And that's the kind of split that a lot of customers find themselves in mid-contract. The shape of what you committed to in January is not necessarily how you're pacing come June. Let's take another example. A contracting team rolls off a project and suddenly, the 50 seats they were using last month are no longer needed next month. So now finance is asking, we have all of this budget locked up in unused seats, but we're getting so many requests for additional AI spend. What do we do? Under a normal contract, nothing. You would wait until renewal when you could readjust. But since ACME is on Flex, maybe we can see how that would work.
Jerome Z. Ng
ExecutivesYes. So on Flex, I can change our upcoming months reservations. And here, you can see I've got 500 seats right now. Let's bump that down to, say, 450 to account for the 50 contractors that are rolling off. For the Duo Agent Platform credits, let's increase this from 50,000 up to, say, 60,000 just to account for the increased AI usage. You can see that the overall commitment has stayed the same, but it's been reshaped to match our needs.
Courtney Meddaugh
ExecutivesWow, that seems really simple. But I have to ask, what about budgeting guardrails? I know that's top of mind for a lot of customers.
Jerome Z. Ng
ExecutivesSo our usage caps actually live here as well. So I can set a usage cap of, say, 70,000 credits. This is slightly above our reservation amount, but it puts a ceiling in case usage spikes.
Courtney Meddaugh
ExecutivesOkay. Great. So in finance, you can get predictability at the contract level. And in 18:11, GitLab added per user credit controls, which means that GitLab admins can allocate additional credits to power users while making sure no one user blows through your entire AI budget.
Jerome Z. Ng
ExecutivesAnd as you all saw today, we've launched GitLab Orbit. So let's give that a try as well. So I'll allocate some credits there, say 5,000 credits.
Courtney Meddaugh
ExecutivesJerome, 5,000 credits? Did you not see the Orbit demo? Let's bump it up a bit.
Jerome Z. Ng
ExecutivesOkay. Okay. Let's do 10,000 credits.
Courtney Meddaugh
ExecutivesBetter.
Jerome Z. Ng
ExecutivesSo with Flex, Orbit lives on the same rate card. I can just lock it in and the commercial side is all handled.
Courtney Meddaugh
ExecutivesSo ACME's commitment didn't have to change. Their overall contract didn't have to change. But what they're getting from GitLab evolves real time with their needs, without Jerome having to go through another procurement cycle to get there.
Jerome Z. Ng
ExecutivesSo that's Flex, the buying program that evolves with your needs.
Courtney Meddaugh
ExecutivesWith Flex, you commit once and then adjust as your year unfolds. You get volume discounts on what you know and flexibility on what you don't. And whether you're running GitLab in our multi-tenant cloud, in a self-managed instance or in a dedicated tenant, GitLab Flex is available today. Customers can now request orders and your sales rep is eager to get you on board.
Manav Khurana
ExecutivesAmazing. Nice work. Good job. All right. So as Courtney mentioned, you can use Flex today, whether you are a new customer, you have an existing contract or have an upcoming renewal. If you go to that URL, you can request a quote and move your contract to Flex today and take advantage of everything that we talked about. All right. So let's recap what we saw today. GitLab, the DevSecOps platform you know is now the agentic infrastructure. The motor system got a lot better with the next generation of Git built for machine scale. The nervous system now has Orbit, so your agents work better, faster, cheaper, but more importantly, you can answer questions that you never could before. The immune system now brings agents to security and governance to agents, so you can stay compliant. You now have Duo Agent Platform that has gotten a lot better with new agents, new agentic flows and new triggers available to you. And then finally, you have Flex where you can commit once and shape your GitLab usage as things change for you. That is the new GitLab. Thank you. And now to show you how all this innovation turns into value for you, our customers, please welcome our Chief Customer Officer, Sherrod Patching.
Sherrod Patching
ExecutivesThank you. Well, we've just shown you some of the latest innovations from agentic infrastructure in action. Agent actions that are able to go through next-generation source code management with rich context, all with the level of visibility and governance that you need. I'd like to tell you a little bit more about a study that was led recently now by Forrester, the Total Economic Impact study on the Duo Agent Platform. We're revealing these results today, and I am thrilled to tell you about what we found. 40% faster time to remediation. You saw some of what we talked about today, being able to bring context and potential remediation into developer flow and a 40% faster time on average. 80% faster time for developer onboarding. I know many of you in this room look at time to first commit as one of your key metrics. And whether it's a new developer coming on to your team or whether changing applications, being able to find contacts there within flow, we saw an average in 80% faster time. And overall, we saw a 400% return on investment for these customers. They were expecting somewhere between 20% to 40%. But as a result of the agent platform, we're excited to tell you that they saw 400%. So joining me on stage today, I'd like to tell you a little bit more about a customer story, showing you this in action. I'd like to welcome on to stage Mercedes-Benz. Mercedes is one of our customers that was able to actually transform how they think about the software development life cycle using GitLab. And they were able to see across thousands of developers the ability to actually go from what was the previous implementation all the way through to the net new generation using software development life cycle in a highly regulated company with GitLab. So to tell you more about this, I'd like to welcome on to stage Bastian from Mercedes-Benz. Welcome, Bastian. Thank you for coming. All right. Have a seat here.
Bastian Stahmer
AttendeesThanks. Thanks for having me.
Sherrod Patching
ExecutivesAll right. So I have a few questions for you. So you recently launched the CLA, C-Class and GLC on MB.OS, Mercedes' in-house vehicle operating system. What does that innovation approach look like? And how is GitLab helping your 20,000-plus engineers on that journey?
Bastian Stahmer
AttendeesWell, first of all, you're right, we launched the CLA last year, followed by the C-Class and the GLC and all run our latest version of MB.OS, our in-house developed operating system. Not so long ago, we had several dozens of suppliers who equipped us with control units and the software and integrating all of them was quite difficult, but we moved towards fewer control units with more capable ECUs where we develop a bigger part of the software. So we control the crucial parts when it comes to autonomous driving, infotainment, powertrains, et cetera. But of course, this comes with some challenges. So different to maybe web or app development, embedded software development is kind of quirky. So we have to deal with embedded tool chains, some tools when it comes to SaaS and DaaS, which were never meant to run in the CI, but rather on Windows. But luckily with tools like Fleeting Runners, we can run also those tools at scale, running several million jobs regularly, moving several petabytes of data of artifacts on the platform. Well, we run the platform in different flavors, where we have a greater degree of flexibility when it comes to our back end and web services for the connected vehicle. We can use GitLab Dedicated to use all the benefits of the SaaS product that where privacy is key, and we have the highest level of control of our data also being present in different regions of the world where required, we go with GitLab self-hosted to have this greater degree. But when we started in-house software development, it wasn't always like that. When we started, we had different instances of Jenkins, Bamboo, Bitbucket, Azure DevOps, et cetera. And over the last years, we all brought this together to GitLab because GitLab gives us this unified user experience. We can govern in one place, can share best practices across all those domains, and we're pretty happy now having more than 20,000 users on our platforms.
Sherrod Patching
ExecutivesAwesome. I love that. And the different flexibility that you're able to bring in that consolidation over time. I think I've been there since the beginning of the journey with you. It's been fun to see. Okay. Like every technology leader here, you were navigating AI for software engineering. But with automotive software, there comes a level of safety, accountability and review burdens that most application teams don't ever face. So how are you approaching the use of agentic AI for your software delivery?
Bastian Stahmer
AttendeesWell, we are doing a lot of different approaches. I think the key is an agent can only be as good as the context and semantics, which are fed to them. And therefore, I'm also super excited about what we saw already. Context and semantics in our domain besides the source code, of course, itself means the functional requirements, but also the nonfunctional ones, safety constraints, architecture patterns, et cetera. So key is that we move this data out of proprietary tool silos we maybe had in the past, but make it accessible within Git or in graphs so that the agents can operate on that to have the right input. But at the same time, also the validation of what we do in CI becomes even more crucial that we tighten the loop where we see what the agents did, but also agents ideally can correct and that's also what we saw just a couple of minutes ago. And what we really also like is that with GitLab, we have this flexibility. We can use the GitLab Duo feature set, but we can also connect other harnesses like Claude Code or so because, I mean, it is still a young thing. We're exploring a lot of options and having this flexibility without overcommitting into one lock ecosystem is a great advantage. For Duo itself, I can only echo what we also just saw chat and code reviews are amongst the most loved features on our end.
Sherrod Patching
ExecutivesFantastic. Thank you. And I know we've been talking a little bit before this also on just the context that Orbit will bring and the ability to be able to make the decisions within the MR and the time savings you'll see there.
Bastian Stahmer
AttendeesExactly.
Sherrod Patching
ExecutivesGreat. Okay. You have an internal framework on AI native engineering. As we launch Orbit public beta today, how does the richer SDLC context fit into that picture?
Bastian Stahmer
AttendeesWell, I think it fits perfectly. I mean, with AI, we can get at such a high pace. But also as we heard already, we need speed with control. And we have automotive regulator standards like ASPICE which require traceability and human accountability. And accountability is especially important for us. I mean, we put people in our cars who trust their life to our cars. So we, as Mercedes, have to stay accountable that the software is safe and sound. And I think mastering this human in the loop approach, letting agents freely where they can run freely. But then again, having the humans in the loop, reviewing the results, approving, giving consent to what the agent did is right, I think this is key. And there, we see GitLab is very well set up as this control plane where we integrate the software and take that accountability.
Sherrod Patching
ExecutivesAwesome. Thank you. All right. Last but not least, a favorite question. How are you thinking about measuring whether or not AI is actually improving productivity?
Bastian Stahmer
AttendeesYes. Well, I think, first of all, all the metrics we did in the past in measuring productivity like DORA metrics, et cetera, are more valid than ever because ultimately, AI is a means to an end in becoming more productive. But also at the same time, we have to justify the spendings, of course. And I think everybody who uses it knows there are good and bad patterns, how to use AI. So what we would like to see is a good integration of the DORA metrics or productivity metrics with AI consumption and insights how our users are using that, seeing best practices, where are we efficient? And I think you have also a nice guest coming up who will dive into that looking forward.
Sherrod Patching
ExecutivesYes, yes. We'll show him in a moment, too. We're excited to have Gene here. Great. Well, Bastian, that is the last of my questions. Thank you so much for joining us today. We're thrilled to have Bastian on stage.
Bastian Stahmer
AttendeesThank you.
Sherrod Patching
ExecutivesAll right. So as you can tell, GitLab has the potential to transform not just software development, but also to what you see on the road and the experience that you have. I love the fact that Mercedes is just a fantastic example of speed with control. There's one more element as you think about what needs to really happen for rich software development life cycle to be true for customers like Mercedes and others. And that is the -- is our ecosystem partners and their presence in bringing all of our customers from agentic -- essentially agentic testing all the way through to agentic engineering. So one of those key partners for us is Google Cloud. And here to tell you more, I'd like to welcome back on the stage, Bill and Daniel Rood from Google Cloud.
William Staples
ExecutivesAll right. Daniel, thank you so much for joining us today. Google and GitLab have been partnering for years.
Daniel Rood
AttendeesWe have.
William Staples
ExecutivesIn fact, it's, I think, maybe the best kept secret. I don't know if people realize, but thousands of customers benefit from the partnership every day because gitlab.com runs on Google.
Daniel Rood
AttendeesIt does.
William Staples
ExecutivesAnd I know many of our strategic customers also choose Google as their infrastructure -- cloud infrastructure provider for their self-managed instances. So tell us a little bit more about what new options we're introducing today.
Daniel Rood
AttendeesWell, I'm really excited to announce that for providers, GitLab certified managed providers, there's now an option to deploy GitLab on Google Cloud with sovereign deployment options in EMEA. And I think this is really important, especially for regulated organizations as if you think about those workloads that need to be compliant within certain regions or against certain regulations, we now offer you the controls in order to do so and for you to be compliant with your auditors. So I think that is really big news for today.
William Staples
ExecutivesCustomers are going to be really excited about that new option, get out of the toil of managing your own instance on your own infrastructure and take advantage of Google's managed service providers, amazing. Now you're not only a cloud infrastructure provider, you're also a model provider, and we've been proud to offer both Gemma and Gemini model support inside Duo Agent Platform. What else do you have for us on that front?
Daniel Rood
AttendeesYes. So maybe as an introduction, so if you think about the Google AI family, there's a number of models that we offer for your customers. We have Gemini Flash, which is our working horse model for your thousands of times a day workflows, and it's cost efficient, it's token efficient, and it's still a state-of-the-art model. Then we have Gemini Pro. Pro is really a powerful model for your most complex workloads. And then equally exciting is our open-weight model, Gemma 4. And Gemma is an excellent model for those who want to run these capable models on the edge, maybe even on device or in air gap solutions. Now all of these models come together in Gemini Enterprise Agent platform, which is tightly integrated with Duo Agent Platform. And so we are offering that today. Now the news for today is that we announced just a few weeks ago, Gemini 3.5 Flash, which is now also available in Duo Agent platform.
William Staples
ExecutivesAwesome. And I know Gemma 4 as well. Gemma 4 for our self-hosted customers in air gap environments because we have many regulated customers who are required to be air-gapped is going to be a really powerful option as well. So thank you. Speaking of the Gemini Enterprise Agent Platform, we were talking earlier about cost and how everyone is talking about the cost of AI and the ROI and being able to understand where the cost is going. Duo Agent Platform provides the visibility into the cost of agents running within GitLab and the use cases that are under action. What does Google provide on that front?
Daniel Rood
AttendeesYes. So if you think about the partnership, there's probably a couple of elements there. So first of all, we talked about cost-efficient and token efficient models. So like a Flash or Gemma 4, they will help you make the right decisions for your AI workload. So I think that's an important one. For those who have a commitment with Google Cloud, through the Google Cloud Marketplace, you're now able to draw down against that commitment with GitLab. So I think that is really important because what that also does is it gives you the flexibility not entering into new budget cycles. It gives you one view of your cost all the way from GitLab platform to inference and infrastructure with one bill, the other bill and also one view of everything you're doing. And what is really important there is a lot of the tech leaders here in the room and online as well as I'm sure your CFOs, they all are interested to understand how much are we actually spending on AI this quarter or even how much value are we getting out of our AI usage. Those are all questions that we can now start to answer with that setup. So I think that is really important. And so if you think about as a customer of GitLab, this go-to-market integration between the Duo Agent Platform and Gemini Enterprise Agent Platform as part of the marketplace, you now can manage your AI workloads end-to-end.
William Staples
ExecutivesAmazing. So really, we've talked about 3 things today: GitLab as a managed service on Google Cloud, now available through managed service partners, Gemini 3.5 and Gemma 4 in GitLab Duo Agent Platform available today and use your Google commitments to buy GitLab licenses and credits. An amazing partnership. It continues to get better and better every single year.
Daniel Rood
AttendeesWe're excited.
William Staples
ExecutivesThank you so much, Daniel.
Daniel Rood
AttendeesThank you very much, Bill. And definitely also really excited about all the things that are coming up in our road map in the coming weeks and months.
William Staples
ExecutivesLook forward to it. Thanks again, Daniel. All right. The agentic engineering is here. Hopefully, you're starting to see how we're bringing agentic infrastructure together with coding agents to deliver speed with control. You've heard amazing customer stories from Mercedes already and more to come about the proven ROI of what we do, both in GitLab and now with agents and Duo Agent Platform and amazing partnerships like with Google and earlier Anthropic. It's an incredible time to be part of GitLab. I hope you're starting to see our new mission in action, which is to unlock every team to ship trusted software at the speed of imagination. I think the most exciting part of Transcend though is often not all of the technology, the amazing demos. But what I hear most often from customers is they love hearing from our partners and our customers about how they use GitLab and the benefits they're seeing. And so it's my pleasure to welcome back on stage, Sherrod and our panelists... To continue.
Sherrod Patching
ExecutivesAll right. Well, welcome. Thank you for coming. So I'd like to introduce you to our panelists. This is Ryan Harvey, he's Head of AI Engineering at Compare the Market. Matteo Figus, he's AI Engineering Manager at AWS. Mans Booijink, Operations Manager at Cube, I think I got it right.
Mans Booijink
AttendeesYes, you did.
Sherrod Patching
ExecutivesAnd Gene Kim, researcher and author of The Phoenix Project, DevOps Handbook, and most recently co-authored Vibe Coding with Steve Yegge. Welcome. Thank you all for coming. Okay. We're going to start with one, and we're going to have each of you answer this. We'll just go in order, and then we'll go from there, I think. In fact, no, Gene, I'll start with you. So to start. When do you think about software innovation in your organization or in those you advise? What has changed the most in the last 12 to 18 months because of AI?
Gene Kim
AttendeesOh my gosh. What hasn't changed? I mean this morning is probably a proof of that. I mean, so having studied high-performing technology organizations for 27 years, I've had a lot of fun in my career, but I think like so many of you, I've never had as much fun as I'm having right now. And it's so strange where we're entering this area, where coding for planning purposes is becoming free and instantaneous. Well, I don't know about free, but I mean, it's certainly pretty close to instantaneous. And that means like every process we've created is like now wildly insufficient, budgeting, procurement, prioritization, getting access to customers. And so -- we heard from Nick from Anthropic mentioning how their teams are generating 8x more output than a year ago. But some of you might say it's just the frontier AI labs, but then we heard from the -- Angelo from the Orbit team saying that in 1 month -- I talked to him this morning, they made 3,000 merge requests. That's like tens of thousands of commits in a month. And some of you will be excited by that, some of you will be scared by it, and some of you will just say, "Oh, that's slop. But I think as leaders, we all have to get ready for this era where that's going to be, I think, increasingly commonplace.
Sherrod Patching
ExecutivesYes, I agree. Thank you. Matteo, maybe we'll have you go next. What are you broadly seeing with AWS?
Matteo Figus
AttendeesSure. So for me, when I speak with enterprises, I've seen that in the last couple of years, the economics of coding kind of flipped. It used to take maybe 1 year -- 1 day to kind of build the feature and maybe a few hours of code reviews to actually get it to a measurable state. Today, we see that with AI, we can actually have code produced in minutes, but maybe still needing hours to actually steer it back to actually get to measurable state. What I see a lot changing in the last year or so in the enterprise, developers are learning that context engineering, elaborating their intents better and using AI to prioritize helps getting a result faster, that actually looks closer to our intended outcome. And this is basically, for me, a signal that models our rewarding discipline rather than speed.
Sherrod Patching
ExecutivesFantastic. Thank you. And Ryan, what about Compare the Market?
Ryan Harvey
AttendeesSo I think for me, the most obvious change, especially for a group of engineers, is where people are spending their time. So you all have seen, you're able to produce and write code significantly faster than we were a couple of years ago. And that has kind of forced people's time to move either side of the code writing process in the delivery pipeline. So people are spending more of their time writing and refining specifications and how work will be conducted, and then also reviewing the output, right? So this is kind of interesting because the bit that we really like about engineering is the code writing in part, right, but it's kind of forced us to go either side of that. And so for us, that's looking at how we address the sort of changing nature of the role and the impact that, that has on a whole bunch of people on what they find valuable in what they're doing.
Sherrod Patching
ExecutivesYes, of course. Thank you. And Mans, what about you at Cube?
Mans Booijink
AttendeesYes, from my perspective, what I see, it's not about what is changing, but how fast this is changing all right now. So like 1 to 2 years ago, we were talking about how are we going to implement code suggestion kind of tools into our software development life cycle. And right now, we are running multiple agents in every stage of our development life cycle within GitLab, also within Claude. So I think it's not only a developer tool or tools anymore, but it's more like a full organizational shift where we are in right now, where we see that the way we think about even building software is completely changing.
Sherrod Patching
ExecutivesYes, completely. I think we have a case study that came out just today with you.
Mans Booijink
AttendeesYes, yes.
Sherrod Patching
ExecutivesSome exciting stories there. Ryan, next question is for you. Compare the Market has been doing some of the most concrete work we've seen on how context changes AI outcomes and software engineering. So tell us more about what problems you look to solve and what you learned in the process.
Ryan Harvey
AttendeesOkay. Cool. So we saw the same as everyone else, that the volume of code that we were outputting was growing significantly. And that, again, forces the sort of impact on people's time into the code review process. So we have an agent that does code review on every single change we have, and that's in our GitLab pipelines. And one of the things for us that was -- we were really interested in is how can we arm the person who's still doing the code review with as much context and meaningful feedback on the code change that's being suggested. So the agent was performing these code reviews, and we wanted to make sure it had as deep context and meaningful impact as possible. And so the team -- by the way, a couple of them are in the audience. So you can have a chat with [ Marina ] and [indiscernible], they're awesome. We wanted to have a look at how do we make that -- the code review that the human gets from the agent as meaningful as possible. And so we did a study within the team on how we could do that. And we took a bunch of different approaches. The first was using Knowledge Graph in Orbit with an agent, and the other one was using an agent with RAG tool and the other one was using an agent just on its own, no tools. And what was really interesting for us was, firstly, using Orbit, using Knowledge Graph, significantly outperformed anything else. So kudos to you, guys. But we get -- yes, the -- sort of evaluations we've done showed that common accuracy was 21% higher than any other option. Most interestingly for us was that using an agent on its own with no tools outperformed an agent with RAG. And sort of digging into this was quite interesting because we found that using a RAG tool with the agent, we were dragging in semantically similar code into the context window, but that was causing a little bit of confusion within the agents. So that's why we see that underperformance for RAG. But for us, it's been a quite a significant unlock. We're getting through, let's say, 1,000 MRs per week. And if you're shaving an hour off of review time because you're providing significant context to the human, that adds up over a [ quarter ].
Sherrod Patching
ExecutivesYes, fantastic. And then maybe just as a quick follow-up. So speed with control, we've been talking about the speed of government -- governance. How do you think about that one?
Ryan Harvey
AttendeesYes, that is interesting. So we were actually talking about this last night, too. The position for us is quite interesting that coming from a sort of regulated government industry, that we find ourselves not in the position where we're having to kind of adapt to AI and tag risk and governance on to the side because we've kind of operated in that space, we find ourselves in a fairly advantageous position that the guardrails and sort of risk and compliance checks that we perform as part of normal pre-agentic software delivery are kind of already there. Sure they're changing, sure they'll adapt. And the risks that we addressed, there will be new ones that we haven't seen before. But the way we think about software delivery is that those things are already part of what we do, predeployment going through our pipelines. So yes, things are changing. The nature of how we do deployments and how we think about delivery will change, but we're in a fairly good position where we can go fast because we already have those guardrails in place.
Sherrod Patching
ExecutivesThat's great. And the discipline is there.
Ryan Harvey
AttendeesYes.
Sherrod Patching
ExecutivesFantastic. Gene, this one's for you. So we talked about some of the study that Compare the Market did, and the surprising result around agents with no context being better performing than that with RAG. So maybe, as you think about -- someone who has studied -- as yourself who has studied the system problems for decades, what do these findings tell you overall?
Gene Kim
AttendeesOh my gosh. I mean, I think one of the things -- I mean I love the work that Ryan and team did at Compare the Market. And I think one is like just a fantastic primary research about what makes these tools more productive. Secondly, is we're all learning together, and it's one of an incredible opportunity in an era where nobody knows where -- no one knows what the new practice will actually look like. Here's an opportunity to actually define those patterns, and I think Ryan and team absolutely did that. And the fact that they did that in a regulatory -- regulated environment, I mean, I think it's fantastic. And reminds me about what happened with DevOps in 2010, where organizations, especially in the regulated industries, they were just so scared to even say the word CICD or DevOps because they were afraid that the regulators would crack down on them. And so as we got the case studies like Capital One, one of the largest card issuers in the United States, I mean, we've sort of normalized that and say it's actually better if you're faster and more secure, more in control. So I'm eager for when we get those case studies down, where we can actually make it safe for organizations to say what they're really doing. And I ran a conference called the Enterprise AI Summit in April where we had Block -- I'm [indiscernible] Block, Netflix, healthcare -- Skypoint healthcare, where companies in regulated spaces are actually sharing that they're working on a regulated mission-critical code, right, using AI. So that's an exciting time to be in the game.
Sherrod Patching
ExecutivesIt very much is. And maybe one more for you. So you've written extensively about how organizations create flow. How does context fit into that picture?
Gene Kim
AttendeesOh, I think it's everything. And I guess the thing that really amazed and amused me this morning is that -- like how intolerable it is when something takes 2 minutes. Like, oh my gosh, 2 minutes. That used to be considered fast. But like now, if you have to wait for -- to get information from a repo and it takes 2 minutes, it is intolerable. And so it's just an exciting time where -- like what does it take to get not just the right context, but get it quickly. It's just exhilarating.
Sherrod Patching
ExecutivesAbsolutely. Thank you. Mans, this one's for you. We've heard about context quality, and we've been talking about that now. You described a model where Claude handles generation and GitLab does the orchestration of everything around and across the software development life cycle. So maybe can you walk us through that use case and the impact that richer context had for you?
Mans Booijink
AttendeesYes, of course. So I think like context and quality of the context is the most important thing in our software development. At Cube, we have been running GitLab for over 8 years right now. So our full software development life cycle is managed within GitLab. So from like issue creation to the actual deployment, it's all within the GitLab environment. So when we started adopting AI over 2 to 3 years ago, it wasn't our question, where is our context at, but more, how are we going to implement AI agents within our existing GitLab environment. So we are doing that in 2 different flows right now. One of them is that we are using Claude code as our daily coding agent for our developers, and we connect that through the MCP and API connection with GitLab. So yes, we keep in control of our software development life cycle. And when we are doing the actual building of the software, we pull it from GitLab context to in our Claude coding agent. There, we are building the software, putting it back into our software development life cycle within GitLab, and that's how we are currently building our software with our teams. Besides that, we are also using the Duo Agent Platform, where we are building custom agents within GitLab. For example, when we want to have an agent which is gathering context before the actual development starts, we are implementing that in the first stage of our GitLab flow to get the context in our issue before it gets into our Claude code development environment. So yes, what we see is GitLab's orchestrating everything for us, and we are looking in Claude code now to do our actual development work. But for example, it can also be another coding agent in the near future so...
Sherrod Patching
ExecutivesAwesome. Thank you. I know a number of our customers are interested to hear more about how these work together. So thank you for sharing. Maybe one more for you. So as you're shipping faster, what does that mean for your business and for your customers?
Mans Booijink
AttendeesYes, what we see is that we can ship a lot faster, but also the quality is increasing. We are delivering more and higher-quality software. So -- and security is also getting better and better. So from there on, we can deliver faster, for example, prototypes, where we, earlier, needed for like months to weeks to develop the first prototype. We are now ready in days to weeks, so we can show them the value that we can deliver with our software. What comes with that is that we see a shift from like the hourly-based software development, where we are shifting to more like value-based software delivery, because it's not only about the hours anymore that the developer spends to develop the software, but it's also about the AI cost, the agents that you're running. So yes, we are figuring that out how we are going to make that shift as a company.
Sherrod Patching
ExecutivesGreat. Thank you.
Gene Kim
AttendeesIf I can just add one more thing about that. I mean what I'm really looking forward to, as the DORA metrics have come up, and that's something I worked on about a decade ago. And I'm really looking forward to the day when we have set -- a set of metrics that can actually share the -- like what you just talked about, like you can conjure up software from scratch in an hour. Right now just talking about merge request and pull -- and lead times just doesn't -- it's like such an incomplete expression of the magic that's happening right now. So we're not there yet, but I look forward to that happening soon.
Sherrod Patching
ExecutivesOh, yes, me too. Thank you. Matteo, okay, from the AWS side, how are enterprise customers defining business success with agentic software engineering? So maybe use cases and how these drive investments.
Matteo Figus
AttendeesSure. So what we see with our enterprise customers is a shift between -- from individual productivity to group productivity. I think, over the last couple of years, every developer started using these tools and recorded some productivity when coding, and everyone around them, other roles, tech and nontech roles, started doing the same. But we saw in the enterprise that in order to achieve greater outcomes, sometimes it is necessary to align tools and technology with people and processes and in general, kind of rituals and ceremonies with people and processes means to -- mostly mean the evolution of roles. We see in many enterprises, some of the responsibilities that used to define clear boundaries of role responsibilities kind of becoming a little bit blurrier. For example, engineers taking over some of the product management kind of duties because this is bringing more efficiency when interacting with AI, just as an example. And when we think about rituals and ceremonies, we see sometimes smaller teams working on shorter sprints in order to actually work a little bit more efficiently. So when I think about use cases, given this is common for all the enterprises, I would say one prominent one is brownfield modernization. In the enterprises, there is a lot of legacy, sometimes spanning multiple [ repos ] that have implicit dependencies and a lot of tribal knowledge. When AI can help us understand this code, we can actually immediately see some of the ROI because we can see maybe moving from releasing a change that used to take months, now maybe going live in weeks or days. I would say the second use case that is quite emerging, obviously, we spoke about it today, is infusing every step of the SDLC with AI. So not only coding but also code reviews, everything is related to operations and orchestration of going live into production. I would say the third use case is also something that goes beyond developer. And maybe it goes and touches in other phases of the SDLC that maybe relate to, for example, product managers or design or user experience experts. So for example, using AI to do data analysis, analyzing customer signals, maybe doing user testing with synthetic personas rather than real personas or complementary to real user testing, the prototyping even. So this is probably something that, again, is quite prominent in the enterprise because we see it is easier to connect that to some kind of ROI.
Sherrod Patching
ExecutivesFantastic. Thank you. All right. Gene, I'm conscious we have 3 minutes left. So I'm going to ask you another question. I'm going to do a quick whip around at the end. So in your research, what separates organizations that make AI work at a systems level from the ones that are stuck in just deploying tools?
Gene Kim
AttendeesOh my goodness. I'll actually quote someone who was part of the dev productivity teams at Amazon for the software builder experience. And there was a cross population study as they try to implement Andy Jassy's edict of like everyone has to use AI. And he said, when you studied like 15 teams about who really excelled, he says it was like really 3 things. It was understand -- AI fluency, how good are they at AI? How much have they practiced? Two is like do I understand where the bottlenecks are? And the third one I thought was really intriguing was like the quality of the leader, right, is a leader focusing on improvement, making time to get better at their craft? And that just really resonated with me. So I think that definitely distinguishes and resonates with me.
Sherrod Patching
ExecutivesAwesome. Thank you. All right. We have got to do a quick whip around with this final question, 30 seconds each. So software engineering is changing fast. So to close out, what is a wild prediction from each of you on what comes next? Ryan, I'll start with you.
Ryan Harvey
AttendeesThis might be unpopular. I think natural language is the only program in language you're going to need to know. We see this internally, teams who are spending their time refining specifications are significantly more productive than those who don't.
Sherrod Patching
ExecutivesOkay. Thank you, Matteo.
Matteo Figus
AttendeesSo everything is changing super fast. I would say my prediction is that this year, next year, every AI engineer will be raising agents and nurturing them. And my prediction is that this will require more technical skills rather than not.
Sherrod Patching
ExecutivesAwesome. Mans, what's it for you?
Mans Booijink
AttendeesYes. I think every company will have full agentic teams, but also agents which are managing their own budgets, hiring other agents when needed, scaling up when needed, scaling down if needed. And from there on, humans are only setting direction, giving the goal and the right context to get to that goal.
Sherrod Patching
ExecutivesGreat. Thank you. And Gene, take us home.
Gene Kim
AttendeesI guess I agree with everyone, and I think we're seeing -- we're starting to see glimpses of a world where everybody codes, right, it's not just developers, where marketing people code, UX, design, CFOs, CEOs. And so I think it means like we're going to 100x the number of developers we have on the planet. And so you do the math, like there's 20 million now times 100, it's about 2.8 billion developers. So it's like about 1/3 of the world population. That feels right to me. So all those graphs that Manav showed this morning of like the growth rate is like, "Oh, get ready, more is coming."
Sherrod Patching
ExecutivesJust the beginning.
Gene Kim
AttendeesYes.
Sherrod Patching
ExecutivesAwesome. I love that. I love we finished on predictions. Up next, we have the research agency from Stanford to join us. So I'd like to welcome to the stage as we come...
Simon Obstbaum
AttendeesHi, everyone. Thank you for having me. Excited to be here at Transcend. So you've listened to what's going on a little bit from practitioners. Now I want to connect that a bit with the research study we're conducting at Stanford. I'm with Stanford SWEPR. It stands for Software Engineering Productivity Research. It's quite a mouthpiece. So a quick round of introduction. So I'm a researcher with SWEPR since 2020. I have an industry background. I'm a CTO at a fintech, a neobank. We do AI-based lending. I'm also the former CTO at Crunchyroll, a video streaming platform, and I was in charge of engineering when they went through a very rapid growth and this inspired some of the research that we're doing. I got connected to the people at Stanford. One of them is Yegor. You see him here. Unfortunately, he couldn't be here today, but together we kicked this off. Just to give you some context, we've been doing this quite a while. Our research has been shared by Elon Musk. It was notably the piece on ghost engineers. Marc Andreessen. We do various events and also the mainstream media picked up on our publishings. And, obviously, we also submit to the major AI and software engineering conferences. So we publish a bunch of papers each year, and all the research is ongoing. So before we dive into that, I will have to explain you a little bit on the methodology that we're using. Basically, how do you even measure software engineering productivity? We didn't know that when we started. I mean there were things like counting commits, counting PRs, counting lines of code. None of that seems something that is really a good way to measure software engineering productivity. So we were, kind of, trying many different ways to figure out what could work. And what seemed to work is actually an expert panel that looks at code written by the engineers. They give their feedback. We ask them questions on implementation time, quality, maintainability, complexity. And then that was the first surprise here in the study. The experts were in very high agreement. And if you guys have been in engineering meetings, it's really hard to get engineers to agree on anything. So for us, that was a big surprise. It was exceptional. We used something called the intra-class correlation coefficient to calculate the agreement. And then, okay, so we found a way to actually measure it. Now can we do that at scale? And in order to do that, we try to train a model that would replicate the expert panel so that we could look at it at thousands commits in very little time. So currently, we have hundreds of companies enrolled in the studies. I haven't updated that number in a bit. I think we're north of 200,000 engineers that were analyzed. The beauty is we can go back. And I think roughly, this represents depending on how you calculate, maybe even 1% of the software engineering population. Now you have to assume that this is not a perfect way to look into productivity but better than other metrics. And all the following slides are kind of based in that methodology. And if you assume that now we have a way to measure productivity consistently, let's take a look at what AI is actually doing. So we'll have a couple of sections. So we'll be looking into how AI benefits are unevenly distributed, how structured practices are important. We'll look at a real company case study, and we'll give you some benchmarks on AI spending because it seems a lot of people have question marks on what's the appropriate budget allocation and some organizational implications. So the first finding here really is that like AI is not really delivering benefits to everyone at the same time. So what I can show you here is that like we picked 46 teams that use AI. And when we started this, we had also a control group of 46 teams that didn't use AI. So while we kept doing this, our control group fell apart at one point in 2025 because there were no more teams not using AI. So we kind of had to extrapolate from the control group in the beginning. But initially, we had like a 4.8% difference. And now the recent update really is 59% difference in output. So -- and the gap is really -- it's gotten wider. And maybe some of you saw that Fable is out, the Mythos-class model. So let's see if we see another spike in the gap. Now if we look at more recent data, then this becomes even more dramatic, right? Like so the bottom quartile teams get almost no benefit still. The top quartile teams often double productivity, the same technology, very different outcomes. So we see a very strong power law effect. The key takeaway here is access to AI is not the differentiator. So if that's what you're using to measure, you've got to start looking into who is successful at using it. Now if we take that from the team level and also take it down to the individual level, you see kind of like the same effects. So heavy users outperform the light users, but team effects matter even more than individual effects, right? Like so even if you have a top performer in the team, they will likely be slowed down by the laggards in the team and they won't be able to deliver to their full value and possibility. So that means AI productivity is a team phenomenon. And in the same time, this also changes who succeeds inside of organizations. Now throughout our measurement, when we looked at who is advancing through the performance quartile. So what you see here is Q4, that's the top performing quartile. Q1 is the bottom performing quartile. So we have not seen a lot of movement in the past. So the p-value was pretty stable, 0.70. But now the rank stability, it fell to 0.45. So since AI -- and we haven't really seen that in any period, no matter what the change was, whether it was going remote, whether any kind of transitions we had in the past, the rank stability was never that low throughout the study. So surprisingly, we see actually a lot of movement upwards from the bottom quartile to the top quartile. And it's interesting in a way because we think -- and we're hypothesizing here and based on interviews that we conducted is that we see that maybe very senior engineers that were doing supporting functions, supporting task code reviews and whatnot. Now they get time to delegate these to a model to an agent and they can contribute and they're outperforming everyone, yes. But then, the takeaway is AI is changing the skills that matter, and that's why we see those changes. But people ask us like, okay, so if we give people AI, then what happens? Does more AI usage deliver better results? And the answer is not necessarily. So we see that the successful teams, they work in a clean engineering environment and cleaner environments achieve larger AI productivity gains, which is not really surprising. For me, as an engineer, I'm somewhat moderately offended by that because what organizations haven't done for the human engineers, they're now doing for their LLMs. But yes, -- so the key point is the environment quality matters. And the reason is that clean environments allow AI to operate more autonomously. And you can see that also when you look at the task composition and the environment cleanliness, when you look how things come together, there is a tipping point where if you fall below that threshold, then it's just the agents can't really deliver good results. So AI just amplifies the environment in which it operates, yes. So let's keep moving. How should companies then measure whether AI is actually delivering value. So I think what we could do is -- and ideally, we would look at business outcomes. So ultimately, I think that would be what would be best. But it's a very noisy signal because there's just too many confounders. So in absence of an ability to properly measure that and correlate that, we recommend looking at the engineering outcomes. That's a relatively clean signal. And then that gives you a pretty clean framework. So you can start using -- measuring the AI usage, you start measuring the AI outcomes, you connect the 2 and you avoid jumping directly to revenue conclusions essentially. And there are several practical ways to measure the AI adoption. So I was hitting on that a little bit earlier. So there was an access-based way of doing it. It's essentially like companies that are in the rollout phase. They want to make sure everybody can access it, everybody can use it. But in the end, it's not ideal, right? Like what we saw is that access and usage telemetry is the gold standard. So while people having access is good, it's better than not having access, but because of the discrepancy showed on the earlier slides, you need to double down on the people that really are killing it. We can actually look at that retroactively in our study because we have to get history. And that's that. Then the next thing is, okay, how do we measure the engineering outcomes? And here is how we think about that. So essentially, the primary metric that we're using in the study is the engineering output as per the expert panel and per the machine learning algorithm that we've trained. And what we use as guardrails is rework, refactoring, quality tech at risk. And also the DORA metrics are super important in terms of measuring flow efficiency. Happiness metrics are useful to check in on your team, but not necessarily as a productivity metric. So the takeaway here is you can try to maximize the output while keeping the guardrails healthy. And with that being said, I think we're ready to move to the next finding here. So what we did and we submitted this paper to ASE, the conferences in October. We have peer reviews in. They're very favorable. We have defined 4 levels. So ad hoc prompting, rules and project context, task-specific agents and orchestrated multi-agent workflows, which where we saw is kind of like the best results are delivered. And the point here is that the AI maturity leaves artifacts that we can analyze. And so we built a classifier to identify these artifacts and do an analysis of what's going on there and how your engineers and teams are talking to their agents. So we analyzed hundreds of repositories. We used embedding, paths and content. And we use that to detect the actual AI maturity signals and got really strong validation results in the sense that we saw strong clustering effects that tie into a higher performance, higher output. And that maturity has a measurable impact on quality as well. So when you look at it, repositories with no structure, they suffer a lot more degradation in terms of quality as you keep using your agent and cognitive complexity for the engineers keeps increasing, static warnings go up. So no matter how you slice and dice it, it's not a good idea without proper harnessing, proper instrumentation and tooling and essentially structure protects your quality. So when we look at individual developers, we see the exact same thing. We see that PR throughput is dramatically improved. duplication is decreased and revert rates are improved. So there is -- yes, there is only benefits that we see here. So there is simple documentation, context practice, those really create outsized return. There is no measurable net negative effect on that. So everything you do in that direction, you will have some gains. Now in order to validate that, we can look at an actual case study of a company that did that. So it's a real enterprise example. We track output quality and churn together. And the company really started at a below average output. And so at one point, the CTO said, like we want to 2x everything, yes. We had the CTO mandate, nothing much changed in the metrics initially. What really started to change thing was, first of all, like the rollout, the adoption. So access was actually -- it didn't matter. They adopted AI in May 2025. They started in the 25th percentile. Productivity doubled. They move towards the 60th percentile now. So the large gains are achievable, right, even at 600 engineers. And -- but productivity gains alone, they don't really matter if the quality collapses. When you look here at the quality analysis of the company, so before AI, they were more or less stable. They were -- sometimes it was going down, then it was going up again. When they focus on it, you probably know how it is. You have to ship something quickly, it degrades, then you spend a little time in optimizing, improving it, it goes up again. But then in the beginning of the AI adoption journey, you see a steep cliff when quality went down. So with proper tooling and instrumentation, they essentially were able to reverse the trend. Productivity remains high while quality stabilized. And recently, in the last few months, they have even achieved an improvement in quality in their agentic workflow. So quality degradation is, in fact, manageable at this point. And the next metric is also giving you an important story. So the churn rate, essentially, we cluster rework and refactoring in that churn rate metric. So that one is also down, yes. So AI improved also execution quality, not just speed. And yes, so understanding what causes these changes is now really the next challenge. What we are trying to do now is we're trying to put a pilot program together that -- where we kind of tie all the engineering metrics and they show correlations. The correlations are not causes. So the real drivers here may be meetings, calendar load. And we don't really know yet or don't understand yet how these things tie together. So this is kind of for us, the next step in our study to really find the causes and not just the patterns. And if anyone here in the room is interested in joining that effort, feel free to reach out, but let's switch gears from productivity to investment. So everybody is wondering also like what is an appropriate amount of money for me to spend. So at Stanford, we launched the AI Spend Index, where we have -- we've gotten consent to publish some of the spend from companies in our study. We track AI spend per developer. There's benchmark organizations. And what we see, the high-performing companies, they spend significantly more relative to the ones that are in the lower-performing quartiles. So underinvestment can become a competitive disadvantage. However, when you look at that, so the benchmark, it kind of becomes more valuable when you can look at your peer and industry data, it's similar concept to levels.fyi. You can compare against your peers, contribute data and essentially, you can unlock more visibility. So the value is really where you sit in your cohort and how you do relative to them. So you can compare based on a few things. I'm just -- you can open up the website and see where you sit and hopefully contribute also some data back if you find it useful. And yes, so I think this is now the science part that is pretty well understood. We're entering a bit conjecture territory here because we're really trying to tie everything back together. And what we see often is that like AI speeds up the individuals, but a lot of organizations that we have in our study, they actually fail to capture the gains, and we want to get a better understanding on why that is. So we have a -- we see that like lower success correlates very highly with size. So the hypothesis here is that enterprises spend a lot of time on internal alignment. So it's -- in principle, people know what they could do or should do, but then you need a lot of time to get everybody who needs to be bought in, bought in and get that done. So it's a lot of meetings, decks, approvals, politics and AI doesn't automatically eliminate these costs or improve it. So a lot of productivity gains get absorbed by coordination and network complexity explains why. So when you look at it, a very simple thing. We published on that like a few years back already. So we see as the number of nodes go up, like the communication overhead just becomes crazy. That's why single-threaded ownership tends to become so important. But every time now, it's amplified by AI. So while you could move faster, you kind of like lose the wins or the benefits in alignment. So this is also like the reason why start-ups tend to benefit more than enterprises. And then you really have a challenge with regards to organizational design. And what we see is AI native companies are really operating fundamentally differently relative to the classical traditional organizations that we have in the study, like they're really built to capture 24/7 engineering, right? Like so it's nonstop. You got your agent running. It's not just another tool. I think they're scaling through compute, there's persistent organizational knowledge and compounding capability improvements. So what we see is they tend to have found a good way in order to remove any blockers where a human decision is kind of -- would be the slowdown, yes. And with that being said, so if you're interested in enrolling your company in our research programs, there is a number of ways to reach out. You can participate in the research, you can contribute to the AI spend. You can help us explore more causal discovery, discuss company-specific opportunities. We're also open to that. And yes, so I think hopefully, this was relevant to you. In our view, the organizations that learn fastest how to measure and operational AI, they will capture the majority of the gains. Thank you so much. That's it from my side.
Sherrod Patching
ExecutivesThank you, Simon. Thank you for sharing the research with us. Well, this brings us to the end of today. So whether you're tuning in online, thank you for joining us to the developers in the room and online as well don't forget the developer show and also the hands-on lab, the Duo Agent platform, that'll be running in just 90 minutes. And for those of you that are here, we look forward to seeing you at the next Transcend. Thank you for coming. [Break]
Colleen Lake
AttendeesHello, hello everyone. I'm Colleen Lake, coming at you for The Developer Show live from Transcend. That's right. We came across the pond. Now if you've been following along at home, you know that Transcend has been busy. And so this is kind of what I like to call the postgame coverage. For all of you sports fans in the developer show audience, I think that might just be me. I know my audience so well. All right. So today, we've unveiled some pretty cool stuff. First of all, Orbit, then Flex Pricing and Governance. Now I know the developers here really just mostly want to talk about Orbit. So that's what we're going to be doing for the next hour. Now what it's going to look like is I have 4 guests joining me today. First up, we have William Arias. William is a SaaS Developer Advocate here at GitLab, and he has been going deep into Orbit for the past few months. He is also a data scientist, and he really likes digging into data. Next up, we have a customer of ours, Felix Becker from Deutsche Bahn. Felix is an AI advocate within Deutsche Bahn and a long-time customer and user of GitLab. So I'm very excited there. Now with Felix, we'll mostly be talking about AI development in general because I don't know about you, but ever since, say, November of last year, all I have been thinking about is how AI has changed development in a way I didn't think it would this quickly and what that means. So we'll be talking about the AI development that Felix has done and what we kind of think it's going to look like in the future, the problems, all of that. Next up, we have Aakriti Gupta, a Senior Engineer from GitLab. She has been here for about 7 years now, and she does back-end development. Now the thing about Aakriti is she has no involvement with the Orbit team whatsoever, but she's a very good sport. So a few weeks ago, we had her just try out Orbit for the first time, and she's going to talk about her experiences there and what she would suggest to you if you're doing the same. Now again, she does not work on Orbit. She just wanted to try it out. Okay, I might have nudged her in that direction, but she, again, very good support. And finally, since I know that we really got to play a little risky here, we're going to have a live demo with Orbit from Angelo Rivera, who is the Engineering Lead of Orbit, and we will be taking the questions live from you from the audience. Now since I'm not completely insane, you can submit those questions now, and we will look through them and pick a couple. But we will be answering your questions live at the end. All right. Now let's get started. Let's bring up our first guest, William.
William Galindez Arias
AttendeesHello Colleen.
Colleen Lake
AttendeesHi, William. How are doing today?
William Galindez Arias
AttendeesHello, everyone.
Colleen Lake
AttendeesNow William, you've been digging into Orbit quite a bit. And I've said the word Orbit maybe 40, 50 times now in the 3.5 minutes I've been talking. But for those of us joining for the first time, can you tell me what is Orbit?
William Galindez Arias
AttendeesYes. So I will start by sharing my journey when I start playing with Orbit. And I have to say that what Orbit is in a business like answer is that it's a service that significantly enhances the context that your agents will use to generate answers. It can take all your GitLab data, index it and make it available as a graph that you can query. So that's what Orbit is. And this part of this journey and that the concept I have as well, I want to share my screen and then show you and share with you the journey I went through when I was starting using Orbit and the context graph. So here in my screen, I hope you can see that this is what many of the developers today, they probably are using when they are dealing with agents. So we have a query or that you can as an end user, you can ask or other agents can query to each other. And then what this will happen is that we are using agents that they are backed by an LLM and this LLM will retrieve information, but this information or this data is coming from a model that already has the weights frozen after training. And this will use all of the sequence to generate a response. So this is what most of us when we are using today agents for coding are using are dealing with. But now the contrast with Orbit or using a knowledge graph to enhance the context is that when we use this, what we are doing now is that the back end of the agents becomes not only the frozen weights of the model, but also the context graph, which has a set of entities and relationships that are factual and that they are built after indexing data, in this case, the data that resides in the GitLab platform. And every time that a query or that an agent needs to retrieve an answer is not only going to use the weights of the model, but it's also going to go through this graph and extract those entities and relationships and give me a grounded response. Does that make sense?
Colleen Lake
AttendeesOkay. So what I'm getting from that is Orbit is based in your current reality and your current system rather than historic data that's generated from memory. Is that correct?
William Galindez Arias
AttendeesSo in this case, what it is doing is taking the data that is part of your GitLab as a platform from the data layer of GitLab and indexing it and building all of these relationships that are factual that are the ones that I want my agent to traverse or to read from to generate a grounded response. And also it's using the weights of the model to understand the query that I put in natural language that could be as an end user or from agent to agent. So we can give more context to this by showing how Orbit looks like in the 3 different views, the web UI and also I can show some of the evaluations.
Colleen Lake
AttendeesYes. And could you also tell me why does this exist? What problem is it solving here?
William Galindez Arias
AttendeesYes. So the problem that it solves and the reason that you as a developer should care about enhancing your development workflow by using the context graph is that you can see in this sequence that I have here that what we want is that every time that there is a query that we will use the power of the LLMs to understand this question. But I also want that the reasoning that the LLMs or the agents will use is based on the known entities and relationships that come from the knowledge graph that will enhance the context that they will use. So at the end of the day, the problem that it solves that is a classic problem that comes with the LLMs is that there is a high risk of hallucinations. And what this is doing is reducing the likelihood that my agent will hallucinate something because it's not going to this frozen set of weights or is not trying to do text matching from a huge data lake, but what it's trying to do is traversing a graph from known entities that were calculated beforehand. So this, at the end of the day, results that you as a developer can have more confidence that whatever your agents are doing is grounded in some data structure that is deterministic and that it will make experiments that you will be doing with this or when interacting with this technology more repeatable, which is one of the main issues today with these systems. So this brings grounded responses and it makes it more predictable at the end of the day.
Colleen Lake
AttendeesOkay. So it's grounded in reality and makes it a lot more accurate and less likely to hallucinate or really just kind of lie to you. It tells the truth.
William Galindez Arias
AttendeesYes.
Colleen Lake
AttendeesAnd it tells not just the truth, but the relevant truth to you.
William Galindez Arias
AttendeesExactly. And that's one of the benefits. Also, as we will see, it requires fewer tokens. It is faster. and so on. So this also means that it will be cheaper, which is also a big topic today when it comes to AI budget.
Colleen Lake
AttendeesYes, because I think every company nowadays has an AI budget or at least every technology company. And we all want to stretch it as far as we can.
William Galindez Arias
AttendeesYes. Okay. So this is a GitLab project, GitLab Orbit. As you can see, this is a repository with a set of groups that has thousands of events. There are merge requests, there are issues. There are lots of things going on here. And what we will do with Orbit is that we can take from this view, you can see that this is an agent that it has in its back-end Orbit. And what I was doing was asking to search merge requests that are fixing open vulnerabilities in this group. And this group is made of a lot of projects. So why am I choosing this one in specific? Because the security issues are the ones that make headlines. And if I want to make sure that I want to have a real state of what are the -- what is the state of the different vulnerabilities that are open in this set of groups and the merge requests that are addressing it, I want to get accurate answers. I want that the correctness of this answer is high. So this is something where I don't want hallucinations. This is something where I need that whatever the agent is giving me is grounded and that is reflecting the reality of my project. So as you can see here, this is kind of a web UI view where I have this Orbit agent. I ask this prompt, I ask this query. But how does it look like in, let's say, in the middle between this UI and the back end Here, you can see that we have a query editor, and this query editor also provides certain set of templates. And this one, I have this one that I'm asking for merge requests that are fixing open vulnerabilities. And when I execute that query, it builds this graph, and this is showing me here, you can see is that I have this deterministic view where here this dot, this node is a vulnerability and everything that I can see that is related to this relationship that goes from this node is showing me all of the open merge requests that are addressing it. So this agent on the -- here on the chat, when it's trying to answer this question is not going through lots of pages trying to do text matching. What it's doing is reading and traversing this graph that has this deterministic relationship there and is grounding the response based on this data structure. So this view is giving me that peace of mind that I can tell that, yes, this answer was -- sorry, this prompt was answered given these data structure that is in the back end. And I am asking a question that is about security. So I don't want hallucinations in this context.
Colleen Lake
AttendeesYou definitely do not want hallucinations in security context or at least I don't. I don't know about our audience, but that's one of the places where I would most like to avoid hallucinations.
William Galindez Arias
AttendeesYes.
Colleen Lake
AttendeesNow can you tell me how else you're using Orbit? What have you learned from testing it in complex environments?
William Galindez Arias
AttendeesYes. So this was a very interesting part as a data scientist. So one of the things that we were playing on and here my screen, I hope you can see, we have built with the team, and I thanks part of the team that built this Orbit Observatory, where what I was doing was playing with a different set of prompts where I wanted to stress out the capabilities and putting into practice the theory that about knowledge graphs and how they enhance the context of agents. So how this now context graph can help me. So you see that this prompt here is asking that across the GitLab or group, which you saw is a very large...
Colleen Lake
AttendeesVery large.
William Galindez Arias
AttendeesProject. What are the -- which are the users that have authored the most merge requests in the last 30 days. So when we run this prompt, you see 2 things I can show here. So first, we have different views. One, where we have the Orbit the service, the one that is running in the cloud, where what I'm doing is asking this question to an agent that doesn't have Orbit in the back end and another agent that has it. And the winner is clear. And here also for the people that are not only -- as the developers, we also sure care about the business angle of all of this. But you can see that it's cheaper, it's faster, and it consumes fewer tokens.
Colleen Lake
AttendeesThose all sound like good things to me.
William Galindez Arias
AttendeesYes. And those are things that our managers would like to hear. But for me also as a developer, what I would like is that it's more accurate and that is giving me a grounded response. So all of this, we can see in this table that when I run this prompt, it went through all of this, and I can have this comparison where the one in the cloud, we see that in time out. And why this happened? Because the type of question that I'm asking is a question that is very friendly for anything that is graph shaped. So this is touching different boundaries and domains from GitLab, the platform. This needs to go to one part where there are merge requests that needs to go to another part where there is an authors, where there is code. And all of these hops are things that they -- or the service is aggregating and precalculated for us. So when the agent just needs to answer this, it doesn't have to go to different set of -- it just goes to one point and that run this query, provide this graph and gives you the answer. What we see now on the screen is that when we only use the API endpoint, what it has to do is to call many different endpoints and then try to aggregate this. And at this point, what the agent did was it said the platform said, I don't -- I cannot do this, I'll come out because it's too much work. So this is one of those examples where we see that these type of prompts or queries that are the ones that we have many times in the day-to-day when we want to understand a code base or we want to answer these questions are the ones that are most relevant. Does it make sense so far?
Colleen Lake
AttendeesThat makes a lot of sense to me. And I really like what you said about it using less tokens being cheaper, but also that does not matter unless it is accurate because cheap and fewer tokens is great, but accuracy is above everything. I don't want something cheap that makes my life a lot worse. I want something that simplifies and helps me day to day.
William Galindez Arias
AttendeesYes, precisely. Because this is another thing that as a developer, I care about it's not only about being faster, but it's also that I am moving faster, but in the correct way.
Colleen Lake
AttendeesSo the point of the tool is to enable developers.
William Galindez Arias
AttendeesExactly. And here in my screen, you can see now that in this evaluation tool, we were not only evaluating what is in the cloud, but also how it will look like when I am using Orbit locally in my computer. So all of this has been the mechanical automation that I'm using cloud code and it's running the same prompt with Orbit and without Orbit. And we've seen that in these 2 cases, Orbit was the winner across all these dimensions that we mentioned before. And one last thing I want to share is that this evaluation will make more sense when we try and test it through different use cases. So you see that in this one, we were also asking what are the top 10 projects with organized by critical vulnerabilities. This is one of those that I want to make sure that I get an accurate answer.
Colleen Lake
AttendeesYes.
William Galindez Arias
AttendeesYes. And also the one that we showed before, the outsource of a merge request failed CI jobs. This is another thing that, let's say, now as a changing hats as a platform engineer, I would like to understand quickly where are those CI jobs failing and being able to diagnose and create a plan for that. This will take a lot of time if I want to do it quite in different ways to API endpoints. This is another use case that I found that is very useful when I have a precompute graph that I can just query and get these answers.
Colleen Lake
AttendeesWow. Now that's a fantastic demo. Thank you so much for that, William. And I have one last question for you. And I do understand that we have some audience questions. If we have time at the end, I'll come back to them. But I had one other question for you, William, right now before we move on to our next guest. And that is, if someone from the audience wanted to bring Orbit back to the team and run an experiment, like if they want to do that right now today, where would you suggest that they point Orbit at first?
William Galindez Arias
AttendeesOkay. So based on the evaluations I ran, my advice will be go to one of those large projects that you have that maybe you are planning some refactor because this is a very excellent use case for this to measure the impact of what will happen if you change some name of a class. But also, you can go to a smaller project, even though this shows lots of benefits when we're using a large code base. In a small project, what this does is that it allows me to also ask those compound questions, those questions that cross boundaries across the platform that there is not a single resource or API endpoint that can answer. So even if it's a small project, but I need to combine different domains from GitLab, the product, the platform, I can also use it. So turning on index a very large project, you will see the benefits, but also even if it's small, just go for it because you can see that it will help you to ask difficult questions, complex questions at a more accurate with more accurate answers and with all the benefits that we mentioned when it comes to cost and time.
Colleen Lake
AttendeesOkay. Are there any projects that you would suggest steering clear of for now?
William Galindez Arias
AttendeesComing in.
Colleen Lake
AttendeesAre there any specific projects that you would suggest not starting with, but maybe building towards?
William Galindez Arias
AttendeesNo, at the moment, not. We need to -- I would like to evaluate more and then I can come back with a better answer to that. But for now, just go for a large project and a small project because the benefits are very visible in both cases.
Colleen Lake
AttendeesAll right. Great. Well, thank you so much, William, for joining us today.
William Galindez Arias
AttendeesThank you.
Colleen Lake
AttendeesAll right. Next up, we have Felix Becker from Deutsche Bahn. Felix, thank you so much for joining us. I'm going to close the computer. Hello. How are you doing today?
Felix Becker
Attendees[indiscernible] Lots of exciting news that you announced and everyone, hi in the stream.
Colleen Lake
AttendeesNow can you tell me a little bit about what you do at Deutsche Bahn, both as a company as a whole and you as an individual?
Felix Becker
AttendeesSure. We drive standards and harmonization through platforms and development. And one of our platform is the developer experience platform. And my role is being a platform manager to find the right products for the platform and also in governance leads so that we have the right things in place to define our standards and our policies.
Colleen Lake
AttendeesBecause it is a highly regulated industry you're in because of that...
Felix Becker
AttendeesIt is. Yes.
Colleen Lake
AttendeesWhich as we would want it to be.
Felix Becker
AttendeesYes. There are life on the line if the things go off the track. So it's really important that we are responsible and have a high reliability and high quality in our software products.
Colleen Lake
AttendeesNow, we're here today to mostly talk about AI. I've heard a lot of people go back and forth about use of AI in highly regulated industries. And I know it's a little bit different. Can you talk about your journey with AI?
Felix Becker
AttendeesYes, sure. We started about 2 years ago, and we started like everyone with the client side agents that helped us for code completion and our journey was twofolded. We had a lot of regulated stuff and compliance and data protective stuff to do, and we also introduced the tools in an internal hackathon so that people can actually use them. And basically bringing AI into the company is a joint effort. So you don't work as a 1 or 2 main team. You have like legal included, you have data protection, security included. And also, we have like a good connection to our union labors. And they were all really forward-thinking, open-minded so that we got this technology in our company.
Colleen Lake
AttendeesThat's fantastic. Now it's moving so fast. And you live in a very -- or you work in a very complex environment where your data lives. What does that look like? I know that you face some very unique -- well, unique but weirdly universal problems. What makes it kind of hard to bring AI into your environment?
Felix Becker
AttendeesYes. To give you a sense, we have around 10,000 developers. We have 70,000 repositories. We run about 2 million pipeline runs in a month. And with this whole set of developers and lots of enthusiasm in that space, we have to make sure that we have like the right security in place. We worked with strong partners in that area, and we built up the knowledge that it actually needed to take the benefits of that technology.
Colleen Lake
AttendeesOkay. And what are the benefits to you? Because I've seen a lot of back and forth from people. Now it's no secret that AI creates a lot of code. And now the code is actually pretty good, which we could not say 3 years ago. It is production-ready code. But with a lot of additional code, does it help your productivity? Does it decrease it in some places? Or is it a mixture?
Felix Becker
AttendeesThat's a really good question. What we see so far is, I guess, it was around last year November when the models made a huge change. So the quality went up tremendously, and we were now seeing chances that we really have good quality code. But we get, on the other hand side, a lot more code than expected. So the thing is the traditional workflows are still like we had less code and the systems work like we had less code, but all of a sudden, we get a lot of more code. And for that, we have to build the systems around that as well to prepare not only the CI/CD pipelines, but also the workflows in the pipelines with the people that we are able -- we see more merge requests coming in that we need to review. We see a lot more reviewing. So the jobs are changing, the responsibilities are changing, and we still have to make sure that everything has high quality.
Colleen Lake
AttendeesYes. Being a programmer involves a lot more reading and writing than it used to.
Felix Becker
AttendeesExactly.
Colleen Lake
AttendeesI was at an open source conference a couple of weeks ago, and that was the one thing that people were talking about, which is the influx of AI development is great in so many ways, but it can be a big stress on maintainers, especially many of which are volunteers because there's just so much to review now. And now while AI is fantastic at reviewing, you do also want a second set of eyes, a human in the loop in many cases.
Felix Becker
AttendeesYes, sure. Open source is a whole different game. I think the chances with being a contributor now with AI, with the help of AI is very high because all of a sudden, you can write faster code, you can understand code base very good, but I want everybody to ask to be more responsible with the code that they submit in an open source because the maintainer has still to be accountable for that. Speaking of accountability, that's the same thing with us in our company. Developers are still accountable for what they write. And therefore, we follow the EU AI Act, and we still have humans in the loop. We might see that changing in the far future or not so far future. But for now, it's very important for us and we still review and we still have seniors looking at the code.
Colleen Lake
AttendeesYes. And what you put in matters so greatly here. Now you and I have talked about this a little bit off camera, but you mentioned that in the past, you've had big differences in quality of code based on what you've put in. Could you talk about some of the problems you've seen or faced in the past and also how to avoid that?
Felix Becker
AttendeesYes, sure. So in the beginning, we -- first of all, it's very important to understand the models and understand the tooling. So when you start off and just think I can type something out in human language and expect good results, then you get to go through a learning curve. So you really have to understand the tooling. And therefore, we saw at the beginning a small dip. So until everybody is on the page. We doubled down on AI this year. So we introduced a large program where we enabled a lot of people, not only in software development, but also on other side of things. But bringing the knowledge in and using the models intentionally, asking them to do smaller things and be ready for the results to be reviewable, that's where the effects went better for us. And of course, the better models and the regular better technologies that's improving pretty rapidly.
Colleen Lake
AttendeesYes. The technology is improving as is our ability to interact with it.
Felix Becker
AttendeesExactly.
Colleen Lake
AttendeesYou mentioned an AI enablement program and not just developers. I'm very interested in that. What does that mean? How are you enabling nondevelopers at Deutsche Bahn?
Felix Becker
AttendeesYes. We have basically a product and stream where we find out what does it mean to use AI, not only within development, but also sometimes we call them white collar workers or something. So we think about working with BahnGPT, that's our internal chat application or with Copilot within the Microsoft space and how we not leave behind the workers that actually have important things to do, but do not work only with code. So we do a lot of how can you leverage prompting for optimizing your own work, how can we work with our customers or end customers basically to bring in AI into the products that we see outside of the Deutsche Bahn.
Colleen Lake
AttendeesOkay. Now that's very interesting. Do you see major differences between other white-collar users creating code with AI versus programmers?
Felix Becker
AttendeesYes, sure. We also have like very curious interested people that join the coding space or the arena, I would say. And they experiment. So we are in a phase where we're experimenting a lot. We have like programs where we especially look in how we would work with agents and the team, how would change our way of working together. And yes, this brings a lot of change within the company. And right now, we see the time where we have to build guardrails and security around that so that we can basically industrialize the things that we have in proof of concept so that we make it more reliable and usable for everyone.
Colleen Lake
AttendeesWhich is what we all definitely want.
Felix Becker
AttendeesAnd this is the job of the platform for the future, not only developer experience, but also agent experience.
Colleen Lake
AttendeesYes. And what does your actual day-to-day look like with coding? How much code are you writing yourself versus prompting? And can we also talk a little bit about the difference between coding with agents and whiteboarding some stuff? What is the difference there?
Felix Becker
AttendeesI mean that's a really important question, and this is something that's mainly misunderstand that white coding is for me prototyping. It's just getting an idea into life, giving some people a chance to click on it, not having paper prototypes, but make it actually feelable and usable and work with stakeholders together to get in first feedback. So we go away from paper prototype Figma drawings into something that's realtistic in the real world and is usable. And on the same time, we need an understanding that this is not production code. And we see a lot of expectation because white coders things are really pretty fast over the weekend, over 2 or 3 days. We have something really -- but what chance really good, but this is not production code for us. So what we are thinking about is bringing in a path how to -- what it takes to bring something that is white coded into production. And for this, we have like a lot of supporting technology in the pipelines and in the scanners that we use around so that we can make sure that the things are in a good quality and reliable. And we also have a lot of principles internally what it takes to have good quality software.
Colleen Lake
AttendeesSounds great to hear.
Felix Becker
AttendeesYes.
Colleen Lake
AttendeesIt seems like you thought a lot about this.
Felix Becker
AttendeesYes, we thought a lot of this, but we are continuously thinking about it. And this is so great to be in a conversation with you to learn more in Transcends and bring things home and adapt our thinking and our workflow.
Colleen Lake
AttendeesIt's great to have you here. Yes. Now I have 2 more questions for you, and we have only a couple of minutes left before our next guest. So we might have to speed run these. So the first one -- I believe in us. I believe in us. The first one is, what is the gap right now between what AI tools promise and what they actually deliver in production?
Felix Becker
AttendeesYes. That's a tough one for the short.
Colleen Lake
AttendeesI know. I believe in you, I believe in you. You've got this, you've got this.
Felix Becker
AttendeesSo I mean it's all about using the AI intentionally and have guardrails around it and have good disciplines. If you're good in agile practices, if you're good in DevOps and the practices around that, you will benefit from AI as well because AI for me is a multiplier. And therefore, teams who are good at these practices can benefit more from AI than others.
Colleen Lake
AttendeesAll right. Now my last question for you. We've talked a lot about how AI coding looks -- has looked incredibly different basically since last November. What do you want in a perfect world? What would it look like for you? What strides would we make in the next 6 months?
Felix Becker
AttendeesGood question again. What I see as everybody else see for sure is that we move from the client into a server side agent scenario where agents run on the service side run more autonomously and basically more agents to work on their own and making decisions on their own. What I would like to see is that companies like you talk more about the second week, not only the first day and the second day, but the second week. So we usually do not bootstrap from greenfield projects. We have large code bases, and we have being -- we have to be very responsible to make that change and have tooling support and more best practices with working on the second week as would be fine. So for us, it's not everything. It's like easy demos and greenfield. So my wish would be that companies like you focus more on being ready for the second week.
Colleen Lake
AttendeesThat's what we've been really working to do. And because especially, you're right, it's very rare to have a greenfield project. You can have one for a demo, but usually, you're working within an existing environment and the tooling needs to reflect that. And not to say Orbit Orbit Orbit, but that is one thing Orbit does great. Well, thank you so much for joining us, Felix. Our next guest will be coming up now, but thank you, Felix.
Felix Becker
AttendeesThank you for having me.
Colleen Lake
AttendeesAll right. Next up, we have one and only, Aakriti. Hello. How are you doing?
Aakriti Gupta
AttendeesI'm good. How are you?
Colleen Lake
AttendeesI'm good. I always love when I have these little microphones. I feel like a member of One Direction. I hope I'm Niall. Basically now, Aakriti, you are a senior staff -- senior back-end engineer, sorry. It's been a morning, guys. You are a senior back-end engineer here at GitLab, and you've been here for 7 years. And Again, I want to stress this to the audience, you did not work on Orbit.
Aakriti Gupta
AttendeesI did not.
Colleen Lake
AttendeesCan you tell me about what you do at GitLab?
Aakriti Gupta
AttendeesRight. I'm-- if I get back to [indiscernible], she said, right now, I work in the Tenant Scale team. So I take care of things like groups and projects. So you look up groups and project page, things are slow, it's my job to fix. If things are buggy, it's my job to fix.
Colleen Lake
AttendeesYou're the fix there.
Aakriti Gupta
AttendeesYes.
Colleen Lake
AttendeesLady fixit. Amazing. Now A few weeks ago, you used Orbit for the first time. Can you tell me about your experience there and what project you used it on?
Aakriti Gupta
AttendeesRight. I used it on the GitLab project, the one monorepo we have. My first thing I remember of the experience is it was intuitive and easy to understand what it was doing. It was indexing everything, right? And very simple documentation to follow. I could get it set up in very few minutes. I was actually surprised how quickly it indexes the code. And I tried to understand, okay, should I do this remotely or should I do it on my machine? And I tried both the things. And there is a slight difference between the 2. So if you do it on your machine, it's only indexing the code in your repo. If you're doing it on a GitLab instance, it's going to take into account your merger class issues and everything. So the first experience was, wow, this is quick. It is indexing everything. That was the first idea I had.
Colleen Lake
AttendeesIndexing everything?
Aakriti Gupta
AttendeesYes.
Colleen Lake
AttendeesThat's a lot of things.
Aakriti Gupta
AttendeesIt is.
Colleen Lake
AttendeesWere there any things that when you started using it, that kind of tripped you up?
Aakriti Gupta
AttendeesLet's say -- so just -- I think it was just a difference between remote and local that I took some time to understand. And I was looking for examples of what do I use it for, but I'm glad I just started playing with it on my own and put up my own use cases and start discovering it, but nothing really strange.
Colleen Lake
AttendeesWhat was your favorite thing you used it for?
Aakriti Gupta
AttendeesFavorite. Where do I start?
Colleen Lake
AttendeesYou tell me?
Aakriti Gupta
AttendeesI think it was accessing a large piece of back-end code to see where it could be refactored. There was -- we built a framework some years ago for geo replication where you could replicate any data type in GitLab. So anybody -- any developer from GitLab could come in and use that, not just the team that built it. And I had worked with this code for a long time. But then for quite some time, I've been off that team, so I didn't. So I knew some of it. I don't know all of it. And I went in assessing that big piece of code. What is the setup like? Is it modular? What is the -- how are the classes related? How does the logic flow? Where is the authentication? Where can there be improvements? What can I refactor? So I was impressed by what a good summary I could get using Orbit and not just the summary of the code, not just the relation between the code, but also about where the architectural decisions were, what were those architectural decisions? How did this piece of code grow over time? As I start asking it more questions like machine is working on it or who are the developers I should talk about this, who are the most active ones, for example? What is the teams priority right now, right? These kind of things plus the context Orbit could have on the code was really powerful.
Colleen Lake
AttendeesOkay. So a lot of the developers who are watching this are similar to you in that they're very experienced developers, but they've never touched this tool. And I mean, they haven't heard of it until about 2 hours ago. So that's understandable. What would you suggest how would the team set this up?
Aakriti Gupta
AttendeesGo through the documentation, exactly the steps. It's very simple to get started. Only thing is decide whether you want it remotely. That is you would need some authorization from whoever manages your groups. to see what projects this is activated for or if you do it directly on your repo, it's just on your machine. So that's really not much to do.
Colleen Lake
AttendeesOkay. Great. And what would you recommend for a good first use case for someone?
Aakriti Gupta
AttendeesOkay. First use case, if you are jumping into a new project or if you're new to a team or a company and you are diving right into it, usually, somebody would give you access, you get Git clone, you look at some documentation, that someone handed to you, starting out a little bit. Somebody tells you this is legacy code, don't worry about it. And then somebody helps you a little bit. But by the time you get to actually contributing to that repo, takes a long time, a lot of steps -- and a lot of times hear things like legacy, legacy. Don't touch it. Nobody knows about it. But legacy code is really just code that nobody is handling right now or all the people who had context on it have left the team. That is legacy code, right? So it's a really good tool to get into code you've never seen before or a legacy code or asking questions of it. That I think is my favorite use case. When I first started out in tech, I worked in consulting and one job that I kept doing over and over was going to a company right before the one person who knew the code retired. So we could touch it and like document it. And it seems like this is a very good use case for that and I think developers would be very thankful for it in a way, I am thankful.
Colleen Lake
AttendeesI am very thankful...
Aakriti Gupta
AttendeesIt can plug the gaps in documentation, for example. It can tell you where to plug the gap, and it can also -- it also just has that knowledge.
Colleen Lake
AttendeesShowcase clients.
Aakriti Gupta
AttendeesIt does. It does. And you can ask questions of it that you can't find in the documentation for example.
Colleen Lake
AttendeesOn onboarding, I find myself sometimes very shy about asking too many questions. like I think, oh, no, if I ask these people 25 questions in a row, they're going to think that this is just a hat rack up here and empty inside. And so with -- maybe that's just my own confidence issues, but I do think that a lot of people find it a lot easier to just figure it out ourselves or go through Orbit or another tool to just dig in there and find the answers.
Aakriti Gupta
AttendeesThat is true. I think it's very different if you give your code base to a person who's just joined your team or there's an experienced engineer with a lot of context on the project, on the priorities of the team, how the team functions the process, everything, and they can sit down with you and introduce things to you or help you get to your first commit. And the thing about asking questions you should. I think it's great to ask questions.
Colleen Lake
AttendeesI do ask questions and the team is in the audience, and they can affirm that with you. Yes, that's true. There's a level of question you can ask.
Aakriti Gupta
AttendeesThat is true. And well these agents don't feel bad if you ask a lot of questions.
Colleen Lake
AttendeesGreat. If I ask an agent 45 questions, well, okay, my usage tokens. But if I ask them all at once. Yes. That will be something.
Aakriti Gupta
AttendeesYes. Maybe humans get a little bit impatient, but an agent is not a problem, especially because it can't give you answers. Or if it can't, it's just going to say, well, my resources are limited. This is what I have, and I can give you an answer. On that front, actually, I really liked when I started using Orbit in my IDE -- the interesting bit was when I asked of it a technical question, it could choose whether to use the standard tools it had or if you wanted to use the knowledge graph from Orbit. So it could choose between the 2. I don't have to say, use orbit for this task. So I don't have to think about what back end or what tools should go into it. It does.
Colleen Lake
AttendeesWow. That's very useful. Now the main benefit of Orbit is, of course, the overall context that it gives. How have you seen it improve the accuracy or help your development?
Aakriti Gupta
AttendeesThat's interesting. So I took 2 problems to it. One was about refactoring in general, the one that I already shared, the example. And the other was interesting. It's a question I would generally take to an agent. I would say, okay, I'm considering adding this method here for calculating the checks. And it came back and said, there's already a method that does something similar, but I see what add-on functionality you're trying to give it. You can do this here. And generally, what I would do with that agent is say, okay, then go implement it, create it or merge and then we talk. But here, it went a step further. And not only did it say, okay, this is the method you should edit. It found the places where the method was used, which is also what any other agent would do. But on top of it, it could find several other places, 23 in that example, where 23 pieces of the code that are going to be affected by that method, and that method was not directly called from there. So it saved me a few cycles of a broken pipeline where I find, okay, this is not working because of my code change. And it also saved me the hassle of potentially refactoring all those places just to run that line code more efficiently. So that is, I think, a step better than what agents have been doing so far and what we are doing, which is what excites me about orbit.
Colleen Lake
AttendeesFantastic. Now -- we've got a lot of developers in the audience right now. How would you recommend they use Orbit?
Aakriti Gupta
AttendeesRight. Two things here. First, I recommend using it on your GitLab instance because it comes not just with the knowledge of your code base. It comes with knowledge of what your team works like. Why were certain decisions taken? Was something deprioritized earlier? And why was that done? Or why is the code the way it is? You can go only so far with git blame, for example, you can't query commit messages. So you need more context. You need one agent that is empowered with all this context. And the other thing I would say is use it for everything. Don't just use agents as you were aging using -- sorry, as you were using, let's say, in March 2026. This is June. This is different. Go for it.
Colleen Lake
AttendeesEmpower agents...
Aakriti Gupta
AttendeesYes, that too. Empower your agent with that knowledge, and you will see a difference in accuracy, which is really important right now and just fewer cycles of going through your code and pipelines.
Colleen Lake
AttendeesFantastic. Now is there anything else you want to say to our audience about Orbit or about development in general right now?
Aakriti Gupta
AttendeesSame thing for both is what I use in my daily work life as well is push your agents, keep pushing them, keep extracting more and more, see how far it can go before it breaks, before the tokens run out or before it says, I don't know any more or I've hit the API limit, just keep pushing it, pushing and see how far it can go and use Orbit especially for bigger projects. And I don't just mean technically complex. I mean for product managers or engineering tech leads, if you've been sitting on doing a major refactoring and thinking, I have to go look into what will it take, how many cycles of work, how many people, this is not that important. Go in, these are the projects that bring you value, but you can do it for -- with fewer resources. Now is the time to go in, especially projects that include work across several teams. We've had those at GitLab before, where it took several milestones just because we had to communicate with so many different teams. This is something that can be done really well with one agent that has context on teams as well. So I really recommend pushing it to wherever its limits are, keep trying a diversity of projects.
Colleen Lake
AttendeesFantastic. Well, thank you so much for joining us.
Aakriti Gupta
AttendeesThank you, Colleen. Lovely.
Colleen Lake
AttendeesLovely, as always. All right. Next up, we've got Angelo Rivera, the engineering lead for Orbit, the tool that you have heard, I don't know, me say maybe 9 million times today already. Hello. How are you doing, Angelo? Orbit.
Michael Angelo Rivera
AttendeesOrbit Orbit Orbit...
Colleen Lake
AttendeesOrbit Orbit Orbit
Michael Angelo Rivera
AttendeesToo much orbit for sure.
Colleen Lake
AttendeesI don't think that's a thing. Come on.
Michael Angelo Rivera
AttendeesNice to see you.
Colleen Lake
AttendeesNice to see you...
Michael Angelo Rivera
AttendeesI just saw you 10 minutes ago.
Colleen Lake
AttendeesDon't break the illusion. Also as the clock says, I've clearly been here for 48 minutes. So, come on. All right. Now Angelo, I know you have a demo for us that you're going to answer some of the questions that we have from the audience. But first, I actually have a question earlier that one of the audience members said. And I'm going to read it to you because I think that's a really good question that we weren't able to get with William, but that you will be able to answer. And that is, since Orbit is basically creating a graph of my repo augmented with metadata from GitLab and is also capable of the parsing of 12 programming languages, can I use it for analyzing the architecture of my code/application?
Michael Angelo Rivera
AttendeesYes, definitely. I mean that's one of the main things. So one thing that was really cool about like what the team built is we kind of built it for ourselves, right? And one -- our service at GitLab actually was one of the first services at GitLab to be as decoupled from the monolith as possible. And so when you're doing cross-service architecture, you have to know all the integration points. And of course, there's like RPC communication and internal endpoints and all that stuff. And so as soon as we got orbit wired, we immediately started dog fooding it ourselves. So we started asking all the various architecture questions. And of course, you need like all the issues, MRs and then on top of that, all the codes. So we could -- I could even show you that too while we wait for some of the audience questions, but I don't want to bore you too much with it.
Colleen Lake
AttendeesAll right. Well, let's do -- I believe we have some audience questions. Let me pull those up.
Michael Angelo Rivera
AttendeesShould I get plugged in? Maybe...
Colleen Lake
AttendeesYes, please.
Michael Angelo Rivera
AttendeesI don't know if we're going to be -- so we should do the demo first.
Colleen Lake
AttendeesAll right. Now everybody in chat, just remember, make your sacrifices to the demo gods now, so this works.
Michael Angelo Rivera
AttendeesOkay. Should we pull up some of the questions?
Colleen Lake
AttendeesAll right. Yes. So do you have that first question there?
Michael Angelo Rivera
AttendeesSo there was one question that I thought that was -- there were 2 questions that I thought were really good from the audience. One of them was -- and we were checking to see beforehand.
Colleen Lake
AttendeesYes. Angelo has been on his laptop in the second row this whole time, just [indiscernible] in case you all were wondering.
Michael Angelo Rivera
AttendeesAnd let's -- so I picked 2 of them that I thought were really good. So let's do the first one. All right. Let's do it. Can I tell you what the first one is. So -- and I tried to come up with a cool prompt. They were asking basically in their company, there's a lot of times where sometimes someone will create a project and then basically just leave the pipeline running, even though there's like no one using the project after a long time, and that could cost like a lot of money after a certain amount of months, if not years. So I was like, what if we have the same thing. So I made like a little prompt. I made like kind of a crazy...
Colleen Lake
AttendeesLet's see.
Michael Angelo Rivera
AttendeesLet's go ahead and run it here. You can see all the other tabs that we had up for the demos earlier in Transcend.
Colleen Lake
AttendeesThat's nothing. You could still read words on there. You know it's bad when you can't read the word...
Michael Angelo Rivera
AttendeesWords right now.
Colleen Lake
AttendeesDon't tell all my secrets. Come on. No, I can look at that. Yes. Also, I can still see icons. You know it's bad when you can't see the icons. That's when my problem has gone so far.
Michael Angelo Rivera
AttendeesYes. I try to use group tabs, but it gets pretty bad. So I made one little prompt right here saying, use orbit to find the scheduled pipelines that keep failing on dead projects. And then I added like a few other filters here, any project that's basically abandoned and then rank them by the longest running and show each project path for that. So -- and this is in GitLab. So we'll see how it goes here. It might take a while for it to run, but maybe we can walk through like kind of Orbit. I don't know how much William went as far as the UI here, but what are you saying?
Colleen Lake
AttendeesYes. So go ahead. Walk us through.
Michael Angelo Rivera
AttendeesAnd while we do that, I want to kick off the other one because, yes, these things take a while, especially because we're using -- they just launched Fable on Duo agent platform, which is awesome.
Colleen Lake
AttendeesWhen did they launch that?
Michael Angelo Rivera
AttendeesLiterally like the hour that it came out. This is really cool to play with it, especially with Orbit because you probably can ask even crazier questions now. So I'll let it do its thing. All right.
Colleen Lake
AttendeesAngelo, I know that we've mostly been showing Orbit today in the website version of GitLab, but where else is it available?
Michael Angelo Rivera
AttendeesSo the one thing that we did is we are making this available to any coding agent. Of course, -- the specific product stuff will be outlined later on, but you can use it with pretty much every coding agent. So I actually was going to pull a VS code here. And so let's pull up VS code here. And I'm just going to run it here on the root because this is all touching the API. And let's go here. So I already ran this question once just to make sure it runs correctly, but...
Colleen Lake
Attendees[indiscernible] Yes, but not that much.
Michael Angelo Rivera
AttendeesYes. So I'll just write the prompt. It was really funny, and I'll give the context of why I picked this one. So they're asking, could you find all of the MRs that have been rejected basically? And I thought that was really funny because I'm working with the team a lot, and I try to help out as much as I can. And they get mad at me because sometimes I'll [indiscernible] some MRs to them and...
Colleen Lake
AttendeesI'm [indiscernible] that guy.
Michael Angelo Rivera
AttendeesYes, they'll get mad at me saying, Angelo, clean up your agent slot. And so I thought this would be a really funny one to see how many MRs got rejected from.
Colleen Lake
AttendeesSo how many times your name in particular [indiscernible] .
Michael Angelo Rivera
AttendeesExactly. So I'm going to just say something like my team sometimes rejects my MRs because they say I'm violating an agent slot. My team is shout out to the team, of course, Michael Tsai, Michael Uen, Jean-Gabriel, let me get this right. and Dmitry and [indiscernible]. And then I'll say, can you find all MRs in the past couple of months? -- that have been rejected by the team. And wait, find all my MRs. In the past couple of months, I've been rejected by the team. So let's kick that off.
Colleen Lake
AttendeesAnd drop your guesses in the chat for how many you think it's going to be. Over under 500 -- over under 10.
Michael Angelo Rivera
AttendeesOkay. All right. Let's see. That's a good one. That's a good one. So let's go back and see.
Colleen Lake
AttendeesThe first one.
Michael Angelo Rivera
AttendeesFirst one ended. Okay. So that was pretty quick. So here are some random project in GitLab where we can actually break that down. Let me just make sure this is running. I forgot the -- the #1 thing I forgot to tell it to do. So Let's re-prompt.
Colleen Lake
AttendeesWe only have 3 minutes left to type that.
Michael Angelo Rivera
AttendeesYes. So basically, we made the skill called Orbit, and you just install it once and then it has the full API. So I'm just going to say, use only Orbit. So now it will use the Orbit skill. And so that's already, of course, built into the Duo Agent platform here. And so yes, so we found pretty much all of the projects. And you can see this monitoring project has had over the past 6 months, 7,000 failed runs. And the last time anybody touched it was 18 months ago. And these are -- it's a good number of projects here. And you can see like this is super useful. You can just go in. And of course, we're going to go in and use this tool to do things we couldn't do before, which is cleanup stuff like this. So I thought that was pretty interesting. Hopefully, this will run within the actual time frame that we want. But happy to chat while this is running. or we can maybe talk a little bit about the queries here that it ran.
Colleen Lake
AttendeesYes. Let's talk about the queries. We have 2 minutes left, though. So we'll talk about the queries. We'll see if it finishes. And I have one -- I have 2 things to remind the audience of before wrapping up. So talk fast. Marg, I'll talk fast.
Michael Angelo Rivera
AttendeesOkay. I can't. So if you look in here, it's pretty cool. The agent is able to write its own queries. So that's the one thing that maybe hasn't been explained in depth. Definitely watch the video that the team did did a video on. It goes super in depth. But what's really cool is because it's a schema and it's an ontology, it can do aggregation and filters across pretty much every dimension across GitLab. And that's how you're able to do these cool reports. So anyways, no more about my boring stuff.
Colleen Lake
AttendeesI think that's amazing. All right. So thank you all for joining, and we should have this up on the screen just to see how it's doing while we're wrapping up. Thank you, everyone, for joining us here today, both at Transcend and for the Developer Show. Now if you want to try Orbit out as I really want to dive into it even more now, go on to the GitLab docs and just dive right in. They're very straightforward, and it's a great way to get started. And if you really want to show off your skills and play with it, we do have an active hackathon right now on Devpost, so check that out too. And as always, if you have any questions, go to forum.gitlab.com and say, hi. Thank you all. And how is our demo going? We have 30 seconds left.
Michael Angelo Rivera
AttendeesStill running. We're still running up the previous one, and we'll...
Colleen Lake
AttendeesI'm like. Angelo, how many times did you say did your MRs get rejected? This -- I think it's good.
Michael Angelo Rivera
AttendeesMaybe too much...
Colleen Lake
Attendees2 months. You're blowing your team too much with AI.
Michael Angelo Rivera
AttendeesYes, there's too much here. Well, maybe it's here. Maybe they'll come back with something. All right. Well, so it's a little bit slower.
Colleen Lake
AttendeesAnd we've got it. Thank you all. Angelo will post the real answer on LinkedIn and online. Thank you so much for joining us today. And it looks like we've got it almost done now. Fingers crossed.
Michael Angelo Rivera
AttendeesIf not, we'll pull up the old one. the old one. models will do this.
Colleen Lake
AttendeesModels will do this. We did try Fable for this one, but here's what it would look like.
Michael Angelo Rivera
AttendeesI'll pull up -- here we go, that's not I can't find it.
Colleen Lake
AttendeesThat's okay. -- thank you so much, Angelo. Thank you, everyone, joining finish right now. All right. That was terrible.
Michael Friedrich
AttendeesAI and what you can do I have a couple of slides the for the preparation so you can participate hands on. This is me if you want to connect on social media as [indiscernible], but rather much it's about you today. You need a working gitlab.com account to follow the workshop exercises if you want to participate hands-on or if you just watch -- want to watch and maybe practice it anything later on, you can also just sign up and continue following the stream wherever you're watching. The session is being recorded, and we will be providing a recording afterwards, same for the slides. And there will also be a Q&A session at the end, but please also feel free to ask a question any time in the Q&A section and I think chat might also be open. And also if you're stuck. So if any exercises don't work for whatever reason, please let us know so we can unlock you while we are doing the workshop. F rom the setup perspective, there is a code that will be redeemed for our Learners platform. This grants you an ultimate license, also access to GitLab Duo platform And credits, GitLab runners and also experimental features and so on. You will have access to the workshop environment 7 days after, so you can continue practicing, finish the exercises and later on also move or transfer the project into your own GitLab environment, whether it's on gitlab.com, self-managed or on dedicated. Now the most important part is you need to redeem the code. And I think it's -- yes, it's in chat already. And I just copy paste that over. Can I do that? No, for some reason, it doesn't work for me. I will ask my colleagues in the background to do that for me right now. But you need that code, go over to the GitLab cloud and redeem that. So in essence, it will look like this. You need to enter the GitLab virtual redemption code, copy paste that in, submit the code. and it should look like this. So when the redemption code is successful, you need to click on the Open [indiscernible] workspace button over here, and then it will get you with a new project overview. The ID here is random generated, but the most important part is like seeing the provisioned group and project over here. That being said, hopefully, this is working for everyone and it's provisioning and testing our environment right now. When you see the project, one important step for this exercise of workshop is to set up your default name space. So you might be using a different name space already. But for this exercise, we specifically need a Duo default name space being set. So on the right-hand side, you can navigate into your preferences by clicking on the icon and select preferences. And within the user preferences in the behavior type, please select the default Duo namespace which should be GitLab Learn Labs. This is a preparation so that later on, any credits or any agentic chat that will be leveraging that name space works out of the box and you don't have to worry about any errors. That being said, let's jump right into the workshop, which I have over here. And I hope I need to zoom in a little I'm already quite zoomed in. Maybe let's collapse the side bar here. The project itself is a Python application with [indiscernible] content and the run book for following the exercises for this workshop, so everything that I do now can also be accessed in the runbooks here, click on run books and then the-workshop.md. These are the instructions that we will be following in the next couple of minutes. And essentially, let's quickly go in. What we will do today is we will be working exclusively in the browser. So no IDE, no terminal required and work on the swag shop here. In this workshop. Like I said, it's a pipeline application with the front end. And there are some issues with that. So we want to analyze the project, prioritize any work that's available. There are fixing -- sorry, there are broken pipelines that we need to fix. We want to implement a new feature and also accelerate code review. Last but not least, there will also be security remediation and resolution. So we are touching the top 5 use cases here, specifically around the Duo use cases. So like ask anything from a design pattern and algorithm to debugging how to implement something or even like create issues and leverage custom agents or specialized agents for that and then go deep into like fixing pipelines, implementing features, doing code reviews or automated code reviews with help from AI and the generic AI and then later on, vulnerability resolution, which also includes the false positive detection, which you might have seen in the keynote demos earlier today. So there's also a hands-on exercise for that. Also as a matter of time for this workshop, so I could probably go 3 hours and talk about everything. We will be touching on the majority of the use cases, but everything that's marked as a bonus or as an optional exercise, I will ask you as a homework exercise and do it async and we focus on the learning exercises. Now what makes Duo platform different? We actually work directly on the platform here. We have access to work items like issues and epic, merch requests, CICD pipelines and even more data from the background like metrics, data vulnerabilities and whatnot. So we don't need to have to do any contact switching. Every action that we perform, we will see in a bit needs a human in the loop. So we need to approve that, verify that and also verify whether it can be applied, merged, run and so on. The learning curve is important. So we will start with autantic chat and then continue to dive deeper into foundational flows and also custom agents, which are often an interesting way to explore and Everything that agents do on the platform is also transparent and you can observe and trace the agents, the behavior, the reasoning, the tool execution. And we will be using sessions do agent platform sessions in that regard and also practice that and how to verify that on the platform. Terminology is here listed agentic mode, like a ganticat on the right-hand side, anything that agents might autonomously execute to fetch context, better understand the user question and get up to speed faster than a chat prompt could ever be. A quick win is a low effort high-impact improvement. So we will be looking at sort of quick wins that we can solve in this project -- another term that I will be using throughout the workshop is flow, which is an automated sequence of AI-driven actions. So it could be multiple agents in one flow of specific steps in order to solve a problem which needs more reasoning, for example, like implementing a feature or fixing a CI/CD pipeline. Last but not least, human in the loop means you as a human, you need to approve things before agentic AI is able to execute that. Okay. The setup comes -- or the workshop comes preceded with configuration, but we also need to do a little work from the settings perspective. So this is raw as you would start with GitLab dent platform in your own environment. One thing we need to do is we need to check the Duo settings, the prerequisceedts here before we do anything else. So we need to enable GitLab Duo in the settings. And for that, we actually -- the settings are over here, and let's go into general and I'm pressing command now to open a new tab. So we try to keep the run book, the instructions for this workshop open in this top and then switch into the actual exercises now. If you're working on a large screen, you could also open 2 windows and move that in parallel. But for this live stream, I need to switch between tabs and the browser right now. Now go into GitLab Duo and check that GitLab Duo is enabled. We allow flow execution. So we can later on use the CI/CD the fixed pipeline flow and others, which is a foundational flow. We want tool approvals for sessions. And the thing we need to enable is these are for false positive detection, the secrets detection and also the vulnerability resolution. So these 3 are important to have enabled for the later vulnerability remediation examples. and exercises and then press save. So this is step 1 of the preparation for this workshop and your own environment to enable these agent platform features. This is part of more high-level governance. So you can also enable that on a group level setting or instance level and verify that only specific agents and flows are available where they should be available. The next thing we want to talk or talk to do is we've heard about GitLab Orbit today being in public beta. And in this workshop, we also want to make use of that. So it's important that Orbit as a knowledge graph indexes your project. It might have been created just now, so the indexing might just be kicked off in progress. But what you can do, what we need to do is we need to click on the profile icon on the right-hand side here. And in our preferences, so the preference here and let's open it in a new tab. -- didn't work in a new [indiscernible] it down. There is the behavior next to the GitLab namespace. So there should be GitLab Learn Labs. You can see GitLab Orbit and GitLab Duo. We want to use it. And we also want to make use of it in a gantic chat, use chat. So basically tick all the boxes. So we ensure that GLatuo has orbit available, and that's it. And once we've done that, click on safe changes and navigate back into the run book. Okay. Sorry. So we are -- right now, we are here -- the next step is we want to trigger a failing pipeline in order to then use the fixed pipeline workflow and we can do that in the pipelines over here. So on the menu, navigating to build pipelines and open that click on new pipeline. We don't need any inputs and no variables. Great. Click on new pipeline. And then we can see the CICD pipelines have kicked off. The stages are test, build, security, deploy and also cleanup. This will take a while. We don't have to worry about this. Let's focus on the other things required to run the workshop. Agentic chat is available in this workshop. So on the right-hand side, you can access Atlantic chat every time. If you close the panel, you can always open it up by clicking on the pen in the paper icon. I don't know what the exact term for that is, but click on new chat. and you can then use it. From a model perspective, it for whatever reason, it's using a different model from earlier tests, can search here for SONET 4.6 and we want to use the default, which is Vertex. I think by default, if you created a new account right now, you should have that, but you can certainly switch. For this workshop, Claude SONET 4.6 has been tested. So if you change the model, you might have a different experience. And yes, my advice is don't change the default model for now. Okay. There is a lot of things to discover here. We will chat or talk about the different agents available in Agentic Chat in a bit. For now, no, there is foundational agents and those are custom agents available, and we will make use in the next minutes. The last thing we need to prepare is actual work items and merge requests and why this sort of looks a little too much. It also shows the exercises will have code blocks. We can copy the code blocks using this small Hover icon over here. And in order to prepare issues and merge requests for this workshop, we use instructions for a generic chat, which tells it to read a specific specification from the repository and then create GitLab issues. And I will show you what this exactly means by just walking through the exercise. So let's copy that and then paste that into a generic chat and then send that with the prompt or you can also press enter. And the idea here is to let a generic chat generate or create the issues and merge requests. This could also be helpful if you, for example, have an Epic with a specification and ask Agentic Chat to break it down into smaller durations, issues, tasks, whatnot. The thing that now happens is a generic chat wants to create an issue. So it calls the tool for creating that and asks for our approval. We can read the description it has generated. And also like if we want to dive deep down into it, we can also see the raw request that's being sent. So now let's approve that. And the result should be 3 issues and 3 merge requests. So in the end, it created issue #2. Then it wants to create a commit and push, which is a preparation for the merge request. Let's approve that again. And we just continue with the creation. Sometimes it can take a little bit depending on what we're asking or how many people are pressing approve right now on the platform. And you don't have to wait until I press approve the exercise is to approve all the creation of all issues if it gets stuck or for whatever reason, doesn't continue, can also write and chat, please continue or hey, are you doing okay? Can you please help me finish the problem? And it might then do an inventory or does an inventory and recreate that. I saw that when I was practicing for this workshop last week. So again, gigantic AI is sometimes a little unpredictable, but it's your partner, you can make it work together. The other things you will also see here is that it takes actions, for example,it reads the README.md into context in order to better understand it. And the reason why we do this exercise is that we can actually look at issues and merge requests we want to fix review and inspect for vulnerabilities in a bit. Other thing to know what is important from the instructions here. Yes, you could type if anything gets stuck to read the current context or also start a new chat and say, I did this specific exercise before, please do an inventory and help me continue. This is a different tool, a different command creating a merge request. So we see the target branch, the branch and it also automatically assigns labels for us, which is nice. And I can do all that just by using agentic chat right now. There is no different work items issued or merge request required. Okay. Now it's processing the final one with note. And I hope this is working for everyone. I do see a question in chat. No credits remain available in agentic chat. Please check the default name space in your user preferences. This should be Gab. I think it's truly in chat. #4 has been created and we want to create a new commit and branch, and I think this should be the final request that we want to see. And usually, when I practice this, everything is fast and now it's a little slow. So sorry for the inconvenience. What else can I share while we are waiting? Maybe I can just ask things. Okay. So I was doing that in the background and now I was a little impatient and just got confirmation. Here we go,We have the GitLab merge request, which is about fixing improved text contrast. The thing you can also see is it creates the merge request and renders the URL and it will add a note to the issue and merge request in this specific case, merge request note. Okay. And it is rendering that. And if this becomes a little unreadable, you can also maximize the chat panel in order to see that. Okay. We have the issues created with the merge request created. Now that this is working or this has been created, we want to actually leverage the issues and merge request by asking GitLab Orbit about the relationships that have been added to the merge request and issues. Therefore, we need to open a specific agenda chat, sorry. So click on the pen and paper icon and select Orbit as an agent. And this opens a new agentic chat with a specialized agent. And then let's copy the instructions here. So it asks whether we have requests with running or failed pipelines, pace that into chat, send and see how it executes. The difference you can see here is it's taking the action to query GitLab Orbit directly. So it analyzes the comments, aims to get the graph schema and also the DSL and then query the knowledge graph or GitLab Orbit directly, doesn't have to go the route using the rest API or graph in the background, but rather much get the answer immediately. So we have a failed pipeline for the Flake8 line length. And there is also a running pipeline, which started 1 minute ago or 2 minutes ago. Would you like me to dig into why the pipeline failed? Yes, we could totally do that, but let's keep that up for later for an additional analysis. I hope I'm not talking too fast and chat is -- that everyone can follow along. If I do talk -- speak too fast, feel free to slow me down and ask a question chat. Specifically for the custom agent, we want to make use of a custom agent that's called the daily Compass agent, which provides us insights on what we could be, what needs our attention, what could be a quick win and so on. We need to enable that agent in our project. In our workshop, we can navigate into AI and agents, and let's open that in a new tab where we can see the default agents that are enabled or the foundational agents, but we can also access the AI catalog here and then search for the daily Compass agent, which is a custom agent maintained by -- so it's a custom agent maintained by the Duo Labs Group uses specific tools as a system prompt where we can see the quick win and for now let that into our project. Yes, everyone will be able to use that. And we can see there's the latest version, okay? So this is also something that you get with custom agents. There's versioning involved, and we can update to the latest version. Yes. So it's similar to, for example, CICD components that provide new updates. You can pin custom agents to an older version, but you can also certainly manage that. And the agent itself, we will be using an agentic chat, but it's important that it's enabled in your project. So when we go back into the project menu here, we can see that the daily Compass agent is enabled. Also stay hydrated if you don't have any like class of water or juice or something around, get something and take a sip in between. For now, the workshop is in a good shape. We do have orbit enabled and tested it that it works. We have the daily Compass agent and now it's actually time to get to work and start with the first use case. So a little more deep dive into agentic chat and the specific use cases. One way to get started with the project is understand the project structure, what to improve, what needs attention right now, anything that comes to mind. This is especially true if you, for example, inherited the project from a different team or the teammate who created the project has left the company 10 years ago, which often -- which could be the case like with some Fortran code Or some code that I have my career never seen like COBOL or any other older things, but also certainly projects where no documentation exists yet. For this specific workshop and swag workshop, there is documentation available, but certainly agentic AI can be helpful to understand that. So we want to copy the prompt or should actually follow the instruction first let's have a new chat. So again, selected Duo as a default and then copy the instructions and pace them in here and execute or let agentic chat analyze the environment. You will also see it's using or it's adding more context here. So it's referencing the current open page, but also agents for md and chat rules, so customer instructions, which help the agents navigate the repository faster. And we will do a deep dive in a bit on that. But for now, let's focus on the analysis. So we have -- let's make a little bit bigger. We have the code base is clean and well organized. The main areas are code application, data quality issues in the DPY, which is, I think, the database probably structural gaps. So there's a duplicated card logic, there are wrong categories, okay? It also knows about that code. So there were functions defined but never called. and there is also other things. So it provides us a summary of the recommendations and also potential effort to fix. Okay. That's good. What should we do next? Yes, we actually want to put this into action and say, let's create an issue for that or let's create issues for these type of things. So let's copy the second prompt here and instruct it to create issues based on the recommendations. -- waiting for to execute is similar to waiting for CICD pipelines to compile, like if you know XKCD-303, this is sometimes true. But every time I call out agentic AI, it comes back. So that's a good thing. It creates the issue, let's approve that and created the vote item for us. Now we want to open that issue. So let's click or press comment and open that in a new -- and we can see it extracted the things it found directly into an issue. Next thing or next action item can be working on the fixes or we could also say, please break it up or promote it into an Epic and then split it into specific reports and so on. But let's see what the instructions tell us to do right now. I will leave it open, okay. Next thing to discover is the AI catalog. So before we actually use a specific agent now, let's go back into the AI catalog, and we have quite a few instructions here. To get into that and some things I've mentioned before, okay. Anyhow, let's follow the instructions. Let's go back into agents, open that in a new tab and I will close some of the remaining tabs. With the agents here, you can see the foundational agents that are available that we will be using in a few minutes and the AI catalog. When we inspect, for example, search for the planner agent, you can see that this is managed by GitLab. There is also a specific label or on that highlights that this is an upstream and GitLab maintained agent. It was updated 2 days ago. So it's current, the latest version, and this is the current release. We have a description and also documentation being linked here so we can learn about this. The visibility is public. The type is foundational in this case. And the agent has specific tool permissions, which are mostly read only in this case, except for like it can create a work item but -- and also update them, but for example, it cannot delete a work item. And the work item in this case is issue Epic and different other types. MCP servers would allow this custom agent to communicate with external tools or external context in that way. Nothing has been configured yet, and there is a system prompt, which we can see and we will test in a bit. This is relatively long. So could use that as an example for your own custom agent development, but this is a foundational agent that's available for us. The benefit here is that the AI catalog is a central place to manage agents and flows within the GitLab Duo platform. And you can also verify which projects or which groups are allowed to use these agents and flows, but also share that, maintain that over time, provide bug fixes and even more. And it's a central repository marketplace for anything that's required in your environment. Okay. Next one. We want to focus on quick wins and use a custom agent for that. So we previously enabled the daily Compass agent in the project, and now we want to use that. So let's create a new chat on the right-hand side here and then search was scroll down depending on what you want to do.What is the name? okay lets double check. Did I make a mistake in my agents? No this is here. Can I open a new chat? There it is. Probably I should have pressed refresh. Okay. Let's open a new chat and the Duo agent -- the agentic mode is active, and we want to use this specific prompt. So given the quick wins for this project, I'm lazy, so I will just copy-paste that, not type that. Given the quick wins, why this is navigating or running the projects, we can see it searches the merge requests, the work items, and also vulnerabilities. And this workshop has failed merge request pipeline. This is label blocked. Okay. Let's make this bigger again since it's rendering a little more text. And there is also a way if you make the window even bigger, you can have a wide version of chat next to a wide version of GitLab in the same window. But yes, this is the full screen. Quick win [indiscernible] , a comment by -- this is me call this out explicitly but it should be your user if you see that. Depending on the output, agenda AI is not predictable. So it might look a little different in your responses here. There's also maintainability improvements to fix those specific tasks we identified earlier, and there is a review needed for merge requests that we created. So it provides with a recommendation, unblock the team by fixing the pipeline and even comes with the conclusion of -- or conclusion, a proposal of what we want to do next in the workshop. But let's quickly go back. Give me the link to the merge request... priority one. Okay, let's try that. Is this priority one? Maybe it's... I guess, okay. This is priority from an analysis perspective. It renders the link. So we can click on that and open it in a new tab. And it's fixing which it needs to be fixed. Okay. This is good. What are the next steps? Yes, the next step is like expected. We need to fix the pipeline. And in order to do that, let's follow the instructions here in the workshop as is. We open -- we have the merge request open. The merge request has a pipeline tab at the top here. Click on that, and you can see the stages where it failed. So the stage is failing with two jobs. We don't want to navigate into the job itself or both jobs and then run AI on the log, but rather much get help from fixing it from this level. And there, we have the Fix pipeline with Duo button, which we will click on. And you can see every flow that runs in the background -- sorry, creates a new session. This will take a bit as it runs in the background on the [GitLab Runner] infrastructure. And we will come back or see how Agentic AI and GitLab Duo will fix that. Let's quickly inspect that I'm following the exercises as proposed. Yes, for the sessions themselves, this opened automatically. So you can see here on the right-hand side, you can access it directly using the GitLab Duo sessions icon. And you can see that previous sessions were running here. The most important part around sessions is you can see what the agents are doing. So the agent reason here is it's fetching context and continues Okay. Yes. And sorry, I'm losing my voice now. You can also access sessions on the left-hand side here. This gives you an insight on I will talk slowly. The session is running. And then it will also attempt to create a fix. And yes, while we wait for that, we should actually be continuing with the dark mode opportunity, which Compass surfaced as a bonus idea. Okay, let's go back into the GitLab Duo chat history. And the chat history is on the right-hand side here. For some reason, I used a different agent. We can see the bonus idea. I'm not sure which part I'm hitting now, but let's refresh maybe Okay. Pressing refresh brought in the part agent into the current context. It seems like I have a good hand for finding bugs while doing live workshops. Okay. We have been doing this, the quick wins. We want to fix the pipeline. The pipeline is running, and we are now here in this specific context. Okay. We have the comment. Next step. And it instructs us to use a [voice chip] and create an issue with that. Okay. Sounds good. The next step is to actually understand the context that's available in this project. So the?agents.md?that's added into the project here helps the agent navigate and understand the context faster, which includes the tech stack, the deployment and a little more than that. So let's open that. On the left-hand side, we have the agents and have opened in a new tab again. This is a full-blown example now, but you can start by defining the development style guide, a summary of the tech stack like a flask application for the back end, templates for the front end, but also specify how it's deployed in production, what are the important targets here. And most importantly, how to build and run the project. Specifically, it is a Python project. So it needs a virtual environment. And these are the instructions to help agents determine the exact same environment as, for example, CI/CD would be using it: lint commands, test commands, and so on. And specific design patterns, project structure, style guide and so -- so this is helpful to help for agents and any agents, whether they're on the platform or in an [IDE] or on the terminal to better understand the code base and the project. The important thing to know about?agents.md?is this gets loaded into every session by default. So it might consume too much of the context window. There are alternative ways developing right now with Agentic AI, which are skills that can be demanded -- sorry, can be loaded on demand by agents when they deem they need a specific skill for building a Python application, for instance, or generate documentation. But this doesn't pollute the context window by default and it can be in an alternative way. The other thing to mention and highlight is the code review flow, which we will practice in a few minutes, doesn't read?agents.md?right now. This is an open feature request, but it uses its own customization file where we can provide specific instructions that matter for code review. Okay. So this is an example. There are more examples available in the documentation. Next one. Next use case is implementing a feature using the developer flow, specifically the dark mode feature that we have been looking at. And let's follow the instructions here open plan and work items. So we have planned work items over here and then create a new chat with the instruction to create an issue for the dark mode. Could have clicked here, but essentially to open Agentic chat and create an issue for the dark -- there's a question in chat, which I didn't see. The user asks, "When I create an issue through Agentic chat, does it use my identity?". So it's using a service account, but also a composite identity. In this regard, it's my identity. So I act as the issue creator. So when we create this now and open it up, we should see that I'm the author -- and I'm also the assignee in this case, practice it with the teammate and follow up on this. But essentially, yes, it's me the author here for the dark mode feature, which we could totally now start working in our local development environment, the repository, get started with Code, Codex, whatever tooling you're using or even like CLI with platform, but we can also do that directly on the platform. And one way to follow up on that is the developer flow. So the developer flow is more or less an issue to merge request automation which can be helpful to get started with proof of concept, but also specifically an issue that implements a new feature, generates architecture documentation like Markdown files, Mermaid charts, anything that's currently missing, and add tests. I'm a developer myself for like 20 years now. And the most hated thing for me is writing tests. I don't know what you think about that, but this can also be a way that gets automated with the developer flow and can be helpful. For the dark mode, we want to practice that now. So scroll down the issue. And at the bottom here, you have implement with GitLab Duo and click on that. Again, same behavior as the previous fixed pipeline workflow. It kicks off a session, which locks the agent reasoning for the analysis and so on. We can also click on the session information on top, get more insights. it locks the user that triggers the session or the flow, and we can also see the job ID. So I could go even deeper and inspect the CI/CD infrastructure, where this is being run on in a sandbox in a environment. This is totally deep down, not made or not important for the normal user. But if you want to look into that, why something isn't working, you can do that. It's also important to know that you can run that in your own infrastructure. So there is a specific tag needed for GitLab runner. And then you can isolate the flow execution on a dedicated runner platform. and use that together with the existing GitLab platform. So there is nothing new to learn from an environment perspective or maintenance perspective for Salesforce, the GitLab Duo platform, except for maybe running LLMs, but the Agentic run time uses a well-known infrastructure and framework. Now this is starting, and you can see the developer agent lists the directory, reads the files, has a good understanding of the code base and creates files and continues. This might take a couple more minutes, so we can put that in the background and come back to that once it's finished. And by finished, it's creating a merge request, which then triggers CI/CD and code review, which we can jump in. Now while this is waiting to finish, we can have a look into the fixed pipeline session. And for that, let's go into AI and sessions. And we can see the fixed pipeline session has finished. which is great. It's trying to understand why it failed, a lot of reasoning in there. And there we can see it created a fix and then pushed the fix into the merge request. And there is also added a comment to the merge request, okay? What are the I don't have -- no, the merge request isn't directly here, but we can access the merge request from the code. Merge request top here. Okay. I've analyzed this and created a fix. Okay. This is a behavior change. I wasn't expecting. But when it determines it cannot push the fixes directly into this merger request, it will create a follow-up merger request, which is sort of a stacked merger request, if you will. So it's not targeting the main branch, but rather much the Git branch from merger request to, and I can actually merge that into the merger request just now. Sometimes it commits directly into the other merger request. So yes, this is something we actually want to approve and merge. Our approval is optional. Everything is green. Let me quickly chat and say," Get ready merge into the target branch". Okay. Let's merge that in. Why is it failing? This is tie up match. It's a draft should follow the instructions. First, we need to mark that as a ready. Merging is now not okay. It introduces some vulnerabilities. Please ignore them for now. We will lose -- we will use them in a bit. Okay. So in the original merger request, we can see the merge and now the pipelines have kicked off again. And then we should see that the merge request has been fixed once the pipeline continues. And actually, we can see flake and so on are green now. This is what we saw earlier. That being said, let's move on. Since there are a couple of more exercises, we can use the planner agent, which is a foundational agent to help us with planning, create issue estimates, set due dates and so on. The way to handle that is to create a new chat and then specifically select the planner agent over here and copy the prompt directly here, which asks which work items are missing estimates, due dates, and assignees. And now we get an overview of specific gaps for the planning a little bit bigger and review the recommendations. For example, we don't have a milestone yet. Let's go back to the run book and then check how we can act on those gaps. So for example, I could say, please assign all unknown issues to me, which we can do over here. And it should fire in action. It's actually looking up the user ID. Yes, this is me. I cannot remember my ID, like my ICQ number. But yes, it's now executing the assignment let's approve that. It fires set up in parallel. And you should see the same for your user name and an estimate. Would you like me to suggest StoryPoint values based on that? Okay. No, not now, but thanks. I follow up in that conversation. But we want to look into another foundational agent, which is the CI expert agent, which is currently in beta. And this is helpful to analyze the current pipeline, figure out if there are inefficiencies and get a better understanding of how this actually works. In my career, I've seen GitLab CI/CD pipelines like 10 years ago, which were created and then never touched again, so like never change your running system. And everyone was living with a run time of 1 hour. But this is an opportunity to maybe optimize that and analyze it in the beginning. So let's create a new chat, sorry, I was too fast with the CI expert, beta as a custom agent, as a foundational agent and then copy paste TCI on the prompt here to analyze the pipeline. And why this is running, there is another exercise, for example, to do a pipeline tour. So get an understanding what each pipeline -- each stage of the pipeline does. And security investments, okay. Let's make that a little bit bigger again while it's running, and there's certainly a lot of things to fix and do. This is interesting. Interesting in a way, I didn't see that yesterday when I practiced the workshop. So sometimes it comes up with more pressuring ideas or identifies something that should be fixed. The?pip?cache key, yes, the parallel jobs share the same cache, which can be a problem if one job installs dependencies from Python and another one again, creating a collision and not a reproducible environment. This is something that should be fixed. Obviously, we don't want to fix it by ourselves, but continue the conversation and then ask Agentic chat to do that for ourselves and use a lighter image, okay? -- because we're using a default Python image could use an optimized container image that we build and maintain using the GitLab container registry in the background then and other optimizations, okay? There's also an estimated risk in there and the summary table. This explanation is in beta right now. So let us know how -- what are your use cases in that regard. Moving on, there's another agent. to practice is the data analyst agent. And this comes in handy if you want to get an insight on, for example, the merge request throughput or open issues in a project. The great thing is in the background, it's using the GitLab query language, and it can also output that. So let's practice that by create a new chat us the data analyst agent copy the prompt. And the ask is to get a GitLab query language view, which then I can embed in my week, but I could also embed it in issues and Epics to have a live view on the data and not just a static Markdown output, which I always need to update later on. And this is great, so I don't need to learn the query language itself, but it knows that from its trained context or the context in the background from the GitLab documentation and the scope. Okay. I remember that it has in rendering for that. So you can immediately see this is a live fetch. So it's like a macro markup language. If you're familiar with Confluence Macros, this is a similar implementation to fetch live data from the platform. We get an overview of merge request activity, which in this case, I offered the merge request, but everything that was offered by the flows is actually attributed to that specific service account. If I -- I think I didn't explain it in an earlier step. Okay. Now let's dive into the fourth use case, and I know we are running a little bit out of time. So let's focus on code review. But there are some merge request that might require code reviews. By default, the flow for code review does not trigger automatically for merge request in this project. If we want to enable that, and let's do that for a second. In the settings, we want to navigate into merge request. Let's open it in a new type. There are the code review settings at some point. I should remember that code review and then enable the code reviews. So every merge request, which is not a draft automatically triggers a code review. We didn't create any merge requests yet, but we can also request a review manually. Let's see which one should we use from the instructions. Open the merge request about the implementation of product search and filing. I think I've seen that before in the keynotes today. Let's go here into code and merge requests and open new Open this specific merge request -- and I could either edit the reviews here or the other way, which I'm confident or which I feel confident is request review and then say GitLab Duo, which is a quick action in GitLab and this then requests a review. Same pattern as before, it kicks off a flow, which locks everything in the session. So if we never get on the right-hand side into the sessions here, we can see the code review session is being triggered and we can follow along. Again, here the details. And what happens in the background is it starts with specialized flow, which takes into account not just the much request description and title, but also the differences and the repository context and specific other references from the software development life cycle SDLC context, discussions like if there is an ongoing review already happening in order to make informed code review suggestions. It will also comment with suggestions or with remarks. And while it is doing that, let's quickly have a look into the actual instructions. -- we have code repository, let's open that. And within the tree, we do have?.gitlab?and the?mr_review_instructions.yml. This is YAML format file, and there is also documentation for that with the instructions. And for example, for CSS, can match on files or file pass and directories and then provide the instructions for review. So specific style guides, examples to catch even -- this is a lot of front-end code. I'm not super confident with front-end engineering. So -- but this sounds good. The Python/Flask back-end security and best practices, code quality using PEP 8 compliance, Flask-specific things, performance, so like which database queries should be optimized or look for specific query problems, memory leaks and whatnot, tests and a whole lot more. In the GitLab documentation, there's also a lot of more examples on different programming languages. So I think I added like plenty of these recently, which will give you a quick start on the specific things. Now let's see if this code review finish. No, not yet. that's still reasoning things. Okay. Let's check -- what else can we do while we are waiting for that. I mean, in reality, you shouldn't be waiting, but rather much keep it in the background, focus on the next important task on your agenda. And when it's automated, you don't even need to use any handholding for those reviews because they happen automatically similar to a pipeline that's being kicked off. Managed activity is something we can check out. Let's open that from the managed menu. And we can see now that we've opened GitLab Duo or the code reviewer, has commented with feedback, but we can also see the user activity. And there is also a developer who started implementing a feature. So this is using the service accounts and composite identity here. The fixed CI/CD pipeline flow has run and even more than that. So this is a small inside of like governance and compliance to see what the agents are doing. Now that I know that there is feedback, you can see it's following the instructions such as a placeholder is missing, a label's missing, an accessibility label. Okay and a whole lot more. So we can add those suggestions to [indiscernible], apply them and then continue the review. It's interesting. It follows an exception. it's also following good practices on style guide for Python. And finally, after looking at the code review use case, the last use case, we want to look into today's vulnerabilities and not just like analyzing them, but understanding whether it has an impact on the products on the application or if it's a false positive and then also get a Agentic help for implementing a merge request and implementing a fix. The first thing to do is to inspect the vulnerabilities for this project. So let's go into secure and vulnerability report, open that in a new tab. And over here, you can see there were some critical vulnerabilities, high and so on. And we need to filter them in all status still detected and advanced fast, okay. So click that away. And the status is all status. You need to click outside of that [dropdown] to actually mark that. This is a known [indiscernible] problem. Activity is still detected can answer. And then we need to look into we actually don't want to look into anything. Let's open a new chat -- and in this case, use the security analyst agent and copy it prompt list all critical and high vulnerabilities with the locations. Okay, let's make this a little bit bigger because there's a lot information coming in. Okay, secret detection and so on. What else can we look into -- okay, we could also use the orbit agent as a secondary step, kind of create a new chat with orbit and then repeat that question which vulnerabilities are detected and which can found them. Again, the difference here is it uses orbit, gets the schema and fetches the information directly. and it explores project and vulnerabilities and breaks down the information. So it doesn't use any rest APIs or anything else in the background, which would take more time or even more LLM around trips and cause more cost. And we have an insight in a similar fashion. Okay. Next step, we want to triage everything. So let's go back into our chat screen the security analyst and say, triage all the SAST vulnerabilities, confirm real risks, dismiss false positives, do not dismiss vulnerabilities used in tests, and create work items for confirmed findings that need remediation, okay? This is a lot to unpack. First confirm everything which is real, dismiss the false positives and keep everything that's used for testing and then also create a work item for anything that needs remediation, okay? So it reads all the vulnerabilities and the source code and more context has a complete picture. Okay. Dependency scanning is a?pip?vulnerability, so that needs an update. And for dismissing the vulnerabilities. Well, this is not a confirmed version. Now it's dismissing everything that's intentional. So I don't need to select that over here and figure out what is currently right or wrong. This is an automated action for that specifically. Okay. Why this is thinking. What else should we do? Okay. positive detection. -- and we got stuck again. sometimes I'm a little impatient. Okay. While this is thinking, let's look into false positive detection. Let's go into the vulnerability report again. And this time, we want to filter, copy-paste the filter over here scanner:sast advanced fast to the URL a quick cheat. There is no activity the medium severity active debug code finding in?app.py. Active debug code, there it is. And yes, this is -- runs the fast web server with the debug flag. So in a developer environment, you get more output what's happening like listing or logging requests and whatnot. The exercise here is to remediate whether this is a false positive or not. So the icon, what the step we need to take is click here and not resolve, but check for false positive. This triggers a flow in the background. So we have another session running. So if we access the sessions here, we should see there's a false positive action running. And yes, it will also take a minute or so to continue that. while we are waiting for that in the background, the next exercise is about fixing vulnerability. And this time, it's about an SQL injection. So for example, the web application reads a user input directly and puts that in an SQL query. And the next day, you're wondering why the database is deleted or it might be sold in darknet because someone created a dump with some SQL injections and there are scanners available that help detect that. And it's actually practice that let's follow the instructions, secure vulnerability report. And we want to modify the URL [indiscernible] and I need to cheat which line is 213. Okay 213 is the proper one. there we can see the Duo is verifying a false positive. So we always have that insight what's going on in the background, which is nice. Here is the improper neutralization. And do we need to explain it? No, we want to resolve it. No, first, we explain it, then we resolve it. We can click explain here, which opens an Agentic chat session that we can follow along. It reads to vulnerability into context and also the source code explains the vulnerability and how an attacker can exploit that and how to fix it. Yes, I could totally continue in this chat session. But the other thing we can do is leverage resolve with Agent AI, click that. And again, we can see a new session has been started for fixing that, leveraging Agentic AI in the background. So essentially, it's running a flow, which analyzes the source code, creates a fix, creates a merge request, which then kicks off the CI/CD and security scanning again. So you can immediately verify that the vulnerability is gone in that created merger request. And while we wait for that one, we can actually go back to the false positive detection, which is running over here and can also just access that in the session. Let's scroll to the bottom can see the reasoning and the analysis and then it updates the explanation specifically in the vulnerability report. So when we press refresh, we can see there is an AI false positive confidence score, 15% in this way. This is not a false positive, so we should fix that. And it provides a lot more insight and risk assessment to remove that. So like if you put that in production, the debug mode should be disabled. It should be even removed from the parameter. This is a lot of information. We can also take -- remove the flag if you want. But yes, the agent helped us understand whether it's a false positive or not. And this is not just true for this specific merge request, this specific vulnerability, but certainly for any that's open and helps triage a backlog of vulnerabilities like, I don't know, maybe 100, maybe 1,000 depending on how big the code base is -- and you don't need to sit by that and review that one by one, but rather much use GitLab Duo platform [empty] the security remediation vulnerability capabilities for that. Now the vulnerability resolution flow was running. Is it this one? Yes, press refresh. The session is still running. Yes, it's still going on. So I would ask you, as a matter of time to verify the fix by yourself. There are follow-up exercises or follow-up thoughts around how to implement like monitoring, alerting, audit logging for security vulnerabilities, which metrics to confirm anything to check. Yes, that's about it and some optional exercises from a playtime perspective. So there's so much more you can do with the GitLab Duo platform, which doesn't fit in those 90 minutes like adding MCP and so on. If you want to see more of that, we do have more workshops coming up, more live sessions. If you're in the Germany, Austria, Switzerland, I will be at the [indiscernible] to next week in Frankfurt and then the week after in Berlin, doing live hands-on demos and hopefully not losing my voice. But that said, this brings us to the end of our workshop. Here is it. If you want to transfer the project -- so when a transfer, you lose access to the LearnLab. So right now, probably continue in the project and finish the exercises. But there is an instruction in the transfer.md, which helps you move everything to your own environment if you want to. And if you have any further questions, let us know either here, can join our community forum, Discord, social media, meet us in person, ask the sales, your account executive, ask our support team, we are here to help you adopt GitLab Duo platform. Yes. And thanks for joining today, participating, and I hope you have a great rest of your week, fully empowered now with what you have learned today. Thank you.
For developers and AI pipelines
Programmatic access to GitLab Inc. earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.