Synopsys, Inc. (SNPS) Earnings Call Transcript & Summary
March 28, 2024
Earnings Call Speaker Segments
Unknown Analyst
analystHello, everybody, and thank you for attending today's webinar call, the 2024 guide to open source security and risks. My name is Kevin Collins, and I'm here with Mike McGuire from Synopsys and he'll be taking us through the presentation. Before we begin, we'd like to go over some housekeeping items. First off, by attending this presentation, the data will be shared with BrightTALK and Synopsys may contact you about products and services that you may be interested in. You may unsubscribe at any time. [Operator Instructions] We'll also be sending a follow-up with the webinar slides as well as the recording. Finally, there's an attachments tab. We've included a number of additional resources on that tab, including the 2024 open source security and risk report. That's the report that this webinar is actually based on. So with that, I will pass it over to Mike to get us started.
Michael McGuire
executiveAll right. Thank you, Kevin. Thanks to everybody for joining us here today. I'm here to talk about open source in its current state today. It's usage across the entire software industry, the entire software world and exactly what that means for you as a development team or a security team or perhaps even as a consumer of software. So here's what we'll cover today, real quick, why we're focusing so much on open source? We'll talk about the prevalence of open source across applications across every single industry and dive into the risk associated with unmanaged open source usage. We'll wrap it up with some best practices and suggestions for teams that are using open source and are looking for ways to track and manage the open source and reduce any risk that it might bring in, any supply chain risk that it might bring in as well. So before we talk about findings, I want to talk about the data sources that we use for the open source security and risk analysis report. I'll refer to it as the OSSRA report from here on out to avoid being tongue-tied. But like I said, I want to go over the data sources because, obviously, you shouldn't believe everything you read are here on the internet unless it comes from Synopsys, of course. So I'm doing your due diligence, here for you to show you where we get our data. And hopefully, that helps. My viewers here understand where this data comes from and how that impacts the software that they use and build today. So within Synopsys, we have product offerings, but we also have consulting services and audit services. And the audit services are going to be the ones where our data comes from. We call them black duck audits. So that is the services side of our Black Duck software supply chain -- I'm sorry Black Duck software composition and analysis tool. And what this audit team does is it's usually going to be activated in the context of a merger and acquisition of a company, most often a technology company. So company A is acquiring company B, then will be asked to come in usually hired by the acquiring company, so company A will be hired by them to come in and audit the software, technology and code bases owned by company B. We're a third-party auditor, and we don't provide any IP to the acquiring company because, of course, the deal isn't done yet. So we'll do that audit and we'll type up reports and we'll make reports that define the level of risk associated with that target code base and the target applications, and we'll send that back to our clients in the acquiring company that they can understand the risk that they are going to be subject to should they proceed with the deal. Most of the time they do, they just want to understand before they do proceed with the deal. So again, we're that trusted third party so that there is no transfer of IP before the deal is done. Now when we do these audits, we collect information, anonymous information, of course. And we're able to scramble it up and whip it up in to a fashion that we put into an annual report, the OSSRA report and publish it out and push it out to a broader market so that everybody can understand how open source impacts our daily lives and our applications and our jobs today. Now we look at code across every single industry. This year, we actually looked at 1,067 code bases across every industry, down a little bit from last year. Last year, we looked at 1,703 code basis, but just the state of the economy, tech transactions are down a little bit they're recovering. So next year, you'll probably see that number go back up. But again, it's just a state of the economy, but we did look at over 1,000 code bases. So we feel as though our findings are pretty significant. You can see the distribution of the industries that we audited code for over here in the table in the overwhelmingly most common industry, it's going to be enterprise software and Software as a Service. The reason for this is pure speculation, but I think it's pretty accurate speculation. And that's the fact that we are moving towards this reality that every company is a software company. So you no longer just have to be a software vendor to be considered a software company. You just need to be any company. Think about your bank, think about your favorite grocery store or your favorite e-commerce site or your insurance company, they all use software to enable their customers and enable their business. And a lot of that software does get moved around, which is why we're seeing such a heavy distribution of those applications throughout our audits. This number down here, the 88% code-base with risk assessments, those are going to be the code bases that we focus on today during this talk. So 88% of that 1,067 code bases actually underwent a risk assessment. So they -- those clients want to understand what kind of security, compliance and quality risk that this software is subject to, whereas the remaining a little bit we're just interested in what the collection of assets look like and perhaps how it was formatted. But again, most of these code bases gave us a good picture of risk. So enough about that, let's start talking about findings. But why are we focused so much on open source findings? We wrote this whole report on open source software. Now the reason for that is pretty simple. As a lot of folks on this call or this webinar probably understand software supply chain security is very much top of mind right now. And you can't talk about software supply chain security without talking about Open Source software. And you'll see why as I go through this presentation. But -- and teaser, open source software makes up the vast majority of applications, modern applications today. So they represent the biggest chunk of the software supply chain. And that means the biggest chunk of the software supply chain is being developed outside of the control and outside of the visibility of the organizations that are actually taking it and using it to build finished applications that they will ship and they will be used as is. You can see here some prime examples of software supply chain threats, risks or attacks, Log4j, OpenSSL, Apache Struts which was the Equifax -- news from the Equifax breach, [ Kearl ] very recently. And then you see SolarWinds here. Now you probably already know, SolarWinds had nothing to do with open source software, but it was very much a software supply chain incident. But the point I'm trying to drive home here is most of these, the vast majority, are going to have something to do with open source software. And that is why we are so focused on open source software here at Synopsys and of course, with this report. So what are some of our findings? Well, first of all, every industry relies on open source. Some more than others, but there is a heavy, heavy reliance on open source across every industry, at least those that we looked at. 96% of the code bases that we analyze contain some sort of open source. And the exact definition of that would be contained at least one open source component, but that's very, very rarely the case because 77% of all the code -- of the code bases originated from open source. Now the way I'd like to articulate that last number is if you take all of the code that we analyzed and put it in one big giant file, 77% of that code originated from an open source project. So just over 20% of that was actually written in-house was actually proprietary or written by a contractor. So you can see here, open source saturation is very, very significant. So most code bases are going to have more than that minimum of one open source component to be included in this top right number, this 96% number here. So no matter who you're talking to, if they build software, they're most likely using open source. Now open source software is going to flow into an application through many different avenues. So there's the easiest, the lowest-hanging fruit here, and that is package managers. This is what package managers are made to do. So those like NPM, Maven, Pip for Python, these are used to bring in dependencies -- open source dependencies on an intentional basis. So should a developer need a specific library that they don't want to have to recreate because they know there's open source libraries available for their use. They're usually going to find it. And the package manager will facilitate bringing that into their application. And then of course, bringing in any secondary or transit of dependencies that are needed to make that direct dependency needed. So these are all the dependencies that are intentionally brought into an application. But that's only one piece of the puzzle. There's always going to be dependencies that are not found in package managers. So these package managers are going to have an artifact like upon the XML file or a package adjacent on file that are going to list your dependencies. It's like a table of contents. So you can track them that way. There are some complications to that, but I won't go too far down that. But there's always going to be dependencies that are not found in package managers. Now the most obvious example is going to be if you're using C and C++. The way that open source is used within applications written in these languages is a bit more nuanced. You can have open source LinkedIn header files from some sort of outside directory or you can have open source included in a direct project directory. But again, there's really no standardization around package managers when it comes to these languages, which are still very, very popular, especially in some industries. Even see container images, which can be hard to analyze, and they're very heavily used, especially those base images that are pulled off a container registry. It's very difficult to see what open source is being brought in. But you can almost guarantee that there's open source included in these aspects. Then there's binaries, libraries, artifacts and containers. These are those components that a development team or a build team or a DevOps team does not have any access to source code for or does not have any access to the build environment for. So it's very difficult to understand what open source dependencies are included in these artifacts and components. And then finally, code snippets. So I cut my teeth as a software developer back in the day. And if I needed some sort of small algorithm or something bigger, like a quad tree, I would simply Google it, go on to stack overflow, and I would copy the source code and paste it into my application. So I would not actually depend on a package manager to bring in the libraries and the transit of dependencies. I didn't need all of that. I just needed a little bit of code. But a little did I know that software, that code was still intellectual property. And I brought it in without the license context. So I could annoyingly be violating intellectual property rules or laws. And I have no idea what those laws are because I don't know what the license is. Now if, when I was writing software, there was GitHub copilot or ChatGPT, I'd be using those to do that for me. So those are available today. So that is a very, very popular avenue for developers to do just what I did. But again, it opens up another stream for open source to flow into an application, on tracks, unmanaged, but still introduce a significant amount of risk to the organization that's leveraging it. Now all of this, all these avenues for open source to flow into an application is going to lead to a lot of dependencies in applications. On average, there's 526 open source components or open source dependencies per modern application. But wait, that probably doesn't sound right does it? If you go and talk to your development team and ask them, hey, are we using open source, they're most likely going to say, yes. But if you ask them how much they're using, they'll probably say they're using, I don't know, 10, 20, maybe 100 components, but 526 dependencies is going to lead to a lot of head scratching. But the fact is every dependency is a bit like feeding a stray cat. You feed one, you bring one in, and they're all going to follow, right? These dependencies have their own dependencies. With this image here on the left, you can actually see these tiny little green dot, low PDF in the very middle. Everything that's attached to it is a dependency of low PDF. Low PDF needs these dependencies to work correctly. I couldn't even fit them all on the slide. It would be very difficult to see the green dot if I zoomed out all the way. But if I brought in that one component, I'll be bringing in all of these components, which are transitive dependencies. So when I do that, and I do that 10, 20, 100 times, and I'm bringing in transit dependencies every time, you can see how we get to this 526 components on average. Now this might not seem like much of an issue upfront. You're probably thinking, so what? Memory is very, very cheap now. Software as a Service is king. So if I'm writing software, I'm deploying it on my own systems. I don't have to worry about shipping it off or distributing it and having a customer complain to me about the size of the applications I'm shipping off, right? This is my problem. I want to bring in all these dependencies and deal with it then when I can. So who really cares about floated applications. And the answer to that is because forgotten in unmanaged dependencies are going to lead to several areas of risk. There's going to be operational risk as we call it, or health risk, and then there's going to be security risk. But first, operational risk. These are our findings in the latest report. 91% of the code bases that we assess for risk contain components that were 10 versions or more behind the most current version of the component. So that means components are not getting updated. It's with all these components, 526 components per application on average, it's very difficult to stay on top of every single upgrade. Now if these upgrades or these patches are issued for a reason. A lot of times, they're actually fixing security vulnerabilities. And other times, they're just giving you features and function enhancements. So you'd want to stay on top of these. But we're finding now that companies aren't doing a very good job of doing this. They're also not doing a good job of cleaning out dependencies that no longer have support. Almost half the code bases that we analyzed had components that had no new development in the last 2 years. So there's no community support for these components that they are depending on. So it's tech debt in the best-case scenario. In the worst-case scenario, there's nobody to find and fix and to find and patch or even find and disclose vulnerabilities in a responsible manner. Now you might have a threat actor going and trying to do that, but they're not known to do responsible disclosures. They're known to usually exploit that without telling anybody. So that's just the operational risk of the component health risk. But this also leads to security risk. I like to call operational risk, a gateway to security risk. And you can see here that almost 3/4 of the code bases that we analyzed had some sort of high-risk vulnerability. 84% had some sort of vulnerability, but I'm not too focused on that number. And the reason for that is because vulnerabilities are always going to happen, reaching a state of 0 vulnerabilities is almost impossible. It would be a very lofty goal or unrealistic expectation. It's just part of the business that we're in. There's always going to be bugs and a lot of bugs, lead to vulnerabilities, but it's that 74% of high-risk vulnerabilities that we're concerned with. And the fact is, most of those vulnerabilities are going to have some sort of fix. So why are they still lingering around. A figure that we did find that's not pictured on the slide, is that the average age of the vulnerabilities that we identified is 2.8 years old. So again, if there's a fix, if these vulnerabilities have been discovered and disclosed, why are they lingering around for so long? We even found 9 code bases that still contain vulnerable versions of Apache Log4j, which was disclosed years ago. It's a very severe vulnerability and it's very, very easy to exploit. I think we also even saw the SEC coming down some requirements about actually patching it within your applications, but we're not seeing -- or we're still seeing organizations failing to do that. Now this all comes down to the inability to manage the vast number of dependencies that an application is relying on. I don't think there's an organization out there that knows they have these high-risk vulnerabilities that knows they have a vulnerable version of Apache Log4j and just willfully doesn't fix it, I think it's lack of visibility. They don't know that these vulnerabilities and these components are impacting their applications. So they just don't know enough to go fix it, right? You can't fix it if you don't know that it exists. To drive -- dive down in this point a little bit more, we actually did a vulnerability zoom in. So we do a top 10 list of the most commonly found vulnerabilities in the code bases that we analyze. And one of them here, CVE-2020-11022 was found in 1/3 of the code bases that we analyzed. So this is a very, very common vulnerability that we found. It's a high severity vulnerability, but it's -- and it's got a fix available. You can see the fix was made available before. This was even disclosed publicly, which is very, very common. But again, there's 1/3 of the component of the applications that we analyze, that's still don't have this fixed. So you sort of have to think about that. Now you could argue that teams maybe can't upgrade this version because it would break some downstream part of the application. But the impacted versions of jQuery are 1.2 to 3.5, right? Those are the versions impacted by this particular vulnerability. But we're on version 3.7 now. So the argument that teams simply can't upgrade without breaking their applications. I think that's probably an unrealistic argument. I know I'm arguing with myself here. But I want to try to anticipate what people might be thinking to be fair. The more likely explanation for this is the visibility issues that I'm mentioning. Now you can also note the CWE here, which is the common weakness enumeration. It's essentially just a way to classify the type of vulnerability. For this one, it's CWE-80. You see that's improper neutralization. It's basically cross-site scripting, right, cross-site scripting can be the exploit that leads -- or cross-site scripting can be the method that leads to the exploit of this particular vulnerability. Now every seeded CWE maps up to a pillar CWE so more like a parent CWE if you will. And CWE-80 maps to CWE-707, which covers a more broad set of the same types of issues like cross-site scripting or improper data sanitization, if you will. Now these types of issues, they are very easy to make, very, very easy to exploit, but they can be really, really hard to spot. But in turn, they should be very easy to fix. So why do we see it here so commonly? And why do we see it in 8 of the top 10 vulnerabilities that we find across these code bases. You can see here, these are the top 10 vulnerabilities and for 8 of them, CWE-707 is the pillar CWE. And it all comes down to visibility. And this is a common issue that occurs very often in the very common JavaScript library, jQuery. Now jQuery is not something that developers are going to interact with every single day. It's something that runs a little bit in the background, and libraries can be loaded in blindly. So I may need one small function or one small method in jQuery, but in order to bring that in, I'm going to be bringing in an entire library in all of its dependencies. So you can see them starting to get bloated here with jQuery. And all of these jQuery libraries that come in blindly and they're not being used, they're not being tracked. Or it's just simply hard to track all of them that are in my applications. So we just see more and more unpatched vulnerabilities associated with it pile up. This doesn't make the issue any less severe, though, just because jQuery runs in the background, this actually leads to a lot of opportunity for threat actors to exploit applications. And jQuery, which is, of course, a Java script library means that we're most likely talking about web applications here. So the most common vulnerabilities we're finding is in the most common and most popular open source libraries that exists on all of open source forges and repos. And that's for the most popular language and that most popular language is used to build web applications. So you see all these vulnerabilities, all these easy to exploit vulnerabilities floating around in web applications. And when you think web applications, you think network connected, which means an attacker or would-be attacker doesn't have to have very special access or privileges in order to gain access or exploit these types of vulnerabilities, right? They're connected to the web, that's what they can use to do the exploit. So my -- if I put on my tinfoil hat here for a minute, my question is, are web applications a major concern, right? Do we see some sort of disaster looming? Are we going to see a lot more Apache Log4js in the near future? Are we going to see a lot of these types of vulnerabilities getting exploited or ramping up in the future? And you can bet that something that we're going to keep our eye on in the future iterations of our OSSRA report. So make sure you tune in to next year and the year after that and the year after that to see if my tinfoil hat conspiracy theory actually holds true. Now not as interesting, doesn't make as many headlines as security issues, but open source IP obligations are still very, very much overlooked. Now if you forgot about this, even being the thing, while I talked about security issues, then that just goes to prove the point. So as an overview, open source might be free in the monetary sense but it's not free in the intellectual property sense. This is still the intellectual property of somebody somewhere. And in order to use it, you have to be given explicit permission. You have to be given a license. So we're still finding conflicts with this fact, right? 31% of the total code bases that we analyze, we're using open source with no license or a custom license. And you might think, well, there's no license, what's the issue? And I'll pose an analogy. If you're driving your car and you get pulled over by the police and they ask you for your license, you certainly wouldn't look up at them and say, "Well, nobody told me that I couldn't drive." Right? You have to be given explicit permission via license to be told that you can operate a motor vehicle. The same thing applies to open source software. You have to be given a license, and it tells you that you can use the software, and it tells you how you can use the software. Can you use it commercially? Are you allowed to alter the software? There are thousands and thousands and thousands of different types of licenses with different types of obligations. And they can be a little bit difficult to track and understand and comply with, which is why we're finding these conflicts, which is why we're finding over half of the code bases we analyzed had some sort of -- were in some sort of conflict with the licenses associated with the open source that they're using. Now to add to this complexity is additional methods for open source to be included in applications without some sort of license contacts. So AI, as I mentioned before, is adding to this complication. So if I'm a developer and I need some sort of algorithm, then I might use an AI coding assistant to write this software for me. Now at Synopsys, we recognize that this could create an issue. So we played around a little bit. We did just that. We had Github copilot create us a few blocks of code. And we used our snippet analysis to scan that code and it matched it back to an open source component that it originated from and that open source component actually had a restricted GPL license. So how do we actually use that software and shift with it? And that was revealed that we are using that software then we would have been liable for violating intellectual property laws and violating license obligations associated with that open source component. But again, you bring this in, there's no license context. It's very difficult to even know that it was brought in the first place. So I think that's why we're seeing so many issues here today. The good news is, and my last bit on license, the good news is we're actually seeing a slight decrease in license conflicts in commercial applications. And I think the reason for that is quite simple. As every company becomes a software company, as we push hard through the digital transformation, we see a lot of applications being not distributed, but being hosted or offered as a software as a service deployment model. And a lot of restrictive licenses have languages that only apply to distributed software. And technically, if you're going to SaaS model, you're not distributing the software. So these licenses, their terms don't take that into account. So these teams can use components that they wouldn't have been able to use before, should they be actually shipping software, and they won't be in conflict with those license. So that's my assumption of why we're seeing lower numbers there. So enough of the doom and gloom because open source software is a central building block of modern applications of all commercial applications today, as you see, most applications use it. So I would never sit here and do the doom and gloom and tell you that you need to avoid using open source software because that's just simply not the case. I'd actually highly recommend you use open source software. I think it helps get products to market faster. And I think finding it is just fighting reality. And reality will come back to bite you eventually. So instead, you want to learn to live with it in harmony. So how can you use open source software without compromising on security or license compliance or compliance? How can you use it without compromising the quality of your code in your code bases? How can you avoid being a number in our next OSSRA report or avoid being a statistic? Now this comes more multi-phase process here. The first thing that we recommend is establishing visibility with software supply chain analysis. Now these colorful bars here should look familiar because these are associated with those many avenues of open source introduction into applications that I went over in the beginning of this presentation. And not all of these are going to be discovered the same way. You need different types of analysis in order to identify these dependencies as they flow into an application. So there's different ways to do this, but this is how we do it here at Synopsys, right? We have a dependency analysis that just interrogates the artifacts of a package manager and adds those to an open source bill materials. We have a signature or code print analysis, which is really good at scanning C and C++ applications and file directories to identify open source being used in those without the use of a package manager. We have binary analysis that it can actually look at artifacts and built artifacts, post-build without any access to source code, without any access to build systems and identify open source components contained in those. And then that snippet analysis that we mentioned, where I used the example, that it can actually identify just a few lines of code that originated from open source projects and mash them back to that project and how establish license guidelines. Now these scan types, they're not all going to be, I wouldn't recommend using all these at the same time because some of them are going to be more heavy weight. They're going to be more resource intensive than the other ones. Dependency analysis, it's pretty easy. It's pretty lightweight. It's just reading some files. Snippet analysis. As you wouldn't imagine, it's a lot more heavy weight where reading line by line, character-by-character, actual source code and comparing it back to our database of open source software. So it's important to actually build dependency analysis of an open source discovery directly into the software development life cycle, and this is how you actually meet at the intersection of accuracy and efficiency. So perhaps, you do a dependency analysis during the coding phase with an IDE plug-in so that you can catch these issues as far left in the SDLC as possible, catches many issues as far less as possible. So that you can avoid rework and showstoppers later down the line. And then during a daily build or a weekly build or a nightly build, you could start doing the more heavy weight types of scans. Not to mention, a lot of these scans have to be done post build anyway, like a binary analysis, so you can analyze artifacts. And they also plug in. They fit very well with tools that are farther right in the development process. Like let's say, a binary repository like JFrog Artifactory or Sonatype Nexus, right? That's where they work well. But at the end of the day, once you tune your analysis, then you can have this full dependency inventory, you have this full visibility of your application dependencies without sacrificing the velocity at which your development team can build software and you can get it out the door, right, get it out the door and do it quicker than your competitors. Because at the end of the day, your job is to build software. Security is sort of a byproduct. So you don't want it to get into the way. Now once you have this inventory of open source dependencies, you can't forget that you have to find some way to connect it back to risk. So you can use different tools to identify dependencies, maybe if you have a pretty tight grasp on which open source dependencies are allowed to be used. You don't even need tooling to do that. Maybe you want to do it manually, but you still need a way to associate all those dependencies back to potential areas of risk. You need to be able to associate those dependencies with known vulnerabilities as you're adding it to your application, but also continuously monitor that particular component for newly disclosed vulnerabilities, right? So everybody who pulled Apache Log4j into their application, it was free and clean when they did it. But all of a sudden, one day, it contained one of the worst most exploitable vulnerabilities that we've seen in a long time. Same with component health, you're going to want some metrics that show you who's responsible for this open source dependency, how many developers are actually actively contributing to it? Who's the project maintainer? Is this a honest developer who has good intentions? Or is it a group of threat actors who are trying to fool you into putting this malicious dependency into your application? And then finally, -- what are the licenses and copyrights associated with this application? What does it mean? What do we have to do? What are we obligated to do, what are we allowed to do so that we don't -- we're not subject to costly litigation further down the line? And finally, as -- because software supply chain security is so top of mind, it's important to perpetuate supply chain visibility. So if you are a software builder, you're probably using software either from open source or maybe even commercial vendors. So either way, it's important to bring in any software bills of materials that you might get from those vendors. It's important to bring in any software bill of materials that you might get from an open source provider or it's important to maybe even build your own SBOMs because you don't get an SBOM from your suppliers or open source. But at the end of the day, you have to focus very much on SBOM management. You have to be able to dynamically generate an SBOM for everything that you're creating. And you have to be able to store that SBOM in a manner that you can make it actionable. You have to act on the visibility that SBOM gives you, and you have to be able to understand when a new vulnerability is disclosed for Apache Log4j, which of your applications, it's going to impact and what you have to do about it. And finally, you have to be able to ship SBOMs to your customers if you are a software vendor. Or if you're just an enterprise software creator, you have to be able to ship SBOMs just to your security team, your application security team or your threat analysis team, whatever it may be. But either way, the visibility that you generate with your open source risk management policies and procedures, you have to be able to perpetuate that visibility and SBOM is very quickly becoming the standard to do just that, right? It can be something used by internal teams to identify and mitigate risk, but it can also be something that you can use to ship out to external teams for customers so that they understand what's going into the software that you're building and then you can establish trust with those customers as well. So with that being said, I'm going to stop here. I'm going to hand it back to Kevin, and we'll go through some questions if there are any. So I appreciate everybody's time today. And Kevin, all yours.
Unknown Analyst
analystThank you so much, Mike. Appreciate that. Great presentation as always. [Operator Instructions] By attending this presentation, your data will be shared by Bright Top and Synopsis may contact you about products and services that you may be interested in, and you can unsubscribe at any time. So let's move along with the questions. So question number one is I think a pretty easy one. You mentioned that this was in the annual report. Have you seen -- besides some of the data that you covered, have you seen any general trends in open source use over the years? Are there more or less issues around kind of compliance and security. Is that stuff trending up and down?
Michael McGuire
executiveSure. So I would say there's a few different numbers that we can think about. So the first one is the number of code bases that we find that are using open source. That number seems to be pretty stagnant in this point of time. So year-over-year, we'll see an increase of a couple of percent or a decrease of a couple of percent. But varying data sources, varying code bases, we'll call that the same. So we're not seeing that number move much anymore. Where we are seeing, I would call it, over the last 4, 5, 6 years, a steady increase is what I refer to as open source saturation. So that was the percentage of code bases that was made up of purely open source. I think we're seeing that slightly increase. Now is that because open sources are -- I'm sorry, is that because organizations are depending more and more on open source? It could be. But if you look at the numbers that we're seeing about the maintenance of open source inside of applications and the number of components included in individual application, I think that number is increasing because teams aren't doing a very good job of cleaning up open source dependencies that they're not using. Again, I don't think it's a low priority for them. I just don't think that they have the tools in place in order to fully track all those dependencies so that they can rip out the ones that they're not using anymore. From the security standpoint, yes, we are seeing an increase in high severity vulnerabilities. Again, I think that's directly related to the number of open source components which is again directly related to the fact that these organizations don't know they have these dependencies, so they don't know that they need to patch these high-severity vulnerabilities. When it comes down to license, I think, I went over that a little bit. We're starting -- we're seeing a decrease in conflicts a bit over the last few years. And again, I just think it's just because of the way that software is being -- it's being deployed or distributed or ships, if you will.
Unknown Analyst
analystAll right. So just probably tied into that last question. If there's no push system to get notified, what's the best way to learn about open source vulnerability?
Michael McGuire
executiveProbably a software composition analysis tool is going to be the best answer there because if you work backwards, you can't know about an open source vulnerability if you don't know that you're going to be -- that you're being impacted by it. And the only way to know if you're impacted by an open source vulnerability is to know which of your applications are depending on a particular open source project. So SCA is going to give you -- answer all those questions. It's going to tell you whether or not you're using this dependency or this project. If you are, which of your applications are impacted by this dependency and then it will alert you in whichever channel you need of this vulnerability once it's disclosed. So it could send you an e-mail, it could file a ticket in Azure DevOps or in Jira, it could send you a slack message. And this is going to do with us usually within 4 hours of a vulnerability being disclosed. So tooling like that, automation like that is going to be your best answer here.
Unknown Analyst
analystAnd is it a similar answer for, say, I'm trying to identify that open source projects that haven't had any development in the last number of x years that you're using within your software?
Michael McGuire
executiveYes, it's similar. I would say the best thing to do with that, the best approach to that is policy. So you can set policy with a software composition analysis tool on -- well, depending on the SCA tool, you can set policy based on pretty much any metric that can be identified with the metadata associated with an open source dependency. So if an open source dependency hasn't had any development activity in the past 12 months or you're trying to bring in a dependency that is 5 or more versions behind the most recent version, you can just say, no. You can hit the red light, decline to bring that in or break a build or whatever it may be. Or if you don't want to have a hard stop, you can flag it, right? You can flag it so that it's backlogged. And the next maintenance sprint, for example. And tell the development teams that at some point in time, you can even set a time threshold at some point in time that they have to update that or at least give some sort of justification as to why they cannot update it.
Unknown Analyst
analystAll right. So [ AI ] has been a hot topic as of late. How is the use of AI coding tools impacting the risks of using open source?
Michael McGuire
executiveSo there's 2 risks. There's the security risk and the license compliance risk. Now to start, I would say, treat AI-generated code like code written by a junior engineer. Before we get pitchforks out, just remember, as I said, I was an engineer at one point in time, I cut my teeth an engineer. So I was a junior engineer. So I can completely understand why that's the best approach. And that's because I did go out and copy paste code into our applications. And I did not paced in the license centers. I didn't even look for it. I didn't even really understand that open source software had IP obligations to begin with. So that's the license compliance risk, right? Just because you're using part of the component, doesn't mean that you don't have to comply with the IP obligations associated with the project that it came from. But there's also the security aspect. And that's really the security aspect of any source code. So a junior engineer might not really understand how data flows through an application as well as, say, a seasoned or senior engineer or an architect. And AI-generated code is probably going to do the same thing. It doesn't understand the entire context of your entire application. It doesn't understand if you have data sanitization in place. It's just going to bring in what you ask it to bring in, and it could be subject to security weaknesses like that. So you definitely want to do some sort of code review, whether it be a manual or an automated, static analysis security test or a static application security test on that software as well or that source code as well.
Unknown Analyst
analystAll right. Do you know how Synopsys leverages OSSF security scorecard?
Michael McGuire
executiveYes. So that's a good question. So that's actually all going to come down to the health, the reliability and the security, maybe even reputation of an open source component. The OpenSSF scorecard is a fantastic reference. I was looking into it not too long ago as well. Now our SCA tool would be the one. But we don't leverage the OpenSSF scorecard, but that's because we have our team of researchers, they're actually based out of Bell Fast. They're our cybersecurity research center. And for every open source component that we add to our knowledge base, which is massive, we do some sort of automated review or a lot of times, we do a human review from this team on those components. And with that, we actually attach the metadata to that component and give it to our users if it impacts their [ BOM ]. So they are getting a lot of the metrics that are offered by the OpenSSF scorecard, but we're sort of putting it in our own proprietary fashion, if you will. So you do get a risk score, we call it operational risk. You do get a risk score based on the same metrics that the OpenSSF scorecard would use, but it's one that we put together. And then what you can do is actually assign policy around that risk score. So if you opened up. For example, Black Duck SCA, you'd see 3 areas of risk, security risk, license risk and operational risk. Then that operational risk is high, then maybe you have some sort of issue in there like the component hasn't been updated recently. It's got a history of self-savatization by the project maintainer and so on. So I guess that's a long answer. For the shorter answer is we're not using the OpenSSF scorecard -- SFF scorecard, but we are following the same exact concepts and doing our own research behind fulfilling those metrics.
Unknown Analyst
analystAll right. Let's do one more question, and then we'll wrap it up. There's been a lot of talk about kind of software supply chain security and SBOM. What's the best way to generate an SBOM? And is there a common type of format?
Michael McGuire
executiveYes. So the best way you're going to generate in SBOM is by building some sort of SBOM generation tool into your software development life cycle. And the reason for this is because -- it's sort of twofold. And that is, firstly, some automated process is going to be able to more accurately and efficiently identify dependencies that need to be added to an SBOM better than any human can. And it also enables you to do this SBOM generation at a spot in the SDLC that makes it most accurate. So if you do it too far left, then you risk missing dependencies that actually make it into an application or including dependencies that never do make it into an application. If you do it way too far right in SDLC, then you might be missing dependencies that are listed in a package manager that you were not able to resolve with some sort of post-build binary analysis. So we always suggest doing it at the build phase because that's when you have the best visibility and that's when you can do it the quickest as part of the build. Now the second part of this, I said is twofold. The second benefit of this is that when you do it here, you get to do it on a build-by-build basis or a release-by-release basis. So now you have a dynamic SBOM that accurately reflects every application that you're building and shipping out without you having to do much extra lay work because, again, it's just built into the process. And then you can automatically send it to some database or some FTP, where your stakeholders can actually go and grab it securely and know that it wasn't tampered with. So automating it is the best part. Is there a standard format for it? I think was the second part of the question or a standard type. Yes, there's a few. The most common are going to be SPDX and CycloneDX, there's one more, SWID, or S-W-I-D. I haven't run into that much, but it's out there, but you're going to see most commonly the form or 2 that I mentioned.
Unknown Analyst
analystAll right. Well, thank you, Mike, for joining us today, and thanks to all of our online attendees for joining us as well. As a reminder, we did record this webinar and you will receive an e-mail shortly with a link to the recorded version. You'll also have access to all of the additional resources that are on the attachment tab. So thanks again for joining us, and we look forward to see you on future webinars. Take care, everybody.
Michael McGuire
executiveThank you.
For developers and AI pipelines
Programmatic access to Synopsys, Inc. earnings transcripts and 32,000+ others is available through the
EarningsCalls.dev REST API. Plans from $24.99/month — full transcripts, speaker segments,
full-text search, and the recently-added /api/v1/transcripts/recent polling endpoint for ETL pipelines.