An Introduction to Software Composition Analysis

Educational Foundations | Reading time: 5 minutes

Is this article helpful?

What’s in this guide?

Open Source Components and Risk

These days, software is rarely built from scratch. Development teams rely on third party and open source components to ship code and innovate faster without reinventing the wheel. This reliance on open source software has drastically changed the amount of first party code, or code written by your development team, in an application. According to the Sonatype State of the Software Supply Chain report, it’s common for 90% of an application’s code to be open source software. These components are called dependencies, and each dependency introduces potential risk in the form of security vulnerabilities, license problems, or quality issues. Many dependencies rely on other open source components, or transitive dependencies, making it harder to identify the original sources of risk. By using these third party components in their applications, organizations are assuming responsibility for code that their teams did not write. The risk to a project from open source software can be managed and prevented through Software Composition Analysis or SCA.

What is Software Composition Analysis?

Software Composition Analysis (SCA) is the process of determining the specific open source software components that make up an application and the risks associated with those components.

In short, SCA is about looking at all the components in a project and determining the potential risk from those components. Software composition analysis is done using tools to find and identify risk in your applications. These tools can be automated and monitor components across the entire Software Development Lifecycle (SDLC).

How Do SCA Tools Work?

Software Composition Analysis tools take an application, identify the components in that application, and then identify any problems or risks with those components. Here’s a high level overview of how SCA tools achieve this.

  1. An application is sent to the SCA tool for analysis.
  2. The tool identifies all the dependencies and third party components in the application and produces a Software Bill of Materials (SBOM). This is done in one of three ways.
    • Manifest Scanning - The SCA tool generates a list of dependencies using the application’s build manifest files, such as package-lock.json for JavaScript projects. This is best used when scanning applications without the final build artifacts or from a source control management (SCM) system
    • Binary Scanning - The SCA tool examines the build artifacts and identifies the open source components using a binary fingerprinting. This only identifies packages included in the final build of your application which reduces false positives and catches third party software added to your application in a non-standard way. Not every software composition analysis tool is capable of binary scanning. Binary scanning is superior to manifest scanning as it assesses the actual artifacts released to production.
    • A Combined Approach - Some tools, like Nexus Lifecycle, use a combination of binary scanning and manifest scanning to give more precise results.
  3. The SBOM is then checked against a variety of public and private databases for security vulnerabilities, license information, and other potential sources of risk. The data provided by SCA tools is not equal. Access to proprietary data and vulnerability research is a major value for most commercial SCA tools. This is because many vulnerabilities are not publicly disclosed or assigned a CVE score. Other component issues, such as quality issues or license information, might not be publicly tracked.
  4. The SCA tool returns a list of vulnerabilities, license information and other component metadata. This data is compared against the organizations governance policies to generate a list of policy violations prioritized by the threat score assigned to each policy. Ideally, the tool will also provide remediation information and a software bill of materials.

Benefits of SCA Tools

Identifying every component and every risk from components is a daunting task. The State of the Software Supply Chain Report says that “development teams use an average of 135 software components.” While it’s theoretically possible for humans to monitor components for new vulnerabilities, license changes, and other potential issues, SCA is a task much better handled by machines. SCA tools are fast, accurate, and provide benefits beyond risk identification.

Automated SCA tools allow teams to ship higher quality code faster and take a proactive approach to risk management. By identifying risks across the software development lifecycle, software composition analysis tools can help your organization “shift left” or move security considerations earlier in the development process. Developers can use the information from a software composition analysis tool to select more secure components early in the development process, resulting in more secure code. This speeds up development time by preventing rework during security reviews. If a development team needs to use a component that has known risk elements, these flaws are known when the component is first introduced. This lets organizations understand their application’s risk and ensure they’re using the component in a safe way.

Next Steps

Check out the links below to learn more about SCA tools, including Sonatype’s Nexus Lifecycle.

Talk to Us

Have more questions or comments? Learn more at [help.sonatype.com](https://help.sonatype.com/, join us in the Sonatype Community, and view our course catalog at learn.sonatype.com.

And visit my.sonatype.com for all things Sonatype.