Understanding Vulnerability Data

IQ Server | Reading time: 7 minutes

Is this article helpful?

In this guide:

What is Sonatype Vulnerability Data?

Sonatype creates its data using a proprietary, automated vulnerability detection system that monitors, aggregates, correlates, and incorporates machine learning from publicly available information. We gather data from various sources including the National Vulnerability Database, website security advisories, email lists, GitHub events from all open source projects, blogs, OWASP, OSS Index, Twitter, and customer reports. We have evaluated many paid-for services and have found the quality and precision of the data to be of limited value, driving our decision to build an intelligent, automated vulnerability detection system. The Sonatype Data Research team is not in the business of simply aggregating public security related feeds — we create the precise data we use.

For example, Sonatype, like other SCA vendors, pulls data from a variety of sources, including:

  • National Vulnerability Database
  • Various public vulnerability feeds
  • Proprietary vulnerability feeds (ex: identifying vulnerabilities in open source code stored in code management platforms such as GitHub)

Unfortunately, not all security data is created equal and some of the data from the above sources, specifically the NVD and public feeds, is incomplete. Many times the “incomplete” data is missing vulnerabilities, and automation is not sufficient to identify this missing information. As a result, this data must be highly curated by Sonatype’s research teams to fill in the gaps and improve accuracy.

How Does Sonatype Provide High Quality Data?

There are two considerations for data quality: (1) content of the security advisory and (2) precision of associating the content to the correct artifact. Automated decisions require extremely precise artifact identification and corresponding association of security information. Without accurate identification and association there is a high degree of false positives. We recently conducted a study of 6000 of the most popular Java components and found that name-based security association algorithms used by every tool other than Sonatype resulted in:

  • 4500 correct non-issue identifications
  • 1034 true positives
  • 5330 false positives when the advisory identified CPE was part of the component name
  • 2969 false negatives when the advisory identified CPE was not in the component name

False positives incur unnecessary research and upgrade costs. False negatives leave you at risk because there are no indicators that show you may be at risk. Sonatype uses a combination of automated identification and human research that eliminates false positives and negatives. This results in savings in research time to prove false positives and rework time to upgrade when not required.

What Data Does Sonatype Provide?

  • The source of the advisory: Sonatype Security Research or the National Vulnerability Database
  • The severity of the issue: CVSS and scoring system version and the source of the score creation
  • The Common Weakness Enumeration (CWE)
  • The exact description from the advisory
  • A detailed explanation of the advisory risk and the attack vector (because the advisory description is often very poor)
  • How to determine if you are vulnerable
  • A recommendation on how to fix or work-around the issue
  • The root cause of the issue; the exact class and vulnerable version range that was found in your code
  • Publicly known attack vectors or exploits; additional resources that describe the exact issue
  • Hygiene data focusing the project’s release frequency, popularity, vulnerability remediation times, developer staffing, and other performance attributes.
  • Release Integrity early warning system using artificial intelligence and machine learning to automatically identify and block next-gen software supply chain attacks relying on typosquatting and malicious code injection.
  • Breaking Changes to enabling developers to instantly see which component version upgrades will require the least effort with the fewest breaking changes.
  • Transitive Solver to provide comprehensive remediation advice for solving both direct and transitive dependencies — all without violating policies or failing builds.

How is a Vulnerability Score / Severity Calculated?

Sonatype uses the Common Vulnerability Scoring System (CVSS) to score vulnerabilities.

If a vulnerability identifier is prefixed with SONATYPE, then the vulnerability severity is its CVSS version 3 score.

If a vulnerability identifier is prefixed with CVE, then the vulnerability severity is its CVSS version 3 score. In the case that a version 3 score is not available, the score will remain version 2.

Where are the Source Components?

Component binaries come from popular repositories like Central, NuGet.org, npmjs.org, Fedora EPEL, and PyPI. We will also ingest components directly from GitHub, and other project download sites, when nominated by customers.

Binary repositories provide the ability to extract information like declared licenses, popularity, and release history. Additional component metadata comes from a variety of sources including direct research.

When is Vulnerability Data Available?

Sonatype Data Services are continuously updated, allowing the most recent data to be visible the instant a Nexus Lifecycle analysis occurs. This is true for both newly published components and newly discovered security issues. We have two processing queues for security vulnerabilities to ensure immediate availability of security data to our customers:

  • Fast-Track: Our automated vulnerability detection systems process various data sources each day. Upon discovery of an issue, a researcher ensures that an appropriate component was identified, a one line summary exists, and that the vulnerable version range matches any available advisories. The Fast-Track process generally makes newly discovered vulnerabilities available in 1-3 days, depending on the severity of the issue.

  • Deep Dive: After the Fast-Track process is complete, issues are selected to undergo the Deep Dive process based on the popularity of the component in question and the age and severity of the vulnerability. During the Deep Dive process, issues undergo source code analysis to ensure there is an accurate vulnerable version range as well as adding detailed explanations, detections, and recommendations. The Deep Dive process may cause a change to the implicated components, CVSS score, and versions as we validate and correct the data provided from the initial Fast-Track process. Deep Dive generally takes 5+ days, but will take less time for issues deemed critical.

How do I Access Vulnerability Information?

From the Updated Policy Report

Sonatype-enriched vulnerability data is available from the IQ Server Application Composition Report.

Component Details Page

In the Application Composition Report, select the violation that you are investigating to open the Component Details Page. Within the Component Details Page, select the Policy Violations tab to view the list of violations for this component:

Component Details Page example

Click on a security policy violation to open the Violation Details tile, then scroll down

Violation Details tile example

Vulnerabilities List

In addition to the Component Details Page, you can also view vulnerability information by accessing the Vulnerability List. From the Application Composition report, go to the Options menu and select View Vulnerabilities.

example list of vulnerabilities

This list gives you an overview of all security vulnerabilities that triggered policy violations. The table shows all detected security vulnerabilities on each component. Vulnerabilities are sorted in decreasing order of severity, with waived or grandfathered violations appearing at the bottom. Note that the policy threat level for waived and grandfathered violations is shown as zero and that security vulnerabilities that haven’t triggered a policy violation are not displayed in this view.

Vulnerability Lookup

From the Vulnerabilities list, you can access detailed information on a given vulnerability by clicking the Vuln ID. This takes you to the corresponding Vulnerability Lookup page.

example Vulnerability Lookup page

Keep in mind, the Vulnerability Lookup page can be accessed anytime by selecting Vulnerability Search from the IQ Server toolbar:

From the Legacy Report

Sonatype-enriched vulnerability data is available from the IQ Server Application Composition Report. Select the Security Issues tab and then select the problem code you’re investigating:

example legacy report with the Security Issues tab highlighted

Then view the detailed Vulnerability Information:

example vulnerability information

You can also access this information from the Vulnerabilities tab of the Component Information Panel.

How do I Use Vulnerability Information?

The important thing to remember is that evaluating your application and seeing security vulnerabilities should create motivation for further investigation.

For example, if it’s recommended to do a component upgrade, use the Component Details Page to identify a recommended non-vulnerable and popular version:

example remediation suggestions

If the new version has the same API as the previous component, simply run unit and integration tests and make sure everything passes to successfully remediate the policy violation.

Talk to Us

Have more questions or comments? Learn more at help.sonatype.com, join us in the Sonatype Community, and view our course catalog at learn.sonatype.com.

And visit my.sonatype.com for all things Sonatype.