Understanding Vulnerability Data

IQ Server | Reading time: 5 minutes

In this guide:

What is Sonatype Vulnerability Data?

Sonatype creates its data using a proprietary, automated vulnerability detection system that monitors, aggregates, correlates, and incorporates machine learning from publicly available information. We gather data from various sources including the National Vulnerability Database, website security advisories, email lists, GitHub events from all open source projects, blogs, OWASP, OSS Index, Twitter, and customer reports. We have evaluated many paid-for services and have found the quality and precision of the data to be of limited value, driving our decision to build an intelligent, automated vulnerability detection system. The Sonatype Data Research team is not in the business of simply aggregating public security related feeds — we create the precise data we use.

How Does Sonatype Provide High Quality Data?

There are two considerations for data quality: (1) content of the security advisory and (2) precision of associating the content to the correct artifact. Automated decisions require extremely precise artifact identification and corresponding association of security information. Without accurate identification and association there is a high degree of false positives. We recently conducted a study of 6000 of the most popular Java components and found that name-based security association algorithms used by every tool other than Sonatype resulted in:

  • 4500 correct non-issue identifications
  • 1034 true positives
  • 5330 false positives when the advisory identified CPE was part of the component name
  • 2969 false negatives when the advisory identified CPE was not in the component name

False positives incur unnecessary research and upgrade costs. False negatives leave you at risk because there are no indicators that show you may be at risk. Sonatype uses a combination of automated identification and human research that eliminates false positives and negatives. This results in savings in research time to prove false positives and rework time to upgrade when not required.

What Data Does Sonatype Provide?

  • The source of the advisory: Sonatype Security Research or the National Vulnerability Database
  • The severity of the issue: CVSS and scoring system version and the source of the score creation
  • The Common Weakness Enumeration (CWE)
  • The exact description from the advisory
  • A detailed explanation of the advisory risk and the attack vector (because the advisory description is often very poor)
  • How to determine if you are vulnerable
  • A recommendation on how to fix or work-around the issue
  • The root cause of the issue; the exact class and vulnerable version range that was found in your code
  • Publicly known attack vectors or exploits; additional resources that describe the exact issue

How is a Vulnerability Score / Severity Calculated?

Sonatype uses the Common Vulnerability Scoring System (CVSS) to score vulnerabilities.

If a vulnerability identifier is prefixed with SONATYPE, then the vulnerability severity is its CVSS version 3 score.

If a vulnerability identifier is prefixed with CVE, then the vulnerability severity is its CVSS version 2 score.

Sonatype plans to ultimately migrate all CVSS version 2 scores to version 3. If a version 3 score is not available, the score will remain version 2.

Where are the Source Components?

Component binaries come from popular repositories like Central, NuGet.org, npmjs.org, Fedora EPEL, and PyPI. We will also ingest components directly from GitHub, and other project download sites, when nominated by customers.

Binary repositories provide the ability to extract information like declared licenses, popularity, and release history. Additional component metadata comes from a variety of sources including direct research.

When is Vulnerability Data Available?

Sonatype Data Services are continuously updated, allowing the most recent data to be visible the instant a Nexus Lifecycle analysis occurs. This is true for both newly published components and newly discovered security issues. We have two processing queues for security vulnerabilities to ensure immediate availability of security data to our customers:

  • Fast Track - Our automated vulnerability detection systems process the various data sources each day. Upon issue discovery, the issue is validated by a researcher to ensure the correct component was identified, a brief issue description exists, and the vulnerable version range matches the advisory. This process generally makes newly discovered vulnerabilities available in 1-3 days depending on severity of the issue.

  • Deep Dive - The deep dive queue is a more methodical approach to ensure the issue has a clear explanation and fix recommendation. This is also where the source code of each issue is investigated to ensure the vulnerable version range is accurate. This process generally takes 5+ days, but can take less for issues deemed critical.

How do I Access Vulnerability Information?

Sonatype-enriched vulnerability data is available from the IQ Server Application Composition Report. Select the Security Issues tab and then select the problem code you’re investigating:

Then view the detailed Vulnerability Information:

You can also access this information from the Vulnerabilities tab of the Component Information Panel.

How do I Use Vulnerability Information?

The important thing to remember is that evaluating your application and seeing security vulnerabilities should create motivation for further investigation.

For example, if it’s recommended to do a component upgrade, use the CIP to identify a recommended non-vulnerable, popular version:

If the new version has the same API as the previous component, simply run unit and integration tests and make sure everything passes to successfully remediate the policy violation.