What We Learned When We Researched Open Source Vulnerabilities in 7 Popular Coding Languages

Programming languages are a lot like people where each has quirks that make them what they are. Each is a little different in its own way.

While we’re not defined by our flaws and particularities, they are a part of us, shaping how we interact with the world around us. The same holds true for programming languages when we think about how different kinds of vulnerabilities raise their heads in the languages we love.

In hopes of shedding light on these questions from the perspective of folks who work in the open source security space, we decided to look at how seven of the most popular programming languages — C, Java, JavaScript, Python, Ruby, PHP, and C++, have fared over the past ten years when it comes to open source vulnerabilities, and gather some insights based on it.

Our research is based on WhiteSource’s extensive database of reported open source security vulnerabilities which continuously aggregates information from multiple sources like the National Vulnerability Database (NVD), security advisories, GitHub issue trackers, and other popular open source projects issue trackers.

Is One Programming Language More Secure Than The Rest? 

Starting out, some of us set out to prove that one programming language reigns secure above all the rest. But sorting through the data, we found that there is a reason why all seven of these languages are so popular, and each one has its own set of issues and advantages when it comes to security. 

Image 1.jpg

In our report, we did a deep dive into the number of vulnerabilities per language, and also looked at vulnerability types and severity for each one of the seven languages. Here, we’ll take a closer look at two favorites — C and JavaScript, to learn more about which open source issues coders should look out for, and how to address them. 

C Has the Most Known Open Source Vulnerabilities: So What?

The winner of the Most-Open-Source-Vulnerabilities category is C. According to our database over the past 10 years C has accumulated the highest number of reported open source vulnerabilities, with almost 50% of all open source vulnerabilities reported for all seven languages. 

Does this mean that C is the most insecure programming language? Not at all. 

C has been in use for nearly 50 years, and has the largest volume of code behind it. It also powers major infrastructure components that power our software products like the Linux kernel, Open SSL and many more. Being this central to development and in use for so long, means that there are many eyes reviewing the code, and many hands reporting and fixing bugs, over a long period of time, which is one of the reasons C has such a high count of known open source vulnerabilities. 

C Known Security Vulnerabilities per project:

Image 2.jpg

To learn more about the programming languages and the best practices for using it securely, we also looked at each language’s most common vulnerability types, or CWE’s. 

Image 3.jpg

The most common CWE’s for other languages were related to web and web services:  cross-Site-Scripting (XSS), Input Validation, Permissions, Privileges, and Access Control, and Information Leak / Disclosure. The vulnerability type that topped the list for C was Buffer Errors (CWE-119). The reason C stands out with Buffer Errors is that these and other related CWE aren’t possible in managed languages – which C is not. C gives programmers more power — and more room to go wrong, making these types of errors typical to C.

Javascript: Home to a Large and Loud Open Source Community

Some languages just feel the urge to be different than the rest, and JavaScript is no exception by making itself well, an exception. 

While information about open source vulnerabilities is generally spread out amongst different resources (NVD, issues trackers, etc as discussed above), JavaScript goes one or two steps further. According to our findings, 31.5% of JavaScript vulnerabilities aren't reported to the NVD. Instead, the community prefers to publish its findings on security advisories and issues trackers within their own ecosystem. This means that vulnerabilities can easily slip beneath the radar if you don’t know where to look. 

Image 4.jpg

JavaScript also likes to be in its own league with its CWEs that aren’t seen elsewhere. Our data showed that the top CWE here was Cryptographic Issues, followed by Path Traversals. None of the other languages showed these CWEs to be significant. 

image 5.jpg

This is yet another example of how if you dig deep enough, the numbers can tell a very interesting story. A lot of the Crypto and Path Traversal issues can be traced to a relatively small number of researchers, who found them in unpopular or even dead packages. 

In order to understand more about these unnatural CVEs, we looked into the NPM packages, and learned that while well over half (61%) of the JS vulnerabilities there are Path Traversal and Crypto, 70% of those packages have less than 2000 downloads in 2018, and are hardly ever used, maintained or supported. 

The increased use in a new generation of automated tools to discover certain types of easier-to-find vulnerabilities is one of the reasons behind this anomaly. When we look at the years these vulnerabilities spiked, we see that nearly all of the Cryptographic Issues (CWE-310) were found in 2016, and the vast majority of the Path Traversal issues (WEE-22) were found in 2017, which supports our claim that the ease of locating CWE-310 and CWE-22 vulnerabilities in JavaScript are behind the unusually high number of these types of CWEs.

Bottom Line: Is Your Favorite Language the Most Secure? 

The simple answer is yes. And no. And it depends. Obviously, developers won’t choose a language based on our study, and that’s a good thing. Each language is its own beast and needs to be treated as such, with respect, a lot of love, and a bit of humility. 

Simply looking at the total number of known open source vulnerabilities will teach us very little, if at all. As this study shows, some of the languages with the most vulnerabilities are also those that are the most integral to how we build our products. It’s not like we are going to stop using the Linux Kernel just because it is written in C. On the other hand, looking at the data to learn about the languages strong and weak points, what to look out for, and where to double down on security is a great practice.

0 comments

Comment

Log in or Sign up to comment
TAGS
AUG Leaders

Atlassian Community Events