Why Use Memory Safe Languages?

Introduction

Memory safety issues are the leading cause of exploitable software defects. The exact reported figures vary a bit by vendor and code base, but the numbers are typically in the 60 - 70% range:

70% of assigned Microsoft CVEs
70% of Chromium high-severity security defects
65% of Android high- and critical-severity bugs (Notably, after adopting memory-safe languages, Google is seeing a significant drop in Android memory-safety issues)

Memory safety has even attracted the attention of the Federal Government, with reports from the Cybersecurity and Infrastructure Security Agency (CISA) stressing the importance of addressing memory safety defects in software.

Worse, memory safety defects have been the cause of the worst sorts of exploitable bugs: Denial of Service (DoS) and Remote Code Execution (RCE) attacks where an attacker can execute arbitrary code of their choice on a victim’s computer.

Memory safety?

“Memory Safety” encompasses a range of inter-related issues in some programming languages (most notoriously, but by no means exclusively: C) that can allow memory to be overwritten inadvertently by a running program. This can lead to the program crashing, or the execution of the program to be altered in a way the programmer did not intend.

Some examples of memory safety issues are:

Out of bounds memory read/write
Use-after-free
Improper pointer dereference

If a program writes to a memory location that hasn’t been mapped by the operating system to its address space, a crash will usually occur (this might result in a DoS condition.) If a program can be tricked into, say, overwriting the return address of the caller to the currently executing function, an attacker might be able to use that to execute arbitrary instructions of their choice (RCE.)

Examples

The underlying problems behind memory safety issues have been known in the software engineering community for more than 50 years, but for a long time these problems remained largely theoretical as a security issue, and were more commonly encountered in routine software defects (i.e., internally triggered program crashes or data corruption) than in externally triggered, overt attacks.

The first major noteworthy memory safety security issue is undoubtedly the Morris Worm, in 1988. Among other attack vectors, it exploited a (quite typical at the time) buffer overflow vulnerability in the Finger network service that was caused by over-writing to a fixed-size buffer on the stack. On the nascent Internet, this infected as many as 6,000 machines (although the real number was likely in the low thousands.)

In 1996, the seminal paper Smashing The Stack For Fun And Profit by Aleph One brought awareness of these attack techniques to a new generation of programmers.

Seven years later, in 2003, the SQL Slammer worm was unleashed. Like the Morris Worm, it exploited a buffer overflow vulnerability, this time in Microsoft SQL Server, infecting 75,000 hosts in just minutes.

In 2017, the Wannacry Ransomware campaign was spread using an NSA-developed exploit that, at its core used an integer overflow and buffer overflow bug to allow Remote Code Execution.

In 2023, Citizen Lab reported that a buffer overflow vulnerability was used to deploy NSO Group’s Pegasus spyware.

Responses

If these problems have been known for 50 years, why are we still dealing with them? Most fundamentally, the hardware (CPUs and memory architecture) we use make these vulnerabilities possible. Close after that, a staggering amount of software has been written in C-family languages over the past 40 years. This software isn’t going anywhere any time soon.

There are, broadly speaking, six approaches to mitigating memory safety issues:

Hardware solutions, such as CHERI and MTE. Less practically: alternatives to the von Neumann architecture
OS-level runtime mitigations, such as Address Space Layout Randomization, non-executable pages, etc.
Formal verification methods
Dynamic and Static analysis tools used with memory-unsafe languages and code bases
Safe subsets of unsafe languages
Languages that offer memory-safety guarantees as part of the language specification

Among these, there is considerable overlap. For instance, you could view the Rust borrow checker as an integrated static analysis solution and non-executable pages generally rely on CPU support to work.

Hardware solutions will likely continue to be adopted, with Google adding support for ARM MTE back in 2023 and Apple announcing Memory Integrity Enforcement in September 2025.

While not the focus of this post, formal verification methods are also seeing increased use, although the proofs continue to be extremely labor intensive to generate.

Memory safe languages

How do memory-safe languages provide safety guarantees? Generally, in one of two ways: garbage collection or compile-time verification.

Garbage-collected languages like Java, C#, and Go achieve memory safety by managing memory allocation and deallocation automatically at runtime. The garbage collector tracks object references and periodically reclaims memory that is no longer reachable, eliminating entire classes of bugs like use-after-free and double-free errors. These languages also typically perform bounds checking on array accesses and null checking on pointer dereferences, catching potential violations at runtime.

The garbage collection approach has proven highly successful in many domains. Go, in particular, has gained significant traction in infrastructure development, powering projects like Docker and Kubernetes. Its garbage collector has been optimized for low latency, with recent versions achieving pause times under 500 microseconds for most applications.

However, garbage collection comes with inherent trade-offs. The unpredictability of garbage collection pauses can be problematic for systems with strict latency requirements. Memory overhead is higher than manual memory management due to the need for metadata and headroom for the collector to operate efficiently.

Compile-time verification, exemplified by Rust, takes a very different approach. Rust’s ownership system and borrow checker enforce memory safety rules at compile time without runtime overhead. Every value has a single owner, and the compiler tracks borrowing (references) to ensure that no value is accessed after being freed and that data races are impossible. This zero-cost abstraction approach means that memory safety comes without runtime performance penalties.

Rust’s model has proven particularly attractive for systems programming scenarios where performance predictability is crucial. The language has been adopted for critical components in Firefox, the Windows kernel, the Linux kernel, and Android. Amazon has reported that since beginning to write new services in Rust, the proportion of production issues caused by memory safety has dropped significantly. Microsoft has similarly announced initiatives to rewrite critical Windows components in Rust.

So, what’s the downside? New Rust programmers sometimes talk of “fighting the borrow checker” to get their code to compile. Many programs with unsafe memory access patterns and data races that could easily be written in C (and that may or may not work some or all of the time) simply will not compile when written in Rust. While the extra dilligence pays off in lower defect rates, the learning curve can be steep and ramp time for new Rust programmers can be longer than in other languages. Additionally, the borrow checker isn’t perfect - there are some perfectly safe access patterns that the borrow checker cannot validate as safe and therefore will not allow. Work is under way to enhance the borrow checker.

With the recent focus on memory safety, one might think that ‘memory safe languages’ are new. That is not the case. In fact, the second-oldest high level language still in use (Lisp) is memory-safe. Common high level/scripting languages are generally memory-safe as well, including Python and JavaScript. New memory-safe languages, like Go and Rust, are notable for providing memory safety that is usable in domains where, for performance or other reasons, C or C++ would have traditionally been used.

Why not add memory safety to existing languages/systems?

People have tried. Runtime techniques like Address Space Layout Randomization and Stack Canaries have made exploiting bugs more difficult, but attackers have managed to develop new techniques (such as Return Oriented Programming) to overcome these mitigations.

Approaches like adopting a “safe subset” of an unsafe language or using add-on analysis tools like Valgrind have also helped, but the problem with these approaches is a) they are voluntary and b) they require discipline and consistency across a project’s entire dependency chain. More simply, these methods don’t scale. As Microsoft observed:

While many experienced programmers can write correct systems-level code, it’s clear that no matter the amount of mitigations put in place, it is near impossible to write memory-safe code using traditional systems-level programming languages at scale.

Economic considerations might dictate reusing an existing code base written in an unsafe language for a system that already exists, and careful application of static/dynamic analysis tools, hardware and os/runtime mitigations, and coding practices may be the only feasible solution in these cases. But, for new projects, memory safe languages represent a fork in the road. To use C or C++ is to bet that your programmers are all significantly above average, and can collectively avoid the problems that have affected almost every system written in these languages. You might be right, but even if that is true, why fight with one hand tied behind your back? Why not use a tool that can eliminate entire defect classes so that your above-average programmers can focus on the features and capabilities that make your system unique? The design trade-offs in C were likely appropriate for the machines available in 1972, but in 2025, where hardware resources abound, extensive compile- and run-time checking is easily afforded. Manual memory management and a lack of runtime bounds checking are a profound deficit that requires unreasonable effort to overcome. It simply does not make good sense to start a new project in C or C++ today.