The dominance of boring languages for large scale systems (and some other areas)

If you spend a lot of time on twitter, HN, and reddit, it’s easy to forget how much software is written in languages that are considered to be quite boring. If I look at the list of things I’m personally impressed with (things like Spanner, BigTable, Colussus, etc.), it’s basically all C++, with almost all of the knockoffs in Java. When I think for a minute, the list of software written in C, C++, and Java is really pretty long. Among the transitive closure of things I use and the libraries and infrastructure used by those things, those three languages are ahead by a country mile, with PHP, Ruby, and Python rounding out the top 6. Javascript should be in there somewhere if I throw in front-end stuff, but it’s so ubiquitous that making a list seems a bit pointless.

These lists are long enough that I’m going to break them down into some arbitrary sublists. As is often the case, these aren’t really nice orthogonal categories and should be tags, but here we are. In the lists below, apps are categorized under “Backend” based on the main language used on the backend of a webapp. The other categories are pretty straightforward, even if their definitions a bit idiosyncratic and perhaps overly broad.

C

Operating Systems

Linux, including variants like KindleOS
BSD
Darwin (with C++)
Plan 9
Windows (kernel in C, with some C++ elsewhere)

Platforms/Infrastructure

Memcached
nginx
Apache
DB2
PostgreSQL
Redis
Varnish
HAProxy

Desktop Apps

git
Gimp (with perl)
VLC
Qemu
OpenGL
FFmpeg
Most GNU userland tools
Most BSD userland tools
AFL
Emacs
Vim

C++

Operating Systems

BeOS/Haiku

Platforms/Infrastructure

GFS
Colossus
Ceph
Dremel
Chubby
BigTable
Spanner
MySQL
ZeroMQ
ScyllaDB
MongoDB
Mesos
JVM
.NET

Backend Apps

Google Search
PayPal

Desktop Apps

Chrome
MS Office
LibreOffice (with Java)
Evernote (originally in C#, converted to C++)
Firefox
Opera
Visual Studio (with C#)
Photoshop, Illustrator, InDesign, etc.
gcc
llvm/clang
Winamp
Z3
Most AAA games
Most pro audio and video production apps

Elsewhere

Also see this list and some of the links here.

Java

Platforms/Infrastructure

Hadoop
HDFS
Zookeeper
Presto
Cassandra
Elasticsearch
Lucene
Tomcat
Jetty)

Backend Apps

Gmail
LinkedIn
Ebay
Most of Netflix
A large fraction of Amazon services

Desktop Apps

Eclipse
JetBrains IDEs
SmartGit
Calibre
Minecraft

VHDL/Verilog

I’m not even going to make a list because basically every major microprocessor, NIC, switch, etc. is made in either VHDL or Verilog. For existing projects, you might say that this is because you have a large team that’s familiar with some boring language, but I’ve worked on greenfield hardware/software co-design for deep learning and networking virtualization, both with teams that are hired from scratch for the project, and we still used Verilog, despite one of the teams having one of the larger collections of bluespec proficient hardware engineers anywhere outside of Arvind’s group at MIT.

Conclusion

I’m not really sure why the vast majority of the types of systems I’m interested in (platforms/infra) are written in boring languages, but I’m reminded of Sutton’s response when asked why he robbed banks, “because that’s where the money is”. Why do I work in boring languages? Because that’s what the people I want to work with use, and what the systems I want to work on are written in. I find the MLs much more pleasant to used than most managed languages in use today, and if I were king, I would make F# the default managed language. But, if I take a job writing a managed language for a backend position I’m overwhelmingly likely to use Java. If I take a job writing a non-GC language, Rust would be nice, but I’m overwhelmingly like to end up writing C or C++.

If my choices were to land on a random project writing Rust, or a really compelling project writing C++, I’d choose the compelling project. YMMV.

Please suggest other software that you think belongs on this list; it doesn’t have to be software that I personally use. Also, does anyone know what EC2, S3, and Redshift are written in? I suspect C++, but I couldn’t find a solid citation for that.

Thanks to Leah Hanson, James Porter, Waldemar Q, Nat Welch, Arjun Sreedharan, Rafa Escalante, @matt_dz, Bartlomiej Filipek, Josiah Irwin, and Presto for additions to this list. Also, thanks to Matt Godbolt, Leah Hanson, and Josiah Irwin for spotting typos.