Six Hot Languages Programmers Should Learn

I just saw the crappiest bit of developer click-bait come through my feed. It purported to be a list of six best languages for programmers to learn including C++, JavaScript, PHP and hell… I can't complete the list, it's so inane. I figured I would write a few words about why programming languages matter and why Java and C++ probably shouldn't be on anyone's "best" list.

First, let's talk about why programming languages matter. At the simplest level programming languages allow humans to tell computing machines what to do. But if you work with other programmers for a while, you realize that programming languages have a social function as well. Programming languages tell other programmers what you thought you wanted to say to a machine. It sounds subtle, but it's an important difference. If I'm only writing programs to communicate to a machine, why do I need comments? Donald Knuth went off on this back in the 80s and 90s; just google "Donald Knuth Literate Programming." And if you want to argue with the godfather of modern programming, that's fine. That's not the path I'm going to take.

But as interesting as Knuth's work on "Software as Literature" is, it's not entirely obvious in a lot of Knuth's writings that software is also used to communicate models of the world. If you want to read something about this, consider reading Papert's "Mindstorms: Children, Computers and Powerful Ideas." Papert was mostly talking about kids, but much of what he says about how kids learn is directly applicable to communicating complex models between members of a software development team.

If you don't want to wait for Amazon to deliver a copy of Mindstorms to your front door, you can read Papert's paper: "Teaching Children Thinking." CSAIL has very kindly uploaded a PDF of the paper and it's available as a free download. It's not a long paper. It talks about how they taught programming to children in the late 60s, but trust me, it has direct applicability to modern software teams (and their users.) Rather than keep you in suspense, let me pick out the bit I think is most important. On page 4-1, Papert says:

"…I propose creating an environment in which the child will become highly involved int he experiences of a kind to provide rich soil for the growth of intuitions and concepts for dealing with thinking, learning, playing and so on."

In other words, Papert doesn't want to teach kids to code, he wants to provide an environment kids use to explore new methods of problem solving. The paper doesn't go into great detail about how the LOGO team did that. You might think he's advocating ignoring teaching an API or details of the LOGO language, but you would be wrong. I think what he's saying here is teaching APIs and details of programming language syntax are not nearly as important as providing a vehicle for kids to build models about the world and the problem they're trying to solve.

And that's the important part of programming languages: they allow programmers to construct models of the real world; to abstract out the important bits of reality and PLAY with them. If you don't like the word "play," then substitute "experiment." If you don't think there's experimentation going on every day by programmers, then you've probably never tried to use libboost. (Also, get a copy of Homo Ludens and read it. Play is very important.)

So the core message I'm trying to communicate here is: Programming languages carry representational models of the problem domain between programmers. If a programming language provides a mechanism to communicate a concept succinctly, it is more likely to be understood clearly by the programmer you're trying to communicate with.

Larry Wall is credited with saying something like "Programming languages differ not in what they make possible, but in what they make easy." The bit I would add on to the end of this is that clearly and succinctly communicating the intent of the programmer and the structure of the problem domain are things that programming languages should try to make easy.

Programmers should certainly learn a number of languages, and if you want to learn a language for purely vocational reasons, there's absolutely nothing wrong with that. My message here is simply: do not let vocational concerns solely dictate which programming languages you investigate; you absolutely need to investigate different techniques to model problems. Many programming languages that introduce new conceptual models are not commercially popular; ignore them at your peril.

The Six Hot Languages Programmers Should Learn

So I mentioned I was going to provide my own list of languages programmers should learn. And since I've bored you with a bunch of theory, let me leave you with some examples. Here goes:

  1. Regular Expressions & BNF
    1. "Regular Expressions" aren't a language, you say? Okay, sure, regular expressions and Backus-Naur Form aren't turing-complete, but they subsume a powerful concept of specifying a pattern. If you use regular expressions in any modern language, what's happening under the hood is the system is compiling a little program executed in a finite state automaton whose purpose is to tell you if (and where) it finds a match between the expression and some input data.
    2. When I worked at Amazon, one of my favourite things was to point out to people how the regular expressions they were using led to pathological results. Google "Pathological Regular Expressions" for more info.
    3. The concept of pattern matching is fundamental to many software engineering tasks. If you're going to be a gigging coder, you REALLY need to master Regular Expressions as well as tools like Lexx and Yacc (or Flex and Bison if you're a GNU person.)
  2. SNOBOL / SPITBOL
    1. What the hell is SNOBOL?
    2. Here's the Wikipedia article, it's as good an intro as any: SNOBOL.
    3. Long story short: SNOBOL is an old-school programming language focused on matching and manipulating strings. I'm including it on this list because I *always* talk about SNOBOL after discussing Regular Expressions. SNOBOL allows the programmer to easily construct Finite State Machines to perform a type of matching that's not limited to "Regular Languages."
    4. I'm not going to get into the details of the different types of grammars; there's enough material there to fill an entire upper-division computer science course. But you might have heard people mention you're not supposed to use regular expressions to parse HTML files. The reason lies in how the parsers for HTML and Regular Expressions differ. Learning SNOBOL and how it differs from Regular Expressions is a great start down the road of understanding programming language grammar and the structure of common data formats.
  3. Self ( or the bits of JavaScript that deal with prototypal inheritance )
    1. Never heard of Self? No biggie. No one else has, either.
    2. But both Java and JavaScript count Self as a first class influence. On one hand, a lot of what we know about Just-In-Time compilers in virtual machines comes directly from the research project at Sun Labs that also led to Self. Much of this hard-won knowledge was later poured into Java in the early days to make it a decently performant runtime environment. On the other hand, Self pioneered the concept of Prototypal Inheritance.
    3. You can actually download a modern version of Self from www.selflanguage.org. I recommend looking at it. It's crazy. It will bend your mind. But at the end of the experience, you'll understand why JavaScript was perfectly fine without classes.
  4. Prolog
    1. Prolog is one of those languages that a lot of people have heard about, but no one's actually used.
    2. This is sad because Prolog is so different from modern, imperative or object-oriented languages.
    3. In Prolog, you type out a series of facts and a series of logical relationships between facts and classes of facts. You can then give the system a sample statement and it will use chronological backtracking to try to prove whether or not the statement is true. Ignoring the syntax, a typical Prolog program goes something like this:
    4. - Tom is a cat.
    5. - All cats are animals.
    6. - Is Tom an animal?
    7. And the system responds "True" - This is a simple example, but you can keep building larger and larger "programs" and ask the system more important and interesting questions. For instance, modeling a travelling salesman problem in Prolog probably takes fewer than 20 lines of code.
    8. Before the (free) SWI Prolog became the dominant implementation on Linux, Borland had a DOS based Turbo Prolog system. I still have a copy of it and often run it in DosBox. (But it's no longer supported, so you should probably go with SWI Prolog unless you're a RetroComputing freak like me.)
  5. Excel
    1. I'm not talking about Excel Macros, I'm talking about the way every spreadsheet known to man detects changes and propagates those changes across the entire sheet. This is an example of Dataflow or Reactive Programming.
    2. Dataflow is a paradigm that's largely ignored even though it has direct applicability to web frameworks and all kinds of GUIs. You're probably going to be okay professionally if all you know about change propagation is "Model - View - Whatever," but there's a lot more out there and when you start looking at distributed systems, reading up on dataflow models will give you a leg up when people start talking about things like "eventual consistency." And let's be honest, the people who yammer on endlessly about eventual consistency are the ones you really need to smack down with a good reference to a change propagation model from the 1950s.
  6. Verilog
    1. I know. You're wondering why the heck I put a Hardware Description Language on a list of programming languages. That's because it's not a list of hot programming languages, it's a list of languages programmers should know. Yes, I'm being a little bit pedantic. But it's okay.
    2. Even if you only do software, you should look at Verilog. You don't need to be an expert and you don't have to buy a FPGA or fab your own chip. But it's very useful to understand what the hardware community is doing with programming languages. Verilog is a language that allows users to declare the operation of hardware. Sound familiar?
    3. The main reason I'm putting Verilog on this list is to remind programmers that source code isn't what machines execute, it is merely a representation of the program. It's a little subtle, but if you start thinking in terms of "the thing versus the representation of the thing," it may lead you to a better understanding of the thing.
    4. And besides, programming FPGA is fun. Go learn some Verilog (or even VHDL if you're twisted in that way.)

Why I Did Not Mention Java

So you notice I didn't mention Java, C++, PHP or Python? That's because these are languages are essentially the same language. Sure, the syntax for each is different and namespace management in PHP is sort of non-existent. But the way you model problems is virtually identical: there are data structures containing a representation of some data. You build a model by building different data structures and relating them, mostly using small functions that know how to insert a reference from one data structure to another. Often these functions live in classes and they're declared in different files so you have to go digging for the exact subclass that contains the function so using a text editor like VI or Emacs is sort of problematic and you wind up using an Integrated Development Environment that's sub-optimally documented and whose interface changes every two years.

Welcome to the next 30 years of your career.

Java, et al. are perfectly good vocational languages. If you learn Java or Python or JavaScript, you'll probably be able to find a job pretty easily. But if you have an inquiring mind, the job will make you want to pull your hair out because you'll spend your day trying to convince your co-workers that the Initialization Fiasco is an anti-pattern.

Learning the languages listed above will set you on the path to thinking about modeling problems differently. I'm not going to encourage you to build a new system out of Scheme or Prolog, but you probably should understand what these languages make easy (and why.) You should know how Dataflow works; you may not want to use an esoteric language that directly supports it, but maybe there's a Python library out there already that makes it much easier than rolling your own. And you won't know to look for it unless you know it exists.