Language vs Implementation vs Platform


Reading through questions on StackOverflow and elsewhere, I often see confusion about an aspect of programming that used to confuse me in no small amount when starting out. I am talking about the confusion between a particular programming language (such as C), an implementation of it (such as GCC) and a platform for which a particular implementation exists (such as Windows, Linux or bare-metal ARM). This writeup is meant to examine why the confusion exists, and to serve as a reasonable explanation that answers can point to.

Experienced developers will probably not find too much interesting about this, but hopefully this can be enlightening to some readers.

Understanding the confusion

Confusing language with implementation, or in some cases implementation with a library, is easy. Yet it is hard to even be aware that you are confused. Generally, beginning programmers are introduced to whatever they are learning by just one name. They’re learning C or Java, or C#. A university course will use that name, books and tutorials will use that name, and so on. So it’s very easy to think that you’re learning C and everything you do in your learning exercises is C.

To make the confusion even more common, a significant amount of learners will be learning something for which there is one dominant implementation and so implementation-specific or platform-specific questions are rare. One such example from some 20 years ago might be Visual Basic, an extremely popular language among beginning programmers from about 1997 and a number of years thereafter. Visual Basic was essentially synonymous with its Microsoft implementation and the Windows platform.

The Big Three terms

So let’s go through the Big Three terms here and try to define what is what.

  • Language - this is C, C++, Java, Python and so on. A programming language is largely a theoretical concept, the language is designed by someone (a committee in many cases), and there are some documents describing the language. In some very prominent cases, the language is actually an international standard. For example, the C language is defined in the standard ISO/IEC 9899, with the current revision being from 2011 and the previous revision from 1999 still holding major significance. Not all languages have a formally accepted standard or specifications. Python is one popular language that has no formal specification, but the Python Language Reference serves essentially the same purpose.

  • Implementation - this is the actual software that implements a particular programming language. Typically, an implementation would consist mainly of a compiler (or interpreter) and an implementation of the language’s standard library. For Python, the most popular implementation is CPython (to the point where, at one time, Python the language was often considered to be defined by this implementation). For C#, the most popular implementation is Microsoft’s Visual C#, but other noteworthy implementations exist, such as Mono.

  • Platform - this is, roughly, the combination of hardware and operating system (if any) that an implementation runs on. Windows running on a x86-64 CPU is one platform, Linux running on the same CPU is another platform, and embedded development deals with many different platforms on a constant basis.

All three come into play for whatever you’re programming, whether you think about it or not. The implementation is the software that actually runs (or compiles) your code, and the platform inevitably affects how the implementation does certain things.

When this matters

One reasonably common question where these differences and terms matter would be “How do I use sleep in C?”. There is nothing wrong with that question whatsoever, and the person asking it wants to pause the execution of their program for a brief period of time. This is absolutely fine. The problem is, there is no sleep function in C! Strictly with respect to C the programming language, the answer is just that, there’s no sleep() in the language. In other words, we could say that sleep() is not a standard C function unlike, for example, the cosine cos() function.

But how do you use sleep in C, then? Well, that depends on the platform and the implementation! On any platform that adheres to the POSIX standard (outside of embedded microcontrollers, a platform is probably POSIX-compliant unless it’s MS Windows), there will be a sleep() function defined in the platform, and it will be in unistd.h. Programming for Linux, MacOS or other POSIX-compliant system, you include that header and use sleep(). On Windows, you’d instead use the Sleep() function (not the uppercase S) and would include windows.h. Ths means that the asking “how to use sleep in C?” is an incomplete question and cannot be answered without knowing what platform it concerns.

Another common scenario for these distinctions to matter is in the case of non-standard implementation-specific behavior. An implementation of a language could define some extra language features that are not present in the language standard. Consider the following switch statement:

switch (apollo) {
    case 1:
        printf("Crew perished in a launch rehearsal.\n");
    case 2 ... 3:
    case 4 ... 6:
        printf("Unmanned test flight.\n");
    case 7 ... 10:
        printf("Successful manned flight.\n");
    case 11 ... 12:
        printf("Moon landing.\n");
    case 13:
        printf("Aborted Moon landing.\n");
    case 14 ... 17:
        printf("Moon landing.\n");
        printf("No such mission.\n");

Is that valid C code? Not according to the C standard, no. However, GCC implements an extended version of the C language, referred to as GNU C. In GNU C, it’s possible to use case ranges so that case 14 ... 17 is valid, even though that is not present in standard ISO C. This has several important implications in practice, one of them being that the code would likely not compile with a different C compiler.

The bane of conio.h

Another aspect greatly contributing to the confusion is that too many universities around the world (especially in certain parts of the world) teach introductory programming with tools and materials that can be 20 years old or more. A particularly prevalent case seems to be teaching C using Borland Turbo C (or C++), a compiler that targets MS-DOS. Such teaching will often include exercises that refer to the conio.h header file for functions such as getch and clrscr, causing great confusion among students when they try to apply their skills on a modern software platform and cannot even get the code to compile.

The reason is of course that conio.h is not a standard C or C++ header so its existence depends on the implementation (it’s still available in Microsoft Visual Studio). I am in any case of the opinion that it’s an error for most introductory programming courses or tutorials to omit the distinction between the programming language itself and the various implementations and platforms.

comments powered by Disqus