Philosophy

Why another language?

Industry coding languages like Python, Java, and C++ are mostly for professionals. Python is often considered the simplest to learn, but as one long-time instructor put it, “even Python has ‘Gotchas'”. Ex: Some syntax is non-intuitive to learners, like reading integers. Ex: The lack of static typing can yield hard-to-debug errors. We considered subsetting Python, but knew the needed departures could make transitioning to real Python confusing.

Languages like Scratch, Snap, and Alice help attract people to computing. Many instructors want something with a more serious feel for college students, and/or that leads more directly into industry languages.

Raptor is a flowchart language for learners. A newer approach is needed that uses HTML5. Plus, a unified flowchart / code language helps lead into an industry language.

Coral was designed with equivalent code and flowchart versions, unlike existing languages. Coral’s educational simulator was designed hand-in-hand with the language.

Coral was designed specifically for learning core programming concepts: input/output, variables, assignments, expressions, branches, loops, functions, and arrays. Once  learned, students are ready to transition to an industry language (if desired). Coral is intended as a step in learning, not a language for producing real applications (though who knows where it might lead).

Declarations

Data types required, all declarations at top, auto-initialized

One of our goals is to teach new programmers to think precisely — a key benefit of learning programming for non-computing-majors. Part of that goal is thinking precisely about data, such as whether an integer or a floating-point number is appropriate, or whether calculations are doing integer or floating-point arithmetic. Learners should know whether an exact comparison (for integers) or approximate comparison (for floating-point) should be done. They should know how and when conversions occur between those types. These are not bothersome details; the precise thought to get such matters right is a key benefit of learning programming. We thus do not go the way of some languages and let the type be derived from the assignment or treat all numbers as floating-point unless otherwise stated, and we especially don’t let a variable’s type change during execution. In programming-language lingo, Coral is “strongly-typed” and is “statically-typed”.

One could argue that such type details slow learners. We indeed favor ease of learning, but imprecise thinking of data types and calculations can lead to problems later. We felt data types are foundational to programming, and that glossing over types is perilous. Our research comparing learners of Python vs. C++ in 20 university-level intro courses seems to support our position.

Coral requires all variables to be declared at the top, in contrast to the trend in industry languages. We believe the simpler mental model for the learner is “I need to create some variables, and then my statements can use those variables.” When students are taught declarations can appear between statements, we see more mistakes, like declaring a variable inside a loop such that the data keeps getting reset on each iteration. Declaring variables late is an optimization that experienced programmers easily manage, but we don’t believe that optimization is best for learners.

Likewise, we don’t allow variables to be explicitly initialized in the declaration. In languages that allow such initialization, we see learners confuse the act of declaring (reserving memory) with the act of assigning (putting a value into memory). Ex: A common learner error is to list the data type in front of every assignment. Also, we believe every variable should be auto-initialized, so Coral auto-initializes integers with 0 and floats with 0.0. Coral has no such thing as an “undefined” value in a variable, which is a strange concept for a learner:  The variable is a real thing and thus must have a value, so what is that value?

The above decisions harmonize with the use of a graphical simulator. When a function starts, all the function’s variables appear in the graphical memory, each with a value of 0 or 0.0. There are no “?” or garbage values. There is no growing/shrinking of the function’s variables as the function’s statements execute.

Restating, the above choices emphasize a simple mental model for learners:

  • Declare all variables first, then write statements that use those variables.

Data types: Integer and float

Coral only supports integer and floating-point data types for variables (currently). Coral’s goal is for students to learn core programming concepts — input/output, variables, assignments, expressions, branches, loops, functions, and arrays — which can be learned with just integers. And that can be interesting, even to non-computing majors, who realize that writing small programs to do data analysis or simple calculations is more likely in their careers than creating video games, for example.

We considered only supporting integers, but added floating-point because many desirable program examples use measured rather than counted data, such as temperatures or heights. We also felt that choosing the correct data type is foundational in programming, and to enable a choice we needed at least two types.

We considered using the word “real” for the floating-point type, since students are more familiar with that word than with float. However, we felt “real” was potentially misleading, since another foundational concept in programming is that real numbers can’t be perfectly represented in limited bits. As such, we preferred the concept of “floating-point numbers”. That term is too long, and thus we went with “float”, which is used in many industry languages. We realize that term isn’t ideal, but we made a tradeoff.

We chose to spell out “integer”, because the horizontal savings of using just “int” was minor and we sought to avoid introducing an unfamiliar abbreviation.

No string type

Variables can only be declared as type integer or float. Programs can be more interesting with strings, but the core programming concepts can be learned without strings.

A string type introduces several complexities. To be interesting, strings need to be manipulated or searched, but how? We could define various functions like Find(), Append(), Length(), etc., but that’s a lot for a beginner. We could allow access to individual elements, but that requires also defining a character data type, and introducing some sort of array or function-based element access. These issues are quite challenging for learners, and we’ve had better experience not dwelling on such issues when teaching. Learners have enough trouble with just integers.

Sizing is also an issue: Would we require a pre-defined fixed size for a string (another concept to learn — estimating the max size expected), or would we dynamically modify a string’s size (and how is that happening — magic?)? In the former case, what happens if an assignment exceeds a string’s size?

We developed the simulator hand-in-hand with the language, and a string type introduces complexities for the simulator as well. The concept of a variable being “a location in memory” gets tricky with strings. If we showed the entire string in one memory location, that would likely annoy many instructors (including ourselves) for being too far from reality. But if we showed each character in a separate memory location, we not only might confuse the student (why is one variable occupying multiple locations?), but we’d use a lot of vertical space, making the memory hard to view.

Coral supports outputting of string literals, which doesn’t involve any of the above complexities.  With such string literals, a program’s output is more readable.

We might add a string (and character) type later, but they would be an advanced topic.

Input / output

Input / output is the most basic operation in a programming language, but involves non-intuitive syntax in most industry languages. Since input / output is one of the first things a learner experiences, we leaned heavily towards an intuitive syntax.

To put a string to the output, we use: Put “Hello” to output. For a variable: Put x to output. The statement looks just like the English. No need to introduce function call syntax like: Print(“Hello”). While simple to programmers, that syntax is strange to a learner. An instructor either has to explain the syntax (which students likely won’t follow), or just say “Trust me” (which makes students feel slightly out of control). Neither is a very good way to start the learning of programming. We thus erred on the side of using the simplest, most intuitive output statement syntax we could devise, to get students off to a good start. In fact, our syntax looks like common pseudocode, which Coral is partly intended to replace.

Likewise, to get a value from input, we use: x = Get next input. We intentionally use assignment “x =” because x is being assigned, so an input statement is consistent with assignment statements in that both assign a variable with a value.  And we specifically use the word “next” to make clear that each statement is moving from one input value to the next.

We use the words put and get. Many books and instructors use synonyms for output words, like print, write, put, or even use the word “output” as both a verb and a noun — the same book might say “x is printed to output”, then later “y is output”, “z is printed”, “u is written out”, or even “w is output to the screen” (introducing “screen” as a synonym for output). Programmers easily navigate those terms, but learners can get confused. Thus, we strive to consistently say “Put to output”, using put as the only action verb, and using output as a noun. Likewise, input words like get, read, and input are often used loosely. We strive to consistently use get as the verb as input as a noun.

For a newline, we decided to use the special two-character sequence \n found in many languages, as in: Put “Hello\n” to output. That may be a little tricky for students, but we felt better than alternatives like a statement variation (like: Put x and newline to output) or a special object (like: Put newline to output).

Statements

Coral allows one statement per line. Many teachers impose that restriction anyways to lessen student confusion,  to yield more readable code, and to reduce bugs. Separators like semicolons are thus unnecessary. The mental model is simple: A statement is a line; a line is a statement. (Blank lines are OK). Stepped simulation proceeds naturally, one line at a time.

Coral requires fixed three-space indents for sub-statements of items like if-else, loops, or functions. The indent is fixed, because otherwise students often make terrible indenting choices, like no indenting, inconsistent indenting, or too much indenting. Why three? Two is too few to easily see the indenting, while three seemed sufficient. Although four is common, by conserving horizontal space, code is less likely to wrap or have to be scrolled, in our simulator, in figures, etc.

For simplicity, Coral has only the // kind of comment, and each comment must be on a line by itself. Those requirements eliminate a choice: Place comment above line vs. at line’s end. For learners, every choice is a potential source of confusion. Multi-line comments can be accomplished using several single-line comments.

With just one statement per line and required indenting of sub-statements, braces are unnecessary, eliminating a common source of bugs in learners’ code, and reducing the constructs that must be learned.

Loops

while and for (no do-while)

Coral has two kinds of loops: while and for. Though we strive to avoid unsubstantial choices for learners, these two kinds of loops seem foundational. A key decision of a programmer is whether to use a while loop (if the number of iterations is unknown) or a for loop (if the number of iterations are known). However, Coral does not have a do-while loop. Such loops can be written using a while loop (and often are in practice). We felt the benefit of fewer constructs outweighed the cleaner do-while loop benefit for some programs. Yes, we realize the for loop can also be written using a while loop, but felt the benefit of teaching while vs. for (a foundational concept) outweighed the benefit of having one less construct.

For loop syntax

Because Coral’s code and flowchart versions are equivalent, we chose to use a for loop syntax that makes the equivalence explicit. In a flowchart, only a while loop directly exists: A decision node with a branch that points back to that node. Thus, we use the for loop syntax of C, C++, and Java, having three parts: An part before the decision node for initialization, a part that is the decision node’s expression, and a part at the branch’s end for update, as in: “for i = 0; i < 10; i = i + 1". That for loop is not as elegant as some forms like "for i in 0 to 9". This decision was a tradeoff; we felt the direct conversion to flowcharts outweighed the benefit of a simpler for loop syntax. We needed a separator for the three parts. Here, we went with the semicolon, just like C, C++, and Java.

Functions

Defining an intuitive function declaration syntax was challenging. Coral’s syntax reads left-to-right. A function named MyFunc has some parameters and returns a value, so: Function MyFunc(int x, int y) returns int z.

In most languages, return statements have some issues. A return statement anywhere but at a function’s end is in some ways like a go-to statement, so many instructors and programmers restrict to only one return statement at the function’s end. Coral enforces that restrictions by having “falling off the end” be the only way a function returns.

We felt the concept of a return variable was simpler than a return statement.  Above, the final value of z is what the function returns. No distinct return statement is necessary. And the user need not declare a local variable solely for the purpose of holding the value to be returned. This approach was inspired by MATLAB.

For a function that returns nothing, we originally considered the word “void” that is common in other languages. But that word is a bit strange to learners, with students often asking what void means, or thinking void is a data type. Our answer is typically “Returning void means the function returns nothing”. Well, we just decided the language should use the more direct wording: Function MyFunc() returns nothing.

Like many languages, Coral starts in a Main function. However, we don’t want to burden learners with function syntax at the start; If an instructor tries to explain, the learner gets lost, but saying “Trust me” is disconcerting as well. Thus, we decided that for a simple program with only statements, then the Main function is implicit. Later, when a user defines their own function, then the Main function must be declared explicitly too. This makes initial learning easy, while supporting functions down the road.

Arrays

Arrays are foundational so we felt should be included. For declaration syntax,  the data type is on the left so all data types line up. The array indication is next, to denote multiple items of that type, with the number in parentheses rather than brackets to avoid confusion with element accesses. Last is the name, like non-array declarations. Thus an array declaration is: integer array(5) myList.

The declaration uses parentheses, which could be confused with function syntax. We considered other symbols, but decided to stick with parentheses, since the size is indeed a sort of parameter, and we felt there wouldn’t be much confusion.

Accesses use brackets [ ], distinct from the declaration’s parentheses ( ).

An array’s size is directly accessible via myArray.size. We introduced a dot syntax to access the size, which seemed acceptable since size is like an attribute of an array. In the simulator, we show that an array has a memory location for the size along with locations for the elements, making clear to learners that the size is just like another variable.

Arrays use a lot of vertical space in the simulator’s graphical memory, so instructors may wish to use small arrays.

Code

We use the terms code and coding quite a bit. To us, code is a textual representation of a program, while a flowchart is a graphical representation. Coral is one language for programming, having two equivalent versions: code and flowchart.

We note that code and coding are becoming popular terms (e.g., code.org, Hour of Code, etc.) — in computing, terminology evolves.

Compiled

Coral is a “compiled” language — the entire program is checked for correct syntax, before execution begins. Our simulator does the compiling, and reports errors. The word “compile” is not used, though, but rather “Enter execution”, which switches the simulator from edit mode to execution mode. The compiled approach provides a simple mental model for learners: When entering execution, the tool ensures my program has correct syntax, then executes my program. The approach also allows for aggressive error checking; our goal is for the simulator’s compiler to thoroughly check for errors and to provide highly-descriptive error messages.

We put “compiled” in quotes because currently Coral is only processed by our simulator; no standalone executable file is generated (of course, one could be in the future). Nevertheless, the process is similar to the normal compilation flow, and in fact the simulator is indeed compiling Coral into an intermediate format.

Tradeoffs

Language design is largely about tradeoffs among metrics. We designed Coral for college students to learn core programming concepts, with some learners transitioning to an industry language (C, C++, Java, Python, C#, Javascript, etc). Thus our metrics were quite different than for other languages. We strove to balance:

  • Super-easy initial learning of core concepts with minimal barriers
  • Consistency among the code and flowchart versions
  • Minimizing constructs and unsubstantial choices
  • Easy viewing in an educational simulator
  • Prevention of common learner errors; detecting errors during “compile time”
  • Simple transition to an industry language