InfoQ Homepage Articles Ballerina Swan Lake: 10 Compelling Language Characteristics for Cloud Native Programming

Ballerina Swan Lake: 10 Compelling Language Characteristics for Cloud Native Programming

Sep 15, 2021 21 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Key Takeaways

Newer, purpose-built programming languages can provide novel ways of doing things that can be radically more efficient than what we're used to. If leveraged correctly, using the right language with the right abstractions for your problem itself is a competitive advantage for business.
Whether it's with Domain-specific Languages (DSLs) or templates, integration remains too difficult. Just recently, this benchmark report indicated that many digital transformation projects failed or were challenged due to the inherently complex nature of integration.
Ballerina is an open-source programming language purpose-built for cloud native programming and integration. The language has been designed to make it easier to use, combine and create network services and thereby allows you to integrate distributed applications seamlessly.
The Ballerina language has come a long way with significant improvements since the 1.0 release in 2019. The latest Swan Lake release further simplifies building and deploying cloud native apps through a network-aware and flexible type system, constructs for developing services and APIs (including REST, GraphQL, and gRPC), a sequence diagrammatic syntax, JSON support, and built-in concurrency among many other capabilities which will be explored in this article.

Background to Ballerina

Integration can be considered as a type of programming and very often has become a visual representation with different techniques that have tried to simplify and abstract the complexity. DSLs have become quite popular because they provide the right abstractions for programming but they come with limitations—most often than not, integration developers have to use regular code to address parts of the problem. Furthermore, integration programming practices have become isolated such that developers programming with the chosen integration tool must develop the rest of their applications with another tool or programming language. And visual representation remains important to be able to observe flows and interactions between endpoints. On top of all that, with cloud native engineering, integration systems are now running in containers, and applications are created using microservices that are distributed across a wide number of nodes.

Wouldn’t it be extremely useful to have a language that provided the capabilities to integrate with code and also tools for visualization? So that we can address the limitations of DSL and also be able to preserve software engineering best practices? This thinking is what precisely led to the Ballerina project. The goal was to create a modern programming language that combines the best of programming languages, integration technology, and cloud native computing into a textual and graphical language optimized for integration and with mainstream potential.

Let’s go through the list below to explore the key characteristics of the Ballerina Language that have been either introduced or enhanced with the Swan Lake Beta release and thereby understand why Ballerina is a great choice for creating network services and distributed application integration.

Key Characteristics of Ballerina

1. Ballerina Addresses Most Use Cases of Scripting Languages with the Robustness and Scalability of Application Languages

You can think of languages being on a spectrum from scripting languages like Perl or Awk to system languages like Rust or C with application languages like Go or Java in the middle. Programs can be written with one-line scripts all the way up to writing millions of lines of code. Ballerina resides on the spectrum between scripting languages and application languages.

Ballerina has unique features that make it particularly worthwhile for smaller programs. Most other scripting languages that are designed for smaller programs have significant differences from Ballerina in that they are dynamically typed and they don't have the unique scalability and robustness features that Ballerina has. Problems in the pre-cloud era that you could solve with other scripting languages are still relevant problems. Except now, network services are involved; robustness is now more important than ever. With standard scripting languages, a 50-line program tends to become an unmaintainable 1000-line program a few years later, and this doesn’t scale. Ballerina can be used to solve problems addressed with scripting language programs but it's much more scalable, more robust, and more suitable for the cloud. Scripting languages also typically don't have any visual components, but Ballerina does.

2. Ballerina is Data Oriented and NOT Object Oriented

In network interactions, the object-oriented approach bundles data with the code, which is not the most optimal way to send data across widely distributed networks of microservices and APIs.

In the pre-cloud era, your APIs are function calls to libraries in the path, and you can pass objects in the call. But you can't do this when your APIs are in the cloud. You want to be able to send data over the network that's independent of code because you don't want to expose your code. Although JAVA RMI and CORBA tried to support distributed objects, it works when you have tight coupling and if both ends belong to the same party. Distributed objects don't work in the loosely coupled cloud. Ballerina emphasizes plain data that is independent of any code used to process the data. While Ballerina provides objects for internal interfaces, it is not an object-oriented language.

With more services in the cloud, a developer is given the added responsibility of working with networked resources in their code. The programming language itself must aid in this operation. That’s why Ballerina comes with a network-friendly type system with powerful features to handle data on the wire.

JSON is the ‘lingua franca’ in Ballerina. The data types in Ballerina are very close to JSON, and a fundamental set of data types which are numbers, strings, maps, and arrays map one-to-one to JSON. Ballerina’s plain in-memory data values are pretty much in-memory JSON. This allows a JSON payload from the wire to come immediately into the language and be operated on without transformation or serialization.

import ballerina/io;
import ballerina/lang.value;

json j = { "x": 1, "y": 2 };

// Returns the string that represents j in JSON format.
string s = j.toJsonString();

// Parses a string in the JSON format and returns the value that 
// it represents.
json j2 = check value:fromJsonString(s);

// Allows null for JSON compatibility.
json j3 = null;

public function main() {
    io:println(s);
    io:println(j2);
}

3. Ballerina has a Flexible Type System

The type system of a programming language allows you to describe how your bits fit together and has evolved beyond catching a class of errors—that's now only a small part of what a type system does for you. It’s also a big part of providing a good IDE experience.

Scripting languages have dynamic typing, and application languages have the traditional static types like in C++ or Java. As discussed earlier, Ballerina is a scripting language but it also comes with features from application languages, including a static type system. In a statically-typed language, type compatibility is checked at compile-time. Statically-typed languages are generally more robust to refactoring, easier to debug, and aids in creating better language tooling.

Even though Ballerina’s type system is static, it's much more flexible than the types you get in application languages. The type system of the Ballerina language is primarily structural with added support for nominal typing. This means that the type compatibility is identified by considering the structure of the value rather than just relying on the name of the type. This is different from languages like Java, C++, and C# that have nominal type systems, in which the type compatibility is bound by the name of the actual type. You can say as much or as little about the structure as you need to. And with that, we get simplicity and flexibility by accepting that some things won't be caught at compile time.

To elaborate, you can say it’s similar to the way the structure is defined in an XML schema. If there's a change or deviation in the XML payload that the program receives, it still processes what it can recognize. It’s not so strict as to cause a failure if something changes in the payload that isn't recognized. Ballerina’s type system works both as a schema language to describe network data as well as a type system for the program working on values in memory.

Read more about Ballerina’s flexible type system here.

4. Ballerina Comes with Powerful Features to Work with Network Data

Ballerina also comes with a set of language features for working on data, out of which the integrated query feature stands out. This feature allows you to query the data using an SQL-like syntax as shown below. Query expressions contain a set of clauses similar to SQL to process the data. They must start with the from clause, and they can perform various operations such as filter, join, sort, limit, and projection.

import ballerina/io;
 
type Employee record {
    string firstName;
    string lastName;
    decimal salary;
};
 
public function main() {
    Employee[] employees = [
        {firstName: "Rachel", lastName: "Green", salary: 3000.00},
        {firstName: "Monica", lastName: "Geller", salary: 4000.00},
        {firstName: "Phoebe", lastName: "Buffay", salary: 2000.00},
        {firstName: "Ross", lastName: "Geller", salary: 6000.00},
        {firstName: "Chandler", lastName: "Bing", salary: 8000.00},
        {firstName: "Joey", lastName: "Tribbiani", salary: 10000.00}
    ];
    
  // Query-like expressions for list comprehensions start with from
  // and end with select.
  // The order by clause sorts members in employees based on the last name.
    Employee[] sorted = from var e in employees
                         order by e.lastName ascending 
                         select e;
    io:println(sorted);
}

There’s also a Table data type which makes it easy to work with relational and tabular data. The code below creates a table with Employee type members, where each member is uniquely identified using their name field. The main function retrieves the Employee with the key value John and performs a salary increase for each employee.


import ballerina/io;

type Employee record {
    readonly string name;
    int salary;
};

// Creates a table with Employee type members, where each
// member is uniquely identified using their name field.
table<Employee> key(name) t = table [
    { name: "John", salary: 100 },
    { name: "Jane", salary: 200 }
];

function increaseSalary(int n) {
    // Iterates over the rows of t in the specified order.
    foreach Employee e in t {
        e.salary += n;
    }
}

public function main() {
    // Retrieves Employee with key value `John`.
    Employee? e = t["John"];
    io:println(e);

    increaseSalary(100);
    io:println(t);
}

Read more about writing integrated queries in Ballerina here.

Moreover, Ballerina provides in-built XML support with functionality similar to XQuery with an XML navigation mechanism like XPath. This is especially useful for people who work heavily with XML but don't want to use an XML-specific language because they work with a variety of data formats these days.

import ballerina/io;

public function main() returns error? {
    xml x1 = xml `<name>Sherlock Holmes</name>`;
    xml:Element x2 = 
        xml `<details>
                <author>Sir Arthur Conan Doyle</author>
                <language>English</language>
            </details>`;

    // `+` does concatenation.
    xml x3 = x1 + x2;

    io:println(x3);

    xml x4 = xml `<name>Sherlock Holmes</name><details>
                        <author>Sir Arthur Conan Doyle</author>
                        <language>English</language>
                  </details>`;
    // `==` does deep equals.
    boolean eq = x3 == x4;

    io:println(eq);

    // `foreach` iterates over each item.
    foreach var item in x4 {
        io:println(item);
    }

    // `x[i]` gives i-th item (empty sequence if none).
    io:println(x3[0]);

    // `x.id` accesses required attribute named `id`:
    // result is `error` if there is no such attribute
    // or if `x` is not a singleton.
    xml x5 = xml `<para id="greeting">Hello</para>`;
    string id = check x5.id;

    io:println(id);

    // `x?.id` accesses optional attribute named `id`:
    // result is `()` if there is no such attribute.
    string? name = check x5?.name;

    io:println(name is ());

    // Mutate an element using `e.setChildren(x)`.
    x2.setChildren(xml `<language>French</language>`);

    io:println(x2);
    io:println(x3);
}

Among other capabilities to handle different types of data seamlessly, there’s also a decimal data type in Ballerina. These are floating point numbers designed for business needs, e.g., to indicate prices. Because values are represented in binary in ordinary languages, they cannot represent all real numbers accurately. When there are more digits than the format allows, the leftover ones are omitted—the number is rounded and thereby creates precision errors. The real world runs on decimal numbers and this is why we think it’s a powerful capability to have.

import ballerina/io;

// The `decimal` type represents the set of 128-bits IEEE 754R 
// decimal floating point numbers.
decimal nanos = 1d/1000000000d;

function floatSurprise() {
    float f = 100.10 - 0.01;
    io:println(f);
}

public function main() {
    floatSurprise();
    io:println(nanos);
}

5. Ballerina is Inherently Concurrent and Provides In-built Safety for Concurrency

Concurrency is a fundamental requirement in the cloud because your network operations have high latency. Scripting languages generally don’t handle concurrency well. Scripting languages like Javascript typically use async functions, which are slightly better than callbacks (but not a lot). With Ballerina, you have a simpler programming model that provides the advantages of async functions but it’s a more straightforward and intuitive approach to concurrency than async functions.

In Ballerina, the main concurrency concept is a strand, which is similar to a goroutine in Go. A Ballerina program is executed on one or more threads. A thread may run on a separate core simultaneously with other threads, or may be pre-emptively multitasked with other threads onto a single core. Each thread can be divided into one or more strands, which are language-managed, logical threads of control. From the programmer’s perspective, a strand does look like an OS thread, but it’s not—it’s cheaper and more lightweight. Strands can be scheduled on separate OS threads. Go follows a similar approach but not many programming languages do. In fact, most dynamic scripting languages don’t support concurrency. For example, Python has a global lock, so it doesn't really support parallel execution.

A function in Ballerina can have named "workers" that each run on a new strand concurrently with the function's default worker and other named workers as shown below:

import ballerina/io;

public function main() {
    // Code before any named workers is executed before named 
    // workers start.
    io:println("Initializing");
    final string greeting = "Hello";

  // Named workers run concurrently with the function's default       
  // worker and other named workers.
    worker A {
        // Variables declared before all named workers and 
        // function parameters are accessible in named workers.
        io:println(greeting + " from worker A");
    }

    worker B {
        io:println(greeting + " from worker B");
    }

    io:println(greeting + " from function worker");
}

Ballerina also allows strands to share mutable state. Typically, combining concurrency and shared mutable state can create data races and give incorrect results. This is one reason why dynamic languages don't usually expose threads. Ballerina, however, comes with a nice solution to this problem and ensures concurrency safety via cooperative multitasking. No two strands belonging to the same thread can run simultaneously. Instead, Ballerina cooperatively (not preemptively) multitasks all strands onto a single thread to avoid locking problems. This is similar to asynchronous functions where everything runs on a single thread but without a complicated programming model. A strand enables cooperative multitasking by yielding. When a strand yields at a specific “yield point”, the runtime scheduler can suspend execution of the strand, and switch its thread to executing another strand.

We can also determine when we can run strands in parallel—an annotation can be used to make a strand run on a separate thread. This is because Ballerina’s unique type system makes it possible to determine when services have locked enough to be able to safely use multiple threads to handle incoming requests in parallel. While this may not seem to provide a massive amount of parallel executions, it’s enough to make effective use of common cloud instance types.

 import ballerina/io;

 public function main() {
// Each named worker has a "strand" (logical thread of  
// control) and execution and switches between strands only at   
// specific "yield" points.
    worker A {
        io:println("In worker A");
    }

    // An annotation can be used to make a strand run on a 
                // separate thread.
    @strand {
        thread: "any"
    }

    worker B {
        io:println("In worker B");
    }

    io:println("In function worker");
}

6. Ballerina has an Intrinsic Graphical View

Handling concurrency and network interactions are an inherent part of writing a cloud program. In Ballerina, every program is a sequence diagram that illustrates distributed and concurrent interactions automatically. A function in a Ballerina program has equivalent representations both in textual syntax and as a sequence diagram. You can switch between the two views seamlessly. Ballerina’s unique graphical view wasn’t an afterthought. In fact, it has been designed deeply into the language in order to provide real insight with respect to a function’s network interactions and its use of concurrency. A sequence diagram is the kind of diagram that works best for that.

To explain, Ballerina’s named workers (discussed in point 5) and other function-level concurrency features depict concurrency, and language abstractions for clients and services depict network interactions. The vertical lines, also known as lifelines, represent workers and remote endpoints. A remote endpoint is a client object, and it contains remote methods that represent outbound interactions with a remote system. There’s distinct syntax for remote method calls and most languages don't distinguish remote calls from regular function calls. The horizontal lines represent the messages sent from a function's worker to another worker or from a function's worker to a remote endpoint. Ballerina easily distinguishes these key aspects and shows the user a high level view of them without the user having to do anything. This is only possible because of the way the graphical elements have been designed into the language from the start. You don’t get this in any other language!

The Ballerina VSCode plugin can generate a sequence diagram dynamically from the source code. To start generating a sequence diagram from the above Ballerina code, download the VSCode plugin and launch the graphical viewer.

7. Ballerina is Cloud Native with a Simple Model for Producing and Consuming Services, and Deploying Code to Cloud

Along with a network-aware type system, Ballerina comes with fundamental syntax abstraction for working with network services. The language also includes built-in support to deploy Ballerina applications on the cloud using Docker and Kubernetes.

Service Objects to Produce Services

Ballerina accommodates the concept of a service, and a service can be written in just three or four lines of Ballerina code. Services in Ballerina are powered by three things working together: applications, listeners and libraries. The application defines service objects and attaches them to listeners. Listeners are provided by libraries; for example, there’s a listener for each protocol (HTTP/GraphQL etc.), which is provided by a library. The listener receives network input and then makes a call to the application to find service objects. Service objects support two interface styles:

Remote methods - named by verbs, and support RPC style
Resources - named by method (e.g. GET) + noun, and support RESTful style (used for HTTP and GraphQL)

Because Ballerina’s service approach is coupled with its unique wire-oriented type system, you can generate an interface description from the Ballerina code. This can be an OpenAPI or a GraphQL specification. So you can actually write regular ballerina service objects and generate your client code.The combination of these features enables cloud integration to work smoothly.

import ballerina/http;
 
service on new http:Listener(9090) {
  resource function get greeting(string name) returns string { 
    return "Hello, " + name; 
  } 
}

Client Objects to Consume Remote Services

The outbound network interactions are represented by client objects. Clients have remote methods that represent outbound interactions with a remote system. The client object is one of the syntax elements that allows us to draw the sequence diagram.

import ballerina/email;

function main() returns error? {
  email:SmtpClient sc
    = check new("smtp.example.com",
                "user123@example.com",
                "passwd123");
  check sc -> sendMessage({
     to: "contact@ballerina.io",
     subject: "Ballerina"
     body: "Ballerina is pretty awesome!"
  });
}

Code to Cloud

Ballerina supports generating Docker and Kubernetes artifacts from code without any additional configuration. This simplifies the experience of developing and deploying Ballerina code in the cloud. Code to cloud builds the containers and required artifacts by deriving the required values from the code. See this example for more details.

To deploy your code into different cloud platforms, such as AWS and Microsoft Azure, annotations on service objects are used to enable easy cloud deployment, as shown in the code snippet below. The Ballerina compiler can generate artifacts, such as Dockerfiles, Docker images, Kubernetes YAML files, and serverless functions.

For example, Ballerina functions can be deployed in Azure by annotating a Ballerina function with @azure_functions:Function.

import ballerina/uuid;
import ballerinax/azure_functions as af;

// HTTP request/response with no authentication
@af:Function
public function hello(@af:HTTPTrigger { authLevel: "anonymous" } string payload) returns @af:HTTPOutput string|error {

    return "Hello, " + payload + "!";

}

8. Explicit Control Flow for Errors

The approach to dealing with errors has a pervasive impact on language design and usage. It affects everything about the language. When you're dealing with a network, errors are a normal part of doing business, especially when considering the eight fallacies of distributed computing. Many pre-cloud languages, such as Java, Javascript, and Typescript, used exceptions as a way of dealing with errors. But not every language follows that design. Languages such as Go and Rust don't even have exceptions at all.

With exceptions, the control flow is implicit and the code is harder to understand and maintain. When things go wrong, just conveniently throwing an exception makes everything go completely haywire. To make error handling work, you have to be able to look at the program and understand if there's an error where an error could take place and how the flow of control is going to change. So, there's a fairly strong trend now to eliminate exceptions and go back to a much simpler approach where errors are explicit and use the normal control flow on error. This approach can be found in Go, Rust, and Swift. Ballerina follows the same approach and allows the developer to use the error data type with explicit error control flow.

import ballerina/io;

// Converts bytes to a string and then to an int.
function intFromBytes(byte[] bytes) returns int|error {

    string|error ret = string:fromBytes(bytes);

    // The is operator can be used to distinguish errors
    // from other values.
    if ret is error {

        return ret;
    } else {
        return int:fromString(ret);
    }
}

// The main function can return an error.
public function main() returns error? {

    int|error res = intFromBytes([104, 101, 108, 108, 111]);    
    if res is error {
        // The `check` expression is shorthand for this pattern of
        // checking if a value is an error and returning that value.
        return res;

    } else {
        io:println("result: ", res);
    }
}

9. Transactions as a Language Feature

Writing Ballerina programs that use transactions is quite straightforward because transactions are a language feature. What Ballerina offers isn't transactional memory but fundamentally language support for delimiting transactions. This way you’re always guaranteed that your transactions have begin, rollback, or commit options.

A running instance of a Ballerina program includes a transaction manager. This may run in the same process as the Ballerina program or in a separate process (it should not be connected over an unreliable network). The transaction manager maintains a mapping from each strand to a stack of transactions (or, in a distributed context, transaction branches). When a strand's transaction stack is non-empty, we say it is in transaction mode; the topmost transaction on a strand's transaction stack is the current transaction for that strand.

import ballerina/io;

public function main() returns error? {
    // Compile-time guarantees that transactions are bracketed with
    // begin and commit or rollback. Transaction statement begins
    // a new transaction and executes a block.
    transaction {
        doStage1();
        doStage2();

        // Committing a transaction must be done explicitly using a commit   
        // statement and it may cause an error.
        check commit;

    }
}

function doStage1() {
    io:println("Stage1 completed");
}

function doStage2() {
    io:println("Stage2 completed");
}

Transactions in Ballerina also compose with its network interaction features, i.e., clients and services, to support distributed transactions. A user can create transactional flows between a client and service by declaring resources/remote methods of a service and remote methods of a client object as transactional.

10. Many Features are Familiar and Ballerina is ‘Batteries Included’

The proliferation of languages does indicate that individuals are willing to learn new languages. But enterprises seem reluctant to adopt a new language because they worry that they won't be able to hire people who are familiar with that language. It’s important to highlight that while Ballerina provides better ways to do things, it also comes with a subset of features that are familiar to a programmer of a C-family language that are enough to get developers started in a couple of hours or less.

Popular C-family languages (C, C++, Java, JavaScript, C#, TypeScript) have a lot in common, and Ballerina leverages this by doing a lot of things in the same way. If you are a programmer with some amount of programming experience with any of the C-family languages, coding with Ballerina is going to be rather straightforward. And in addition to the powerful language features, Ballerina is ‘batteries included’, which means that the language comes with a rich standard library (with libraries for network data, messaging and communication protocols), a package management system, structured documentation, a testing framework, and extensions/plug-ins for popular IDEs (notably Visual Studio Code) among other tools to support the language.

Conclusion

While Ballerina has all the general-purpose functionality of a modern programming language, its niche is that it uniquely provides language features that make it easier to use, combine and create network services for the cloud. Developers can now build resilient, secure, and highly performant services that address the fallacies of distributed computing and integrate them to create cloud native applications by simply using a programming language that is specialized to do just that.

For a great, quick introduction to creating and consuming HTTP services in Ballerina, take a look at this screencast. And if you learn better from examples, a vast collection of code examples that highlight important Ballerina features and concepts can be found here.

For an in-depth introductory explanation on the language features of Ballerina Swan Lake, watch this video series by James Clark, Ballerina’s Chief Language Designer. You can also check out his blog posts for more context on the design principles of Ballerina.

About the Author

Dakshitha Ratnayake is currently working at WSO2 as the Program Manager for Ballerina. With a background in Software Engineering, she has over 10 years of experience in the roles of Software Engineer, Solutions Architect, and Technology Evangelist at WSO2. Throughout, she has been a technology advocate for WSO2 in the areas of API Management, Enterprise Application Integration, Identity and Access Management, Microservices Architecture, Event-Driven Architecture, and Cloud Native Programming, while also maintaining technical relationships with different parties to translate business requirements and communicate technical strategy.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?