BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles A Comprehensive Look at F# 4.1

A Comprehensive Look at F# 4.1

Bookmarks

Key Takeaways

  • Struct Tuples, Records, and Discriminated Unions take the performance spotlight.
  • ByRef returns when performance you need to go even faster.
  • Caller Information attributes make logging easier.
  • Many unused keywords are no longer reserved.
  • Optional parameters now work correctly.

Semantic versioning can be misleading. While F# 4.1 is backwards compatible with F# 4.0, it is by no means a minor release.

With both Microsoft and the larger community contributing, several more features were introduced since the F# 4.1 preview in the areas of performance, interoperability, and convenience.

Performance

The headliner for this release is the ability to use structs, also known as value types, instead of reference types. The ability to stack-allocate values, and embed them inside other objects, can have huge performance ramifications when used correctly.

Struct Tuples

At the top of the list are Struct Tuples. Tuples are very important to idiomatic code in F# and other functional programming languages.

A major criticism of F#’s implementation, known as System.Tuple, was that it is a reference type.

This means potentially expensive memory allocation is needed each and every time a tuple is created. Being immutable objects, that can happen quite frequently.

This was solved in .NET by the introduction of the ValueTuple type. Also used by VB and C#, this value type will improve performance in scenarios where memory pressure and GC cycles are an issue.

Care has to be used, however, as repeatedly copying ValueTuples more than 16 bytes in size may introduce other performance penalties.

In F#, you can use the struct annotation to declare a struct tuple instead of a normal tuple. The resulting type works similarly to a normal tuple, but is not compatible so switching is a breaking change.  

    let origin = struct (0,0)
    let f (struct (x,y)) = x+y

If you are using this for performance reasons, testing is important. Because tuples are so often used in F#, the compiler has special optimizations for them that may eliminate the tuple entirely.

This optimization may not necessarily apply to a struct tuple.

That said, Arbil wrote in the original proposal “according to my tests short struct tuples perform up to 25x faster than default ones if the cost of garbage collecting is taken into account”.

A possible extension to this feature was something called “structness inference”. Consider this line:

let (x0,y0) = origin

In F# this would be a compiler error because origin is a struct tuple and the expression (x0,y0) represents a reference tuple. Had structness inference been implemented, then the stuct keyword would have been implied.

Since is it a compiler error, this feature may be implemented in a future version without introducing a breaking change. However, there are numerous sematic and compiler repercussions, so it is by no means guaranteed to happen.

Struct Records

Another important concept in F# programing is the use of records types. Records types are in many ways like tuples such as being immutable and of a fixed size. The main difference is that each field in a record has a distinct name, unlike a tuple where only the position is actually meaningful.

Generally speaking, a library developer is going to want to use records instead of tuples for public APIs, as the named fields are going to be much easier for the application developer to understand.

Unfortunately records suffer from the same problem as Tuples, namely they are always reference types. Or rather, they were. Based partially on the struct tuple work, contributor Will Smith (a.k.a TIHan) created Struct Records.

To indicate a type should be a struct record instead of a normal reference style record, you must use the [<Struct>] attribute. You may be wondering why this doesn’t use the struct keyword. Dsyme writes,

Yes, the attribute is needed, and @TIHan is correct that this is one reason it's always been there. Using the attribute is the preferred way if indicating structness for nominal type definitions.

There's also a basic principle here that F# only uses "let", "module" and "type" as top level declarations (and "exception" and "extern" but they are rarely used). We don't use new keyword-led declarations for the panoply of different kinds of nominal types.

Warning: Struct records are not compatible with F# 4.0. There is a flaw in that compiler which causes it to see struct records as reference types instead of value types. Do not use this feature if your library may be consumed by someone using the older compiler.

Struct Discriminated Unions

Continuing the theme of structs in F#, we see the introduction of Struct Discriminated Unions. A discriminated union is essentially equivalent to a union type in other languages such as C++, but with some extra syntactical tricks. For example, you can effectively define new types within a discriminated union in the form of “case identifier”.

type Shape =
    | Rectangle of width : float * length : float
    | Circle of radius : float
    | Prism of width : float * float * height : float

In this example, the union Shape has three sub-types: Rectangle, Circle, and Prism. These only exist in the context of a Shape, and a given instance of a Shape can only contain one of the three.

If you are unfamiliar with the F# syntax, individual fields in type declarations are separated by an asterisk (*). So Rectangle has two fields, Circle one, and Prism three (one of which isn’t named).

When a given case identifier has more than one field, it is implemented as a tuple. Which brings us back to the purpose of this feature; to allow discriminated unions to be implemented as value types instead of a reference types.

Warning: As with struct records, the F# 4.0 compiler will not correctly interpret struct discriminated unions.

Support for byref returns

C# 7 added a new feature called ref locals. This allows for a safe pointer to a value. That value could be in inside an object, a by-ref parameter, or in some cases on the stack. Consider this simple example,

var a = new int[ ] {1, 2, 3};
ref int x = ref a[0];
x = 10; //the array a is now {10, 2, 3}
int x_value = x //dereference the value

The same code in F# would look like this

let a = [| 1; 2; 3; |]
let x = & a.[0]
x <- 10
let x_value : int = x //dereference the value

In the announcement and RFC for this feature, it incorrectly stated that F# already had support for ref locals via Reference Cells. This is understandable because the syntax looks like C#’s ref local:

let y = ref a.[0]
y := 20
let y_value : int = !y //dereference the value

However, once you look at the source code for a reference cell it becomes clear that it is really just a way to box a mutable value.

public sealed class FSharpRef<T> : IEquatable<FSharpRef<T>>, IStructuralEquatable, IComparable<FSharpRef<T>>, IComparable, IStructuralComparable
    {
        public T contents@;
        public FSharpRef(T contents)
        {
            this.contents@ = contents;
        }
        //interface implementations omitted
        public T Value
        {
            get { return this.contents@; }
            set { this.contents = value; }
        }
    }

So in this example, the variable named y isn’t really a reference to an element in the array a. Rather, y is merely a copy stored in an FSharpRef<int> object. This would lead to confusion if not for the fact that the syntax for ref locals wasn’t so different from ref cells.

Interoperability

Another aspect of F# 4.1 is an emphasis on ensuring that F# code interacts well with libraries written in other languages. And with .NET having deep hooks for C, COM, and dynamic languages, this means more than just consuming C# libraries.

Memory Pinning with the Fixed Keyword

This feature only matters to developers who are calling into C libraries from F#. If you pass a data structure to a C library, and that library needs to hold onto the structure, you can run into some serious problems. Unlike the .NET languages, C doesn’t expect a garbage collector to come behind it and move everything around.

The solution to this is to “pin” an object in memory. This prevents the GC from moving it; though one has to be careful not to abuse this as it can have a negative impact on memory usage.

This is done in F# using the fixed keyword in conjunction with the use keyword. This may be confusing to some programmers, as the use keyword is traditionally used for IDisposable objects much like C#’s using keyword. In this case, the use keyword just provides scope for the associated variable and a guarantee that the memory will be unpinned outside of that scope.

Caller Information

In .NET, Caller Information is provided via the use of optional parameters decorated with the attribute CallerFilePath, CallerLineNumber, or CallerMemberName. This is primarily used for logging, but you’ll also see it in other scenarios such as supporting property change notifications in WPF/XAML applications.

There isn’t much to say about this feature in F#. According to the RFC, implementing this wasn’t optional as it is needed for F# to conform to .NET standards.

Correctly Working Optional Parameters

Simply put, .NET style optional parameters in F# didn’t work. Theoretically you could put a [<Optional;DefaultParameterValue<(...)>] on a parameter and get the same optional parameter behavior that you see in VB or C#. However, F# 4.0 and earlier didn’t compile the DefaultParameterValue attribute correctly. This meant that it was ignored in all languages.

A related problem is that while F# can consume compiled optional, default arguments from other libraries, it couldn’t consume them from code within the same assembly. This only affected .NET style optional parameters, F# style optional parameters worked as expected.

Both bugs were solved in RFC FS-1027 - Complete Optional and DefaultParameterValue Implementation.

While primarily an interoperability issue, there are potentially significant performance ramifications as well. .NET style optional parameters are essentially free. The compiler simply passes in a constant specified by DefaultParameterValue attribute.

If you use F# style optional parameters instead, each optional argument supplied needs to be wrapped in a FSharpOption<T> object. Thus, if a method has 5 optional parameters, and you supply values for 3 of them, then you need to make 3 memory allocations.

So while it is significantly more verbose, .NET style optional parameters will give you better performance than F# style optional parameters.

Another consideration is interoperability. F# style optional parameters are not understood by languages such as VB and C#. So those languages are going to have to manually wrap parameters in FSharpOption<T> objects or pass nulls for missing values.

Moving on, there is also the issue of documentation. The default value for F# style optional parameters are not exposed in the public API. In fact, F# doesn’t really have the concept of a default value. (This is somewhat odd considering its inspiration, OCaml, does.) Instead, there is an optional design pattern idiomatic code should follow, but is in no way obligated to.

Reflection in .NET Core, .NET Standard 1.5, and Portable Class Libraries

Formally referred to as Enable FSharp.Reflection functionality on Portable 78, 259 and .NET Standard 1.5 profiles, this feature solves a longstanding and unnecessary limitation when using F# on limited platforms. From the RFC,

The functionality FSharpValue.MakeRecord and other similar methods are missing from FSharp.Core reflection support for Profile78, 259 and .NET Core (i.e. .NET Standard 1.5). This is because the signatures of these use the BindingFlags type and that type is not available in those profiles.

This functionality is really part of the basic F# programming model. This is a frustrating problem because the BindingFlags is really only used to support BindingFlags.NonPublic, and could always just as well have always been a Boolean flag.

This RFC is to make this functionality available, especially on .NET Core but also on the portable profiles.

Unlike the other RFCs, this was deemed simple enough that no discussion period was granted.

Convenience

No major release would be complete without some features that, while not strictly necessary, make the developer’s job easier.

Underscores in Numeric Literals

Have you ever seen the number 1000000 and found yourself counting zeros to determine what it actually means? If so, you’ll probably find this feature to be useful. With F# 4.1, you can now write 1_000_000 instead.

Coincidentally, the ability to put underscores in numeric literals was also added in C# this year.

Mutually Referential Types and Modules Within the Same File

F# has a design problem that developers who primarily use VB, C#, Java, etc. probably didn’t even know was possible: the inability to reference one type or module from another.

In F#, projects are compiled file-by-file rather than all at once. This means a given type or module normally cannot reference another type or module unless it appears earlier in the compilation order. In practical terms, this means you cannot have mutual references between two types.

For example, say you have a LinkedList class and a Node class. If Node appears before LinkedList in the compilation order, then LinkedList can have code that refers to its collection of Node objects. However, since Node is first, it can’t have a property pointing back to the LinkedList that owns it.

In earlier versions of F#, this restriction can be lifted by creating a “recursive scope” using the rec keyword. This very limited though; it can only be used for a group of functions or a group of types in the same file. You cannot, for example, have a type and a module refer to each other. Nor can you have an exception contain a reference to a type capable of throwing the exception.

With the so-called Mutually Referential Types and Modules Within the Same File feature, this restriction has been relaxed somewhat. You can now use the rec keyword at the namespace or module level, allowing an entire file to be mutually referential, if it is limited to a single namespace.

The reason for this keyword is philosophical rather than technical. In idiomatic F#, there is a belief that cyclical dependencies lead to spaghetti code. In order to reduce this, they intentionally make it difficult to introduce cyclical dependencies. To be fair, this often does result in easier to understand dependency chains. But the downside is you get very large files when cyclical dependencies are necessary.

It should be noted that philosophy isn’t the only reason that F# doesn’t allow mutually referential types to span multiple files. From the RFC:

For a type-inferred, Hindley-Milner-style language, it is technically essentially impossible to provide incremental checking for single files within an assembly where all files are mutually referential. This means that visualIDE tooling performance would be worse when editing large assemblies when this switch is enabled.

Modules and Types with the Same Name

One annoyance in VB and C# is you occasionally find yourself wanting to have a type and a module (C# static class) with the same name. This usually comes up when designing extension methods or other utility libraries that work on a given class.

F# solves this by automatically adding the suffix “Module” to the module’s name if there is a type with the same name. Previously this required the use of the [<CompilationRepresentation(CompilationRepresentationFlags.ModuleSuffix)>] attribute, but now module renaming is implied.

One does have to question the logic of this design. Say you publish a library with a module named Foo. Then later you add a type named Foo. This will cause the compiler to automatically rename the module from Foo to FooModule, a breaking change. Since this will now happen without the attribute being present, there is no warning.

Gauthier Segay responds,

Do you think "new F# devs" would be actively designing their code as to have a type A and a module A?

If they faced a similar need, they'll reshape a bit their business code without big impact on frontend code IMHO.

I think this feature is mostly for ML seasoned devs, and those devs who want to expose API for C# will do so with the great features that make F# to work well with C# idiomatic code.

Unreserved Keywords

When F# was created, a lot of keywords were reserved for future use, especially ones found in other ML-based languages. After 12 years of development, they have concluded many of the reserved words are never going to be used.

  • atomic - this was related to the fad for transactional memory circa 2006. In F# this would now be a library-defined computation expression
  • constructor - the F# community is happy with 'new' to introduce constructors
  • eager - this is no longer needed, it was initially designed to be "let eager" to match a potential "let lazy"
  • functor  - If F# added parameterized modules, we would use "module M(args) = ..."
  • measure  - There is no specific reason to reserve this these days, the [<Measure>] attribute suffices
  • method  - the F# community is happy with 'member' to introduce methods
  • object  - there is no need to reserve this
  • recursive  - F# is happy using "rec"
  • volatile - There is no specific reason to reserve this these days, the [<Volatile>] attribute suffices

If you find you need to use an identifier that is still reserved as a keyword, you can surround it using double back-ticks (``private``). This works like brackets in VB ([Public]) or the @ symbol in C# (@private).

API Changes

There were also several API changes. Here are a couple of the more notable ones.

Result type

In many functional programming languages, functions that have a problem can return an error value instead of throwing an exception. In F#, this was occasionally done using the Option type, but that has a serious problem. Option has no way of telling you why an operation has failed. It can only say whether a value exists.

To work around this, F# developers could create their own discriminated unions that return either a result value or a detailed error value. That leads to inconsistencies across different libraries, so in F# 4.1 we see the introduction of an official result type.

 /// <summary>Helper type for error handling without exceptions.</summary>
    [<StructuralEquality; StructuralComparison>]
    [<CompiledName("FSharpResult`2")>]
    [<Struct>]
    type Result<'T,'TError> = 
      /// Represents an OK or a Successful result. The code succeeded with a value of 'T.
      | Ok of 'T 
      /// Represents an Error or a Failure. The code failed with a value of 'TError representing what went wrong.
      | Error of 'Terror

 As you can see, this uses the new struct discriminated union feature.

Implement IReadOnlyCollection<'T> in list<'T>)

Normally we wouldn’t be covering such a minor API change, but this one has some interesting ramifications. For the core .NET languages, there is a separation between the compiler and the framework classes. This means you can usually target an older version of the .NET runtime using the latest C# or VB compiler.

In F# the rules are a little different. Since F# was meant to be backwards compatible with ML and OCaml, at least enough to make porting easy, it has its own set of framework classes that need to be shipped with the compiler. Or rather, its own set of framework libraries.

With this feature, it wasn’t enough to just add the missing interface implementation. They also had to gate it with conditional compilation directives so they could continue building .NET Portable Class Libraries where that interface doesn’t exist.

A side-effect of adding the IReadOnlyCollection<'T> interface was that it broke JSON.NET. Though quickly fixed, it prompt JSON.NET’s creator, James Newton-King, to ask,

How come FSharpList<T> doesn't have a constructor that takes an IEnumerable<T>? If it had one then it would automatically work with Json.NET

The answer lies in how FSharpList<T>, a.k.a. list<'T>, is defined. Rather than being a normal class, it is a discriminated union (see above) consisting of either nothing or a value and another FSharpList<T>. So basically it is a linked list, without a wrapper class, compatible with F#’s pattern matching syntax.

Because of this design, FSharpList<T> is not allowed to have a constructor. Also, each node in the linked list is a separate IReadOnlyCollection<'T> with its own count that ignores any item coming before it in the list. This is an O(n) operation, something that concerned the developers when implementing the interface.

Contrast this with .NET’s LinkedList<T> class, which is a wrapper for LinkedListNode<T>. Since almost everything is done through the wrapper, LinkedList<T> can offer constructors and maintain metadata such as the current count.

F# 4.2

Features are already in development for F# 4.2 such as a ToString override for discriminated unions and records. You can watch the progress on the F# 4.2 folder of the fslang-design GitHub repository.

About the Author

Jonathan Allen got his start working on MIS projects for a health clinic in the late 90's, bringing them up from Access and Excel to an enterprise solution by degrees. After spending five years writing automated trading systems for the financial sector, he became a consultant on a variety of projects including the UI for a robotic warehouse, the middle tier for cancer research software, and the big data needs of a major real estate insurance company. In his free time he enjoys studying and writing about martial arts from the 16th century.

Rate this Article

Adoption
Style

BT