BT

Swift Memory Ownership Manifesto

| by Sergio De Simone Follow 8 Followers on Mar 03, 2017. Estimated reading time: 2 minutes |

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

According to Chris Lattner, Swift creator and Swift team lead before moving to Tesla, defining a Rust/Cyclone-inspired memory ownership model is one of the main goals for Swift 4. Now that Swift 4 has entered its phase 2, the Swift team has published a manifesto detailing how Swift memory ownership could work.

Swift’s compiler does actually implement its own, opaque ownership model (ARC) to decide when ownership should be transferred. There are cases when this is obvious, and cases where the compiler can make the wrong assumptions, thus causing unnecessary copying. In a nutshell, Swift’s new memory ownership model will try to put under the developer’s control when memory is copied. The main objective of defining a memory ownership model for Swift is trying to overcome the drawbacks that the current copy-on-write reference counting approach has. Those are the overhead and occasionally unpredictable performance of reference counting, and the need to generally allocate memory on the heap due to the requirement of being able to copy at any time.

Such drawbacks, while not generally a problem for application programming, can be undesirable for system programming in those cases where specific performance guarantees are required. Additionally, the benefits of a more flexible memory management model can be significant even for application programming when trying to optimize specific bottlenecks. Based on this, Swift’s new memory ownership model will be opt-in, so only developers requiring that finer-graded control will incur the cost of its complexity, compared to ARC.

One change that will affect all Swift developers, though, is the “Law of exclusivity”, which will not be opt-in. This law will enforce that a variable cannot be accessed simultaneously in conflicting ways, such as when it would be passed as an inout argument to two different functions, or when method receives a callback that accesses the same variable that the method was called on. Both cases are currently allowed in Swift, and ruling them out will affect all developers. Moreover, since the Law of exclusivity has an impact on the language ABI, because it changes the guarantees made for parameters, it will be one of the first features to be adopted.

In addition to the “Law of exclusivity”, Swift will introduce new annotations and language features to allow the propagation of shared values and to express types that cannot be implicitly copied. Those three mechanisms – exclusivity, explicit control of shared values propagation, and non-copyable types – play together to make it possible for the compiler to optimize the code more aggressively, according the the Manifesto’s authors.

Succinctly, the high-level picture of the new Swift ownership model could be summarized as follows:

  • The compiler will flag all non-exclusive uses of inout, whether explicit or implicit, as explained above.
  • Developers will be able to declare whether a variable is owned or shared, to avoid reference counting and unnecessary copying/destroying when entering/leaving lexical scopes.
  • Developers will be able to declare moveonly (i.e., non-copyable) types, that the compiler will not copy and that cannot be used to create additional references. moveonly types will have move semantics and are considered an advanced-features, so by default all types will be copyable.

The Manifesto itself provides quite a long and detailed analysis of all aspects that the definition of such an ownership model entails for Swift, and its details are not necessarily final, yet. A shorter, though quite dense summary of its key points has been published by Swift developer Alexis Beingessner.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Re: Meanwhile, in the Java world by Cameron Purdy

I'm pretty sure that when Java came out, I was programming on a Pentium 66 workstation with an impressive 64MB RAM (that's "MB" not "GB") that cost $6000, so the largest possible heap was probably about 32MB, and the single-threaded CPU cycles-per-second matched the memory size.

Five years later, the typical Java enterprise application was running on the Sun 4xx series (for roughly $80,000), with 4x 400Mhz CPUs and 4GB RAM, so already a 10:1 (order of magnitude) increase in RAM:CPU from a single-threaded cycles-per-second perspective.

My three-year old workstation today has 64GB RAM and a six-core (12-SMT) CPU at somewhere between 2-3Ghz, so a 20:1 RAM:CPU ratio. A lot of enterprise applications are running on several-year-old Dell or HP servers with 1TB of RAM and roughly a 2Ghz CPU, so about a 500:1 RAM:CPU ratio, with the Dell m1000e servers with a 2.13Ghz CPU and 3TB RAM running at a 1500:1 RAM:CPU ratio. That's not even counting the high end 32TB Sun Sparc and IBM Power servers, which tack on yet another order-of-magnitude increase.

tl;dr - the RAM:CPU-speed ratio has gone from roughly 1:1 to roughly 1000:1 during Java's lifespan, and that continues to grow.

The challenge that the JVM has with GC is that the heap is conceptually a single shared space for read/write data; in other words, there is neither compartmentalization of the heap (e.g. by thread or by core), nor is there any guarantee that anything stored in the heap won't be changed at any point in time that user code is running. There is no harder GC problem than trying to GC a single giant data structure being concurrently whacked away on by any number of threads. That Java works reasonably well at all in large-heap systems is a testament to the incredible feats that the various JVM engineers have pulled off.

The next generation of commodity 4-CPU "enterprise" servers will clock in at 100+ cores (i.e. 200+ SMT threads) and 16TB RAM. If the JVM supports it, people will be running 10TB heaps. I'm going to go out on a limb and suggest that GC pauses could become an issue.

As for reference counting, it doesn't stand a chance at that scale either. The problem is simply a natural consequence of (1) viewing the store as a shared store, and (2) allowing absolutely every item in the shared store to be mutated by anyone, at any time. As long as those two design assumptions remain, the problem will remain.

Peace,

Cameron.

Re: Meanwhile, in the Java world by Gil Tene

I applaude the various language level designs and memory models that aim to improve the ease of building concurrent programs, reduce their complexity for the writer, improve their readability and reasoning, and their test-ability and correctness. Those are worthy motivations for such designs.

But GC "cost" should not be used as a reason/excuse IMO, and neither should GC pauses. The actual cost of dealing with a shared heap (with anyone being able to access and mutate at any time) has been demonsted to be low and sustainable at any size, and the need for GC pauses that have anything do to with the size or complexity of the heap contents, or with the rate of operating on that contents, has similarly been demonstrated to be nil. Building languages or memory models based on assumption that 20-year-old GC mechanisms that still perform a GC work in Stop-The-World pauses are "as good as you can do" or "represent inherent limitations" (in either cost or pause behaviors) is no different than designing with the assumption that LLVM or other compiler and optimization frameworks will never exist.
In my view, designing an environment or a language to work with such "trivial" GC mechanisms is just as silly as designing it to "run fast without needing optimizing compilers", or "execute fast without having to rely on constant propagation or inlining based optimizations" because those things are too hard to build well. We don't need to "spare the GC" in our designs any more than we need to "spare the interpreter". Runtimes and compilers can and should be built to make such issues irrelevant and allow clean, readable, maintainable code to be built at scale.

GC is a solved problem. Including pauses. At scales from megabytes to terrabytes. And I expect it to only get better, with more and more implementations that actually embrace what it takes to tackle it (e.g. Embracing concurrent compaction, viewing any work done in a pause as "bad" or "wrong"). Java already has one (with more coming), and I see a lot of promise in the newer collector mechanisms being developed for all sorts of runtimes.

Swift by Alexey Porubay

Which is better Swift or objective C? www.cleveroad.com/blog/swift-vs-objective-c--wh... Check my thoughts about that.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

4 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT