UniTask v2 — Zero Allocation async/await for Unity, with Asynchronous LINQ

Yoshifumi Kawai
10 min readJun 11, 2020

--

I’ve previously published UniTask, a new async/await library for Unity, now I’ve rewritten all the code and released new one.

GitHub — Cysharp/UniTask

In UniTask v2, almost everything is zero-allocated due to a thorough rewrite of the code (Technical details to follow). In addition to significant performance improvements, the new asynchronous sequences and Asynchronous LINQ. Others, await extensions for external assets such as DOTween and Addressables Support has also been built in for added convenience.

Before I talk about v2, here’s a recap. async/await is a feature that has been included since C# 5.0. Just like writing asynchronous code in synchronous code instead of asynchronous code handled by callback chains and coroutines. Both return values and exception handling will be handled naturally. When handling only callbacks, the complexity of the process results in multiple nests, and the inner exception is not propagated outward, making it difficult to process errors.

In Unity, the coroutine, which is achieved with a yield return (generator), allows you to use asynchronous processing can be used to flatten the nests to some extent, but return value and error cannot be handled due to the constraints, so used with delegate.

It can be flattened nest slightly, but

  • Callback process still remains
  • Can’t use try-catch-finally because of the yield syntax
  • Allocation of the lambda and the coroutine itself
  • Difficult to do cancellation process(coroutine only stop, not call dispose)
  • Impossible to control multiple coroutines (serial/parallel processing)

In async/await

with language-level support for asynchronous code that is almost identical to those of synchronous code.

However, the Unity framework itself doesn’t have much async/await support. UniTask provides…

  • await support for each Unity’s AsyncOpeations
  • Unity’s PlayerLoop-based switching process (Yield, Delay, DelayFrame, etc…) allows for all the functionality of a coroutine on UniTask
  • await support for MonoBehaviour events and uGUI events

I’ve implemented a custom UniTask type based on struct with a dedicated AsyncMethodBuilder. Ignore .NET Task and without ExecutionContext/SynchronizationContext that is unnecessary for Unity, it achieves Unity-optimized performance.

with UniTask, Unity is now able to take full advantage of the power of async/await.

And things have changed in the two years since the first release. NET Core 3.1, and then .NET 5 will be released and the runtime was rewritten. And we’re seeing C# 8.0 in Unity as well. So, while the above elements have been retained, I have completely revised the API and

  • Zero-allocation of the entire async method to further improve performance
  • Asynchronous LINQ(UniTaskAsyncEnumerable, Channel, AsyncReactiveProperty)
  • Increased PlayerLoop timing (new LastPostLateUpdate will have the same effect as WaitForEndOfFrame)
  • Support for external assets such as Addressables and DOTween

I have implemented performance improvements and in particular, zero-allocation is now able to reduce GC, even with heavy use of async/await Therefore, you can expect a significant performance improvement.

In addition, I have also adjusted the behavior to be similar to .NET Core ValueTask/IValueTaskSource(e.g. Delay is launched on call as well as Task, with the exception of twice await throw, etc.). This gives you the performance benefits of a UniTask, but the behavior The learning gap has been reduced by making it standards-aligned for.

With support for C# 8.0 starting in Unity 2020.2.0a12, asynchronous stream notation is now possible. So UniTask v2 supports asynchronous streams. Added UniTaskAsyncEnumerable.

Even in Unity 2018, 2019 and 2020.1, where C# 8.0 is not supported In combination with the on-board asynchronous LINQ, almost identical processing is possible. Also, since it is LINQ, all standard LINQ query operators can be applied to asynchronous streams. For example, the following code runs once every two times on a button-click asynchronous stream Here’s an example of the Where filter.

In addition to button clicks, it provides a rich asynchronous stream factory that integrates with a number of Unity, so you can write all kinds of processes depending on your ingenuity.

Principle of AsyncStateMachine and Zero Allocation

While there are many new features, such as asynchronous stream support, the biggest feature of UniTask v2 is a significant performance improvement.

Compared to the standard Task implementation of async/await, UniTask v2 is much better than the standard Task implementation in terms of where allocation occurs and how it was deterred. Let’s break down the structure, taking the following relatively simple asynchronous method as an example.

If there is a cache that has been read, it is returned, otherwise it reads asynchronously. This await code is a built at compile-time to GetAwaiter -> IsCompleted/ GetResult/UnsafeOnCompleted.

It’s a optimization that avoids the cost of creating/registering/calling a callback when a callback is not needed (e.g. this LoadAssetFromCacheAsync itself returns the value immediately if it’s also cached).

The methods declared in async are converted by the compiler into a state machine. The MoveNext method registers this state machine with the await callback .

The StateMachine implementation is a bit longer. You can think of it as being split by a line and advancing a state. It’s a little hard to read because it’s set with the Builder described below. The awaiter calls GetResult immediately if IsCompleted is true. If false, register its own MoveNext to UnsafeOnCompleted. This MoveNext will be called again when the asynchronous process is completed and calls own GetResult.

The last character is the AsyncTaskMethodBuilder, which is not a compiler-generated class, corresponding builder class with a Task. The original source is a bit long, so I’ll post the simplified code.

The Builder does the initial call (Start), getting the return value Task, registering the callback (AwaitUnsafeOnCompleted) and setting the result (SetResult/SetException).

The await chain is similar to the callback chain, but if you write the callback chain by hand, the cannot avoid the occurrence of lambda’s closure allocations, but async/await can be used with the compiler generates a single delegate to do all the work, which reduces the allocations . These mechanisms make writing in async/await more powerful than handwriting in It will be.

Now that we have all the pieces in place. async/await and Task is very good basically because it includes a detailed optimization that takes advantage of the compiler generation, but there are some problems.

In terms of memory allocation, the following four allocations occur in the worst case

  • Task Allocation
  • Boxing the AsyncStateMachine
  • Allocation of the Runner encapsulating the AsyncStateMachine
  • Allocation of delegate for MoveNext

If you declare the return value in Task, even if the value is in the immediate return state, you can always use Task’s Allocation occurs. To deal with this problem, the .NET Standard 2.1 introduces the ValueTask type. However, if a callback is needed, the Task allocation will still be there, and the Boxing of the AsyncStateMachine are also present. Since this allocation requires the StateMachine to be placed on a heap.

UniTask solves these problems with a custom AsyncMethodBuilder since C# 7.0.

UniTask(value type) removes allocation in the case of an immediate value return. Strongly typed Runner(which does not occur boxing) is integrated with the return value of Task(RunnerPromise). Further, It is retrieved from the object pool. When the call completes(GetResult), it returns to the pool. This completely removes Task and state machine related allocations.

As a limitation, all UniTask objects cannot await twice, as they automatically return to the pool when the await completes.

This constraint is the same as ValueTask/IValueTaskSource.

The following operations should never be performed on a ValueTask instance:

* Awaiting the instance multiple times.
* Calling AsTask multiple times.
* Using .Result or .GetAwaiter().GetResult() when the operation hasn’t yet completed, or using them multiple times.
* Using more than one of these techniques to consume the instance.

If you do any of the above, the results are undefined.

Although there are some inconveniences, aggressive pooling is possible thanks to this restriction.

Note that the implementation of these zero-allocation async/await methods is similar as introduced in Async ValueTask Pooling in .NET 5. UniTask v2 is not waiting for a runtime update in the distant future, but it is available right now in Unity.

If you monitor by the profiler in the UnityEditor or Development Build, you’ll see the allocation. This is because the AsyncStateMachine generated by the C# compiler is “class” in a debug build. In the release build, it will be a “struct” and there will be no allocation.

The pool size is unlimited by default, but you can use TaskPool.SetMaxPoolSize to set the maximum size and TaskPool.GetCacheSizeInfo allows you to retrieve the number of items currently in the cache. Unlike .NET Core, Unity has a higher impact on the GC, so you should actively pool but it may be better to adjust it for some applications.

Coroutine and PlayerLoop

One of the key features of UniTask, which differs from Task, is to not use SynchronizationContext(and ExecutionContext) at all.

In the case of Unity, it automatically returns to the main thread (by the UnitySynchronizationContext). This is convenient at first glance, but there is overhead and no needed in Unity. Because Unity’s asynchronous processing(AsyncOperation) is run on Unity’s engine layer(C++) and that is already returned to the main thread in the scripting layer(C#).

So I cut the SynchronizationContext to make it lighter.

One more thing, the SynchronizationContext has only one place to come back to, and that’s In the case of Unity, there are many situations in which the call to the execution sequence is finely controlled. There is. For example, coroutines can also use WaitForEndOfFrame often, with things like WaitForFixedUpdate, we need to adjust the points in the execution sequence.

So instead of a single SynchronizationContext, UniTask now allows you to manually specify points in the execution sequence to return to.

Currently in Unity, the standard PlayerLoop mechanism allows all event functions to modify. Here’s the list, with UniTask injected at the beginning and end of each, for a total of 14 locations to choose.

It’s long, so let’s just take out Update.

The MonoBehaviour “Update” is available in ScriptRunBehaviourUpdate, coroutine (yield return null) is “ScriptRunDelayedDynamicFrameRate”, UnitySynchronizationContext is a It’s run by PlayerLoop in “ScriptRunDelayedTasks”.

If you look at it this way, there’s nothing special about Unity’s coroutine. PlayerLoop(ScriptRunDelayedDynamicFrameRate) drives IEnumerator, it only calls MoveNext on every frame. UniTask’s custom loop that not much different from a coroutine loop.

However, Unity’s coroutine is an old-fashioned mechanism, and due to the limitations of yield returns and high overhead, it’s not an ideal mechanism.

The use of UniTask as an alternative to coroutines that is not limited to asynchronous processing, it has no limitations and also performance is good. Therefore, I believe that replacing the coroutine with UniTask is a practical choice.

You can also find out which PlayerLoop the script is currently running in by checking the stack trace in the Debug.Log.

If it’s running in UniTask’s PlayerLoop, the second position from the bottom PlayerLoop is displayed (in this case, PlayerLoopTiming. PreLateUpdate).

Asynchronous LINQ

With the introduction of C# 8.0 support in Unity 2020.2.0a12, asynchronous Stream is now possible. For example, the following statement replaces Update()!

As you can see, C# 8.0 is too early, but in C# 7.3 environments, the ForEachAsync method is used in We’ll use this one for practical purposes, since it can be moved in almost the same way.

Also, UniTaskAsyncEnumerable has LINQ that same as the `IEnumerable`'s LINQ or Rx of `IObservable`.

Asynchronous LINQ has all standard LINQ query operators that can be applied to asynchronous streams. For example, the following code runs once every two times on a button-click asynchronous stream Where filtered.

It is similar to UniRx (Reactive Extensions), but with Rx being a push type Asynchronous stream, where as UniTaskAsyncEnumerable is a Pull type asynchronous stream.

And we also have a Channel that same as .NET Core’s System.Threading.Channels. It’s like a Golang channel but arranged for async/await. Channel.Reader.ReadAllAsync returns IUniTaskAsyncEnumerable, so it can consume by foreach, or asynchronous LINQ.

For example, when combined with Publish, UniTask’s proprietary operator, it can be converted to a Pub/Sub implementation.

Enhanced await support

In default, await support for AsyncOperation, ResourceRequest, AssetBundleRequest, AssetBundleCreateRequest, and UnityWebRequestAsyncOperation are all included in UniTask.

I’ve organized them into three patterns.

In addition to await directly, you can also call the WithCancellation method to get the support of the cancellation. Also, the return value of this is UniTask, so it can be used with WhenAll for parallel processing.

ToUniTask is a more advanced option than WithCancellation. Progress callbacks, PlayerLoop to run, and It is now a method that can pass a CancellationToken.

In addition, support for DOTween and Addressable has been added as an external asset, and For example, the following implementations are possible in DOTween.

And all UniTasks can be monitored for usage with the UniTaskTracker.

This makes it easier to prevent memory leaks.

.NET Core

New from UniTask v2, the .NET Core version of UniTask is now available in NuGet. It has a higher performance than the standard Task/ValueTask, but it is not as easy to use as ignoring the ExecutionContext/SynchronizationContext.

In order to ignore the ExecutionContext, the AysncLocal does not work either. If you want to use it, we recommend you understand the limitations and use it only as a pinpoint.

If you want to use UniTask internally and provide ValueTask as an external API, you can write it in the following way

.NET Core version was primarily intended to be available when sharing code with Unity (e.g. for Cysharp/MagicOnion, realtime framework for Unity and .NET Core) as an interface.

You can use UniTask’s WhenAll for ValueTask by the Cysharp/ValueTaskSupplement. Please check it.

Conclusion

It’s been two years since the release of the first UniTask, and while many games have adopted it, we’ve been able to there are still many people who don’t understand or misunderstand about async/await.

UniTask v2 is the perfect solution for Unity, I hope that many people will now know the power of C#, and async/await.

--

--

Yoshifumi Kawai
Yoshifumi Kawai

Written by Yoshifumi Kawai

a.k.a. neuecc. Creator of UniRx, UniTask, MessagePack for C#, MagicOnion etc. Microsoft MVP for C#. CEO/CTO of Cysharp Inc. Live and work in Tokyo, Japan.

Responses (2)