NativeMemoryArray — A library that takes full advantage of the .NET 6 API to handle huge data of over 2GB

Yoshifumi Kawai
4 min readDec 23, 2021

There is one major limitation in C# when dealing with arrays, especially byte[]. The upper limit of a one-dimensional array is 0x7FFFFFC7 (2,147,483,591), which is a bit smaller than int.MaxValue.

This limitation is different with .NET 6 and other target frameworks. We will discuss it in detail later.

This value of 2GB is the limit of int Length. However, nowadays, we often deal with large values, such as 4K/8K video, large data sets for deep learning, and huge point cloud data for 3D scanning. Therefore, the 2GB constraint is too bad. And this constraint does not change even if Span<T> or Memory<T> is used (because Length is int).

When you run into these limitations, you can either switch to streaming processing or use pointers to handle them, but neither is very easy to handle, and they don’t necessarily replace the operations you used to do in memory.

Therefore, I’ve created an array with native memory on the back, which exceeds the 2GB constraint and is highly integrated with new APIs like IBufferWriter<T>, ReadOnlySequence<T>, RandomAccess.Write/Read, System.IO.Pipelines, etc.

.NET 6 is recommended, but it is supported from .NET Standard 2.0, which also supports Unity Game Engine.

You can load huge data via .NET 6' new RandomAcesss API.

You can also use Stream via IBufferwriter<T>, IEnumerable<Memory<T>>.

Also you can extract and process the Slice of Span<T>, or use ref T this[long index] for indexer access and pointer extraction.

Of course, .NET Standard 2.0/2.1 is also supported, and it also works with Unity Game Engine.

NativeMemoryArray<T>

NativeMemoryArray provides only simple Cysharp.Collections.NativeMemoryArray<T> class. It has where T : unmanaged constraint so you can only use struct that not includes reference type.

The difference with Span<T> is that NativeMemoryArray<T> itself is a class, so it can be placed in a field. This means that, unlike Span<T>, it is possible to ensure some long lifetime. Since you can make a slice of Memory<T>, you can also pass it into Async methods. Also, the length limit of Span<T> is up to int.MaxValue (roughly 2GB), however NativeMemoryArray<T> can be larger than that.

The main advantages are as follows

  • Allocates from native memory, so it does not use the C# heap.
  • There is no limit of 2GB, and infinite length can be allocated as long as memory allows.
  • Can pass directly via IBufferWriter<T> to MessagePackSerializer, System.Text.Json.Utf8JsonWriter, System.IO.Pipelines, etc.
  • Can pass directly via ReadOnlySequence<T> to Utf8JsonWriter, System.IO.Pipelines, etc.
  • Can pass huge data directly via IReadOnlyList<(ReadOnly)Memory<T>> to RandomAccess (Scatter/Gather API).

The feature is that it can be seamlessly integrated with other APIs through versatile conversions.

  • Slice — AsSpan<T>(), AsMemory<T>()
  • IBufferWriter<T> — CrateBufferWriter()
  • foreach — AsSpanSequence, AsMemorySequence, GetEnumerator
  • pointer — ref T this[], GetPinnableReference()
  • ReadOnlySequence<T> — AsReadOnlySequence()
  • IReadOnlyList<Memory<T>> — AsReadOnlyMemoryList()
  • IReadOnlyList<ReadOnlyMemory<T>> — AsMemoryList()

IBufferWriter<T> can be passed directly to MessagePackSerializer.Serialize, etc., or used in cases such as reading from a Stream, where it is retrieved and written chunk by chunk from the beginning.

The ReadOnlySequence<T> can be passed directly to MessagePackSerializer.Deserialize, and SequenceReader is useful to processing large data via streaming.

AsMemoryList() and AsReadOnlySequence() are convenient data structure for RandomAccess.Read/Write API.

Array.MaxValue

By the way, before .NET 6, the limit for the number of array elements was set differently for byte arrays (arrays of 1-byte structures) and other arrays. For example, quote the documentation for System.Array

The array size is limited to a total of 4 billion elements, and to a maximum index of 0X7FEFFFFF in any given dimension (0X7FFFFFC7 for byte arrays and arrays of single-byte structures).

However, .NET 6 adds Array.MaxLength property, it returns single constants- 0X7FFFFFC7.

This change was introduced and discussed in this dotnet/runtime PR(#43301).

Arrays have been around for a long time, so int Length is OK, but I kind of wish that the Length of Span<T> and Memory<T> were long.

According to the documentation on what to do with Span’s API as of 2016, there were several candidates, but the result was int Length, which follows the array.

How about nuint Span<T>/Memory<T>.Length, now that nint and nuint were added in C# 9.0, which does not exist at the time of the 2016 discussion.

However, when NativeMemoryArray was first developed, it was made with nuint Length, but it was hard to use nuint in APIs like AsSpan(nuint start, nuint length). Usually, I want to pass int or long as is. So I finally decided to unify it with long.

Anyway, please try to use it. Even if you don’t need huge arrays, allocating in native memory is less demanding on the heap, so you may be able to create programs with better performance.

--

--

Yoshifumi Kawai

a.k.a. neuecc. Creator of UniRx, UniTask, MessagePack for C#, MagicOnion etc. Microsoft MVP for C#. CEO/CTO of Cysharp Inc. Live and work in Tokyo, Japan.