NativeMemoryArray — A library that takes full advantage of the .NET 6 API to handle huge data of over 2GB
There is one major limitation in C# when dealing with arrays, especially byte. The upper limit of a one-dimensional array is 0x7FFFFFC7 (2,147,483,591), which is a bit smaller than int.MaxValue.
This limitation is different with .NET 6 and other target frameworks. We will discuss it in detail later.
This value of 2GB is the limit of int Length. However, nowadays, we often deal with large values, such as 4K/8K video, large data sets for deep learning, and huge point cloud data for 3D scanning. Therefore, the 2GB constraint is too bad. And this constraint does not change even if Span<T> or Memory<T> is used (because Length is int).
When you run into these limitations, you can either switch to streaming processing or use pointers to handle them, but neither is very easy to handle, and they don’t necessarily replace the operations you used to do in memory.
Therefore, I’ve created an array with native memory on the back, which exceeds the 2GB constraint and is highly integrated with new APIs like IBufferWriter<T>, ReadOnlySequence<T>, RandomAccess.Write/Read, System.IO.Pipelines, etc.
GitHub - Cysharp/NativeMemoryArray: Utilized native-memory backed array for .NET and Unity - over…
Utilized native-memory backed array for .NET and Unity - over the 2GB limitation and support the modern…
.NET 6 is recommended, but it is supported from .NET Standard 2.0, which also supports Unity Game Engine.
You can load huge data via .NET 6' new RandomAcesss API.
You can also use Stream via IBufferwriter<T>, IEnumerable<Memory<T>>.
Also you can extract and process the Slice of Span<T>, or use ref T this[long index] for indexer access and pointer extraction.
Of course, .NET Standard 2.0/2.1 is also supported, and it also works with Unity Game Engine.
NativeMemoryArray provides only simple
Cysharp.Collections.NativeMemoryArray<T> class. It has
where T : unmanaged constraint so you can only use struct that not includes reference type.
The difference with
Span<T> is that
NativeMemoryArray<T> itself is a class, so it can be placed in a field. This means that, unlike
Span<T>, it is possible to ensure some long lifetime. Since you can make a slice of
Memory<T>, you can also pass it into Async methods. Also, the length limit of
Span<T> is up to int.MaxValue (roughly 2GB), however
NativeMemoryArray<T> can be larger than that.
The main advantages are as follows
- Allocates from native memory, so it does not use the C# heap.
- There is no limit of 2GB, and infinite length can be allocated as long as memory allows.
- Can pass directly via
- Can pass directly via
- Can pass huge data directly via
The feature is that it can be seamlessly integrated with other APIs through versatile conversions.
- Slice — AsSpan<T>(), AsMemory<T>()
- IBufferWriter<T> — CrateBufferWriter()
- foreach — AsSpanSequence, AsMemorySequence, GetEnumerator
- pointer — ref T this, GetPinnableReference()
- ReadOnlySequence<T> — AsReadOnlySequence()
- IReadOnlyList<Memory<T>> — AsReadOnlyMemoryList()
- IReadOnlyList<ReadOnlyMemory<T>> — AsMemoryList()
IBufferWriter<T> can be passed directly to
MessagePackSerializer.Serialize, etc., or used in cases such as reading from a
Stream, where it is retrieved and written chunk by chunk from the beginning.
ReadOnlySequence<T> can be passed directly to
MessagePackSerializer.Deserialize, and SequenceReader is useful to processing large data via streaming.
AsReadOnlySequence() are convenient data structure for RandomAccess.
By the way, before .NET 6, the limit for the number of array elements was set differently for byte arrays (arrays of 1-byte structures) and other arrays. For example, quote the documentation for System.Array
The array size is limited to a total of 4 billion elements, and to a maximum index of 0X7FEFFFFF in any given dimension (0X7FFFFFC7 for byte arrays and arrays of single-byte structures).
However, .NET 6 adds Array.MaxLength property, it returns single constants- 0X7FFFFFC7.
This change was introduced and discussed in this dotnet/runtime PR(#43301).
Arrays have been around for a long time, so int Length is OK, but I kind of wish that the Length of Span<T> and Memory<T> were long.
According to the documentation on what to do with Span’s API as of 2016, there were several candidates, but the result was int Length, which follows the array.
How about nuint Span<T>/Memory<T>.Length, now that nint and nuint were added in C# 9.0, which does not exist at the time of the 2016 discussion.
However, when NativeMemoryArray was first developed, it was made with nuint Length, but it was hard to use nuint in APIs like AsSpan(nuint start, nuint length). Usually, I want to pass int or long as is. So I finally decided to unify it with long.
Anyway, please try to use it. Even if you don’t need huge arrays, allocating in native memory is less demanding on the heap, so you may be able to create programs with better performance.