Package org.apache.datasketches.memory

This package provides high performance primitive and primitive array access to direct (native), off-heap memory and memory-mapped file resources, and consistent views into ByteBuffer, and on-heap primitive arrays. It can be used as a more comprehensive and flexible replacement for ByteBuffer.

In addition, this package provides:

  • Two different access APIs: read-only Memory and WritableMemory for absolute offset access, and read-only Buffer and WritableBuffer for relative positional access (similar to ByteBuffer).
  • Clean separation of Read-only API from Writable API, which makes writable versus read-only resources detectable at compile time.
  • The conversion from Writable to read-only is just a cast, so no unnecessary objects are created. For example:
         WritableMemory wMem = ...
         Memory mem = wMem;
     
  • AutoCloseable for the external resources that require it, which enables compile-time checks for non-closed resources.
  • Immediate invalidation of all downstream references of an AutoCloseable resource when that resource is closed, either manually or by the JVM. This virtually eliminates the possibility of accidentally writing into the memory space previously owned by a closed resource.
  • Improved performance over the prior Memory implementation.
  • Cleaner internal architecture, which will make it easier to extend in the future.
  • No external dependencies, which makes it simple to install in virtually any Java environment.

More specifically, this package provides access to four different types of resources using two different access APIs. These resources are contiguous blobs of bytes that provide at least byte-level read and write access. The four resources are:

  • Direct (a.k.a. Native) off-heap memory allocated by the user.
  • Memory-mapped files, both writable and read-only.
  • ByteBuffers, both heap-based and direct, writable and read-only.
  • Heap-based primitive arrays, which can be accessed as writable or read-only.

The two different access APIs are:

  • Memory, WritableMemory: Absolute offset addressing into a resource.
  • Buffer, WritableBuffer: Position relative addressing into a resource.

In addition, all combinations of access APIs and backing resources can be accessed via multibyte primitive methods (e.g. getLong(...), getLongArray(...), putLong(...), putLongArray(...)) as either ByteOrder.BIG_ENDIAN or ByteOrder.LITTLE_ENDIAN.

The resources don't know or care about the access APIs, and the access APIs don't really know or care what resource they are accessing.

An access API is joined with a resource either with a static factory method or in combination with a Handle, which is used exclusively for resources that are external to the JVM, such as allocation of direct memory and memory-mapped files.

The role of a Handle is to hold onto the reference of a resource that is outside the control of the JVM. The resource is obtained from the handle with get().

When a handle is extended for an AutoCloseable resource and then joined with an access API it becomes an implementation handle. There are 3 implementation handles:

As long as the implementation handle is valid the JVM will not attempt to close the resource.

An implementation handle implements AutoCloseable, which also enables compile-time checks for non-closed resources. If a Handle is acquired in a try-with-resources (TWR) block, it's associated resource will be automatically closed by the JVM at the end of the block. The resource can also be explicitly closed by the user by calling Handle.close().

     //Using try-with-resources block:
     try (WritableyMapHandle handle = WritableMemory.map(File file)) {
       WritableMemory wMem = handle.get();
       doWork(wMem) // read and write to memory mapped file.
     }

     //Using explicit close():
     WritableMapHandle handle = WritableMemory.map(File file);
     WritableMemory wMem = handle.get();
     doWork(wMem) // read and write to memory mapped file.
     handle.close();
 

Where it is desirable to pass ownership of the resource (and the close() responsibility) one can not use the TWR block. Instead:

     WritableMapHandle handler = WritableMemory.map(File file);
     doWorkAndClose(handle); //passes the handle to object that closes the resource.
 

Whatever part of your process is responsible for allocating a resource external to the JVM must be responsible for closing it or making sure it gets closed. Since only the implementation Handles implement AutoCloseable, you must not let go of the handle reference until you are done with its associated resource.

As mentioned above, there are two ways to do this:

  • Use a try-with-resources block. At the end of the block, the JVM will automatically close the resource.
  • If you need to pass an external resource, pass the implementation resource handle, not the access API. This means you are also passing the responsibility to close the resource. If you have different parts of your code holding references to the same handle, whichever one closes it first will make all the other resources invalid, so be careful. As long as there is at least one reference to the handle that is still valid and the resource has not been closed, the resource will remain valid. If you drop all references to all handles, the JVM will eventually close the resource, making it invalid, but it is possible that you might run out of memory first. Depending on this is a bad idea and a could be a serious, hard-to-find bug.

Moving back and forth between Memory and Buffer:

    Memory mem = ...
    Buffer buf = mem.asBuffer();
    ...
    Memory mem2 = buf.asMemory();
    ...
 

Hierarchical memory regions can be easily created:

     WritableMemory wMem = ...
     WritableMemory wReg = wMem.writableRegion(offset, length); //OR
     Memory reg = wMem.region(offset, length);
 

With asserts enabled in the JVM, all methods are checked for bounds and use-after-close violations.

Author:
Lee Rhodes