TieredMemDB is a Redis branch that fully uses the advantages of DRAM and Intel Optane Persistent Memory (PMEM). It is fully compatible with Redis and supports all its structures and features. The main idea is to use a large PMEM capacity to store user data and DRAM speed for latency-sensitive structures. We also offer the possibility of defining the DRAM to PMEM ratio, which will be automatically monitored and maintained by application. This allows you to fully adapt the utilization of the memory to your hardware configuration.
The source code of TieredMemDB can be found in this GitHub repository.
TieredMemDB requires Linux kernel 5.1 or higher. This version introduces KMEM DAX feature which is used to expose PMEM device as a system-ram.
Information how to configure KMEM DAX are available on this blog post.
For automatic recognition of KMEM DAX NUMA node libdaxctl-devel (v66 or later) is necessary to be installed in your system.
TieredMemDB sources are available on github.
Libmemkind is used as a submodule so it need to initialized with:
% git submodule init % git submodule update
Building with Memkind as an allocator is done by:
% make MALLOC=memkind
The application has additional configuration parameters related to the handling of two types of memories.
The main one is the ability to define Memory Allocation Policy. It allow modifying mechanism how heap memory is allocated via zmalloc() function calls. It can target DRAM, Persistent Memory or both. In general, bigger allocations should be stored in Persistent Memory which provides a higher capacity, while smaller and more frequently used should be stored in DRAM.
Memory Allocation Policy is defined in redis.conf or via command line argument:
Parameters used when THRESHOLD policy is selected:
When Threshold policy is selected, application will check static-threshold parameter. Allocation of the size smaller than this threshold goes to DRAM. Allocation of the size equal or bigger than this threshold goes to Persistent Memory. Parameter can be modified when application is running using CONFIG SET.
Minimum allocation size measured in bytes which goes to Persistent Memory
Parameters used when RATIO policy is selected:
Application allocates part of data in DRAM and part in Persistent Memory based on value of internal dynamic threshold and dram-pmem-ratio. Application monitors DRAM and Persistent Memory utilization and modifies value of internal dynamic threshold by increasing or decreasing it to achieve expected dram-pmem-ratio.
The syntax of dram-pmem-ratio directive is the following:
Expected proportion of memory placement between DRAM and Persistent Memory. Real DRAM:PMEM ratio depends on workload and its variability. dram_value and pmem_value are values from range <1,INT_MAX>
Place 25% of all memory in DRAM and 75% in Persistent Memory
dram-pmem-ratio 1 3
Internal dynamic threshold have minimum possible limit defined by dynamic-threshold-min and maximum possible limit defined by dynamic-threshold-max. They should be adapted to the size of the objects handled by the database.
Initial value of dynamic threshold
Minimum value of dynamic threshold. The use of lower values for this parameter may be required when the database handles a large number of objects of small size and the target ratio is set to allocate mainly from PMem, e.g. when the majority of objects will be smaller than 64 bytes a dram-pmem-ratio is set to 1-8.
Maximum value of dynamic threshold
An additional parameter determines how often the application should monitor the current ratio and adjust the internal dynamic threshold.
DRAM/PMEM ratio period measured in milliseconds
When the Memory Allocation Policy is set to a value other than dram-only it means that some memory will be allocated on one or more PMEM nodes. Persistent Memory Variant specifies how the Memory Allocator chooses PMEM NUMA node and what happens when that node runs out of free space.
Persistent Memory Variant is defined in redis.conf or via command line argument:
If there is no free memory left in the swap space, an “Out of Memory” error occurs, regardless of the value of Memory Variant. By default TieredMemDB uses a “single” variant.
Use all available PMEM NUMA nodes
The main hashtable may be one large allocation. In a situation where the base consists mainly of a large number of small keys and values, the allocation of the main hashtable affects significantly the PMEM/DRAM ratio. On the other hand, as it is very often used for writing and reading, so it should be placed in DRAM. The following option allows you to choose where you want to allocate the hashtable to depending on your needs.
Keep hashtable structure on DRAM or on PMEM
With the INFO MEMORY command it is possible to monitor the parameters used by application to maintain DRAM/PMEM ratio. The following parameters have been added:
127.0.0.1:6379> info memory Memory … pmem_threshold:30 used_memory_dram:865144 used_memory_dram_human:844.87K used_memory_pmem:41216 used_memory_pmem_human:40.25K
The Memkind allocator is crucial for the TieredMemDB application. It is used as an Extensible Heap Manager built on top of jemalloc which enables control of memory characteristics and a partitioning of the heap between kinds of memory. Memkind supports a type of memory based on KMEM DAX mechanism and the possibility of exposing memory from the device as an additional NUMA node.
Memkind is a general-purpose allocator, but for an application like Redis it was also optimized by passing specific parameters during “configure” part: