A Simple Cache System in Rust

In this article, we will see how to build a basic yet efficient Least Recently Used (LRU) cache system in Rust, complete with an eviction…

A Simple Cache System in Rust

In this article, we will see how to build a basic yet efficient Least Recently Used (LRU) cache system in Rust, complete with an eviction policy and customizable eviction callbacks.

Starting with the foundational concepts of caching and the significance of the LRU strategy, we progressively enhance a basic cache implementation.

We introduce advanced features like efficient tracking of item usage, concurrency support, and the integration of eviction callbacks to handle resource cleanup upon item removal.

Let’s dive deep in! 🦀

The Basics

Implementing a basic cache system in Rust involves creating a struct to represent the cache, which typically includes a hashmap for storing key-value pairs and possibly some mechanism for tracking the age or usage of items for eviction purposes (e.g., for a Least Recently Used (LRU) cache).

For simplicity, I’ll show you how to implement a basic cache system without any eviction policy. This cache will store key-value pairs and won’t automatically remove old or less frequently used items.

use std::collections::HashMap; 
use std::hash::Hash; 
 
// Define the Cache struct with a generic type that must implement the Eq and Hash traits. 
pub struct Cache<K, V> where K: Eq + Hash { 
    // The underlying storage for the cache. 
    storage: HashMap<K, V>, 
} 
 
impl<K, V> Cache<K, V> where K: Eq + Hash + Clone { 
    // Create a new, empty cache. 
    pub fn new() -> Self { 
        Cache { 
            storage: HashMap::new(), 
        } 
    } 
    
    // Insert a key-value pair into the cache. 
    pub fn set(&mut self, key: K, value: V) { 
        self.storage.insert(key, value); 
    } 
     
    // Retrieve a value from the cache by key, returning an Option. 
    pub fn get(&self, key: &K) -> Option<&V> { 
        self.storage.get(key) 
    } 
} 
 
fn main() { 
    // Example usage of the cache. 
    let mut cache = Cache::new(); 
 
    // Insert some key-value pairs. 
    cache.set("key1", "value1"); 
    cache.set("key2", "value2"); 
 
    // Retrieve and print a value. 
    if let Some(value) = cache.get(&"key1") { 
        println!("Found value: {}", value); 
    } else { 
        println!("Value not found"); 
    } 
 
    // Try to retrieve a value that doesn't exist. 
    match cache.get(&"key3") { 
        Some(value) => println!("Found value: {}", value), 
        None => println!("Value not found"), 
    } 
}

This code defines a Cache struct that uses a HashMap for storage. The Cache struct has methods for creating a new cache (new), inserting a key-value pair (set), and retrieving a value by key (get). The main function demonstrates how to use the cache to store and retrieve values.

Implementing an eviction policy

To implement a cache system with an eviction policy in Rust, we’ll use the Least Recently Used (LRU) policy as an example. An LRU cache evicts the least recently accessed item when the cache reaches its capacity to make room for new items.

We’ll enhance the previous example by adding the capacity for the cache and tracking the usage of items using a combination of a HashMap and a VecDeque (a double-ended queue). The VecDeque will track the order of item usage, with the most recently used items at the front and the least recently used items at the back.

use std::collections::{HashMap, VecDeque}; 
use std::hash::Hash; 
 
pub struct LRUCache<K, V> 
where 
    K: Eq + Hash + Clone, 
{ 
    capacity: usize, 
    storage: HashMap<K, V>, 
    usage_order: VecDeque<K>, 
} 
 
impl<K, V> LRUCache<K, V> 
where 
    K: Eq + Hash + Clone, 
{ 
    pub fn new(capacity: usize) -> Self { 
        assert!(capacity > 0, "Cache capacity must be greater than 0"); 
        LRUCache { 
            capacity, 
            storage: HashMap::new(), 
            usage_order: VecDeque::new(), 
        } 
    } 
 
    pub fn set(&mut self, key: K, value: V) { 
        // Insert or update the value for the key. 
        self.storage.insert(key.clone(), value); 
        // Move this key to the front of the usage order to mark it as recently used. 
        self.update_usage(&key); 
        // If the cache exceeds its capacity, remove the least recently used item. 
        if self.storage.len() > self.capacity { 
            if let Some(least_recently_used) = self.usage_order.pop_back() { 
                self.storage.remove(&least_recently_used); 
            } 
        } 
    } 
 
    pub fn get(&mut self, key: &K) -> Option<&V> { 
        if self.storage.contains_key(key) { 
            // Move this key to the front of the usage order. 
            self.update_usage(key); 
            self.storage.get(key) 
        } else { 
            None 
        } 
    } 
 
    fn update_usage(&mut self, key: &K) { 
        // Remove the key if it already exists in the usage_order. 
        self.usage_order.retain(|existing_key| existing_key != key); 
        // Insert the key at the front to mark it as recently used. 
        self.usage_order.push_front(key.clone()); 
    } 
} 
 
fn main() { 
    // Example usage of the LRU cache. 
    let mut cache = LRUCache::new(2); 
    cache.set("key1", "value1"); 
    cache.set("key2", "value2"); 
    println!("Retrieved: {:?}", cache.get(&"key1")); // Should update key1's position 
    cache.set("key3", "value3"); // This should evict key2 
    // Trying to retrieve key2 should yield None since it was evicted. 
    match cache.get(&"key2") { 
        Some(value) => println!("Retrieved: {:?}", value), 
        None => println!("Key2 was evicted"), 
    } 
}

This implementation defines an LRUCache struct that includes the cache's capacity, a HashMap for storage, and a VecDeque to track the usage order of the keys. The set method inserts or updates a key-value pair and ensures the cache does not exceed its specified capacity by evicting the least recently used item if necessary. The get method retrieves a value by its key and updates the usage order to reflect the recent access. The update_usage method manages the position of keys in the VecDeque to maintain the correct usage order.

This is a basic implementation and might need optimizations or modifications based on specific requirements, such as thread safety in concurrent environments.

Here’s a more advanced example implementing some of these concepts, focusing on the doubly linked list for efficient tracking:

use std::collections::HashMap; 
use std::rc::{Rc, Weak}; 
use std::cell::RefCell; 
 
struct Node<K, V> { 
    key: K, 
    value: V, 
    prev: Option<Weak<RefCell<Node<K, V>>>>, 
    next: Option<Rc<RefCell<Node<K, V>>>>, 
} 
impl<K, V> Node<K, V> { 
    fn new(key: K, value: V) -> Rc<RefCell<Self>> { 
        Rc::new(RefCell::new(Node { 
            key, 
            value, 
            prev: None, 
            next: None, 
        })) 
    } 
} 
 
pub struct LRUCache<K, V> 
where 
    K: Eq + Hash + Clone, 
{ 
    capacity: usize, 
    map: HashMap<K, Rc<RefCell<Node<K, V>>>>, 
    head: Option<Rc<RefCell<Node<K, V>>>>, 
    tail: Option<Rc<RefCell<Node<K, V>>>>, 
} 
impl<K, V> LRUCache<K, V> 
where 
    K: Eq + Hash + Clone, 
{ 
    pub fn new(capacity: usize) -> Self { 
        assert!(capacity > 0, "Cache capacity must be greater than 0"); 
        LRUCache { 
            capacity, 
            map: HashMap::new(), 
            head: None, 
            tail: None, 
        } 
    } 
 
    pub fn set(&mut self, key: K, value: V) { 
        let node = if let Some(node) = self.map.get(&key) { 
            let mut node_ref = node.borrow_mut(); 
            node_ref.value = value; 
            node.clone() 
        } else { 
            let new_node = Node::new(key.clone(), value); 
            self.map.insert(key, new_node.clone()); 
            new_node 
        }; 
        self.attach(node); 
         
        if self.map.len() > self.capacity { 
            if let Some(tail) = self.tail.take() { 
                let tail_key = &tail.borrow().key; 
                self.map.remove(tail_key); 
                self.tail = tail.borrow_mut().prev.take().and_then(|prev_weak| { 
                    let prev = prev_weak.upgrade().unwrap(); 
                    prev.borrow_mut().next = None; 
                    Some(prev) 
                }); 
            } 
        } 
    } 
 
    pub fn get(&mut self, key: &K) -> Option<V> { 
        self.map.get(key).and_then(|node| { 
            let val = node.borrow().value.clone(); 
            self.attach(node.clone()); 
            Some(val) 
        }) 
    } 
 
    fn attach(&mut self, node: Rc<RefCell<Node<K, V>>>) { 
        if let Some(ref head) = self.head { 
            head.borrow_mut().prev = Some(Rc::downgrade(&node)); 
        } 
        node.borrow_mut().next = self.head.take(); 
        self.head = Some(node.clone()); 
        if self.tail.is_none() { 
            self.tail = Some(node); 
        } 
    } 
} 
 
fn main() { 
    let mut cache = LRUCache::new(2); 
    cache.set("key1", "value1"); 
    cache.set("key2", "value2"); 
    println!("Retrieved: {:?}", cache.get(&"key1")); // Access key1, making it most recently used 
    cache.set("key3", "value3"); // Evicts key2 as it is the least recently used 
    match cache.get(&"key2") { 
        Some(value) => println!("Retrieved: {:?}", value), 
        None => println!("Key2 was evicted"), // Expected outcome 
    } 
}

This advanced example introduces a custom doubly linked list for tracking the usage order, significantly improving efficiency over the VecDeque approach. Each node in the list contains a key-value pair, and the list maintains pointers to the most and least recently used items. When an item is accessed or added, it's moved to the front of the list. If the cache exceeds its capacity, the item at the end of the list (the least recently used item) is removed.

Incorporating eviction callbacks

To incorporate eviction callbacks into the LRU cache implementation, you can modify the Node structure to include an optional callback function that gets called when a node is evicted. In Rust, you can use Box<dyn Fn()> to store a closure that takes no parameters and returns nothing, which is suitable for our eviction callback.

Here’s an updated version of the LRU cache that includes eviction callbacks:

use std::collections::HashMap; 
use std::rc::{Rc, Weak}; 
use std::cell::RefCell; 
 
struct Node<K, V> { 
    key: K, 
    value: V, 
    prev: Option<Weak<RefCell<Node<K, V>>>>, 
    next: Option<Rc<RefCell<Node<K, V>>>>, 
    on_evict: Option<Box<dyn Fn()>>, // Eviction callback 
} 
impl<K, V> Node<K, V> { 
    fn new(key: K, value: V, on_evict: Option<Box<dyn Fn()>>) -> Rc<RefCell<Self>> { 
        Rc::new(RefCell::new(Node { 
            key, 
            value, 
            prev: None, 
            next: None, 
            on_evict, 
        })) 
    } 
} 
 
pub struct LRUCache<K, V> 
where 
    K: Eq + Hash + Clone, 
{ 
    capacity: usize, 
    map: HashMap<K, Rc<RefCell<Node<K, V>>>>, 
    head: Option<Rc<RefCell<Node<K, V>>>>, 
    tail: Option<Rc<RefCell<Node<K, V>>>>, 
} 
impl<K, V> LRUCache<K, V> 
where 
    K: Eq + Hash + Clone, 
{ 
    pub fn new(capacity: usize) -> Self { 
        assert!(capacity > 0, "Cache capacity must be greater than 0"); 
        LRUCache { 
            capacity, 
            map: HashMap::new(), 
            head: None, 
            tail: None, 
        } 
    } 
 
    pub fn set(&mut self, key: K, value: V, on_evict: Option<Box<dyn Fn()>>) { 
        let node = if let Some(node) = self.map.get(&key) { 
            let mut node_ref = node.borrow_mut(); 
            node_ref.value = value; 
            node.clone() 
        } else { 
            let new_node = Node::new(key.clone(), value, on_evict); 
            self.map.insert(key, new_node.clone()); 
            new_node 
        }; 
        self.attach(node); 
         
        if self.map.len() > self.capacity { 
            if let Some(tail) = self.tail.take() { 
                let tail_key = &tail.borrow().key; 
                if let Some(callback) = &tail.borrow().on_evict { 
                    callback(); // Call the eviction callback 
                } 
                self.map.remove(tail_key); 
                self.tail = tail.borrow_mut().prev.take().and_then(|prev_weak| { 
                    let prev = prev_weak.upgrade().unwrap(); 
                    prev.borrow_mut().next = None; 
                    Some(prev) 
                }); 
            } 
        } 
    } 
 
    pub fn get(&mut self, key: &K) -> Option<V> { 
        self.map.get(key).and_then(|node| { 
            let val = node.borrow().value.clone(); 
            self.attach(node.clone()); 
            Some(val) 
        }) 
    } 
 
    fn attach(&mut self, node: Rc<RefCell<Node<K, V>>>) { 
        if let Some(ref head) = self.head { 
            head.borrow_mut().prev = Some(Rc::downgrade(&node)); 
        } 
        node.borrow_mut().next = self.head.take(); 
        self.head = Some(node.clone()); 
        if self.tail.is_none() { 
            self.tail = Some(node); 
        } 
    } 
} 
 
fn main() { 
    let mut cache = LRUCache::new(2); 
    // Set with eviction callback 
    cache.set("key1", "value1", Some(Box::new(|| println!("Evicting key1")))); 
    cache.set("key2", "value2", Some(Box::new(|| println!("Evicting key2")))); 
    println!("Retrieved: {:?}", cache.get(&"key1")); // Access key1, making it most recently used 
    cache.set("key3", "value3", Some(Box::new(|| println!("Evicting key3 or key2")))); // Evicts key2 as it is the least recently used 
    match cache.get(&"key2") { 
        Some(value) => println!("Retrieved: {:?}", value), 
        None => println!("Key2 was evicted"), // Expected outcome 
    } 
}

In this version, the Node struct includes an on_evict field that holds an optional eviction callback. The set method accepts an additional parameter, on_evict, which is a closure that gets executed when the node is evicted. When the cache exceeds its capacity and needs to evict the least recently used item, it checks if there's an eviction callback for the node being removed and executes it if present.

🚀 Explore More by Luis Soares

📚 Learning Hub: Expand your knowledge in various tech domains, including Rust, Software Development, Cloud Computing, Cyber Security, Blockchain, and Linux, through my extensive resource collection:

  • Hands-On Tutorials with GitHub Repos: Gain practical skills across different technologies with step-by-step tutorials, complemented by dedicated GitHub repositories. Access Tutorials
  • In-Depth Guides & Articles: Deep dive into core concepts of Rust, Software Development, Cloud Computing, and more, with detailed guides and articles filled with practical examples. Read More
  • E-Books Collection: Enhance your understanding of various tech fields with a series of free e-Books, including titles like “Mastering Rust Ownership” and “Application Security Guide” Download eBook
  • Project Showcases: Discover a range of fully functional projects across different domains, such as an API Gateway, Blockchain Network, Cyber Security Tools, Cloud Services, and more. View Projects
  • LinkedIn Newsletter: Stay ahead in the fast-evolving tech landscape with regular updates and insights on Rust, Software Development, and emerging technologies by subscribing to my newsletter on LinkedIn. Subscribe Here

🔗 Connect with Me:

  • Medium: Read my articles on Medium and give claps if you find them helpful. It motivates me to keep writing and sharing Rust content. Follow on Medium
  • Personal Blog: Discover more on my personal blog, a hub for all my Rust-related content. Visit Blog
  • LinkedIn: Join my professional network for more insightful discussions and updates. Connect on LinkedIn
  • Twitter: Follow me on Twitter for quick updates and thoughts on Rust programming. Follow on Twitter

Wanna talk? Leave a comment or drop me a message!

All the best,

Luis Soares
luis.soares@linux.com

Senior Software Engineer | Cloud Engineer | SRE | Tech Lead | Rust | Golang | Java | ML AI & Statistics | Web3 & Blockchain

Read more