Memory-Mapped I/O in Rust

Memory-mapped I/O is especially helpful when working with big files, loading only the necessary parts into memory. This makes it simpler to…

Memory-Mapped I/O in Rust

Memory-mapped I/O is especially helpful when working with big files, loading only the necessary parts into memory. This makes it simpler to access and change file data as if you were working with regular memory, which can speed things up, especially for apps that need to jump around a lot in a file.

It’s also handy for letting different programs share data without a fuss and any changes you make get saved automatically. Plus, if your computer supports it, it can move data around even faster without bogging down the CPU.

Let’s dive into how to work with Memory-Mapped I/O in Rust in practice through hands-on examples.

Example 1: Reading from a Memory-Mapped File

This example demonstrates how to open an existing file, memory-map it, and read data from it.

use memmap2::Mmap; 
use std::fs::File; 
use std::io; 
 
fn main() -> io::Result<()> { 
    // Open an existing file 
    let file = File::open("example.dat")?; 
     
    // Memory-map the file for reading 
    let mmap = unsafe { Mmap::map(&file)? }; 
     
    // Read some data from the memory-mapped file 
    // Here, we'll just print out the first 10 bytes 
    let data = &mmap[0..10]; 
     
    println!("First 10 bytes: {:?}", data); 
     
    Ok(()) 
}

Example 2: Using Memory-Mapped Files for Inter-Process Communication (IPC)

Memory-mapped files can be used for efficient data sharing between processes. This example sets up a simple IPC mechanism where one process writes data to a shared memory area, and another process reads it.

Writer Process

use memmap2::MmapMut; 
use std::fs::OpenOptions; 
use std::io::{self, Write}; 
 
fn main() -> io::Result<()> { 
    let file_path = "shared.dat"; 
 
    let message = b"IPC using mmap in Rust!"; 
 
    let file = OpenOptions::new() 
        .read(true) 
        .write(true) 
        .create(true) 
        .open(file_path)?; 
 
    file.set_len(message.len() as u64)?; 
 
    let mut mmap = unsafe { MmapMut::map_mut(&file)? }; 
 
    mmap[..message.len()].copy_from_slice(message); 
 
    mmap.flush()?; 
 
    println!("Message written to shared memory."); 
 
    Ok(()) 
}

Reader Process

use memmap2::Mmap; 
use std::fs::File; 
use std::io; 
 
fn main() -> io::Result<()> { 
    let file_path = "shared.dat"; 
 
    let file = File::open(file_path)?; 
 
    let mmap = unsafe { Mmap::map(&file)? }; 
 
    let message = &mmap[..]; 
 
    println!("Read from shared memory: {:?}", message); 
 
    Ok(()) 
}

Example 3: Modifying a Memory-Mapped File

This example shows how to modify part of a memory-mapped file, demonstrating an in-place update.

use memmap2::MmapMut; 
use std::fs::OpenOptions; 
use std::io; 
 
fn main() -> io::Result<()> { 
    let file_path = "example.dat"; 
     
    let file = OpenOptions::new() 
        .read(true) 
        .write(true) 
        .open(file_path)?; 
    let mut mmap = unsafe { MmapMut::map_mut(&file)? }; 
     
    // Modify a portion of the mapped memory 
    let new_data = b"Rust"; 
     
    for (i, byte) in new_data.iter().enumerate() { 
        mmap[i] = *byte; 
    } 
     
    mmap.flush()?; 
     
    println!("Memory-mapped file updated."); 
     
    Ok(()) 
}

Notes:

  • The unsafe block is necessary for memory-mapped operations because they involve direct manipulation of memory, which can lead to undefined behavior if not used carefully.
  • Always ensure that the file size is adequate for the operations you intend to perform. Trying to access memory outside the bounds of the mapped file can cause your program to crash.
  • Remember to handle errors and edge cases in a real-world application, such as checking if the file exists before attempting to read from it or ensuring that file modifications do not exceed the file size.

Example 4: Concurrently Reading from a Memory-Mapped File

Memory-mapped files can be efficiently used by multiple threads to read data concurrently due to the way the operating system handles page caching. Here’s an example that demonstrates concurrent reads:

use memmap2::Mmap; 
use std::fs::File; 
use std::io; 
use std::sync::Arc; 
use std::thread; 
 
fn main() -> io::Result<()> { 
    let file = File::open("example.dat")?; 
     
    let mmap = unsafe { Mmap::map(&file)? }; 
     
    let mmap_arc = Arc::new(mmap); 
     
    let mut handles = vec![]; 
     
    for _ in 0..4 {  // Create 4 threads 
        let mmap_clone = Arc::clone(&mmap_arc); 
        let handle = thread::spawn(move || { 
            // Each thread reads from the memory-mapped file 
            let data = &mmap_clone[0..10];  // Example: Read first 10 bytes 
            println!("Thread read: {:?}", data); 
        }); 
 
        handles.push(handle); 
    } 
 
    for handle in handles { 
        handle.join().unwrap(); 
    } 
 
    Ok(()) 
}

Example 5: Using Memory-Mapped Files for Efficient Large Data Manipulation

Memory-mapped files are particularly useful for working with large data sets, as they allow for random access without loading the entire file into memory. Here’s an example that manipulates a large file:

use memmap2::MmapMut; 
use std::fs::OpenOptions; 
use std::io; 
 
fn manipulate_large_file(file_path: &str) -> io::Result<()> { 
    let file = OpenOptions::new() 
        .read(true) 
        .write(true) 
        .open(file_path)?; 
 
    let mut mmap = unsafe { MmapMut::map_mut(&file)? }; 
     
    // Example manipulation: zero out every other byte in a large file 
    for i in (0..mmap.len()).step_by(2) { 
        mmap[i] = 0; 
    } 
     
    mmap.flush()?; // Ensure changes are written back to the file 
     
    Ok(()) 
} 
 
fn main() -> io::Result<()> { 
    let file_path = "large_example.dat"; 
    manipulate_large_file(file_path) 
}

Example 6: Memory-Mapped Files for Ring Buffers

Memory-mapped files can be used to implement a ring buffer (circular buffer), which is useful for scenarios where a fixed-size buffer is continuously written to and read from, such as logging systems or stream processing.

use memmap2::MmapMut; 
use std::fs::{File, OpenOptions}; 
use std::io; 
 
struct RingBuffer { 
    mmap: MmapMut, 
    capacity: usize, 
    head: usize, 
    tail: usize, 
} 
 
impl RingBuffer { 
 
    fn new(file_path: &str, size: usize) -> io::Result<Self> { 
 
        let file = OpenOptions::new() 
            .read(true) 
            .write(true) 
            .create(true) 
            .open(file_path)?; 
 
        file.set_len(size as u64)?; 
 
        let mmap = unsafe { MmapMut::map_mut(&file)? }; 
 
        Ok(Self { 
            mmap, 
            capacity: size, 
            head: 0, 
            tail: 0, 
        }) 
    } 
 
    fn write(&mut self, data: &[u8]) { 
        for &byte in data { 
            self.mmap[self.head] = byte; 
            self.head = (self.head + 1) % self.capacity; 
            if self.head == self.tail { 
                self.tail = (self.tail + 1) % self.capacity; // Overwrite oldest data 
            } 
        } 
    } 
 
    // Additional methods for reading, seeking, etc., can be added here 
} 
 
fn main() -> io::Result<()> { 
    let mut ring_buffer = RingBuffer::new("ring_buffer.dat", 1024)?; 
    ring_buffer.write(b"Hello, Ring Buffer!"); 
    Ok(()) 
}

Example 7: Implementing a Simple Database with Memory-Mapped Files

Memory-mapped files can be used to implement a simple key-value store, leveraging the efficiency of direct memory access for both reads and writes. Here’s a basic example:

use memmap2::MmapMut; 
use std::collections::HashMap; 
use std::fs::{File, OpenOptions}; 
use std::io; 
 
struct SimpleDB { 
    mmap: MmapMut, 
    index: HashMap<String, (usize, usize)>, // Key to (offset, length) 
} 
 
impl SimpleDB { 
    fn new(file_path: &str, size: usize) -> io::Result<Self> { 
        let file = OpenOptions::new() 
            .read(true) 
            .write(true) 
            .create(true) 
            .open(file_path)?; 
        file.set_len(size as u64)?; 
        let mmap = unsafe { MmapMut::map_mut(&file)? }; 
        Ok(Self { 
            mmap, 
            index: HashMap::new(), 
        }) 
    } 
 
    fn insert(&mut self, key: &str, value: &[u8]) -> io::Result<()> { 
        let offset = self.mmap.len(); // Append to the end 
        let length = value.len(); 
        if offset + length > self.mmap.capacity() { 
            return Err(io::Error::new(io::ErrorKind::OutOfMemory, "Database is full")); 
        } 
        self.mmap[offset..offset + length].copy_from_slice(value); 
        self.index.insert(key.to_string(), (offset, length)); 
        Ok(()) 
    } 
 
    fn get(&self, key: &str) -> Option<&[u8]> { 
        self.index.get(key).map(|&(offset, length)| &self.mmap[offset..offset + length]) 
    } 
} 
 
fn main() -> io::Result<()> { 
    let mut db = SimpleDB::new("simple_db.dat", 1024 * 1024)?; // 1 MB database 
 
    db.insert("hello", b"world")?; 
 
    db.insert("foo", b"bar")?; 
 
    if let Some(value) = db.get("hello") { 
        println!("Value for 'hello': {:?}", value); 
    } 
 
    Ok(()) 
}

Example 8: Memory-Mapped File for Real-Time Data Processing

Memory-mapped files are ideal for real-time data processing applications, such as financial tick data analysis, where latency is critical. This example shows how you might set up a memory-mapped file for such a use case:

use memmap2::MmapMut; 
use std::fs::OpenOptions; 
use std::io; 
use std::time::{Duration, Instant}; 
 
fn process_real_time_data(file_path: &str) -> io::Result<()> { 
    let file = OpenOptions::new() 
        .read(true) 
        .write(true) 
        .create(true) 
        .open(file_path)?; 
 
    file.set_len(1024 * 1024)?; // 1 MB 
 
    let mmap = unsafe { MmapMut::map_mut(&file)? }; 
 
    let start = Instant::now(); 
 
    while start.elapsed() < Duration::from_secs(10) { // Process for 10 seconds 
        let timestamp = start.elapsed().as_micros() as u32; 
        let data = timestamp.to_ne_bytes(); // Example data: current timestamp 
 
        // Write data to a specific location, e.g., beginning of the mmap 
        mmap[0..data.len()].copy_from_slice(&data); 
 
        // Simulate real-time data processing by sleeping for a short duration 
        std::thread::sleep(Duration::from_micros(1)); 
    } 
    Ok(()) 
}

🚀 Explore More by Luis Soares

📚 Learning Hub: Expand your knowledge in various tech domains, including Rust, Software Development, Cloud Computing, Cyber Security, Blockchain, and Linux, through my extensive resource collection:

  • Hands-On Tutorials with GitHub Repos: Gain practical skills across different technologies with step-by-step tutorials, complemented by dedicated GitHub repositories. Access Tutorials
  • In-Depth Guides & Articles: Deep dive into core concepts of Rust, Software Development, Cloud Computing, and more, with detailed guides and articles filled with practical examples. Read More
  • E-Books Collection: Enhance your understanding of various tech fields with a series of free e-Books, including titles like “Mastering Rust Ownership” and “Application Security Guide” Download eBook
  • Project Showcases: Discover a range of fully functional projects across different domains, such as an API Gateway, Blockchain Network, Cyber Security Tools, Cloud Services, and more. View Projects
  • LinkedIn Newsletter: Stay ahead in the fast-evolving tech landscape with regular updates and insights on Rust, Software Development, and emerging technologies by subscribing to my newsletter on LinkedIn. Subscribe Here

🔗 Connect with Me:

  • Medium: Read my articles on Medium and give claps if you find them helpful. It motivates me to keep writing and sharing Rust content. Follow on Medium
  • Personal Blog: Discover more on my personal blog, a hub for all my Rust-related content. Visit Blog
  • LinkedIn: Join my professional network for more insightful discussions and updates. Connect on LinkedIn
  • Twitter: Follow me on Twitter for quick updates and thoughts on Rust programming. Follow on Twitter

Wanna talk? Leave a comment or drop me a message!

All the best,

Luis Soares
luis.soares@linux.com

Senior Software Engineer | Cloud Engineer | SRE | Tech Lead | Rust | Golang | Java | ML AI & Statistics | Web3 & Blockchain

Read more