Building a VM Instruction Set in Rust

In this comprehensive tutorial, we’ll build a basic Virtual Machine (VM) in Rust. It isn’t just about coding; it’s about understanding the…

Luis Soares

17 Jan 2024 — 7 min read

In this comprehensive tutorial, we’ll build a basic Virtual Machine (VM) in Rust. It isn’t just about coding; it’s about understanding the core concepts of virtualization, instruction sets, and how to implement these ideas in a practical, hands-on manner.

By the end of this tutorial, you will have a deeper understanding of VMs and a working Rust application that simulates a simple VM.

What is a Virtual Machine?

A Virtual Machine is a software emulation of a physical computer. It’s an abstraction layer that runs between the hardware and the operating system or applications, allowing multiple operating systems to coexist on the same physical hardware or enabling software to run in a consistent environment regardless of the underlying hardware.

VMs are widely used for various purposes, from running different operating systems on a single physical machine (like Windows on a Mac) to providing isolated environments for software development and testing.

What are Instruction Sets?

An instruction set is a group of commands that a VM or a processor can execute. These instructions can range from simple arithmetic operations (like addition and subtraction) to complex operations involving memory management and I/O handling. The richness and efficiency of an instruction set play a crucial role in determining the performance and capabilities of a VM or a CPU.

Examples of Virtual Machines

A well-known example of a VM is the Java Virtual Machine (JVM), which allows Java applications to run on any device with a JVM installed, irrespective of the underlying hardware and operating system. This “write once, run anywhere” capability is a significant advantage of using VMs.

Implementing the VM in Rust

Step 1: Setting Up the Rust Environment

Ensure Rust is installed on your system. Create a new Rust project using Cargo:

cargo new my_virtual_machine 
cd my_virtual_machine

Step 2: Defining the Instruction Set

Start by defining the instructions your VM will support:

#[derive(Clone)] 
enum Operand { 
    Value(i32), 
    Var(String), 
} 
 
#[derive(Clone)] 
enum Instruction { 
    Push(i32), 
    Add(Operand, Operand), 
    Sub(Operand, Operand), 
    Mul(Operand, Operand), 
    Div(Operand, Operand), 
    Print, 
    Set(String, i32), 
    Get(String), 
    Input(String), 
    If(Vec<Instruction>, Vec<Instruction>), 
    Else(Vec<Instruction>), 
}

Step 3: Building the VM Structure

Create a struct to represent the VM, which includes a stack for operands and a hashmap for variables:

struct VM { 
    stack: Vec<i32>, 
    vars: HashMap<String, i32>, 
}

Step 4: Implementing the Instruction Logic

Implement the logic to execute each instruction:

fn new() -> VM { 
        VM { 
            stack: Vec::new(), 
            vars: HashMap::new(), 
        } 
    } 
 
    fn get_operand_value(&self, operand: &Operand) -> i32 { 
        match operand { 
            Operand::Value(val) => *val, 
            Operand::Var(var_name) => *self.vars.get(var_name) 
                .expect("Variable not found"), 
        } 
    } 
 
    fn run(&mut self, program: Vec<Instruction>, path: &str) { 
        let mut pc = 0; // Program counter 
        while pc < program.len() { 
            match &program[pc] { 
                //PUSH 
                Instruction::Push(val) => self.stack.push(*val), 
 
                //ADDITION 
                Instruction::Add(op1, op2) => { 
                    let val1 = self.get_operand_value(op1); 
                    let val2 = self.get_operand_value(op2); 
                    self.stack.push(val1 + val2); 
                }, 
 
                //SUBSTRACTION 
                Instruction::Sub(op1, op2) => { 
                    let val1 = self.get_operand_value(op1); 
                    let val2 = self.get_operand_value(op2); 
                    self.stack.push(val1 - val2); 
                }, 
 
                //MULTIPLICATION 
                Instruction::Mul(op1, op2) => { 
                    let val1 = self.get_operand_value(op1); 
                    let val2 = self.get_operand_value(op2); 
                    self.stack.push(val1 * val2); 
                }, 
 
                //DIVISION 
                Instruction::Div(op1, op2) => { 
                    let val1 = self.get_operand_value(op1); 
                    let val2 = self.get_operand_value(op2); 
                    if val2 == 0 { 
                        panic!("Division by zero"); 
                    } 
                    self.stack.push(val1 / val2); 
                }, 
 
                //PRINT 
                Instruction::Print => { 
                    if let Some(top) = self.stack.last() { 
                        println!("{}", top); 
                    } else { 
                        println!("Stack is empty"); 
                    } 
                }, 
 
                //SET VARIABLE 
                Instruction::Set(var_name, value) => { 
                    self.vars.insert(var_name.clone(), *value); 
                }, 
 
                //GET VARIABLE 
                Instruction::Get(var_name) => { 
                    if let Some(&value) = self.vars.get(var_name) { 
                        self.stack.push(value); 
                    } else { 
                        panic!("Undefined variable: {}", var_name); 
                    } 
                }, 
 
                //GET USER INPUT from the command line 
                Instruction::Input(var_name) => { 
                    let mut input = String::new(); 
                    io::stdin().read_line(&mut input).expect("Failed to read line"); 
                    let value = input.trim().parse::<i32>().expect("Invalid input"); 
                    self.vars.insert(var_name.clone(), value); 
                }, 
 
                //PROCESS IF instructions 
                Instruction::If(if_block, else_block) => { 
                    if let Some(top) = self.stack.last() { 
                        if *top != 0 { 
                            self.run(if_block.to_vec(), path); // IF the value at the stack is > 0, execute the IF instruction 
                        } else if !else_block.is_empty() { // If the value at the stack = 0, execute the else 
                            if let Ok(file) = File::open(path) { 
                                let reader = io::BufReader::new(file); 
                                let mut else_block_clone = else_block.clone(); // Clone the else_block 
                                let mut else_block_reader = reader.lines(); 
 
                                for next_line in &mut else_block_reader { 
                                    if let Ok(next_line) = next_line { 
                                        else_block_clone.extend(parse_instruction(&next_line)); 
                                    } 
                                } 
                                self.run(else_block_clone, path); // Pass the cloned else_block 
                            } else { 
                                panic!("Failed to open file: {}", path); 
                            } 
                        } 
                    } else { 
                        panic!("Stack is empty"); 
                    } 
                }, 
 
                //Process the ELSE block 
                Instruction::Else(else_block) => { 
                    // This is only executed if the 'if' condition was not met, 
                    // so we don't need to check the stack again. 
                    self.run(else_block.to_vec(), path); // Pass path as an argument 
                }, 
            } 
            pc += 1; 
        } 
    }

Step 5: Parsing Instructions from a File

Implement functionality to load and parse instructions from a file. This requires reading a file line by line and converting each line into an Instruction:

fn load_program(reader: &mut io::BufReader<File>) -> io::Result<Vec<Instruction>> { 
        let mut program = Vec::new(); 
 
        // Read all lines into a vector 
        let lines: Vec<String> = reader.lines().collect::<Result<_, _>>()?; 
 
        // Temporary storage for IF/ELSE blocks 
        let mut if_block = Vec::new(); 
        let mut else_block = Vec::new(); 
        let mut in_if_block = false; 
        let mut in_else_block = false; 
 
        for line in lines.iter() { 
            let parts: Vec<&str> = line.split_whitespace().collect(); 
 
            // Handle the start of an IF block 
            if parts.get(0) == Some(&"IF") { 
                in_if_block = true; 
                in_else_block = false; 
                continue; 
            } 
 
            // Handle the start of an ELSE block 
            if parts.get(0) == Some(&"ELSE") { 
                in_else_block = true; 
                in_if_block = false; 
                continue; 
            } 
 
            // Check if currently inside an IF or ELSE block 
            if in_if_block || in_else_block { 
                let block = if in_if_block { &mut if_block } else { &mut else_block }; 
 
                // Add instruction to the current block 
                block.extend(parse_instruction(line)); 
 
                // Check for the end of the block 
                if parts.get(0) == Some(&"ENDIF") { 
                    if in_if_block { 
                        program.push(Instruction::If(if_block.clone(), else_block.clone())); 
                    } else { 
                        program.push(Instruction::Else(else_block.clone())); 
                    } 
                    if_block.clear(); 
                    else_block.clear(); 
                    in_if_block = false; 
                    in_else_block = false; 
                } 
 
                continue; 
            } 
 
            // Parse other instructions 
            let instruction = parse_instruction(line); 
            program.extend(instruction); 
        } 
 
        Ok(program) 
    }

Implement additional parsing methods to improve code readability:

fn parse_operand(op_str: &str) -> Operand { 
    if let Ok(val) = op_str.parse::<i32>() { 
        Operand::Value(val) 
    } else { 
        Operand::Var(op_str.to_string()) 
    } 
} 
 
fn extract_var_name(operand: &str) -> &str { 
    operand.trim_start_matches("Var(\"").trim_end_matches("\")") 
} 
 
fn parse_instruction(line: &str) -> Vec<Instruction> { 
    let parts: Vec<&str> = line.split_whitespace().collect(); 
    match parts.as_slice() { 
        ["PUSH", num] => vec![Instruction::Push(num.parse::<i32>().expect("Invalid number"))], 
        ["ADD", op1, op2] => { 
            let operand1 = parse_operand(extract_var_name(op1)); 
            let operand2 = parse_operand(extract_var_name(op2)); 
            vec![Instruction::Add(operand1, operand2)] 
        }, 
        ["SUB", op1, op2] => { 
            let operand1 = parse_operand(extract_var_name(op1)); 
            let operand2 = parse_operand(extract_var_name(op2)); 
            vec![Instruction::Sub(operand1, operand2)] 
        }, 
        ["MUL", op1, op2] => { 
            let operand1 = parse_operand(extract_var_name(op1)); 
            let operand2 = parse_operand(extract_var_name(op2)); 
            vec![Instruction::Mul(operand1, operand2)] 
        }, 
        ["DIV", op1, op2] => { 
            let operand1 = parse_operand(extract_var_name(op1)); 
            let operand2 = parse_operand(extract_var_name(op2)); 
            vec![Instruction::Div(operand1, operand2)] 
        }, 
        ["PRINT"] => vec![Instruction::Print], 
        ["SET", var_name, value] => { 
            let value = value.parse::<i32>().expect("Invalid number"); 
            vec![Instruction::Set(var_name.to_string(), value)] 
        }, 
        ["GET", var_name] => vec![Instruction::Get(var_name.to_string())], 
        ["Input", var_name] => vec![Instruction::Input(var_name.to_string())], 
        _ => vec![], 
    } 
} 
 
// Function to create a BufReader and call VM::load_program 
fn load_program_and_run(file_path: &str) -> Result<(), Box<dyn std::error::Error>> { 
    let file = match File::open(file_path) { 
        Ok(file) => file, 
        Err(e) => { 
            eprintln!("Failed to open file: {}", e); 
            return Err(Box::new(e)); // Return an error 
        } 
    }; 
    let mut reader = io::BufReader::new(file); 
 
    // Create a VM instance 
    let mut vm = VM::new(); 
 
    // Load and run the program 
    match VM::load_program(&mut reader) { 
        Ok(program) => { 
            vm.run(program, file_path); // Just call run without expecting a Result 
            // Handle any other necessary logic here if needed 
        } 
        Err(e) => { 
            eprintln!("Failed to load program: {}", e); 
            return Err(Box::new(e)); // Return an error 
        } 
    } 
 
    Ok(()) // Return Ok to indicate success 
}

Step 6: Handling Command Line Arguments

Modify the main function to take the file path as a command line argument:

fn main() { 
    let args: Vec<String> = env::args().collect(); 
    if args.len() != 2 { 
        eprintln!("Usage: {} <program_file.rm>", args[0]); 
        process::exit(1); 
    } 
 
    let file_path = &args[1]; 
 
    match load_program_and_run(file_path) { 
        Ok(_) => { 
            println!("Program executed successfully."); 
        } 
        Err(e) => { 
            eprintln!("Error: {}", e); 
            process::exit(1); 
        } 
    } 
}

Step 7: Testing the VM

Create a text file with a series of instructions and use it to test your VM. For example:

Input y 
GET y 
Input x 
GET x 
ADD Var("x") Var("y") 
PRINT 
IF 
    GET x 
    PRINT 
ELSE 
    GET y 
    PRINT 
ENDIF

This should:

Ask for user input from the command line
Store it in a variable y
Ask for a new user input from the command line
Store it in a variable x
Add the two variables' values
Print the top of the stack, which will contain the result of the Addition
Evaluate the top of the stack — if the value is 0, execute the IF block, which will print the value of the yvariable. If not, execute the ELSE block, which will print the value of the x variable.

Play around with the VM by combining operators, variables, and so on!

This tutorial has guided you through creating a simple VM in Rust, demonstrating core concepts like instruction sets and VM operation.

Keep exploring and experimenting, and you’ll find that the world of VMs offers endless opportunities for learning and innovation.

You can find the complete implementation in my GitHub repository: https://github.com/luishsr/rustvm.

🚀 Explore a Wealth of Resources in Software Development and More by Luis Soares

📚 Learning Hub: Expand your knowledge in various tech domains, including Rust, Software Development, Cloud Computing, Cyber Security, Blockchain, and Linux, through my extensive resource collection:

Hands-On Tutorials with GitHub Repos: Gain practical skills across different technologies with step-by-step tutorials, complemented by dedicated GitHub repositories. Access Tutorials
In-Depth Guides & Articles: Deep dive into core concepts of Rust, Software Development, Cloud Computing, and more, with detailed guides and articles filled with practical examples. Read More
E-Books Collection: Enhance your understanding of various tech fields with a series of free e-Books, including titles like “Mastering Rust Ownership” and “Application Security Guide” Download eBook
Project Showcases: Discover a range of fully functional projects across different domains, such as an API Gateway, Blockchain Network, Cyber Security Tools, Cloud Services, and more. View Projects
LinkedIn Newsletter: Stay ahead in the fast-evolving tech landscape with regular updates and insights on Rust, Software Development, and emerging technologies by subscribing to my newsletter on LinkedIn. Subscribe Here

🔗 Connect with Me:

Medium: Read my articles on Medium and give claps if you find them helpful. It motivates me to keep writing and sharing Rust content. Follow on Medium
Personal Blog: Discover more on my personal blog, a hub for all my Rust-related content. Visit Blog
LinkedIn: Join my professional network for more insightful discussions and updates. Connect on LinkedIn
Twitter: Follow me on Twitter for quick updates and thoughts on Rust programming. Follow on Twitter

Wanna talk? Leave a comment or drop me a message!

All the best,

Luis Soares
luis.soares@linux.com