Exploring Hash functions in Rust: Fowler–Noll–Vo (FNV), SipHash, and beyond

Hash functions are fundamental in numerous computing scenarios, offering ways to represent arbitrary-sized data in fixed-sized values. In…

Exploring Hash functions in Rust: Fowler–Noll–Vo (FNV), SipHash, and beyond

Hash functions are fundamental in numerous computing scenarios, offering ways to represent arbitrary-sized data in fixed-sized values. In this article, we’ll delve into the workings of two well-known hash functions — Fowler–Noll–Vo (FNV) and SipHash — mainly focusing on their implementation in the Rust programming language.

Fowler–Noll–Vo (FNV) Hash Function:

How it works at a low level:

FNV works by multiplying a hash with a prime number and then XORing the result with a byte from the input. This is done for each byte in the input, producing the final hash.

There are two main variants: FNV-1 and FNV-1a.

FNV-1:

hash = ((hash × FNV_prime) XOR octet)

FNV-1a:

hash = ((hash XOR octet) × FNV_prime)

Rust Implementation (Using the fnv crate):

use fnv::FnvHashMap;

fn main() {
   let mut map = FnvHashMap::default();
   map.insert("key", "value");
   println!("{:?}", map.get("key"));
}

Use Cases:

  • Quick hashing in memory-bound situations.
  • Situations where cryptographic strength isn’t mandatory but speed is crucial, like hash tables.

SipHash

How it works at a low level:

SipHash is a cryptographic algorithm that protects hash-flooding denial-of-service attacks. At a low level, it uses a series of SipRounds on the input data combined with two 64-bit keys.

The algorithm processes message blocks of 64 bits, and its main loop consists of XORing these blocks into the state, followed by a fixed number of SipRounds.

Rust Implementation:

Rust’s standard HashMap uses SipHash by default.

use std::collections::HashMap;

fn main() {
   let mut map = HashMap::new();
   map.insert("key", "value");
   println!("{:?}", map.get("key"));
}

If you need more direct control over the hashing algorithm, Rust provides the SipHasher and SipHasher13 structs in the std::hash::SipHasher module.

use std::hash::{Hash, Hasher, SipHasher};

fn main() {
   let mut hasher = SipHasher::new();
   "hello world".hash(&mut hasher);
   let hash = hasher.finish();
   println!("Hash is: {}", hash);
}

Use Cases:

  • Safeguard against DoS attacks that exploit hash functions.
  • General-purpose hashing with a good balance of speed and security.

Performance Trade-offs:

Understanding the context in which you’re deploying a hash function is crucial, and Rust provides excellent flexibility.

1. FNV:

  • Speed: One of FNV’s significant advantages is its speed. It’s speedy, especially for short keys.
  • Predictability: Its simplicity, however, can be a downside. If an attacker knows you’re using FNV, they might intentionally generate collisions, slowing down operations in data structures like hash maps.

2. SipHash:

  • Security: SipHash’s design focuses on protection against hash-flooding attacks. This is crucial for general-purpose scenarios where the input might be adversarial.
  • Speed Trade-off: While SipHash is fast, it’s typically slower than FNV, especially for concise keys. However, its robustness often makes up for this slight decline in performance.

Picking the Right Hash Function in Rust:

1. Evaluate Your Threat Model: If you’re designing a system exposed to untrusted inputs (e.g., a public web service), SipHash is safer due to its resistance to hash DoS attacks.

2. Consider Your Data: For scenarios where your keys are known to be short and non-adversarial (like certain in-memory operations), FNV’s speed might be beneficial.

3. Understand Rust’s Defaults: Rust’s HashMap uses SipHash by default because it provides a good balance for general use cases. However, understanding why and when to opt for an alternative is crucial for performance-critical applications.

Extending the Landscape of Hashing in Rust:

Beyond FNV and SipHash, Rust’s ecosystem offers a variety of hashing algorithms to fit different contexts. Being an expressive language emphasising performance and safety, Rust allows developers to leverage its robust type system, ownership model, and vast library ecosystem to implement and utilise hashing effectively.

New Entrants in Rust’s Hashing Ecosystem:

1. ahash: This is a high-speed (but non-cryptographic) hashing algorithm designed explicitly for Rust’s HashMap. It’s considerably faster than SipHash in many scenarios and is a good choice when performance is paramount and cryptographic strength isn’t necessary.

2. blake3: An evolution from BLAKE2, BLAKE3 is a cryptographic hash function that’s faster than MD5, SHA-1, SHA-2, SHA-3, and BLAKE2. The Rust implementation takes full advantage of SIMD instructions, making it suitable for performance and cryptographic security.

Integrating Custom Hashers in Rust:

Rust allows developers to define custom hashers and integrate them seamlessly with standard library data structures.

For example, to use ahash with Rust’s HashMap:

use std::collections::HashMap; 
use ahash::AHasher; 
use std::hash::BuildHasherDefault;

type AHashMap<K, V> = HashMap<K, V, BuildHasherDefault<AHasher>>;fn main() {
   let mut map: AHashMap<&str, &str> = AHashMap::default();
   map.insert("key", "value");
   println!("{:?}", map.get("key"));
}

Benchmarks and Comparisons:

When selecting a hashing algorithm, it’s essential that conducting benchmarks relevant to your use case is necessary. While Rust’s ecosystem often provides benchmarks comparing different hash functions, real-world performance can vary based on data patterns, system architecture, and workload characteristics.

Tools like criterion-rs can help you conduct precise benchmarks in Rust, ensuring you make an informed decision.

Check out more articles about Rust in my Rust Programming Library!

Best Practices:

  1. Regularly Update Dependencies: Hashing algorithms can have vulnerabilities. Regularly updating your Rust packages ensures you benefit from the latest security patches.
  2. Balance Performance and Security: While a hasher like FNV is lightning-fast, it might not be suitable for adversarial contexts. Always consider the security implications of your choice.
  3. Leverage Rust’s Type System: Rust allows you to create type aliases and newtypes. Use them to make your code more expressive and ensure the correct hasher is used in the proper context.

In Conclusion

Hashing is an intricate domain with trade-offs spanning performance, security, and domain-specific needs. With its focus on zero-cost abstractions and memory safety, Rust offers developers an expansive toolbox for hashing. By understanding the nuances of available hash functions, keeping up with the evolving ecosystem, and leveraging Rust’s unique features, developers can craft both performant and secure systems.

Stay tuned, and happy coding!

Check out more articles about Rust in my Rust Programming Library!

Visit my Blog for more articles, news, and software engineering stuff!

Follow me on Medium, LinkedIn, and Twitter.

All the best,

CTO | Senior Software Engineer | Tech Lead | AWS Solutions Architect | Rust | Golang | Java | ML AI & Statistics | Web3 & Blockchain

#RustLang #HashFunctions #CyberSecurity #PerformanceOptimization #FNVHash #SipHash #RustHashing #DoSProtection #RustEcosystem #AHash #blake3 #CriterionRS #HashMap #RustDevelopment #AlgorithmTradeOffs #MemorySafety #RustPerformance #RustUpdates #RustTypeSystem

Read more