手写哈希表：哈希函数、冲突解决与 LRU Cache 实战

哈希表（Hash Table），也叫散列表，是计算机科学中最重要的数据结构之一。它通过哈希函数将键（Key）映射到数组的某个索引位置，从而实现平均 O(1) 时间复杂度的插入、查找和删除。

JavaScript 的 Object 和 Map 本质上都是哈希表的实现。理解哈希表的底层原理，能帮助你写出更高效的代码，并在面试中游刃有余。

本文将手写实现 HashMap 和 HashSet，深入探讨哈希函数设计、冲突解决策略，并通过 LRU Cache 综合案例将所学知识融会贯通。

哈希表的核心思想

哈希表的核心可以用一句话概括：用空间换时间。

想象一个数组，我们希望根据某个键快速找到对应的值。最直接的方法是遍历数组，时间复杂度为 O(n)。如果键是整数且范围不大，可以直接用键作为数组索引，实现 O(1) 访问。但现实中键可能是字符串、对象等任意类型，范围也可能极大。

哈希函数的作用就是将任意类型的键转换为固定范围的整数索引，让我们能用数组实现 O(1) 的访问。

plaintext

键(Key) ──→ 哈希函数 ──→ 索引(Index) ──→ 数组中的位置
"name"        hash()          3            buckets[3]

哈希函数设计

一个好的哈希函数应该满足两个条件：

确定性：相同的键总是产生相同的哈希值
均匀分布：不同的键尽可能均匀地分布到各个桶中，减少冲突

简单的字符串哈希函数

function hashString(key, bucketSize) {
  let hash = 0;

  for (let i = 0; i < key.length; i++) {
    // charCodeAt 获取字符的 UTF-16 编码
    // 使用质数 31 作为乘数，减少哈希碰撞
    hash = (hash * 31 + key.charCodeAt(i)) % bucketSize;
  }

  return Math.abs(hash); // 确保非负
}

console.log(hashString('hello', 16)); // 某个 0-15 的整数
console.log(hashString('world', 16)); // 另一个 0-15 的整数

通用哈希函数

对于非字符串键，可以先将键转为字符串，再应用哈希：

function hashCode(key, bucketSize) {
  if (typeof key === 'number') {
    return Math.abs(key) % bucketSize;
  }

  const str = String(key);
  let hash = 0;

  for (let i = 0; i < str.length; i++) {
    const char = str.charCodeAt(i);
    hash = ((hash << 5) - hash) + char;
    hash = hash & hash; // 转为 32 位整数
  }

  return Math.abs(hash) % bucketSize;
}

冲突解决策略

无论哈希函数设计得多好，**冲突（Collision）**总是不可避免的——两个不同的键可能映射到同一个索引。解决冲突有两种经典策略。

策略一：链地址法（Separate Chaining）

每个桶（bucket）不再存储单个值，而是存储一个链表（或数组）。当多个键哈希到同一索引时，将它们链接在一起。

plaintext

索引 0: null
索引 1: ["abc" -> 100] -> ["def" -> 200]
索引 2: ["ghi" -> 300]
索引 3: null

查找时先定位到桶，再在链表中遍历查找。如果哈希分布均匀，每个链表的长度会很小，接近 O(1)。

策略二：开放地址法（Open Addressing）

发生冲突时，按照某种探测序列在数组中寻找下一个空位。常见的探测方式：

探测方式	公式	特点
线性探测	`h(k, i) = (hash(k) + i) % size`	容易产生聚集
二次探测	`h(k, i) = (hash(k) + i²) % size`	减少聚集，但可能探测不到空位
双重哈希	`h(k, i) = (hash1(k) + i * hash2(k)) % size`	分布最均匀

本文使用链地址法实现，因为它实现简单、扩容灵活，且支持超过桶数量的元素。

手写 HashMap

class HashMap {
  constructor(initialCapacity = 16) {
    this.buckets = new Array(initialCapacity).fill(null).map(() => []);
    this.size = 0;
    this.capacity = initialCapacity;
    this.loadFactorThreshold = 0.75; // 负载因子阈值
  }

  // 哈希函数
  _hash(key) {
    if (typeof key === 'number') {
      return Math.abs(key) % this.capacity;
    }

    const str = String(key);
    let hash = 0;

    for (let i = 0; i < str.length; i++) {
      hash = (hash * 31 + str.charCodeAt(i)) % this.capacity;
    }

    return Math.abs(hash);
  }

  // 设置键值对
  set(key, value) {
    const index = this._hash(key);
    const bucket = this.buckets[index];

    // 检查是否已存在该键，存在则更新
    for (const entry of bucket) {
      if (entry.key === key) {
        entry.value = value;
        return;
      }
    }

    // 添加新键值对
    bucket.push({ key, value });
    this.size++;

    // 检查是否需要扩容
    if (this.size / this.capacity > this.loadFactorThreshold) {
      this._resize(this.capacity * 2);
    }
  }

  // 根据键获取值
  get(key) {
    const index = this._hash(key);
    const bucket = this.buckets[index];

    for (const entry of bucket) {
      if (entry.key === key) {
        return entry.value;
      }
    }

    return undefined;
  }

  // 删除键值对
  delete(key) {
    const index = this._hash(key);
    const bucket = this.buckets[index];

    for (let i = 0; i < bucket.length; i++) {
      if (bucket[i].key === key) {
        bucket.splice(i, 1);
        this.size--;
        return true;
      }
    }

    return false;
  }

  // 检查键是否存在
  has(key) {
    return this.get(key) !== undefined;
  }

  // 获取所有键
  keys() {
    const keys = [];
    for (const bucket of this.buckets) {
      for (const entry of bucket) {
        keys.push(entry.key);
      }
    }
    return keys;
  }

  // 获取所有值
  values() {
    const values = [];
    for (const bucket of this.buckets) {
      for (const entry of bucket) {
        values.push(entry.value);
      }
    }
    return values;
  }

  // 清空哈希表
  clear() {
    this.buckets = new Array(this.capacity).fill(null).map(() => []);
    this.size = 0;
  }

  // 扩容：重新计算所有键的哈希值
  _resize(newCapacity) {
    const oldBuckets = this.buckets;
    this.capacity = newCapacity;
    this.buckets = new Array(newCapacity).fill(null).map(() => []);
    this.size = 0;

    for (const bucket of oldBuckets) {
      for (const entry of bucket) {
        this.set(entry.key, entry.value);
      }
    }
  }
}

使用示例

const map = new HashMap();

map.set('name', 'Alice');
map.set('age', 25);
map.set('city', 'Beijing');

console.log(map.get('name'));   // Alice
console.log(map.get('age'));    // 25
console.log(map.has('city'));   // true

map.set('age', 26);             // 更新
console.log(map.get('age'));    // 26

map.delete('city');
console.log(map.has('city'));   // false

console.log(map.keys());        // ['name', 'age']
console.log(map.values());      // ['Alice', 26]

手写 HashSet

HashSet 是只存储键、不存储值的集合结构，底层就是值被忽略的 HashMap。

class HashSet {
  constructor(initialCapacity = 16) {
    this.map = new HashMap(initialCapacity);
  }

  add(value) {
    this.map.set(value, true); // 值用布尔标记即可
  }

  has(value) {
    return this.map.has(value);
  }

  delete(value) {
    return this.map.delete(value);
  }

  get size() {
    return this.map.size;
  }

  clear() {
    this.map.clear();
  }

  values() {
    return this.map.keys();
  }
}

// 使用示例
const set = new HashSet();
set.add(1);
set.add(2);
set.add(1); // 重复添加无效

console.log(set.has(1));    // true
console.log(set.has(3));    // false
console.log(set.size);      // 2
console.log(set.values());  // [1, 2]

负载因子与扩容机制

负载因子（Load Factor） = 元素数量 / 桶数量

负载因子越高，冲突概率越大，链表越长，性能越差。当负载因子超过阈值（通常为 0.75）时，需要扩容：

创建一个新的大小为原来两倍的桶数组
遍历旧桶中的所有元素
用新的容量重新计算哈希值，放入新桶

扩容的时间复杂度是 O(n)，但均摊到每次插入操作中仍然是 O(1)。这是均摊分析的经典案例。

实战：LRU Cache

**LRU（Least Recently Used）**缓存是一种常用的缓存淘汰策略：当缓存满时，优先淘汰最久未使用的数据。

LRU Cache 需要同时满足两个需求：

O(1) 访问：通过哈希表实现
O(1) 更新访问顺序：通过双向链表实现

class LRUCache {
  constructor(capacity) {
    this.capacity = capacity;
    this.cache = new Map(); // 哈希表：key -> node

    // 双向链表节点
    this.head = { key: null, value: null }; // 伪头节点（最近使用）
    this.tail = { key: null, value: null }; // 伪尾节点（最久未使用）
    this.head.next = this.tail;
    this.tail.prev = this.head;
  }

  // 获取值，并移动到链表头部（标记为最近使用）
  get(key) {
    if (!this.cache.has(key)) {
      return -1;
    }

    const node = this.cache.get(key);
    this._moveToHead(node);
    return node.value;
  }

  // 设置键值对
  put(key, value) {
    if (this.cache.has(key)) {
      // 更新现有值
      const node = this.cache.get(key);
      node.value = value;
      this._moveToHead(node);
      return;
    }

    // 创建新节点
    const newNode = { key, value };
    this.cache.set(key, newNode);
    this._addToHead(newNode);

    // 超出容量，淘汰尾部节点
    if (this.cache.size > this.capacity) {
      const removed = this._removeTail();
      this.cache.delete(removed.key);
    }
  }

  // 在链表头部添加节点
  _addToHead(node) {
    node.prev = this.head;
    node.next = this.head.next;
    this.head.next.prev = node;
    this.head.next = node;
  }

  // 移除节点
  _removeNode(node) {
    node.prev.next = node.next;
    node.next.prev = node.prev;
  }

  // 将节点移动到头部
  _moveToHead(node) {
    this._removeNode(node);
    this._addToHead(node);
  }

  // 移除尾部节点（最久未使用）
  _removeTail() {
    const node = this.tail.prev;
    this._removeNode(node);
    return node;
  }
}

使用示例

const cache = new LRUCache(2);

cache.put(1, 'A'); // cache: [1:A]
cache.put(2, 'B'); // cache: [2:B, 1:A]

console.log(cache.get(1)); // 返回 'A'，cache: [1:A, 2:B]

cache.put(3, 'C'); // cache 满，淘汰 2，cache: [3:C, 1:A]

console.log(cache.get(2)); // 返回 -1（已淘汰）

cache.put(4, 'D'); // 淘汰 1，cache: [4:D, 3:C]

console.log(cache.get(1)); // 返回 -1
console.log(cache.get(3)); // 返回 'C'
console.log(cache.get(4)); // 返回 'D'

LRU Cache 是 LeetCode 第 146 题，也是面试中极常考的一道题。其核心思路就是哈希表 + 双向链表的组合：哈希表提供 O(1) 的键查找，双向链表维护元素的访问顺序。

总结

概念	说明
哈希函数	将任意键映射为固定范围的整数索引
冲突	不同键映射到同一索引，不可避免
链地址法	每个桶存储链表，冲突元素链接在一起
开放地址法	冲突时按探测序列寻找下一个空位
负载因子	元素数/桶数，超过阈值需要扩容
扩容	重建更大的桶数组，重新哈希所有元素

哈希表的 O(1) 时间复杂度是均摊的，依赖良好的哈希函数和合适的负载因子。理解这些底层原理，能帮助你在使用 JavaScript 的 Map 和 Set 时做出更明智的选择，也能在面试中从容应对手写哈希表的要求。

LRU Cache 作为哈希表与双向链表结合的经典案例，值得反复练习，直至能手写出完整的正确代码。

手写哈希表：哈希函数、冲突解决与 LRU Cache 实战

手写哈希表：哈希函数、冲突解决与 LRU Cache 实战

哈希表的核心思想

哈希函数设计

简单的字符串哈希函数

通用哈希函数

冲突解决策略

策略一：链地址法（Separate Chaining）

策略二：开放地址法（Open Addressing）

手写 HashMap

使用示例

手写 HashSet

负载因子与扩容机制

实战：LRU Cache

使用示例

总结

评论

Related Articles

手写哈希表：哈希函数、冲突解决与 LRU Cache 实战

手写图结构：邻接表、遍历算法与最短路径

JavaScript 对象与原型