Using Tries to Store Word Lists in RAM

July 10th, 2012

Filed under: Game Development | Be the first to comment!

One task every word game must perform is determining whether or not the word the player creates is a valid word. To check if a word is a valid, the game needs a word list and check if the word the player creates is in the list. How do you store the word list so you can check it quickly?

One answer is to use a trie. A trie is a data structure that indexes and searches for words.

The following diagram shows a trie for the words ace, aced, aces, acre, acres, act, acted, acting, and acts:

TrieExampleCropped

 

The silver nodes represent the end of a word. Notice the memory efficiency of using a trie. 15 nodes can represent nine words. Storing the nine words separately entails storing 38 characters in memory. Imagine a dictionary containing tens of thousands of words and you can see how much memory you can save by storing a word list as a trie.

Traversing a trie is also fast. The speed of searching for a word in a trie depends on the length of the word, not the number of words in the list. This means you can search for a word quickly in a list that contains tens of thousands of words, which is important in a word game.

Implementing a Trie

Implementing a trie involves creating two data structures: trie node and trie. A trie node stores the following data:

  • The letter.
  • Whether or not the node is the end of a word.
  • A list of child nodes. If you’re using a C-based language, the list would be a pointer to a trie node.

You must perform the following actions on a trie node: add a child and find a child. Adding a child adds a trie node to the list of child nodes. Finding a child involves going through each node in the child list and checking that the letter of the node matches the desired letter.

A trie consists of a list of trie nodes. In a C-based language the list would be a pointer to a trie node data structure.

You must perform the following actions on a trie: add a word and find a word. To add a word start at the root of the tree and go through each letter in the word. For each letter check if it is already in the trie. If it is in the trie, make that node the current node and move on to the next letter. If the letter is not in the trie, add a node for the letter to the trie.

To find a word, go through the letters in the word and drill down the trie until you get to the last letter. If you reach a NULL node before you get to the end of the trie, the word does not exist in the trie. If you get to the end of the word, check that the current node is the end of a word. If it is, the word was found.

Source Code Implementations

You may have noticed this post is light on source code. That is because you can find many source code implementations of tries on the Internet. I will link to three of them here.

The code in the C++ implementation link worked well for me so I decided to link to it instead of posting my code.


Leave a Reply

Your email address will not be published. Required fields are marked *