This script is not the most elegant since I only had 2, 3 days to do it (really it's a dump of chunk of codes) but it works so hoorah! Basically what the script does is:
- Read a file (*.txt) and determine frequency of each characters and total bits.
- Build a Huffman tree based on previous frequency table. Each node keeps a character & its frequency, path (eg. 011, 0011, 01110 etc), vertex flag (whether it's a left or right child) and visit flag (whether it has been visited or not - for node searching later). Note though, not every node keeps a character, only leaf nodes do (Huffman tree properties). Sorting algorithm to build priority queue to work with when building the tree is selection sorting algorithm.
- Follow every root-to-leaf path to get the bits for each character. This script uses breadth-first search algorithm to look for nodes inside the tree.
- Write compression information in a new file (compressed one) but this script only write the bits sequence into the file.
The file here: huffman.cpp (C++)