Wednesday, January 14, 2009

Reusable CRC-32 hash implementation (with source) as seen in the free, multi-threaded, command line file hashing, ComputeFileHash utility (also with source)

Delay's Blog - Trust, but verify [Free tool (and source code) for computing commonly used hash codes!]

“…

Popular hash functions in use today are MD5 and SHA-1, with CRC-32 rapidly losing favor. Speaking in very broad terms, one might say that the quality of CRC-32 is "not good", MD5 is "good", and SHA-1 is "very good". For now, that is; research is always under way that could render any of these algorithms useless tomorrow... (For more information about the weaknesses of each algorithm, refer to the links above.)

In order for published checksums to be useful, the user needs an easy way to calculate them. I looked around a bit and didn't a lot of free tools for computing these popular hash functions that I was comfortable with, so I wrote my own using .NET. Here it is:

One of the things that was important to me when writing ComputeFileHashes was performance. Nobody likes to wait, and I'm probably even less patient than the average bear. One of the things I wanted my program to do was take advantage of multi-processing and the multi-core CPUs that are so prevalent these days. So ComputeFileHashes runs the three hash functions in parallel with each other and with loading the next bytes of the file. Theoretically, this can take advantage of four different cores - though in practice my limited testing suggests there's just not enough work to saturate them all. :)

…”

Delay's Blog - Free hash [A reusable CRC-32 HashAlgorithm implementation for .NET]

“In the notes for yesterday's release of the ComputeFileHashes tool (and source code), I mentioned that I'd written my own .NET HashAlgorithm class to compute CRC-32 hash values. The complete implementation can be found below and should behave just like every other HashAlgorithm subclass (ex: MD5 or SHA1). The code here is based on the CRC-32 reference implementation provided in Annex D of the PNG specification and pretty much "just worked". It implements the necessary Initialize, HashCore, and HashFinal methods as well as the technically optional (but practically necessary) Hash and HashSize properties. There's no test code to speak of, though it's worth pointing out that I've run tens of gigabytes of data through my ComputeFileHashes tool and have verified the correctness of the computed CRC-32 value for each test file. :)

Without further ado:

…”

Since hashes are a significant part of my work world, related articles usually catch my eye. And since these two included source code, I thought I’d capture them for future reference (and in case you might find them interesting too ;)

 

Related Past Post XRef:
Have some HASHes with your Shell – HashTab, the File Hash Explorer Shell Extension (a new “Much Have” Shell Extension?)
File Hash Generator Explorer Shell Extension - MD5 a file via right-click in Windows Explorer...
MD5 Hash SQL Server Extended Stored Procedure
Do we really need to say goodbye to MD5’s? There are 340,282,366,920,938,463,463,374,607,431,768,211,456 reasons there’s maybe no rush…
MD5 Collisions

No comments: