30 thg 10, 2010

Representing Arrays as Streams of Bits

NGUON: http://www.codeguru.com/cpp/cpp/cpp_mfc/bits/article.php/c4087/Representing-Arrays-as-Streams-of-Bits.htm

  • 1
Environment: MFC
I wrote this class when I needed to extract separate bits of data from an array, one by one. After some calculations, I found I needed to put modified bits in another array of data. I think this class may be useful, so here it is.
This class operates with data located in the CByteArray object. So, the common steps to use CBitStream are:
  1. Declare a CByteArray object.
  2. Declare a CBitStream object, supplying a reference to the newly created CByteArray object to the constructor of the first one.
  3. Enjoy. :)
To set/get the next bit, you can use either overloaded shifting operators or SetBit()/GetBit() functions. Also, you can navigate inside the array of data. You can set a "pointer" to any bit inside your array.
Also, you can choose the order in which bits of every byte will be processed. The default is a "left-to-right" order. In this order, the bits affected are: 7th, 6th, 5th, 4th, 3rd, 2nd, 1st, and then the 0 bit. You can define the RIGHTTOLEFTFILL identifier to change the order to right-to-left, in which bits are accessed: 0, 1st, 2nd, 3rd, 4th, 5th, 6th, and then 7th. I prefer the left-to-right order because it is easier to understand. Look to the picture below. a) is the left-to-right order (default) and b) is the right-to-left order.
CByteArray array;
CBitStream stream(array);

stream<<0<<1<<0<<0<<1<<0<<0<<1;
// The first byte now consists of 01001001 bits, so it equals 73
// (0x49). It's true when you are using the left-to-right order.
stream.SetBit(0);
stream.SetBit(1);
stream.SetBit(0);
stream.SetBit(0);
stream.SetBit(1);
stream.SetBit(0);
stream.SetBit(0);
stream.SetBit(1);
// The second byte is now the same as the first.

Downloads

Download demo project - 10 Kb
Download source - 3 Kb

A Compressed Bitset Class

A Compressed Bitset Class

  • 1

Introduction

The Standard Template Library bitset class is a useful class suitable for most bitset operations, but if you are dealing with lots of large bitsets, it may be that the stl bitset class is missing a trick.
In many applications using bitsets, you will find that a typical bitset is either sparsely populated with bits, or conversely almost fully set. In these cases, a large bitset can use a lot of memory for what is actually being represented.
To overcome this, we have developed the CMJBitset class as a plug-in replacement for bitset. The CMJBitset classm depending on compilation optionsm may take as little as 7 bytes to represent a bitset of any size, assuming all the bits are set or reset. In comparision, a 1024 bitset will take 128 bytes. In essence, theCMJBitset operates by run length encoding a bitset if the bitset is either almost all set/reset, but otherwise uses the STL bitset class. We will refer to the different encodings as normalsparse, andfull.

Using the Code

General usage

Using the CMJBitset class should be simple. Replace bitset<N> in your code with CMJBitset<N> and include the CMJBitset.h header file. Because CMJBitset presents the same interface as bitset, this should be enough to get things working.

Loading/Saving the bitset

The class would not give a fraction of its potential benefit if you were unable to save and load the bitset to and from file. Hence, these member functions:
bool load(FILE *)
and
bool save(FILE *)
are provided; but perhaps more to the point, all members of the class are public, enabling you to load and save at will.

Compilation Options

If you examine MJBitset.h, there are a number of compilation options.

For debugging

This is used by the example project to test the CMJBitset class. I would also recommend that when trying out the CMJBitset class you do some testing with this as well. As of the current date, a "full coverage" test has not been carried out on the class; therefore, bugs are a definite possibility.
//
// When DEBUG_CMMJBITSET is defined, a shadow bitset is kept and
// validated against operations. This is obviously very slow
// uses loads of memory and defeats any purpose in using this
// class and hence should never be used outside of a debug session.
//
//
//#define DEBUG_CMJBITSET

The maximum bitset size you need to deal with

The default is for a maximum of 0xffff.
//
// data representation type for bitsets <= 256
// CMJBITSET_USE_CHAR should be defined
//

//#define CMJBITSET_USE_CHAR
#define CMJBITSET_USE_SHORT
//#defineCMJBITSET_USE_LONG

The Example Project

The example project is a simple win32 console application that attempts to regress the CMJBitset by creating a succession of bitmaps with random population levels and performing a reasonably comprehensive set of operations on the bitsets.
You easily may alter the regression test to compare performance between CMJBitset and bitset.

Points of Interest

Performance

The performance of the CMJBitset class is quite hard to quantify. For bitsets that it does not encode because they are neither full ror sparse, the CMJBitset class is clearly several times slower thanbitset.
If you run the example regression test and make it use sparse bitsets, the times will still be a little faster than bitset. But, the comparitive speed will dramatically increase once the operations cease to be completely in memory either due to paging or loading/saving to files.
We have tested the class in the WhereWasI product and seen dramatic reductions in memory usages and very significant speed increases.
It is also notable that some operations with CMJBitset are dramatically faster than bitsetflip(), for example, simply needs to invert the full and sparse representation types.
At the top of the source file, there are a number of compilation options that affect the way CMJBitsetallocates memory; the key option is set by default and reserves space within the class for bitsets with up to 4 bits set/unset. This often avoids the overhead of time and memory of using malloc/free, and in our examples is a definite performance bonus. Your mileage may vary and you may want to tune performance by changing some of these values.

History

  • June 14th, 2004 - 1.00 Released.
  • June 18th, 2004 - 1.01 Released. - Contains a number of optimisations
  • June 22nd, 2004 - 1.02 Released. - Fixes compilation bug with VC7.0. Fixes operator precedence bug that would effect normal bitset performance

Downloads

  • cmjbitset_demo.zip

  • cmjbitset_src.zip
    • 1

  • No comments.
    (You must be signed in to comment on an article. Not a member? Click here to register)