Reading Bits from a byte with python

I have instructions concerning the structure of a binary file and I’m trying to build a parser to get information from the binary file. I was doing quite alright till i came across the following:

Start with a DWORD Size = 0. You’re going to reconstruct the size by getting packs of 7 bits:

  1. Get a byte.

  2. Add the first 7 bits of this byte to Size.

  3. Check bit 7 (the last bit) of this byte. If it’s on, go back to 1. to process the next byte.

To resume, if Size < 128 then it will occupy only 1 byte, else if Size < 16384 it will occupy only 2 bytes and so on…

What I’m confused about is what it means to "get bits from a byte", and to "check the last bit of the byte". This is the way I’ve been reading bytes from the file:

     from struct import *     #..... some other blocks of code     self.standard = {"DWORD":4,"WORD": 2,"BYTE": 1,"TEXT11": 1,"TEXT12": 2}     st = st = self.standard     size = 0     data = unpack("b", f.read(st["BYTE"]))     #how to get bits???     if size < 128:         #use st["TEXT11"]     elif size < 16384:         #use st["TEXT12"]  
Add Comment
2 Answer(s)

What I’m confused about is what it means to "get bits from a byte"

You do that using bit operations. For example, to get the first (lower) 7 bits of a byte, use

byte & 127 

Or, equivalently,

byte & 0x7f 

Or

byte & 0b1111111 

In your case, byte would be the first and only member of the tuple data.

To get the last bit, you need to both mask the bit (using &) and bit-shift it into position (using >>) — although in your case, since you only need to check whether it’s set, the shifting isn’t absolutely necessary.

Add Comment

Maybe the confusion is related to the binary representation of the integer number, for example, if we have the number 171 it is equivalent to this binary configuration (1 byte):

val = 0b10101011 # (bit configuration) print(val) # -> 171 (integer value) 

Now you can use a bit mask to let pass only 1 of those bits (big endian notation):

print(val & 0b00000001) # -> only the first bit pass and then it prints 1 print(val & 0b10000000) # -> only the latest bit pass and then it prints 128 print(val & 0b00000100) # -> it prints 0 because val does not have a 1 to the third position 

Then, to check if the seventh bit is 1 you can do the following operation:

print((val & 0b01000000) >> 6) # val    = 0b10101011 #             ^ # mask   = 0b01000000 # result = 0b00000000 -> 0 (integer) # shift  =    ^123456 -> 0b0 

The bit shift (>> operator) allows you to get the result of the bit mask.

For example, if you want the second bit:

print((val & 0b00000010) >> 1) # val    = 0b10101011 #                  ^ # mask   = 0b00000010 # result = 0b00000010 -> 2 (integer) # shift  =         ^1 -> 1b0 -> 1 (integer) 
Answered on July 16, 2020.
Add Comment

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.