Skip to content Skip to sidebar Skip to footer

Convert Binary Strings (ASCII) To Binary File

I have several large files (3-6 Gb) of 1's and 0's characters in ASCII and I would like to convert it to a simply binary file. Newlines are not important and should be discarded. t

Solution 1:

How about this bash command?

cat test.bin | tr -d '\n' | perl -lpe '$_=pack"B*",$_' > true_binary.txt

'tr' will delete all newline characters, and the perl command converts to binary.


Solution 2:

I don't know if this would solve the question, but how about this:

with open('ascii.txt', 'r') as file_ascii, open('binary.txt', 'wb') as file_bin:
    file_bin.write(bytes(''.join(file_ascii.read().split()), 'utf-8'))

Or, to overwrite the file:

with open('ascii.txt', 'r') as f:
    binary = bytes(''.join(file_ascii.read().split()), 'utf-8')

with open('ascii.txt', 'wb') as f:
    f.write(binary)

Short, but should do the trick.


Solution 3:

We could build an "only shell" solution.
First, we transform the 1's and 0's to an stream of 8 characters lines:

$ { cat test.bin | tr -cd '01' | fold -b8; echo; }
01110001
10000000
10100010
00001001
00011111
…
…
10011110
00010010
10010011
11010011
10010000
10011111
11100110

That's 560/8 lines, or 70 lines, which should translate to 70 characters.
It should be said that the characters are not ASCII, values above decimal 127 (hex 7f) are not ASCII. I am interpreting them as byte values (unsigned decimal value).

Then we can read each line and translate it first to decimal "$((2#$a))" so the shell understand them, then to hex printf '\\x%x' so the final printf could translate to an hex byte printf '%b' "…":

$ { cat infile | tr -cd '01' | fold -b8; echo; } | 
    while read a; do printf '%b' "$(printf '\\x%x' "$((2#$a))")"; done 
q��     J�P�cP�XO�!u���(Έ�큅a���OoU�f[G�X2���Ȁ3����Ӑ��

Of course, the characters printed are a (most probably) incorrect interpretation of the byte values in some locale that the user is using. Maybe an hex output will be more interesting (but that depends on your needs or interest):

$ { cat infile | tr -cd '01' | fold -b8; echo; } | 
    while read a; do printf '%b' "$(printf '\\x%x' "$((2#$a))")"; done |
        od -vAn -tx1c

  71  80  a2  09  1f  4a  82  50  e2  63  50  dc  22  08  00  58
   q 200 242  \t 037   J 202   P 342   c   P 334   "  \b  \0   X
  4f  c4  21  04  17  75  f1  f8  e6  28  ce  88  7f  07  ef  ed
   O 304   ! 004 027   u 361 370 346   ( 316 210 177  \a 357 355
  81  85  61  01  b1  00  10  f4  16  82  11  4f  6f  55  e3  82
 201 205   a 001 261  \0 020 364 026 202 021   O   o   U 343 202
  66  5b  47  f7  58  32  d5  f7  d6  00  c8  80  33  96  9c  9e
   f   [   G 367   X   2 325 367 326  \0 310 200   3 226 234 236
  12  93  d3  90  9f  e6
 022 223 323 220 237 346

Note that the same structure could be used for the file test_XY_encoded.txt:

$ { cat infile | tr 'XY' '01' | tr -cd '01' | fold -b8; echo; } | 
    while read a; do printf '%b' "$(printf '\\x%x' "$((2#$a))")"; done | 
        od -vAn -tx1c

  71  80  a2  09  1f  4a  82  50  e2  63  50  dc  22  08  00  58
   q 200 242  \t 037   J 202   P 342   c   P 334   "  \b  \0   X
  4f  c4  21  04  17  75  f1  f8  e6  28  ce  88  7f  07  ef  ed
   O 304   ! 004 027   u 361 370 346   ( 316 210 177  \a 357 355
  81  85  61  01  b1  00  10  f4  16  82  11  4f  6f  55  e3  82
 201 205   a 001 261  \0 020 364 026 202 021   O   o   U 343 202
  66  5b  47  f7  58  32  d5  f7  d6  00  c8  80  33  96  9c  9e
   f   [   G 367   X   2 325 367 326  \0 310 200   3 226 234 236
  12  93  d3  90  9f  e6
 022 223 323 220 237 346

Post a Comment for "Convert Binary Strings (ASCII) To Binary File"