Reading binary files from PCs
on "other endian" machines

Written by Paul Bourke
March 1991


This document briefly describes the byte swapping required when a binary file created on a DOS/WIndows is to be read on a computer which has its bytes ordered the other way.

There are various datatypes which may be read, the simplest is characters where no byte swapping is required. The next simplest is an unsigned short integer represented by 2 bytes. If the two bytes are read sequentially then the integer value on a big endian machine is 256*byte1+byte2. If the integer was written with a little endian machine such as a DOS/WINDOWS computer then the integer is 256*byte2+byte1.

While this approach can be used for unsigned shorts, ints, and longs and can be easily modified for signed versions of the same, it is rather difficult for real numbers (floats and double precision numbers). Fortunately the standard IEEE numerical format is used almost exclusively now days so that the bytes making up the particular number can be swapped around appropriately in memory. This does assume that the size of the particular numerical type is the same length on both machines, the machine that wrote the file and the machine reading the file. The usual standards are short integers are 2 bytes, long integers are 4 bytes, floats are 4 bytes and doubles are 8 bytes.

In summary, to read 2 byte integers (signed or unsigned) one reads the 2 bytes as normal, eg: using fread(), and then swap the 2 bytes in memory. It turns out that for long integers, floats and doubles the requirements is to reverse the bytes as they appear in memory. See the source below for more details.

Source code

Some routines illustrating the methods required to do the byte swapping for various numerical types.

/*
   Read a short integer, swapping the bytes
*/
int ReadShortInt(FILE *fptr,short int n)
{
   unsigned char *cptr,tmp;

   if (fread(n,2,1,fptr) != 1)
      return(FALSE);
   cptr = (unsigned char *)n;
   tmp = cptr[0];
   cptr[0] = cptr[1];
   cptr[1] =tmp;

   return(TRUE);
}

/*
   Read an integer, swapping the bytes
*/
int ReadInt(FILE *fptr,int *n)
{
   unsigned char *cptr,tmp;

   if (fread(n,4,1,fptr) != 1)
      return(FALSE);
   cptr = (unsigned char *)n;
   tmp = cptr[0];
   cptr[0] = cptr[3];
   cptr[3] = tmp;
   tmp = cptr[1];
   cptr[1] = cptr[2];
   cptr[2] = tmp;

   return(TRUE);
}

/*
   Read a floating point number
   Assume IEEE format
*/
int ReadFloat(FILE *fptr,float *n)
{
   unsigned char *cptr,tmp;

   if (fread(n,4,1,fptr) != 1)
      return(FALSE);
   cptr = (unsigned char *)n;
   tmp = cptr[0];
   cptr[0] = cptr[3];
   cptr[3] =tmp;
   tmp = cptr[1];
   cptr[1] = cptr[2];
   cptr[2] = tmp;

   return(TRUE);
}

/*
   Read a double precision number
   Assume IEEE
*/
int ReadDouble(FILE *fptr,double *n)
{
   unsigned char *cptr,tmp;

   if (fread(n,8,1,fptr) != 1)
      return(FALSE);

   cptr = (unsigned char *)n;
   tmp = cptr[0];
   cptr[0] = cptr[7];
   cptr[7] = tmp;
   tmp = cptr[1];
   cptr[1] = cptr[6];
   cptr[6] = tmp;
   tmp = cptr[2];
   cptr[2] = cptr[5];
   cptr[5] =tmp;
   tmp = cptr[3];
   cptr[3] = cptr[4];
   cptr[4] = tmp;

   return(TRUE);
} 
Macros

An alternative for all but doubles is to use these cute macros, then the swapping is done inline.

#define SWAP_2(x) ( (((x) & 0xff) << 8) | ((unsigned short)(x) >> 8) )
#define SWAP_4(x) ( ((x) << 24) | \
         (((x) << 8) & 0x00ff0000) | \
         (((x) >> 8) & 0x0000ff00) | \
         ((x) >> 24) )
#define FIX_SHORT(x) (*(unsigned short *)&(x) = SWAP_2(*(unsigned short *)&(x)))
#define FIX_INT(x)   (*(unsigned int *)&(x)   = SWAP_4(*(unsigned int *)&(x)))
#define FIX_FLOAT(x) FIX_INT(x)
Strategies for developers

There are three basic strategies for software developers when choosing how to create endian independent data files and associated software.