Similarly, there is a function ntoh (network to host) used to read data off the network. You need this to make sure you are correctly interpreting the network data into the host's format. You need to know the type of data you are receiving to decode it properly, and the conversion functions are: The other approach is to include a magic number, such as 0xFEFF, before every piece of data. If you read the magic number and it is 0xFEFF, it means the data is in the same format as your machine, and all is well.
htons()--"Host to Network Short"
htonl()--"Host to Network Long"
ntohs()--"Network to Host Short"
ntohl()--"Network to Host Long"
Function
Purpose
ntohs
Convert a 16-bit quantity from network byte order to host byte order (Big-Endian to Little-Endian).
ntohl
Convert a 32-bit quantity from network byte order to host byte order (Big-Endian to Little-Endian).
htons
Convert a 16-bit quantity from host byte order to network byte order (Little-Endian to Big-Endian).
htonl
Convert a 32-bit quantity from host byte order to network byte order (Little-Endian to Big-Endian).
ntohs
, ntohl
, htons
, htonl
) will be defined to do nothing and there will be no run-time performance impact. If, however, the processor is Little-Endian, the macros will reorder the bytes appropriately. These macros are routinely called when building and parsing network packets and when socket connections are created. Serious run-time performance penalties occur when using TCP/IP on a Little-Endian processor. For that reason, it may be unwise to select a Little-Endian processor for use in a device, such as a router or gateway, with an abundance of network functionality. (Excerpt from reference [1]).
Endianness so simple and yet I confuse myself with Big vs Little endian.
In one byte data encoding (ASCII) endianness do not matter. But when we use more than two bytes to represent a character we need to agree to store left to right or vice versa.
Endianess is also referred to as the NUXI problem. Imagine the word UNIX stored in two 2-byte words. In a Big-Endian system, it would be stored as UNIX. In a little-endian system, it would be stored as NUXI.
Big Endian is how we read in english left to right, hence high order byte is stored at 0 position. Consider the 32-bit number, 0xDEADBEEF.
Big-Endian: The most significant byte is stored at the lowest byte address.
Little-endian: Least significant byte is stored at the lowest byte address.
Solution 1: Use a common format
It is important to use hton before sending data, even if you are big-endian. Your program may be so popular it is compiled on different machines, and you want your code to be portable (don't you?).
Remember that a single byte is a single byte, and order does not matter. Declared in winsock2.h, which are defined for TCP/IP, so all machines that support TCP/IP networking have them available. They store the data in 'network byte order' which is big endian.
If the processor on which the TCP/IP stack is to be run is itself also Big-Endian, each of the four macros (i.e.
One additional problem with the host-to-network APIs is that they are unable to manipulate 64-bit data elements.
Solution 2: Use a Byte Order Mark (BOM)
If you read the magic number and it is 0xFFFE (it is backwards), it means the data was written in a format different from your own. You'll have to translate it.
BOM adds overhead to all data that is transmitted. Even if you are only sending 2 bytes of data, you need to include a 2-byte BOM. Ouch!
Unicode uses a BOM when storing multi-byte data (some Unicode character encodings can have 2, 3 or even 4-bytes per character). XML avoids this mess by storing data in UTF-8 by default, which stores Unicode information one byte at a time.
Why are there endian issues at all? Can't we just get along?
Each byte-order system has its advantages. Little-endian machines let you read the lowest-byte first, without reading the others. You can check whether a number is odd or even (last bit is 0) very easily, which is cool if you're into that kind of thing. Big-endian systems store data in memory the same way we humans think about data (left-to-right), which makes low-level debugging easier.
Resources -