Table of Contents |
Table of Chapters |
The mini-Sockets API is designed to be as close as possible to the BSD Sockets API and still allow a small footprint. The primary differences are that passive connections are accomplished with a single call, m_listen(), rather than the BSD bind()-listen()-accept() sequence, and the BSD select() call is replaced with a callback mechanism.
In order to avoid name collisions at link time, the socket API routine names are all prepended with an "m_" to ensure uniqueness. These can be mapped to the standard socket names via macros or used as-is. Throughout this manual the m_ syntax is used. However the following calls are functionally equivalent to the corresponding ("m_"-less) BSD sockets calls: m_socket(), m_connect(), m_send(), m_recv(), and m_close(). These represent all the sockets calls traditionally used to implement TCP client applications.
The mini-Sockets API does not support any protocol besides TCP. UDP in the mini-IP stack is accessed via the lightweight API described in the previous UDP section. All InterNiche UDP based applications (such as DHCP, SNMP, and TFTP) are capable of operating on the lightweight API and do not require Sockets.
Server applications (applications which primarily perform passive opens) are supported via a simple callback mechanism. The m_listen() call accepts a pointer to a C routine (the "callback") as one of its parameters, and that routine is called whenever a TCP connection is formed with the listening TCP port. The callback routine may also be used as a mechanism for receiving data and notification of TCP connection events such as errors and closures. Client applications may use this feature by passing an optional callback routine to m_connect().
One feature of the callback mechanism is that it provides the potential to implement a mini-Sockets API program that is fully blocking on multiple sockets. On conventional BSD Sockets this is implemented with the select() call, which passes a list of sockets awaiting events (usually connection or data receipt) to the TCP layer. The select() call returns when any of the selected events for the passed sockets occurs. The callback mechanism allows the TCP layer to directly call the application code when the event occurs rather than waking a thread which is blocked inside the select() logic. In addition to the smaller code footprint, this model allows faster response to incoming TCP data and fewer threads required for the application.
Another advantage of the mini-Sockets API is the potential for moving data to and from the application without copying it. This is referred to as a "zero-copy" API. Applications that use the zero-copy features can run faster and with less memory overhead. The disadvantages of zero-copy are that it is harder to program and is incompatible with standard BSD sockets.
The most common problem with mini-Sockets programming is mishandling the buffers that the application must allocate from and return to the TCP layer. Once the application looses track of one of these packet buffers and fails to return it to the stack’s control the buffer is no longer useable. This is referred to as a packet leak and is quite similar to leaking memory from the system’s memory heap.
Authors of zero-copy applications must fully understand when the buffers are owned by their application and when they are returned to the TCP layer, and carefully design applications to avoid loosing track of these buffers. For this reason we recommend that the more conventional m_send() and m_recv() calls be used except in cases where large amounts of data must be moved over a TCP connection quickly.
Sockets, both BSD and mini, have a specific life cycle. They are created, connected, used for I/O, and finally destroyed. Neither BSD sockets nor mini-Sockets provide a facility for a socket that has disconnected to be reconnected and reused. The following table shows the expected sequence of socket calls for clients and servers on both BSD sockets and mini-Sockets.
For client applications:
Mini-Sockets | BSD Sockets |
---|---|
m_socket() | socket() |
m_connect() | connect() |
m_recv() and/or m_send() - or - tcp_send() and/or tcp_recv() - (zero-copy I/O) | recv() and/or send() |
m_close() | close(); |
For server applications:
Mini-Sockets | BSD Sockets |
---|---|
(n/a - merged with listen) | socket() |
(n/a - merged with listen) | bind() |
m_listen() | listen() |
(n/a - handled via callback) | accept() |
m_recv() and/or m_send() - or - tcp_send() and/or tcp_recv() - (zero-copy I/O) | recv() and/or send() |
m_close() | close(); |
Several data types are used throughout the mini-Sockets API. These are briefly described here:
M_SOCK is the data type of a mini-sockets socket descriptor. In the current implementation it is a pointer to the mini-socket structure, however it is not good programming form to access the structure members directly - this should be done via the API calls.
Mini-Sockets often use a "callback" routine - a pointer to a C routine which is passed to the TCP layer when the socket is created or connected, and which the TCP layer thereafter uses to deliver data or even notifications to the application. This data type is the pointer to such a C routine. Using a predefined type for the routine helps ensure that all such routines comply with the TCP layers design and avoids programming errors.
Callback routines allow for fast response and good performance, however since they are called in the context of the TCP/IP network thread, they should be treated in a manner similar to Interrupt ISR. Specifically, they should never block. On non-blocking sockets they may make calls to send data. Callbacks for data delivery should typically move the passed data PACKET into a local queue and wake a thread that will handle the received data later.
A PACKET is a pointer to a netbuf structure, information about the buffers size and current state, and a queue link. These structures are created and managed by the IP/TCP code. They can be accessed by applications to facilitate zero-copy data I/O.
sockaddr_in is essentially the same as the sockaddr_in type in conventional BSD sockets. It is a small structure containing the addressing information for a TCP endpoint - specifically the IP address and port number.
This section describes the mini-Sockets calls in detail.
Name
m_socket()
Syntax
M_SOCK m_socket(void);
Parameters
None
Description
Allocates a socket structure for an active connection. Sockets for passive TCP connections should be allocated and connected via the m_listen() routine.
This routine does not start the active connection process. That should be done via a call to the m_connect() routine.
Note that sockets created with this routine default to blocking mode. They can be set to non-blocking at any later time via calls to m_ioctl().
Returns
Returns a newly created M_SOCK if OK, else NULL if out of resources.
Name
m_connect()
Syntax
int m_connect(M_SOCK so, struct sockaddr_in * sin, M_CALLBACK(name));
Description
Starts the active connection process. If socket has been set to non-blocking and no problems are detected it returns immediately with EINPROGRESS. On blocking sockets it waits until the socket is fully connected or times out before returning.
As with the BSD connect() call, the sockaddr_in structure should be filled in with an IP address and port number for the connection. Wildcard connection parameters are not supported for active connects.
The M_CALLBACK parameter is a pointer to a routine that will be called when a non-blocking socket connects. The callback code should generally just wake up an application thread to handle the new connection.
Parameters
M_SOCK so /* socket to start connect on */
sockaddr_in * sin /* structure with port and address to connect to */
M_CALLBACK(name) /* pointer to callback routine for NB connects */
Returns
Zero if socket connect completed successfully. EINPROGRESS if the socket is non-blocking and the connection was initiated successfully, or one of the BSD error code.
Common error codes are:
ETIMEDOUT | timeout waiting for connect |
ENOMEM | out of heap memory |
EISCONN | socket is already connected |
Name
m_listen()
Syntax
M_SOCK m_listen (struct sockaddr_in* sin, M_CALLBACK(name), int * error);
Parameters
sockaddr_in * sin /* local port/foreign IP address */
M_CALLBACK(name) /* callback routine when connect occurs */
int * error /* OUT - error return */
Description
Start a listen on the port and IP address passed in the sockaddr_in structure. The IP address may be 0 (zero) if connections are to be accepted from any IP address. This is referred to as a "wildcard" IP address.
The listen is implemented by creating a partially filled in M_SOCK and tcpcb. The socket is returned to the caller for passing to later calls, like m_close(), but will never actually become a working connection. Any passive connection which succeeds on this socket causes the callback routine to be called with a code of M_OPENED and passed a new, connected, socket.
The application may terminate the listen by passing the returned socket to m_close().
Returns
Returns the newly created listening socket if success, else returns INVALID_SOCKET, and the passed error holder is set to one of the BSD error codes.
Common errors are:
ENOMEM | out of heap memory |
Name
tcp_send()
Syntax
Int tcp_send(M_SOCK so, PACKET pkt);
Description
Send a packet allocated via tcp_pktalloc(). User should have filled in the data to be sent at pkt->m_data and set length in pkt->m_len.
An OK (0) return means the data is queued for sending and that pkt is now the responsibility of the stack. An error return means that pkt has NOT been queued or freed and is still owned by the caller.
Parameters
M_SOCK so /* socket to send on. Must be open */
PACKET pkt /* filled in data packet to send. */
Returns
0 if OK or BSD error code.
Common errors are:
ENOTCONN | socket is not connected |
EWOULDBLOCK | blocking socket is congested and is not able to buffer the packet for send |
Name
tcp_recv()
Syntax
PACKET tcp_recv(M_SOCK so);
Parameters
M_SOCK so /* socket on which to receive */
Description
Return next block of received packet on passed socket. The calling application is responsible for returning pkt to freeq via tcp_pktfree(). The returned pkt member pkt->m_data points to data, pkt->m_len is length of data.
Returns
pkt | if one is ready |
NULL | if no packet is ready and socket is in non-blocking mode |
EWOULDBLOCK | if no packet is ready and socket is in blocking mode |
Name
m_ioctl()
Syntax
int m_ioctl(M_SOCK so, int option, void * data);
Parameters
M_SOCK so /* socket on which to set option */
int option /* one of the SO_ options */
void * data /* option parameters, defined by option */
Description
Implement selected SO_ options from socket.h. Most often this routine simply sets a state bit for the socket or TCP connection and returns. This one routine maintains both the so->so_options (socket options) and tp->t_state (TCP connection) masks. As such, it replaces both the BSD setsockopt() call and the sockets portions of the BSD system ioctl() call.
Supported m_ioctl() options are described below. The data parameter is assumed to be a pointer to an int unless otherwise noted.
SO_NBIO | sets blocking or non-blocking mode depending on the value of *data. When data is 0, the socket becomes blocking. If data points to a non-zero value, the socket becomes non-blocking. |
SO_NONBLOCK | sets the socket into blocking mode. data is ignored. |
SO_BIO | sets the state of the socket to blocking. data is ignored. |
SO_DEBUG | sets enables optional debugging code. If *data is zero then debugging mode is disabled, else it is enabled. |
SO_LINGER | similar to BSD linger. When called with a data value of zero, this causes the socket to send a reset when it is closed (via a later call to m_close()) rather than the FIN handshake. data is number of seconds. |
Returns
The return value is 0 (zero) if the operation was OK, else EOPNOTSUPP if the option is unsupported.
Name
m_close()
Syntax
int m_close(M_SOCK so);
Parameters
M_SOCK so /* socket to close & delete */
Description
Closes the passed socket. The socket structure’s memory MAY be freed at this point if SO_LINGER was set with a data value of zero or the socket was already in the process of closing. The socket should not be referenced again after it is passed to m_close().
Returns
Returns 0 (zero) if OK, else one of the BSD error codes.
A common error code is EINVAL, which indicates the passed parameter did not appear to be a current socket.
Name
tcp_pktalloc()
Syntax
PACKET tcp_pktalloc(int datasize);
Parameters
int datasize /* size of TCP data for packet */Description
Allocate a packet for sending TCP data. The pkt->nb_prot member of the returned PACKET structure points to a buffer big enough for the data size passed.
The maximum size of a buffer than can be allocated is limited to the number set in the global variable bigbufsize at initialization time. This is typically the network MTU minus header sizes. If you want to send larger data blocks in a single operation, then use the simpler (though usually slower) m_send() call.
Returns
A pointer to a PACKET, or NULL if a big enough packet was not available in the stack’s internal free queues.
Name
tcp_pktfree()
Syntax
void tcp_pktfree(PACKET p);
Description
tcp_pktfree frees a packet allocated by (presumably) tcp_pktalloc() or passed to the application by a callback. This is a simple wrapper around pk_free() to lock and unlock the free-queue resource.
Parameters
PACKET p /* the packet to be returned to the Protocol stack */
Returns
No value is returned. If the passed packet is already in a free queue, has been corrupted, or does not appear to be a valid packet, a dtrap() may be generated by the debugging logic.
Name
m_send()
Syntax
int m_send(M_SOCK so, char * data, unsigned datalen);
Parameters
M_SOCK so
char * data
unsigned datalen
Description
A work-alike for the BSD sockets send() call.
The data pointed to by the parameter data is sent on the socket so. Up to datalen bytes are sent. On blocking sockets, this call will block until all the data is sent or the socket has an error. On non-blocking sockets, data is sent and/or queued for future sending until all the data is sent or no more can be buffered in the internal buffers.
Returns
The number of bytes actually sent (which may be less than datalen on a non-blocking socket), or -1 if error. If a -1 is returned the BSD error code may be read with the macro tcp_errno(so).
Name
m_recv()
Syntax
int m_recv(M_SOCK so, char * data, unsigned length)
Parameters
M_SOCK so
char * data
unsigned length
Description
A work-alike for the BSD sockets recv() call.
Receives data from the TCP layer into the application buffer pointer to by the parameter data. Receives up to length bytes. On blocking sockets it will block until at least some data is ready in the data buffer. On non-blocking sockets it will always return immediately.
Returns
Returns number of bytes actually read, or -1 if error. If a non-blocking socket has no data ready, it returns the -1 and sets a socket error of EWOULDBLOCK so that applications can distinguish between broken sockets and healthy sockets which simply have no data ready.
If the call returned a -1 because the socket was disconnected then the error code will be ESHUTDOWN.