[Document on ruby-doc.org] ruby_doc
Ruby provides a standard library Socket for networking programming in lower layer, such as TCP and UDP. There are also other libraries for application layers like HTTP, FTP and TELNET, they are not included in this tutorial.
From [Wikipedia] wikipedia_socket, the term Internet sockets is used as a name for an application programming interface (API) for the TCP/IP protocol stack, usually provided by the operating system.
From the view of OS, when an application creates a socket, this socket is referenced by a socket number. OS forward all the payload of incoming IP packets to corresponding application, by extracting the socket address information from IP and transport protocol header.
A socket address is the combination of an IP address and a port into a single identity. An Internet socket is characterized by a unique combination of the following:
- Local socket address: Local IP address and port number
- Remote socket address: Only for established TCP sockets. This is necessary since a TCP server may serve several clients concurrently. The server creates one socket for each client, and these sockets share the same local socket address.
From the discussion above, we can know that:
- In the case of TCP, each socket(identified by local IP, local port number, remote IP and remote port number) is assigned a socket number, and the OS forward the payload of the IP packets to the application which are mapping with that socket number.
- In the case of UDP, each socket(identified by local IP and local port number) is assigned a socket number. OS do the same thing as above.
The class TCPSocket supports connection-based, reliable transmission control protocol. For all states in TCP connection, you can refer to TCP State Diagram. In order to establish the TCP connection, we only need to care about 3 states: listen, established and closed:
The TCP server create socket that are in listening state, the socket is waiting for initiatives from client programs. For a listening TCP socket, the remote address presented by netstat may be denoted 0.0.0.0 and the remote port number 0.
A TCP server may serve several clients concurrently, by creating a child process for each client and establishing a TCP connection between the child process and the client. Unique dedicated sockets are created for each connection. These are in established state, when a socket-to-socket virtual connection or virtual circuit (VC), also known as a TCP session, is established with the remote socket, providing a duplex byte stream.
![TCP connection flow] tcp_flow
The flow above shows the C function call flow when establishing a TCP connection.
Now, Let's write a simple echo TCP server by following the flow above.
require 'socket'
# Create the server using the Socket class
# AF_INET: using IP protocol
# SOCK_STREAM: using TCP
server_socket = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
# Set the socket address
sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
# Bind the socket address to the server socket
server_socket.bind( sockaddr )
# set the state to listen
server_socket.listen(5)
# Block for incoming connection from client by accept method
# The return value of accept contains the new client socket object and the remote socket address
client, client_sockaddr = server_socket.accept
# Receive data using recvfrom method on the new client socket
data = client.recvfrom( 20 )[0].chomp
puts "I only received 20 bytes '#{data}'"
# Send back the data
client.puts "You said: #{data}"
# Wait 1 second and close the server socket
sleep 1
server_socket.close
After running the code above, you can connect the server by telnet:
$ telnet localhost 2200
Connected to localhost.
Escape character is '^]'.
hello
You said: hello
On the server side, it will display: $ ruby echo.rb I only received 20 bytes 'hello'
There is a lot of work we need to do when following TCP connection flow and just using the Socket class. But this will help us to understand how to establish a TCP connection. In order to simplify the code, we can use TCPServer and TCPSocket:
server code:
require 'socket'
# listen and bind are involved in TCPServer.initialize
server_socket=TCPServer.new 'localhost',2200
client, client_sockaddr = server_socket.accept
data = client.recvfrom( 20 )[0].chomp
puts "I only received 20 bytes '#{data}'"
client.puts "You said: #{data}"
sleep 1
server_socket.close
nt code:
require 'socket'
c=TCPSocket.new 'localhost', 2200
c.puts "hello"
puts c.recv(100)
c.close
There are two ways to handle multiple connections. The first one is create a thread when a new client is connected, and the communication with the client is handled inside the new thread.
require "socket"
echo_server = TCPServer.new('localhost', 2200)
loop do
# Create a new thread for each connection.
Thread.start(echo_server.accept) do |client|
puts "#{client.peeraddr[2]}:#{client.peeraddr[1]} is connected"
loop do
data = client.recvfrom( 20 )[0].chomp
if data =="exit"
client.puts "bye!"
puts "#{client.peeraddr[2]}:#{client.peeraddr[1]} is disconnect"
break
end
puts "I only received 20 bytes '#{data}' from #{client.peeraddr[2]}:#{client.peeraddr[1]}"
client.puts "You said: #{data}"
end
client.close
end
end
The second way is Kernel::select method. Unfortunately, the ruby doc for this method is broken. This method will return an array for the descriptors which the ready for read event are raised. We can iterate the array and handle each socket.
require "socket"
echo_server = TCPServer.new('localhost', 2200)
socket_list = Array.new
socket_list.push echo_server
loop do
# return an socket array that ready for read event are raised.
return_array = select( socket_list, nil, nil, nil )
return_array[0].each do |res|
if res == echo_server
client=res.accept
socket_list.push client
puts "#{client.peeraddr[2]}:#{client.peeraddr[1]} is connected"
else
data = res.recvfrom( 20 )[0].chomp
if data =="exit"
res.puts "bye!"
puts "#{res.peeraddr[2]}:#{res.peeraddr[1]} is disconnect"
res.close
socket_list.delete res
else
puts "I only received 20 bytes '#{data}' from #{res.peeraddr[2]}:#{res.peeraddr[1]}"
res.puts "You said: #{data}"
end
end
end
end
The difference between TCP and UDP is that UDP is not a reliable protocol. A UDP socket cannot be in an established state, since UDP is connectionless. Therefore, netstat does not show the state of a UDP socket. A UDP server does not create new child processes for every concurrently served client, but the same process handles incoming data packets from all remote clients sequentially through the same socket. This implies that UDP sockets are not identified by the remote address, but only by the local address, although each message has an associated remote address. The connection flow is as below:
![UDP connection flow] udp_flow
Now, Let's write a simple echo UDP server by following the flow above.
server code:
require 'socket'
# SOCK_DGRAM: using UDP
server_socket = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM, 0)
# Set the socket address
sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
# Bind the socket address to the server socket
server_socket.bind( sockaddr )
# We don't need to listen state and just receive the data:
reply, from = server_socket.recvfrom( 20, 0 )
puts "received #{reply} from #{Socket.unpack_sockaddr_in(from).reverse.join(':')}"
server_socket.send("you said: #{reply}", 0, from)
server_socket.close
client code:
require 'socket'
client_socket = Socket.new(Socket::AF_INET, Socket::SOCK_DGRAM, 0)
sockaddr = Socket.pack_sockaddr_in( 2200, 'localhost' )
client_socket.connect(sockaddr)
client_socket.puts "hello"
reply, from = client_socket.recvfrom( 20, 0 )
puts reply
client_socket.close
We can also use UDPSocket class to simplify the code above.
server code:
require 'socket'
server_socket = UDPSocket.new
server_socket.bind "", 2200
reply, from = server_socket.recvfrom( 20, 0 )
puts "received #{reply} from #{from[2]}:#{from[1]}"
server_socket.send("you said: #{reply}", 0, from[2],from[1])
server_socket.close
client code:
require 'socket'
client_socket = UDPSocket.new
client_socket.connect("", 2200)
client_socket.puts "hello"
reply, from = client_socket.recvfrom( 20, 0 )
puts reply
client_socket.close
In Ruby MRI, Socket is implemented in C. The source code is located at ext/socket. It is hard for me to understand the classes in Ruby by reading the C. Fortunately, another Ruby implementation solve this problem. Rubinius is written in ruby more than any other implementation. The socket.rb show up the class inheritance hierarchy:

- BasicSocket: The base class of all socket classes
- Socket: This class provides the basic utilities for other classes. Socket::Foreign module and Socket::ListenAndAccept module are the most important modules.
- UNIXSocket: This class use AF_UNIX protocol.
- UNIXServer: Facility to build UNIX socket server.
- IPSocket: This class use AF_INET protocol.
- TCPSocket: For TCP socket.
- TCPServer: Facility to build TCP socket server.
- UDPSocket: For UDP socket.
Debugger is also available if you want to see how the Socket works, just insert "debugger" into the place you are interested. At the moment when this article is written, Rubinius 1.0 has been released, unfortunately, the debugger in this version is not available. So you may need to checkout a previous version such as 0.9 to use the debugger.
This is very nice, I'm curious why you said this:
How is the documentation for
Kernel#select
broken?