Socks 5: Continuing Sending a Token

In this installment, we’ll first go over the answers to the questions from the previous installment, in which we introduced working, but less-than-perfect code.

Below is a diff chunk that repairs an error in the main function (question 5 from the previous installment). I will, from time to time, post chunks like these rather than complete code. These chunks should be easy enough to read for most, once you get used to them: a line starting with a space is unchanged from the previous version, a line starting with a + is new, a line starting with a – is removed. The line with the @ sign indicates where in the file the chunk was (lines 17 through 24) and where it is now (lines 14 through 23). This is the “unified diff format”, which is what I will consistently use when posting diff chunks, a.k.a. patches.

@@ -17,7 +14,9 @@ int main()
        {
                boost::thread::yield();
                done = client.done() || server.done();
-       } while (!client.done() || !server.done());
+       } while (!done);
+       client_thread.join();
+       server_thread.join();
 
        return 0;
 }

In the original code, neither one of the threads was “joined” explicitly, which is bad form. They would eventually be joined by the thread object’s destructors, but it is bad form nonetheless to not handle thread life-times explicitly. Hence, the two new lines with the calls to join.

Also, in the original code, the while() clause of the main loop called the two done methods again, rather than rely on the done variable. Both the server and the client will eventually die once the done flag is set,
so as soon as it is set we can start joining the two threads. Note, however, that in this new version, the condition is slightly changed: it no longer requires that both the server and the client are done, but only that one of them is done – which should kill the other. This relies on both the client and the server being well-behaved in the sense that when the done flag is set, they will both quit. Therefore, this code is still a bit fragile.

Question 1: “change the copied code between server and client to shared code between the two”

In order to do this, we need to move the code from both the server and the client to something they can both share. One good candidate would be a new class that can handle all of the socket-related operations for us. Those operations would have to include

  • RAII
  • creating new connections
  • anything server-related (bind, listen, accept)

but the same class should not allow outside code to mess with its innards while still allowing “synchronous multiplexing” (remember that?) using select.

Here’s what such a class could look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
namespace Details
{
	class Socket
	{
	public :
		Socket();
		Socket(int socket);
		~Socket();
 
		Socket & operator=(int socket);
		void swap(Socket & other);
 
		void set(fd_set * fds, int & highest) const;
		bool isSet(const fd_set * fds) const;
 
		void connect(boost::uint32_t address, boost::uint16_t port);
		void bind(boost::uint32_t address, boost::uint16_t port);
		void listen(unsigned long backlog_queue_size = 0);
		int accept();
		void close();
 
		ssize_t send(const std::vector< char > & data);
		ssize_t recv(std::vector< char > & data);
 
	private :
		Socket(const Socket &);
		Socket & operator=(const Socket &);
 
		int socket_;
	};
}

Now, there are quite a few little caveats in this header that you might want to be on the look-out for. Note, for example, that this class is neither copy-constructible nor assignable, because both the copy constructor and the assignment operator are private (and presumably not implemented) – see lines 26 and 27. But wait! Now look at line 10. Didn’t I say that the class wasn’t assignable? Then what is that assignment operator doing there – and why is there a swap method for this class?

Also note that there are two constructors: one default constructor and one that takes an integer argument.

Then there’s lines 13 and 14, where we mess around with fd_sets. That’ll certainly help when playing with select, won’t it?

Enough with the caveats already: let’s have a look at the underlying code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
namespace Details
{
	Socket::Socket()
		: socket_(::socket(AF_INET, SOCK_STREAM, IPPROTO_TCP))
	{
		if (socket_ == -1)
		{
			throw std::bad_alloc(); // failed to create socket - we might want to add some more data to the exception here
		}
		else
		{ /* all is well */ }
	}
 
	Socket::Socket(int socket)
		: socket_(socket)
	{ /* no-op */ }
 
	Socket::~Socket()
	{
		if (socket_ != -1)
		{
			::close(socket_);
		}
		else
		{ /* no-op */ }
	}
 
 
	Socket & Socket::operator=(int socket)
	{
		Socket rhs(socket);
		swap(rhs);
		return *this;
	}
 
	void Socket::swap(Socket & other)
	{
		std::swap(socket_, other.socket_);
	}
 
	void Socket::set(fd_set * fds, int & highest) const
	{
		FD_SET(socket_, fds);
		if (socket_ >= highest)
		{
			highest = socket_ + 1;
		}
		else
		{ /* OK for now */ }
	}
 
	bool Socket::isSet(const fd_set * fds) const
	{
		return (FD_ISSET(socket_, fds) != 0);
	}
 
	void Socket::connect(boost::uint32_t address, boost::uint16_t port)
	{
		sockaddr_in in_address;
		memset(&in_address, 0, sizeof(in_address));
		in_address.sin_family = AF_INET;
		in_address.sin_port = htons(port);
		in_address.sin_addr.s_addr = htonl(address);
		if (::connect(socket_, (sockaddr*)&in_address, sizeof(in_address)) != 0)
		{
			throw std::runtime_error("failed to make the connection!");
		}
		else
		{ /* all is well */ }
	}
 
	void Socket::bind(boost::uint32_t address, boost::uint16_t port)
	{
		sockaddr_in in_address;
		memset(&in_address, 0, sizeof(in_address));
		in_address.sin_family = AF_INET;
		in_address.sin_port = htons(port);
		in_address.sin_addr.s_addr = htonl(address);
		if (::bind(socket_, (sockaddr*)&in_address, sizeof(in_address)) != 0)
		{
			throw std::runtime_error("failed to make the connection!");
		}
		else
		{ /* all is well */ }
	}
 
	void Socket::listen(unsigned long backlog_queue_size/* = 0*/)
	{
		if (::listen(socket_, backlog_queue_size) != 0)
		{
			throw std::runtime_error("Failed to listen on socket");
		}
		else
		{ /* all is well */ }
	}
 
	int Socket::accept()
	{
		return ::accept(socket_, 0, 0);
	}
 
	void Socket::close()
	{
		if (socket_ != -1)
		{
			::close(socket_);
			socket_ = -1;
		}
		else
		{ /* nothing to close */ }
	}
 
	ssize_t Socket::send(const std::vector< char > & data)
	{
		return ::send(socket_, &data[0], data.size(), 0);
	}
 
	ssize_t Socket::recv(std::vector< char > & data)
	{
		return ::recv(socket_, &data[0], data.size(), 0);
	}
}

Well, this confirms that there is no copy constructor and no run-of-the-mill assignment operator. That is good, because that means that once the class has taken ownership of a file descriptor, it will at least make sure it doesn’t get copied around. An alternative would have been to use reference counting or some other such mechanism, but that would have been both overkill and messy – neither of which are good options.

The two constructors we saw earlier are very different from each other: the first creates a TCP socket (IPv4 address family, stream socket using the TCP protocol) whereas the second doesn’t actually do anything, other than copying the socket FD to take ownership of it. Note that the FD may well be -1, indicating that the socket isn’t valid (which would mean it shouldn’t be closed). Luckily, the destructor takes care of that possibility. (Actually, there’s no luck involved).

The assignment operator takes ownership of a socket FD. Just for good measure, I threw in the swap trick, which we will see more of later.

The set and isSet methods are in deed used for select, as we will see shortly. The code is basically copied from the server in this case. Speaking of code copied from the server, we also have bind, listen and accept. The connect function is copied from the client.

Let’s have a look at some of the changes this implies in the client code:

@@ -1,11 +1,10 @@
 #include "Client.h"
-#include <sys/socket.h>
-#include <netinet/in.h>
-#include <unistd.h>
 #include <stdexcept>
 #include "../../lib/rfc1961/Token.h"
+#include "Details/Socket.h"
 
 using namespace Vlinder::Chausette::RFC1961;
+using Details::Socket;
 
 Client::Client(bool & done, unsigned short port)
 	: done_(done)

In this first chunk, you can see that we’ve removed the includes we no longer need and replaced them by the include of the Socket class’ header, and a using declaration telling the compiler what we mean when we say Socket. Then, there follows a big chunk of code being removed, replaced by only three lines of code:

@@ -15,49 +14,11 @@ Client::Client(bool & done, unsigned short port)
 
 Client & Client::operator()()
 {
-	int socket(::socket(AF_INET, SOCK_STREAM, IPPROTO_TCP));
-	if (socket == -1)
-	{
-		done_ = true;
-		std::cerr << "Failed to create socket" << std::endl;
-		throw std::runtime_error("Failed to create socket");
-	}
-	else
-	{ /* all is well */ }
-	struct SocketGuard
-	{
-		SocketGuard(int & s)
-			: s_(s)
-		{ /* no-op */ }
-
-		~SocketGuard()
-		{
-			if (s_ != -1)
-			{
-				close(s_);
-			}
-			else
-			{ /* Not something I can close */ }
-		}
-
-		int & s_;
-	} socket_guard(socket);
-	sockaddr_in address;
-	memset(&address, 0, sizeof(address));
-	address.sin_family = AF_INET;
-	address.sin_port = htons(port_);
-	address.sin_addr.s_addr = htonl(0x7f000001);
-	if (::connect(socket, (sockaddr*)&address, sizeof(address)) != 0)
-	{
-		done_ = true;
-		std::cerr << "Failed to make the connection!" << std::endl;
-		throw std::runtime_error("failed to make the connection!");
-	}
-	else
-	{ /* all is well */ }
 	Token token_to_send(Token::failure__);
 	std::vector< char > data(serialize(token_to_send));
-	ssize_t sent(::send(socket, &data[0], data.size(), 0));
+	Socket socket;
+	socket.connect(0x7f000001, port_);
+	ssize_t sent(socket.send(data));
 	if (sent <= 0)
 	{
 		std::cerr << "Failed to send";

Of course, the code is actually still there – in the Socket.cpp file – but its intricacies are hidden away now, and neatly packed into the methods that both tell us what they do, and do only what they should – i.e. connect connects and send sends.

Let’s take a look at what happened to Server.cpp:

@@ -1,10 +1,11 @@
 #include "Server.h"
 #include <boost/thread.hpp>
-#include <sys/socket.h>
-#include <netinet/in.h>
+#include <sys/time.h>
 #include "../../lib/rfc1961/Token.h"
+#include "Details/Socket.h"
 
 using namespace Vlinder::Chausette::RFC1961;
+using Details::Socket;
 
 #define SERVER_TIMEOUT	500	// ms
 
@@ -23,59 +24,13 @@ Server & Server::operator()()
 		error__,
 	};
 
-	int socket(::socket(AF_INET, SOCK_STREAM, IPPROTO_TCP));
-	if (socket == -1)
-	{
-		done_ = true;
-		std::cerr << "Failed to create socket" << std::endl;
-		throw std::runtime_error("Failed to create socket");
-	}
-	else
-	{ /* all is well */ }
-	struct SocketGuard
-	{
-		SocketGuard(int & s)
-			: s_(s)
-		{ /* no-op */ }
-
-		~SocketGuard()
-		{
-			if (s_ != -1)
-			{
-				close(s_);
-			}
-			else
-			{ /* Not something I can close */ }
-		}
-
-		int & s_;
-	} socket_guard(socket);
-	sockaddr_in address;
-	memset(&address, 0, sizeof(address));
-	address.sin_family = AF_INET;
-	address.sin_port = htons(port_);
-	address.sin_addr.s_addr = htonl(0x7f000001);
-	if (::bind(socket, (sockaddr*)&address, sizeof(address)) != 0)
-	{
-		done_ = true;
-		std::cerr << "Failed to bind socket" << std::endl;
-		throw std::runtime_error("Failed to bind socket");
-	}
-	else
-	{ /* all is well */ }
-	if (::listen(socket, 0) != 0)
-	{
-		done_ = true;
-		std::cerr << "Failed to listen on socket" << std::endl;
-		throw std::runtime_error("Failed to listen on socket");
-	}
-	else
-	{ /* all is well */ }
+	Socket socket;
+	socket.bind(0x7f000001, port_);
+	socket.listen();
 
 	State state(listening__);
 	fd_set read_fds;
-	int client_socket(-1);
-	SocketGuard client_socket_gaurd(client_socket);
+	Socket client_socket(-1);
 	std::vector< char > buffer;
 	buffer.reserve(Token::max_token_size__);
 	while (!done_)

The first two chunks are pretty much the same as for the client: remove a lot of code that is no longer needed here. That is followed, however by a slightly more interesting piece of code, which shows us how the set and isSet members of our new Socket class are used:

@@ -89,8 +44,9 @@ Server & Server::operator()()
 				timeout.tv_sec = SERVER_TIMEOUT / 1000;
 				timeout.tv_usec = (SERVER_TIMEOUT % 1000) * 1000;
 				FD_ZERO(&read_fds);
-				FD_SET(socket, &read_fds);
-				int select_result(select(socket + 1, &read_fds, 0, 0, &timeout));
+				int highest(-1);
+				socket.set(&read_fds, highest);
+				int select_result(select(highest, &read_fds, 0, 0, &timeout));
 				if (select_result < 0)
 				{
 					state = error__;
@@ -99,7 +55,7 @@ Server & Server::operator()()
 				{ /* nothing to do in this case: we timed out */ }
 				else
 				{
-					if (FD_ISSET(socket, &read_fds))
+					if (socket.isSet(&read_fds))
 					{
 						state = accepting__;
 					}

As you can see, the file descriptor from the Socket class’ instance is never revealed here, but the proper bit is set in the fd_set and the highest variable, which is needed for the UNIX version of select is set as needed.

The rest of the code is really just a simplification that we can make thanks to the new Socket class.

@@ -110,20 +66,13 @@ Server & Server::operator()()
 			break;
 		case accepting__ :
 			std::clog << "Accepting a new connection" << std::endl;
-			client_socket = accept(socket, 0, 0);
+			client_socket = socket.accept();
 			state = reading__;
 			break;
 		case reading__ :
 			std::clog << "Reading from connection" << std::endl;
-			if (client_socket == -1)
-			{
-				state = error__;
-				break;
-			}
-			else
-			{ /* all is well - carry on */ }
 			buffer.resize(Token::max_token_size__);
-			ssize_t received(recv(client_socket, &buffer[0], buffer.size(), 0));
+			ssize_t received(client_socket.recv(buffer));
 			if (received <= 0)
 			{ /* nothing to do here */ }
 			else
@@ -133,8 +82,7 @@ Server & Server::operator()()
 				Token received_token(deserialize(buffer));
 				std::cout << received_token << std::endl; // dump it to the console
 			}
-			close(client_socket);
-			client_socket = -1;
+			client_socket.close();
 			state = listening__;
 			break;
 		case error__ :

This goes to show that a really rather minor change in the code, moving a bit of code out of two classes into a single class, can go a long way toward cleaning up the code and can have a radical impact on the structure of the code itself – which will become even more evident as we progress to the next question.

Question 2: “make sure that once a connection is accepted the server doesn’t block until the client writes”

Some of you will have tried this by adding an intermediary state in the (primitive) state machine between reading and receiving. Although that is a feasible approach, it is not necessary as select is well able to handle more than one socket, so in stead of doing that, we’ll just use the call to select we already have, and do some actual multiplexing. This also means that once a new connection is accepted, in stead of going to the reading__ state, we go back to the listening__ state:

@@ -46,10 +47,40 @@ Server & Server::operator()()
 				FD_ZERO(&read_fds);
 				int highest(-1);
 				socket.set(&read_fds, highest);
+				if (client_socket)
+				{
+					client_socket.set(&read_fds, highest);
+				}
+				else
+				{ /* nothing in the socket */ }
 				int select_result(select(highest, &read_fds, 0, 0, &timeout));
 				if (select_result < 0)
 				{
-					state = error__;
+					// This may be just the client having hung up or some similar error.
+					// Check for that case and, if it is so, just close the client socket.
+					FD_ZERO(&read_fds);
+					highest = -1;
+					if (client_socket)
+					{
+						client_socket.set(&read_fds, highest);
+						timeout.tv_sec = 0;
+						timeout.tv_usec = 1; // very short poll - I could have used poll here as well, but I prefer to be consistent for now
+
+						select_result = select(highest, &read_fds, 0, 0, &timeout);
+						if ((select_result < 0) && (errno == EBADF))
+						{
+							client_socket.close();
+						}
+						else
+						{
+							// the problem was not with the client socket, so assume it is more serious
+							state = error__;
+						}
+					}
+					else
+					{	// certainly not the client socket
+						state = error__;
+					}
 				}
 				else if (select_result == 0)
 				{ /* nothing to do in this case: we timed out */ }
@@ -59,6 +90,10 @@ Server & Server::operator()()
 					{
 						state = accepting__;
 					}
+					else if (client_socket && client_socket.isSet(&read_fds))
+					{
+						state = reading__;
+					}
 					else
 					{ /* dunno what happened - there should be no such thing as a spurious wake-up here.. */ }
 				}
@@ -67,7 +102,7 @@ Server & Server::operator()()
 		case accepting__ :
 			std::clog << "Accepting a new connection" << std::endl;
 			client_socket = socket.accept();
-			state = reading__;
+			state = listening__;
 			break;
 		case reading__ :
 			std::clog << "Reading from connection" << std::endl;

You’ve perhaps noticed the following bit of code:

50
51
52
53
54
55
if (client_socket)
{
	client_socket.set(&read_fds, highest);
}
else
{ /* nothing in the socket */ }

and you may have wondered how that works, seeing as client_socket is not any kind of integer or boolean. The answer is that Socket now overloads the cast-to-bool operator, which is declared like this:

@@ -9,6 +9,8 @@ namespace Details
 	class Socket
 	{
 	public :
+		typedef int Socket::* Bool_;
+
 		Socket();
 		Socket(int socket);
 		~Socket();
@@ -16,6 +18,8 @@ namespace Details
 		Socket & operator=(int socket);
 		void swap(Socket & other);
 
+		operator Bool_ () const;
+
 		void set(fd_set * fds, int & highest) const;
 		bool isSet(const fd_set * fds) const;

and defined like this:

@@ -44,8 +45,13 @@ namespace Details
 		std::swap(socket_, other.socket_);
 	}
 
+	Socket::operator Socket::Bool_ () const
+	{
+		return (socket_ != -1) ? &Socket::socket_ : 0;
+	}
+
 	void Socket::set(fd_set * fds, int & highest) const
 	{
 		FD_SET(socket_, fds);
 		if (socket_ >= highest)
 		{

Note the typedef that defines the Bool_ type as a pointer-to-member-of-type-int-in-class-Socket. The reason for this rather convoluted way of going about things is to make the implicit conversion to a boolean type well-behaved. In fact, implicitly converting directly to bool can have some very nasty side-effects – but I’m sure I’ll get to those in some other post. Note, though, that we now return a pointer to the member socket_ which could conceivably be used to break our class’ encapsulation – I thought it was a bit overkill to add a used_for_cast_to_bool_ member (which I will sometimes do if I’m in a paranoid mood).

Note, by the way, that error-handling still lacks some grace in this version.

Question 3: “break the server’s code up into pieces”

In order to do this, the sockets we work with, as well as the server’s state, become members of the class so the different pieces can still work with it. To avoid unnecessary dependencies, we only use a forward declaration of Socket in the Server header, which forces us to do dynamic allocation for both sockets – and to add a destructor – but which is a good practice to avoid bloated compile times. Due to this, though, question 7 (enforce, in the code, the fact that we don’t want instances of the Server and Client classes to be copied) becomes all the more important.

See the patch for the code.

We’ll get to the other questions – and a few new ones – in the next installment (otherwise, I’ll be talking into my microphone to record this podcast four way too long).

About rlc

Software Analyst in embedded systems and C++, C and VHDL developer, I specialize in security, communications protocols and time synchronization, and am interested in concurrency, generic meta-programming and functional programming and their practical applications. I take a pragmatic approach to project management, focusing on the management of risk and scope. I have over two decades of experience as a software professional and a background in science.
This entry was posted in C++ for the self-taught and tagged . Bookmark the permalink.