|
Objective Socket
programming is a very interesting area of software
developement. However, with little or no expirience it
is a bit difficult to get a program up and going. In
this tutorial I'll do my best to walk step by step
through the development of a very simple Web server.
When finished, you should have a general understanding
of the HTTP protocol and socket programming with
TCP.
This tutorial is presented
in Java. For anyone interested in putting this to work
and gaining some more valuable expirience, i'd reccomend
(once you've got a grasp on how the code works, and the
concepts of HTTP and socket programming) implementing
this in C#. It'd be a good exercise in using the FCL and
a good first project for C# socket
programming.
Our Web Server The goals
for the Web server we'll design here are:
Handle multiple HTTP requests Accept and Parse
the HTTP request Get the requested file from the
server Create an HTTP response message, including
the requested file Send the response to the
client.
Remember, "the server" is simply the
machine which is running our program (the WebServer
program). Valid file's for our Web server will be html,
jpeg's, gif's, and plaintext. This should be enough. If
you're hungry for it, you can extend the server's
functionality and file support.
Brief background on
HTTP The basic idea is this:
When a user requests a web page (for example, clicks a
link) the browser sends an HTTP request message for the
objects to the server (which we'll be implementing).
"The objects" here would consist of any html files, jpeg
or gif files, or any others associated with the
requested page. The server receives the request and
responds with an HTTP response message that contains all
the requested objects. In HTTP 1.0 (used here) a new TCP
connection is created for each request, and the response
is sent back over that connection, after which the
connection is closed. The structure of HTTP 'request'
and 'response' messages is important to the
implementation of our web server. The server will be
sent requests, and will respond with appropriate
response messages. But we'll cover the details of these
HTTP messages in a moment. For now, I'll quickly explain
sockets, and then we'll get to some
coding.
Sockets A socket is
the interface between the application layer and the
transport layer (if that makes any sense). In less
technical terms, a socket can be thought of as a
mailbox. A process delivers to and receives information
from it's socket. Since this tutorial is presented in
Java, you will see how to create and use sockets using
the Java API java.net.Socket.
Excuse the extremely brief "discussion" on
sockets, but the concept should become clearer in the
code. So let's get to that.
import java.io.*;
import java.net.*;
import java.util.*;
public final class WebServer {
public static void main(String args[]) throws Exception {
int PORT = 5306;
ServerSocket listenSocket = new ServerSocket(PORT);
while(true) {
HttpRequest request = new HttpRequest(listenSocket.accept());
Thread thread = new Thread(request);
thread.start();
}
}
}
|
You'll notice something called a
ServerSocket, which is a little different than
just Socket. A ServerSocket listens on a specified
port. When a request comes in on that port, the
ServerSocket uses the accept() method to create a
new TCP connection with the client (the computer sending
the request), and return a new Socket which can be used
to communicate with the client over the TCP connection.
All information sent into that socket is sent (over TCP)
to the client, and all information sent by the client to
the server is picked up at that socket.
The
port on which the ServerSocket is listening is up to
you. It can be any number between 1024 and 65,536. Port
numbers between 0 and 1023 are reserved for certain
other application protocols (HTTP, FTP, SMTP, TELNET,
etc.) and are therefore restricted. These are called
well-known port numbers.
So execution of the server goes as
follows: First, a ServerSocket is created, and set to
listen on a specified port. The server then enters an
infinite loop, listening for new requests for
connections. When it receives a request, it creates a
new HttpRequest, with a reference to the
associated Socket, processes the request (on a seperate
thread of execution), and sends a response. The response
will be generated and sent from within the class
HttpRequest. So all that is left is to define the class
HttpRequest.
So let's do it! Here's how it
begins:
final class HttpRequest implements Runnable {
final static String CRLF ="\r\n";
Socket socket;
public HttpRequest(Socket socket) throws Exception {
this.socket = socket;
}
public void run() {
try {
processRequest();
} catch (Exception e) {
System.out.println(e);
}
}
private void processRequest() throws Exception {
InputStream is = this.socket.getInputStream();
DataOutputStream os = new DataOutputStream(this.socket.getOutputStream());
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String requestLine = br.readLine();
|
We save a reference to the client socket
as a member of the class. The CRLF is simply a carriage
return line feed, which will come in handy later. The
run method simply calls processRequest(). This is used
so we are able to process our HTTP requests on seperate
threads of execution. (Yep, thats all it takes to make
the server multi-threaded!). The heart of this class is
in the processRequest() method.
Only half the method is shown here, because
there is some information on Http request
message's and Http response message's that
you'll have to know before examining the rest of this
method. But for now, you see the creation of Input and
Output Streams to the client socket. br can now
be used to read from the socket, and os can be
used to write to the socket. Remember, all information
the client sends is sent to our socket, where we
pick it up (read it), and all information we want to
send to the client is sent through the socket,
where we drop it off (write it). The last line of code
here uses br to read the first line of the
request message from the socket. Before we move on,
we've got to talk a little bit about request and
response messages.
Http
Message's HTTP messages are are
written in oridinary, ASCII format, that any human being
could read. They consist of an initial request (for
request messages) or status (for response messages)
line, followed by several header lines. In the case of
the response messages, the body (consisting of all the
data of the requested object) is the last part of the
message. A closer look at these message types
follows.
HTTP Request
Message GET /somedir/page.html HTTP
/1.0 Host: www.someschool.edu Connection:
close User-agent: Mozilla/4.0 Accept-language:
fr extra carriage return, line feed
Above is an example of a typical HTTP request
message. Something very similar to this would be
generated and sent to the server on the appropriate port
(80, for HTTP, though in this example it would be
whatever you choose to set your ServerSocket on) when
you click on a link for a webpage. In this example, the
line of importance is the request line, the first
line. This line could start with several values,
inluding GET, POST, and HEAD. For our purposes, we need
only focus on GET request messages. This specifies the
file requested. It is the servers responsiblity to
bundle that file into a response message and send it
back to the client.
HTTP Response
Message HTTP/1.0 200 OK Connection:
close Date: Thu, 06 Aug 1008 12:00 12:00:15
GMT Server: Apache/1.3.0 (Unix) Last-Modified:
Mon, 22 Jun 19998 09:23:25 GMT Content-Length:
6821 Content-Type: text/html
(data data data
data data data ...)
The (data data data data data data
...) represents the entity body and is the
meat of the message. This is the requested object that
is being sent back to the client. The first line is the
status line, which contains the HTTP version and
an OK message. The status code for "OK" is 200. There
are several different messages that could be included
here, and all have specific status codes. Some common
examples:
- 200 OK
- 404 Not Found
- 400 Bad Request
- 505 HTTP Version Not Supported
The next
few lines are the header lines. The only header
line we will use in our WebServer is the
Content-Type header, which specifies the type of
the object being sent. In our server the possible types
are:
- text/html
- image/jpeg
- image/gif
- text/plain
So the response message our
server will send will consist of a status line, followed
by a Content-type line, followed by the requested
object. The code below examines the recieved request
line, generates an appropriate response message, and
sends it to the client.
StringTokenizer tokens = new StringTokenizer(requestLine);
tokens.nextToken();
String fileName = tokens.nextToken();
if(fileName.startsWith("/"))
fileName = fileName.substring(1,fileName.length());
FileInputStream fis = null;
boolean fileExists = true;
try {
fis = new FileInputStream(fileName);
} catch (FileNotFoundException e) {
fileExists = false;
}
String statusLine = null;
String contentTypeLine = null;
String entityBody = null;
if (fileExists) {
statusLine = "HTTP/1.0 200 OK" + CRLF;
contentTypeLine = "Content-type: " + contentType(fileName) + CRLF;
} else {
statusLine = "HTTP/1.0 404 Not Found" + CRLF;
contentTypeLine = "NONE";
entityBody = "\n\n Not Found";
}
os.writeBytes(statusLine);
os.writeBytes(contentTypeLine);
os.writeBytes(CRLF);
if (fileExists) {
sendBytes(fis, os);
fis.close();
} else {
os.writeBytes(entityBody);
}
os.close();
br.close();
socket.close();
}
private String contentType(String fileName) {
if(fileName.endsWith(".htm") || fileName.endsWith(".html"))
return "text/html";
else if(fileName.endsWith(".jpg") || fileName.endsWith(".jpeg"))
return "image/jpeg";
else if(fileName.endsWith(".gif"))
return "image/gif";
else if(fileName.endsWith(".txt"))
return "text/plain";
else
return "application/octet-stream";
}
private static void sendBytes(FileInputStream fis, OutputStream os) throws Exception {
byte[] buffer = new byte[1024];
int bytes = 0;
while((bytes = fis.read(buffer)) != -1 )
os.write(buffer, 0, bytes);
}
}
|
The two methods contentType() and
sendBytes() are just helper methods, and should
be self explanatory. What you need to concentrate on is
the implementation of the processRequest()
method.
The fileName could also be a pathname to a
file in another directory. In this implementation, the
file name's start in the current directory. This detail
is up to you. You could append a "C:\Web\page\" to each
requested file, then all objects/pathnames would be
searched for in that directory. You're choice. Also,
this is platform dependent, for the simple matter of the
slashes used in pathnames. (A solution to this problem
is beyond the scope of this tutorial, but not terribly
difficult).
Telioses! That's it,
you're done. To try it out, just run this puppy on your
machine. Then, fire up your favorite browser, and direct
it to:
http://hostname:port#
/filename
where port# is the port you
have your ServerSocket listening on. Make sure
"filename" is in the appropriate directory, and that it
is one of the supported file types. hostname
is the name of the machine you are running the
WebServer, such as ws13.ug.cs.sunysb.edu or
129.49.238.126.
Conclusion Hopefully,
you're not staring at your monitor wondering what you
just read. I tried to keep things as clear and concise
as possible. Now that I've finished this tutorial, I'll
get the C# code for this up here ASAP, but I reccomend
you use this tutorial as a roadmap, and write the C#
code yourself. I'll post both Java and C# code samples
in the Tools
section.
| |