intro

CGI has been a staple of the web for quite a long time. In the 90s, it has driven much of the complex web applications due to its accessibility; programmers didn’t have to handle connections and encrypting, do multiplexing or any of that complex protocol stuff, they just had a list of environment variables that they extracted the headers from and they got the body from standard-input. The protocol allowed programmers to freely write and worry less about the language that they’re using, if it supported short-lived processes (it executed then exitted), then it supported HTTP.

Whilst CGI was in many ways benefitial and probably led to where the web is at now, it has two downsides:

  1. The overhead was high
  2. The throughput was low

In essence, it wasn’t performant. But what could the web do? It relied on CGI for much of its popular web applications, people couldn’t simply let go of such a useful and simple protocol. This is where FastCGI comes in.

fastcgi

FastCGI has the same purpose of CGI, which is to abstract away the HTTP protocol from the application, but with the intent of achieving high performance. How would FastCGI do this? Well, instead of short-lived process, it used long-lived processes. All connections would go through a socket that the server(the application) has created and is listening to; all reads from that socket would read the body of the request and all writes would be written to the http client.

The performance gain wasn’t only theoretical; it actually worked. Afterwards, FastCGI spread around like wildfire, being implemented in languages like C, C++, Lua, Java and even shell scripts. Not only that but it gave accidental birth to PHP, which later became the web’s most used language. PHP never implemented HTTP and instead depend on FastCGI.

In this day and age, we have essentially reached the pinnacle of the web. We have high speed connections that are 10x higher faster than say in the 90s, we have a wide range of languages we could use, and we could finally support features like real-time streaming which was unheard of in the 90s. Not only that, but we could generate the user interface instead of generating dynamic pages, allowing for immersive web pages that feel and act like real software installed on your computer.

good software design

Though, one area is still missing and that is good software design. Web applications have become worse as time passed; their designs are copied from one company to another, there is no creativity in the web anymore; everything is so santizied and standardized. As a web developer, I want to experiment with new ideas and new languages but I am left at a handicap because what is popular is what works.

Think about it: When was the last time you heard about a web project written in Smalltalk, Eiffel, Tcl, or even D? Almost never, right? Of course, just because a language exists doesn’t mean you should use it, but you should have the ability to use it.

This is what cl-scgi aims to be. I love the language Common Lisp but its web libraries are of piss poor quality. It has http libraries that only support the 1.1 version which isn’t acceptable because the 2.0 is much more performant. However, HTTP 2.0’s specification is 150% of HTTP 1.1’s specification, hinting at an underlying complexity with the implementation.

Similarily, FastCGI is a lot more complex than it needs to be. Both, the Simple CGI and Fast CGI provide virtually the same functionality but with SCGI it is that much simple to implement.

implementing scgi

If you want to implement SCGI, you should know how to do the following in the language you’re dealing with:

  1. creating sockets and closing them
  2. listening for connections and accepting them
  3. sending to and reading from connections
  4. parsing a SCGI request

That’s literally all, check out the specification if you don’t believe me. The barebones build (without any helper or deprecated functions) of cl-scgi stands at about 100-200 lines of readable code whilst FastCGI implementation is at ~3000 lines of code(libfcgi).

Simpler code is better and more performant code.

using cl-scgi

Using cl-scgi is quite simple, just invoke tcp-server or unix-server with your callback and you’re done. Here’s an example:

(tcp-server
 "127.0.0.1"
 6970
 (lambda (head len stream)
     (declare (type (vector (unsigned-byte 8)) head))
     (declare (integer len))
     (let ((headers (cl-scgi:parse-headers (babel:octets-to-string head)))
           (body (cl-scgi:read-until-content-length len stream)))
       ;; print headers as a list
       (print (alexandria:hash-table-plist headers))
       (print (babel:octets-to-string body))
       (cl-scgi:response-string (make-hash-table) "asd" stream)
       (force-output stream))))

Now, if you link your favorite web server(nginx for me) with scgi and point scgi to 127.0.0.1:6971 and do a request, it should print out the following:

# i have nginx set up so that all scgi requests
# go to 127.0.0.1:6971
curl http://localhost:8080/scgi 
# prints: asd

helper functions

Besides tcp-server and unix-server, cl-scgi has other helper functions that help out with using cl-scgi as a web server.

request-parsing methods

Let’s start with the request parsing functions. You have:

  1. parse-request-from-stream: A function that takes in a stream and returns a vector of headers in (unsigned-byte 8) form and a number, the number representing the amount of bytes that should be in the body.
  2. parse-request: A function that takes in a request in the form of (vector (unsigned-byte 8)) or bytes and returns the header and body bytes.
  3. parse-request-with-headers: A function that takes in a request in the form (vector (unsigned-byte 8)) or bytes and returns the header as a hash map and the body bytes.
  4. parse-request-as-string: A function that takes in a request in the form (vector (unsigned-byte 8)) or bytes and returns the header as a hash map and the body as a string.

All of the functions above return values as a (values) value and not as a list. You can convert them easily to a list using (multiple-value-list). Do note that parse-request, parse-request-with-headers and parse-request-as-string are all deprecated functions and will be removed in future releases.

You could create the equivalent of these functions through babel:octets-to-string and cl-scgi:parse-headers.

byte-reading methods

There are 2:

  1. read-until-eof: takes in a stream and reads until EOF is encountered, then returns a vector of the bytes
  2. read-until-content-length: takes in a content-length(number) and a stream and reads until the content-length is reached, then returns a vector of the bytes.

response-writing methods

There are 3 functions:

  1. write-bytes: the same as write-sequence without the start and end keywords and works with streams that only support write-byte
  2. format-headers: takes in a hash-table, turns it into HTTP-compatible headers and returns it as a vector of bytes.
  3. response-string: takes in a header hash-map, a body string and a stream and writes all of those to the stream in a way that’s compatible with the SCGI and HTTP protocol.

cancelling methods

One thing I failed to mention is that requests can technically be invalid. Some requests sent by faulty servers could throw an error, or the callback could have bad parameters. That error, however, can be handled through one of four options:

  1. STOP: stops accepting connections all-togehter
  2. CONTINUE-CLOSE: continues accepting connections but closes this one.
  3. CONTINUE-EXEC: continues accepting connections but also executes the global continue-callback.
  4. CONTINUE-EVAL: continues accepting connections but also will execute a user-prompted lambda.

With option 3, the programmer can bind continue-callback from outside the package to specify a custom callback when option 3 is chosen. This callback only receives a stream and is executed after option 3 is selected.

Do note that option 3 will not show if it is set to nil (default-value).

multi-threaded behaivor

Both unix-server and tcp-server can become multi-threaded by setting the multi-threaded key to t.

maximum accepted connections

There is a key in both unix-server and tcp-server that limits the amount of connections that are being accepted, its named is :backlog. To accept more/less connections than the default (128) simply set it to the desired amount.

connections are closed by-default

Never close connections on your own cause connections, by default, are closed by the library itself. It’s fine to use finish-output or force-output but not close.

final note

This is my first library in general so expect bugs and undocumented areas. It is in alpha status with not a lot of unix tests plus it is not even tested in any other Common Lisp implementation besides SBCL.