cl-scgi: scgi in common lisp
intro⌗
CGI has been a staple of the web for quite a long time. In the 90s, it has driven much of the complex web applications due to its accessibility; programmers didn’t have to handle connections and encrypting, do multiplexing or any of that complex protocol stuff, they just had a list of environment variables that they extracted the headers from and they got the body from standard-input. The protocol allowed programmers to freely write and worry less about the language that they’re using, if it supported short-lived processes (it executed then exitted), then it supported HTTP.
Whilst CGI was in many ways benefitial and probably led to where the web is at now, it has two downsides:
- The overhead was high
- The throughput was low
In essence, it wasn’t performant. But what could the web do? It relied on CGI for much of its popular web applications, people couldn’t simply let go of such a useful and simple protocol. This is where FastCGI comes in.
fastcgi⌗
FastCGI has the same purpose of CGI, which is to abstract away the HTTP protocol from the application, but with the intent of achieving high performance. How would FastCGI do this? Well, instead of short-lived process, it used long-lived processes. All connections would go through a socket that the server(the application) has created and is listening to; all reads from that socket would read the body of the request and all writes would be written to the http client.
The performance gain wasn’t only theoretical; it actually worked. Afterwards, FastCGI spread around like wildfire, being implemented in languages like C, C++, Lua, Java and even shell scripts. Not only that but it gave accidental birth to PHP, which later became the web’s most used language. PHP never implemented HTTP and instead depend on FastCGI.
In this day and age, we have essentially reached the pinnacle of the web. We have high speed connections that are 10x higher faster than say in the 90s, we have a wide range of languages we could use, and we could finally support features like real-time streaming which was unheard of in the 90s. Not only that, but we could generate the user interface instead of generating dynamic pages, allowing for immersive web pages that feel and act like real software installed on your computer.
good software design⌗
Though, one area is still missing and that is good software design. Web applications have become worse as time passed; their designs are copied from one company to another, there is no creativity in the web anymore; everything is so santizied and standardized. As a web developer, I want to experiment with new ideas and new languages but I am left at a handicap because what is popular is what works.
Think about it: When was the last time you heard about a web project written in Smalltalk, Eiffel, Tcl, or even D? Almost never, right? Of course, just because a language exists doesn’t mean you should use it, but you should have the ability to use it.
This is what cl-scgi
aims to be. I love the language Common Lisp but its web libraries are of piss poor quality. It has http libraries that only support the 1.1
version which isn’t acceptable because the 2.0
is much more performant. However, HTTP 2.0
’s specification is 150% of HTTP 1.1
’s specification, hinting at an underlying complexity with the implementation.
Similarily, FastCGI is a lot more complex than it needs to be. Both, the Simple CGI and Fast CGI provide virtually the same functionality but with SCGI it is that much simple to implement.
implementing scgi⌗
If you want to implement SCGI, you should know how to do the following in the language you’re dealing with:
- creating sockets and closing them
- listening for connections and accepting them
- sending to and reading from connections
- parsing a SCGI request
That’s literally all, check out the specification if you don’t believe me. The barebones build (without any helper or deprecated functions) of cl-scgi
stands at about 100-200 lines of readable code whilst FastCGI implementation is at ~3000 lines of code(libfcgi).
Simpler code is better and more performant code.
using cl-scgi
⌗
Using cl-scgi
is quite simple, just invoke tcp-server
or unix-server
with your callback and you’re done. Here’s an example:
(tcp-server
"127.0.0.1"
6970
(lambda (head len stream)
(declare (type (vector (unsigned-byte 8)) head))
(declare (integer len))
(let ((headers (cl-scgi:parse-headers (babel:octets-to-string head)))
(body (cl-scgi:read-until-content-length len stream)))
;; print headers as a list
(print (alexandria:hash-table-plist headers))
(print (babel:octets-to-string body))
(cl-scgi:response-string (make-hash-table) "asd" stream)
(force-output stream))))
Now, if you link your favorite web server(nginx for me) with scgi and point scgi to 127.0.0.1:6971
and do a request, it should print out the following:
# i have nginx set up so that all scgi requests
# go to 127.0.0.1:6971
curl http://localhost:8080/scgi
# prints: asd
helper functions⌗
Besides tcp-server
and unix-server
, cl-scgi
has other helper functions that help out with using cl-scgi
as a web server.
request-parsing methods⌗
Let’s start with the request parsing functions. You have:
parse-request-from-stream
: A function that takes in a stream and returns a vector of headers in(unsigned-byte 8)
form and a number, the number representing the amount of bytes that should be in the body.parse-request
: A function that takes in a request in the form of(vector (unsigned-byte 8))
or bytes and returns the header and body bytes.parse-request-with-headers
: A function that takes in a request in the form(vector (unsigned-byte 8))
or bytes and returns the header as a hash map and the body bytes.parse-request-as-string
: A function that takes in a request in the form(vector (unsigned-byte 8))
or bytes and returns the header as a hash map and the body as a string.
All of the functions above return values as a (values)
value and not as a list. You can convert them easily to a list using (multiple-value-list)
. Do note that parse-request
, parse-request-with-headers
and parse-request-as-string
are all deprecated functions and will be removed in future releases.
You could create the equivalent of these functions through babel:octets-to-string
and cl-scgi:parse-headers
.
byte-reading methods⌗
There are 2:
read-until-eof
: takes in a stream and reads until EOF is encountered, then returns a vector of the bytesread-until-content-length
: takes in a content-length(number) and a stream and reads until the content-length is reached, then returns a vector of the bytes.
response-writing methods⌗
There are 3 functions:
write-bytes
: the same aswrite-sequence
without the start and end keywords and works with streams that only supportwrite-byte
format-headers
: takes in a hash-table, turns it into HTTP-compatible headers and returns it as a vector of bytes.response-string
: takes in a header hash-map, a body string and a stream and writes all of those to the stream in a way that’s compatible with the SCGI and HTTP protocol.
cancelling methods⌗
One thing I failed to mention is that requests can technically be invalid. Some requests sent by faulty servers could throw an error, or the callback could have bad parameters. That error, however, can be handled through one of four options:
- STOP: stops accepting connections all-togehter
- CONTINUE-CLOSE: continues accepting connections but closes this one.
- CONTINUE-EXEC: continues accepting connections but also executes the global continue-callback.
- CONTINUE-EVAL: continues accepting connections but also will execute a user-prompted lambda.
With option 3, the programmer can bind continue-callback from outside the package to specify a custom callback when option 3 is chosen. This callback only receives a stream and is executed after option 3 is selected.
Do note that option 3 will not show if it is set to nil (default-value).
multi-threaded behaivor⌗
Both unix-server
and tcp-server
can become multi-threaded by setting the multi-threaded
key to t
.
maximum accepted connections⌗
There is a key in both unix-server
and tcp-server
that limits the amount of connections that are being accepted, its named is :backlog
. To accept more/less connections than the default (128) simply set it to the desired amount.
connections are closed by-default⌗
Never close connections on your own cause connections, by default, are closed by the library itself. It’s fine to use finish-output
or force-output
but not close
.
final note⌗
This is my first library in general so expect bugs and undocumented areas. It is in alpha status with not a lot of unix tests plus it is not even tested in any other Common Lisp implementation besides SBCL.