adeiladu cern httpd yn linux cyfoes
y rhyddhad olaf gweinydd gwe cyntaf… o 1996
hwyl: buddugoliaethusmeddalweddcso while i was writing another post i decided to see how hard it is to build and run cern httpd, the very first webserver. well, ok, this is the linux version, and v3.0A, so not the first – but the original is written for nextstep, and getting it to compile would likely be a much larger project.
this turned out to be a surprisingly easy process! i did this a couple years ago on an ancient linux (that i had lying around anyway) and it took multiple hours of battling dependency hell, mostly because it was ancient linux and i was not yet good at linux. this time around, it was like 30 minutes of tweaking c until it worked.
if you'd like to see the end result, you can clone it:
git clone https://genderphas.ing/projects/cern-httpd/
cd cern-httpd
make
but if you don't want to skip past my fun, then read on.
why?
this all started because, in an upcoming post, i make the assertion that hosting a website yourself has never been easy for laypeople, contrary to something someone said as an analogy in a video about something else.
but then i realized on reflection that i didn't actually know if that was true! so i figured i'd find the oldest webserver (from prior knowledge, from wikipedia, cern's httpd from 1991 – it was part of tim berners-lee's whole package when he invented the web) and try running it, to see how bad it really is.
as it turns out: not too terrible! pretty unfriendly for laypeople, of course, as was the tradition at the time, but for me, with my experience in webdev and sysadmin, really not that bad.
but this post isn't about using cern httpd – it's about building it on a modern toolchain.
how?
i run arch linux. that means as of this writing (see the published date) i have:
$ gcc --version
gcc (GCC) 14.2.1 20240910
$ clang --version
clang version 18.1.8
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /sbin
that's one of the upsides of arch – all my software is always maximally up to date. on the downside, i'm about to compile c from 1996. my experience with historical code was preparing me for a lot of pain.
first, we need to collect our dark materials:
mkdir ./cern-httpd; cd ./cern-httpd
curl -#O https://www.w3.org/Daemon/httpd/w3c-httpd-3.0A.tar.gz | tar xz
from there, we're expected to make
to build. on my box that immediately fails, because it calls ./BUILD
, which is written in csh, which doesn't exist on my box, and likely doesn't exist on yours either.
you could install csh if you wanted (tip: search for "tcsh" too) but luckily, there's already a ./BUILD.SH
written for posix shells. we can point the makefile at that instead and–
./BUILD.SH: line 21: [: too many arguments
okay. not a great sign. but it's an easy fix (delete the stray -s
) and address the compile error that actually dominates the screen:
/usr/include/string.h:380:14: error: conflicting types for ‘strcasestr’; have ‘char *(const char *, const char *)’
../../Library/Implementation/HTString.h:33:15: note: previous declaration of ‘strcasestr’ with type ‘char *(char *, char *)’
the bits i've bolded tell me that strcasestr
is missing const qualifiers, and the wrong one is in HTString.h
, because /usr/include/
is the system includes i'm trying to be compatible with.
toss in a couple of const qualifiers, and that compiles! unfortunately,
../../Library/Implementation/HTAccess.c: In function ‘HTLoadDocument’:
../../Library/Implementation/HTAccess.c:664:9: error: implicit declaration of function ‘time’ [-Wimplicit-function-declaration]
664 | time(&theTime);
| ^~~~
../../Library/Implementation/HTAccess.c:50:1: note: ‘time’ is defined in header ‘<time.h>’; this is probably fixable by adding ‘#include <time.h>’
49 | #include "HTError.h"
+++ |+#include <time.h>
50 |
../../Library/Implementation/HTAccess.c:666:13: error: implicit declaration of function ‘ctime’ [-Wimplicit-function-declaration]
666 | ctime(&theTime),
| ^~~~~
../../Library/Implementation/HTAccess.c:666:13: note: ‘ctime’ is defined in header ‘<time.h>’; this is probably fixable by adding ‘#include <time.h>’
ok! well, that's still not bad, let's just add the <time.h>
header–
ok, look, you can guess, reader. if you're following along, you're now going to spend 5+ minutes just adding headers, occasionally tweaking a function signature to match, and noticing some… interesting… warnings. i'm not transcribing it all for you.
one tip, though: make clobber all
instead of just make
, since the makefile doesn't always pick up changed files like it should.
instead, let's jump ahead to when we've got all the headers together. the thing compiles! but it doesn't link:
/home/user/clone/cern-httpd/Daemon/linux/../../Daemon/Implementation/HTPasswd.c:104:(.text+0x1bc): undefined reference to `crypt'
/home/user/clone/cern-httpd/Daemon/linux/../../Daemon/Implementation/HTPasswd.c:168:(.text+0x334): undefined reference to `crypt'
/home/user/clone/cern-httpd/Library/linux/../../Library/Implementation/HTTCP.c:150:(.text+0xd): undefined reference to `sys_nerr'
/sbin/ld: /home/user/clone/cern-httpd/Library/linux/../../Library/Implementation/HTTCP.c:150:(.text+0x29): undefined reference to `sys_errlist'
reading carefully this tells us that crypt
, sys_nerr
, and sys_errlist
are missing, and need to be linked. the first is a simple fix: add -lcrypt
to the LFLAGS
in the makefile for your system.
the other two are a bit harder. luckily we can find their documentation online, under man perror
, which tells us they're… deprecated, then that they were removed in glibc 2.31. heck.
carefully reading what they do, though, it becomes clear that strerror
is, if not a straight-up identical replacement, at least close enough for our purposes. the big hint is this line and especially the bolded bit:
The use of
sys_errlist[]
is nowadays deprecated; use strerror instead.
there's a few ways you could fix this. i just ripped out the extern sys_nerr
/extern sys_errlist
, then replaced their usage with strerror
, demolishing all the compile guards along the way.
and just like that – it links! and you can run it! and it was surprisingly easy to get there!
fun fact: i did all this against gcc at first, since that's what's linked as cc
on my box. when i swapped it out with clang, it immediately just worked. a pleasant surprise, since occasionally gcc's quirks get in the way of compatibility.
what now?
i don't plan to work on this any more in the near future, but that doesn't mean there's not some more progress that could be made.
for one thing, it has a lot of concerning warnings. for exmaple, all of the warnings about sprintf
output buffers being too small. are those real issues? false positives? uninvestigated! and there are plenty more warnings besides, even before enabling -pedantic
and -std=c23
, let alone building with asan or fuzzing.
if you're looking to pick this up, that's where i'd start. or don't, i'm not your dad. take it in a totally different direction! have fun with it! and then tell me about it!
whatever you do, though – don't run this thing in prod. not on a server you expect to maintain control of.