Building a newsletter archive


Bullish is a stock market newsletter that I built to keep me informed about the market ups and downs and help with my investments.

One of the features on my radar was an archiving functionality to help with long-tail SEO and showcase previous editions.

The final code turned out to be pretty small, but it took a few trial and errors to get to the solution I'll detail below.

Bullish is nowadays referred to as JAMstack, meaning its a static website, no backend, just plain old HTML and Javascript. It's a cool new way to build modern products without worrying about infrastructure beforehand.

To build an Archive, first and foremost, we need to decide where to store content, in this case, each email sent, and the obvious solution is S3. You can overengineer whatever you like, but nothing will beat S3 for simple file storage.

The next challenge was to figure out how to route requests coming from https://bullish.email/archive to S3 but keep the URL’s with clean names like https://bullish.email/archive/2020-10/nasdaq-lost-1-57-today.html and for that, you need some sort of a reverse proxy.

Bullish’s website runs on Netlify, and they have an elegant solution to solve this baked into their product.

Netlify supports proxy redirects, and that’s what I needed. All I had to do was specify a netlify.toml with the rules, and everything pretty much worked on the first try.

With the two biggest challenges out of the way, I focused on wiring the code to upload to S3 after each email edition goes out and a rake task to update the archive every day triggered by Github actions as a scheduled event.

The rake task to update the archive index works by looping through all files in S3 for the current month and feeds that into an HTML template that uses the Mustache template engine to spit out a static index.html file that is uploaded back to S3 to become https://bullish.email/archive.

Inside S3, the archive bucket has the following folder structure labeled in the format year-month like 2020-10, 2020-09, where all the emails go plus an index.html with a list of all the files in each particular folder.

Another index.html sits in the root of the bucket, which is essentially a copy of the index file from the current month's folder, and that becomes the entry point for /archive.

During the rake task to update the archive index, we also update the directory, which recycles the past month's folder, let's say 2020-09 to be under /archive/directory. So /archive points to the current month’s index while /archive/directory has a list of all previous months like /archive/2020-09 and so on.

Another neat thing I did was to inject a popup snippet in the email file before it gets uploaded to S3. If somebody happens to land on an archive URL from a previous edition, they'll get an upsell to subscribe to Bullish in there.

And this concludes the last big-ticket item from the initial feature set I had planned in the back of my head when I started this project.

Now it's time to chill, regroup and decide what's next.

Cheers.