Deploying a Jekyll Blog to Amazon S3 Without .html Extensions
Deploying Jekyll sites on Amazon S3 is an efficient, performant, and super cheap method for web hosting of mostly-static websites. But one thing that bothered me is that, by default, Jekyll outputs your posts with a .html extension, so requests to S3 will have to have the extension as well.
For example:
$ curl -I http://example.com/blog/my-post.html HTTP/1.1 200 OK $ curl -I http://example.com/blog/my-post HTTP/1.1 404 Not Found
Since I personally dislike having unnecessary extras in my URLs, I always avoid file extensions like .html on my sites. This poses a problem with S3, because the built-in request rewrite tools are cumbersome and not overly flexible.
So, the simplest way to resolve this issue is to simply remove the file extension, right? Well, I couldn’t find any documentation to support the ability to customize (or remove) the file extension of Jekyll posts. It seems that .html is the only supported output for posts, so something custom seems to be required.
After browsing around for some plugins that might provide this, I finally decided to just do it the old fashion way and add a step to my deploy script to just rename each post file to remove the suffix.
Here’s a look at how that works:
# Remove the .html extension from all blog posts for sexy URLs for filename in $DEPLOY_DIR/blog/*.html; do if [ $filename != "$DEPLOY_DIR/blog/index.html" ]; then original="$filename" # Get the filename without the path/extension filename=$(basename "$filename") extension="${filename##*.}" filename="${filename%.*}" # Move it mv $original $DEPLOY_DIR/blog/$filename fi done
Just a simple loop over the .html files in the /blog directory, ignoring index.html, and stripping the extension.
If you’re curious, here’s what the full deploy script looks like:
#!/bin/bash # # Cleans and deploys the project to S3. # # Usage: # ./deploy.sh <ACCESS_KEY> <SECRET_KEY> # Initialize some vars export AWS_ACCESS_KEY_ID="$1" export AWS_SECRET_ACCESS_KEY="$2" export AWS_DEFAULT_REGION="us-east-1" export BUCKET="kylewbanks.com" export DEPLOY_DIR=".deploy" # Build jekyll jekyll build # Copy the site directory to a temporary location so that modifications we make don't get overwritten by the Jekyll server # that is potentially running mkdir -p $DEPLOY_DIR cp -a _site/. $DEPLOY_DIR # Remove the .html extension from all blog posts for sexy URLs for filename in $DEPLOY_DIR/blog/*.html; do if [ $filename != "$DEPLOY_DIR/blog/index.html" ]; then original="$filename" # Get the filename without the path/extension filename=$(basename "$filename") extension="${filename##*.}" filename="${filename%.*}" # Move it mv $original $DEPLOY_DIR/blog/$filename fi done # Now upload to s3, deleting any items that no longer exist aws s3 sync --delete $DEPLOY_DIR s3://$BUCKET # Finally, upload the blog directory specifically to force the content-type aws s3 cp "$DEPLOY_DIR/blog" s3://$BUCKET/blog --recursive --content-type "text/html" # Cleanup rm -r $DEPLOY_DIR
You’ll notice the second last line re-uploads the /blog directory to explicitly set the content type. This was required because without the .html file extension, it seems that the S3 content-type guessing becomes suspect, and you may or may not get the proper content-type.
Running this script from the root of your Jekyll project will upload the entire site to S3, with clean URLs for all blog posts.