␛the serverless vendor lock-in
You've learned about serverless but want to avoid getting locked-in? Already got burned and promised to yourself it wouldn't happen again? Come aboard!
So, you've learned about serverless, maybe from Yann Cui (aka theburningmonk) and his "Yubl's road to serverless" series; if you haven't, go have a look at these links; and you now might be wondering about cloud / vendor lock-in.
Cloud lock-in
Wikipedia has a great article on it. Basically, in a serverless project, it's the mere fact that your AWS Lambda function (which can now run for up to 15 minutes) can't and won't run as-is on Google Cloud Functions (who finally got out of beta since July 2018). That Amazon's S3 and Google's Cloud Storage don't share the same inner workings and APIs. That DynamoDB (AWS' only 'serverless' as-in 'don't manage anything' database offering until recently - we now have Aurora Serverless, with it's own challenges) is very special, with some very specific requirements, and definitely doesn't have much to do with Google's Datastore or Firestore.
It simply means that migrating your serverless services from one provider to another will take some work. Sometimes lots of it, and that you might end up stuck with whoever you picked at the beginning. And let's not even get into multiple providers deployments / redundancies.
Thing is, with my associate we want to build a product our customers will be able to run on whatever provider's infrastructure they want.
So, how do we avoid getting locked-in in the first place ?
Interfaces and protocols to the rescue !
If you've had the pleasure to use another language than Javascript, you've probably encountered interfaces (in Typescript or Java), or protocols as they're sometimes called (in Swift or Clojure), and might even have had the joy of seamlessly switching database implementations without a hitch (a team did it some time ago, from Core Data to Realm, but if this message is still here it means I couldn't find the source) thanks to protocols! See where I'm going ?
Since this is about FaaS, and all of them support Node.js, I wrote the project I'll use as an example (my yet-to-be-released WebComics Reader app's serverless backend, available on GitHub) in TypeScript. The core concept remains the same and will apply whatever the language (as long as it supports interfaces / protocols, unlike JavaScript).
So, what will we use interfaces for? They'll help us in decoupling and separating our code into a common, shared, codebase, and provider (think AWS, GCP, FaunaDB, Cloudflare Workers,...) specific codebases.
The Core
Our webcomics backend will receive webhooks whenever a new strip is released, and store it. It will then return the stored results upon clients' requests. So, what should we keep in here?
Models
Our model will thus be quite simple, containing only some basic properties
export class Comic {
comicName: string;
id: string;
title: string;
imgUrl: string;
url: string;
content: string;
publishedDate: string;
constructor(comicName: string, id: string, title: string, imgUrl: string, url: string, content: string, publishedDate: string) {
this.comicName = comicName
this.id = id
this.title = title;
this.imgUrl = imgUrl;
this.url = url;
this.content = content;
this.publishedDate = publishedDate;
}
public static parse(comicName: string, strip: object): Comic {
throw new Error("Cannot call parse on abstract Comic Class");
}
};
Common logic
What common logic do we have? Event handlers, and... parsers! For the http event (aka webhook). Here's the webhook event handler:
export function handlePost(comicName: string, strip: object, provider: ComicsProvider): Promise<void> {
const comicType = comicFrom(comicName);
const comic = comicType.parse(comicName, strip);
return provider.post(comic);
}
function comicFrom(comicName: string) {
switch (comicName) {
case ComicNames.xkcd: return Comics.XKCD;
case ComicNames.questionnableContent: return Comics.QuestionnableContent;
default: throw new Error("Unknow or unavailable Comic: " + comicName);
}
}
See that provider
parameter ? Its type is an interface. But we'll get back to it later. Also, here's the parser for a new XKCD strip's event, for example (done through an extension of the abstract Comic
class):
export class XKCD extends Comic {
static idRegex = /(xkcd.com\/)(\d*)/i;
static contentRegex = /(alt=")(.*)(")/i;
public static parse = (comicName: string, strip: any): Comic => {
try {
let idRegexResult = XKCD.idRegex.exec(strip.url);
if (idRegexResult === null || idRegexResult.length < 3) {
throw new Error("Id regex failed for comic: " + comicName);
}
let id = idRegexResult[2];
let title = strip.title;
let imageUrl = strip.image;
let url = strip.url;
let contentRegexResult = XKCD.contentRegex.exec(strip.content);
if (contentRegexResult === null || contentRegexResult.length < 3) {
throw new Error("Content regex failed for comic: " + comicName);
}
let content = contentRegexResult[2];
let publishedDate = strip.published;
let comic = new Comic(comicName, id, title, imageUrl, url, content, publishedDate);
return comic;
} catch (error) {
console.log("Error: ", error);
throw new Error("Invalid XKCD Comic");
}
}
}
And Interfaces!
Yes, the interfaces definitely should be in your core, shared, codebase. Otherwise, they'd be useless. Here's ours !
export interface ComicsProvider {
getLatestStripFor(comicName: string): Promise<Comic>;
getStripsFor(comicName: string): Promise<Comic[]>;
post(comic: Comic): Promise<void>;
}
AWS Implementation
So, what does this leave to our implementation? Well, not much really. Mapping the event from the provider's specific format to yours, reporting errors (in this case to Cloudwatch journals via console.log), and calling the right function from your core! Don't believe me? Here's the whole AWS Lambda function for the post event:
const vandium = require("vandium")
import { APIGatewayEvent } from "aws-lambda"
import { handlePost as newStrip } from "webcomics-reader-webservices"
import { Provider } from "../providers/dynamoDB"
const provider = new Provider()
// tslint:disable: no-unsafe-any
exports.httpHandler = vandium.api()
.protection()
.POST(handlePost)
// tslint:enable: no-unsafe-any
.onError( (err: Error) => {
console.log("Error: " + err)
return err
})
async function handlePost(event: APIGatewayEvent) {
if (event.pathParameters === null || event.pathParameters === undefined) throw new Error("Invalid Parameters")
try {
const comicName = event.pathParameters.comicName
// For some reason, @types/aws-lambda believes API Gateway events bodies are strings...
const strip = event.body as unknown as object
await newStrip(comicName, strip, provider)
return { result: "Done" }
} catch (error) {
console.log("Error: ", error)
throw new Error("Failed")
}
}
Conclusion
So, here's how interfaces can be used to greatly reduce lock-in risks and make switching providers mega-super-duper-easier. But that's not all. Using interfaces, you can switch a DB with S3, or Google Cloud Datastore with a third-party API. The only constraint? Wrap these services correctly so that they conform to your interface, and make sure your interface is based on your service's needs, and not your provider's constraints and limits. Yes, I'm looking at you, DynamoDB.
If you want to see the project in full, it's available on Github (and pretty much a work in progress)
Also, here's some music for you to blast on your stereo on your way to freedom
Comments ()