I’m architecting a product (for my employer Digital Pi) which is hosted within the Amazon Web Services (AWS) environment. We need to create a microservice that sends all mail for the system through a centralized Marketo instance. Marketo is a marketing automation platform, and in the parlance of Marketo what I will be doing is triggering a campaign to run against the leads I give it. Marketo has a REST API that can be used to accomplish this task.
This mail service will be called from a variety of places inside our AWS setup. Rather than set up a server and put nginx on it with a REST API built in NodeJS, and then forever have to deal with operational stuff related to that (security, backups, etc) I thought I’d bite the bullet and set up a Lambda function to send mail. I’ve decided to do it in .NET Core since I love .NET and am forced to program in NodeJS all the time.
First, I built a command line program to do the following:
- Read in a json document which describes the mail to send, and includes the tokens to pass to the Marketo campaign trigger. I’ll start with an embedded resource but later this will be picked up from S3 when the Lambda function is triggered.
- Query the Marketo API via REST to get the Lead IDs associated with my 1..n email targets
- If necessary, create leads via the Marketo API for any emails not already in the system
- Trigger the campaign via the Marketo API for all of the leads
In our system emails will generally be sent to a single email address, but there may be a case with notifications where we will want to send the same email notification to multiple people in one shot, so in this case the code difference is somewhat trivial and I want to keep options open. (Marketo has a 300 email per post limitation so eventually I’ll have to refactor this code if we send notifications but that’s somewhat of an unknown at the moment and you have to draw the line somewhere.)
After getting that code framed out and working, it was time to set up the Lambda function and call the same code from Lambda. The Lambda function will be triggered by an S3 PUT event for now (the various other pieces of our application will PUT a file to a specific bucket) but it’s fairly easy to set up REST-style endpoints with Amazon’s API Gateway and we may add that later. It will just mean another method in the class and another trigger, which seems pretty great to me.
public async Task<string> FunctionHandler(S3Event evnt, ILambdaContext context) { var s3Event = evnt.Records?[0].S3; if (s3Event == null) { return null; } Console.WriteLine("Handler: Attempting mail send"); try { Console.WriteLine($"Handler: getting object {s3Event.Object.Key} from bucket {s3Event.Bucket.Name}."); var response = await this.S3Client.GetObjectAsync(s3Event.Bucket.Name, s3Event.Object.Key); using (var stream = response.ResponseStream) { TextReader tr = new StreamReader(stream); var s3Document = tr.ReadToEnd(); Console.WriteLine("Handler: Got Data from Key"); Mailer.Mail(s3Document); Console.WriteLine("Handler: Mail processed successfully"); return null; } } catch (Exception e) { Console.WriteLine(e.Message); Console.WriteLine(e.StackTrace); Console.WriteLine("Handler: FAILED to process mail successfully"); throw; } }
This method gets called by the Lambda function with an ILambdaContext
which has things in it like MemorySize
and TimeRemaining
. It also has an ILambdaLogger
in it which has a method on it for logging, I found by accident that Console.WriteLine
went to the log as well, so I just use that (so my code works the same when run from the command line via the Program.Main
entry point). The bulk of the actual mail work is in the Mailer.Mail
static method.
When this code is executed in the Lambda environment, the class constructor will be passed an S3Client
property which is then set as a class property prior to this function running. The S3Client
has AWS credentials from the IAM role associated with the function and the AWS region will be set to the region where the Lambda function is executed.
The Lambda function relies on some Amazon packages, here is the project.json
file:
{ "version": "1.0.0-*", "buildOptions": { "debugType": "portable", "emitEntryPoint": true, "outputName": "OmegaEmailService" }, "dependencies": { "Microsoft.NETCore.App": { "type": "platform", "version": "1.0.1" }, "Npgsql": "3.1.9", "Newtonsoft.Json": "9.0.1", "System.Text.Encodings.Web": "4.3.0", "Amazon.Lambda.Core": "1.0.0", "Amazon.Lambda.Serialization.Json": "1.0.0", "Amazon.Lambda.Tools": { "type": "build", "version": "1.0.2-preview1" }, "Amazon.Lambda.S3Events": "1.0.0" }, "tools": { "Amazon.Lambda.Tools": "1.0.2-preview1" }, "frameworks": { "netcoreapp1.0": { "imports": "dnxcore50" } } }
I decided to upload the Lambda package via the AWS Console because the AWS Visual Studio plugin doesn’t work correctly, there are no roles on the second page of the AWS Publish wizard so you can’t actually deploy. The tools look cool but they don’t work at the moment, so for now my deployment process is “Publish” via Visual Studio to a Publish directory, then zip the directory and upload it to the AWS site, and set a bunch of options. After it’s been deployed, the function will refresh just by uploading the zip again. Magic and containers and yay.
Let’s talk about that VPC option. First, I thought well since I run the entire product in its own VPC so I guess I’ll put this function there too. This had a number of surprising side effects. First, the code couldn’t talk to S3 to get the object (after being triggered by said object) and better yet it returned no error which is always great. It timed out after 15 seconds, so I upped the timeout to 60 seconds and the same thing happened. After some Google, I discovered there’s something in the VPC setup called Endpoints which seems to be solely for allowing stuff in your VPC to reach S3.
I added an Endpoint…
…Lo and behold, I was able to get the contents of the S3 object from the bucket when the function was triggered. I instantly ran into another problem, the function calls out to Marketo’s API over the internet, and a Lambda Function doesn’t have outbound internet access unless you set up a NAT gateway on your VPC. Since I routed a normal Internet Gateway for my VPC and reconfiguring that was beyond the scope of this exercise (plus a NAT gateway runs about $30/month and that seemed steep) I decided to make the Lambda function execute outside of the VPC and then the function was able to access the Marketo REST API.
I felt comfortable deploying outside of the VPC for now because originally the Lambda function was going to look up REST credentials out of a database on RDS, but for some reason Npgsql does not load/work properly inside Lambda, even though it works fine under Linux elsewhere. I wanted to get to proof of concept so I hardcoded the REST credentials for now.**
Then it worked. Success.
I do have one very weird bug in my code that I think may be a bug in the Linq implementation or a comparator somewhere in .NET Core. The email documents we shove into S3 look like this (shortened):
{ "campaignId": 1461, "to": [ "kk@example.com", "kk+test1@example.com" ], "tokens": [ { "name": "my.EmailSubject", "value": "Password Recovery" } ] }
The function sends the email addresses to Marketo, and gets back Lead Ids that look like this:
{ "requestId": "12b04#158fb2f1e35", "result": [ { "id": 1013663, "email": "kk@example.com" }, { "id": 1023537, "email": "kk+test1@example.com" } ], "success": true }
Both of these json documents are parsed by Json.NET into a class where the emails are identically set as strings. Then, I have code like this:
var result = Leads.GetLeadsByEmailAddress(restCredentials, token, o.to).Result; var returnedLeads = result.result; var matchedLeads = o.to.Where(to => returnedLeads.Any(rl => rl.email == to)).Select(r => r); var unmatchedLeads = o.to.Where(to => returnedLeads.Any(rl => rl.email != to)).Select(r => r); Console.WriteLine($"Mailer: There are {matchedLeads.Count()} matched leads"); Console.WriteLine($"Mailer: There are {unmatchedLeads.Count()} unmatched leads");
I am aware that that code can (and it will) be much better but I was surprised about the result. On my computer here I get:
There are 2 matched leads
There are 0 unmatched leads
In the Lambda function I get:
02:35:19 There are 2 matched leads
02:35:19 There are 2 unmatched leads
I’ll have to figure out why that’s being bizarre, but at least the function is deployed and working. The debugging environment is not great on AWS Lambda, and there are no emulators for like there are for Azure App Service etc., so debugging comes down to the old write things to a log and then study the log. There were multiple instances where the function just terminated with no errors in the log which is also not ideal for debugging.
The logging display in the AWS Console is excellent, here’s a portion of it and you can see you can open up the log entries if they’re long and it even does pretty-printing if it detects json:
Overall this exercise was a success, especially since it’s one less server to have to deal with. We have a few more microservices to put together and I will definitely be doing them this way, especially if I can figure out what’s going on with Npgsql.
** Npgsql error is The type initializer for ‘Npgsql.TypeHandlerRegistry’ threw an exception. System.Reflection.ReflectionTypeLoadException: Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information. It seems to be a problem that some people have on some systems and not others, so I haven’t delved into it yet. There an issue on github that might be related but that runtime.unix.* namespace skeeves me out.