Automating Database Backups: Simplifying Operations with Scheduled Lambda Functions

Published in

kloia

4 min readMar 2, 2024

Automating database backups is a critical aspect of software development, ensuring data integrity and resilience against unforeseen circumstances. While conventional methods like AWS RDS backups or storing backups on S3 are widely utilized, they may not always meet specific client requirements. Recently, I encountered a client with unique needs, prompting me to explore alternative solutions tailored to their preferences.

The journey began with researching and experimenting to develop a bespoke solution that seamlessly integrates with the client’s infrastructure. The outcome? A cron-scheduled Lambda function designed to handle database backups and securely upload them to a server via SFTP (Secure File Transfer Protocol).

The project is a Serverless Framework project, housing a Lambda function triggered by an AWS EventBridge Rule. This rule governs the execution schedule defined in the serverless.yml configuration file. At its core, the Lambda function utilizes tools like mysqldump for MySQL database backups and SFTP for secure file transfer.

Prerequisites

Before diving into setting up the project, ensure you have the following prerequisites:

Node.js and npm are installed on your machine.
Familiarity with the Serverless Framework.

Setup

Setting up the project is a breeze. Simply follow these steps:

Clone the repository:

git clone https://github.com/vahiwe/serverless-aws-lambda-scheduled-cron.git
cd serverless-aws-lambda-scheduled-cron

2. Install dependencies:

npm install

Configuration

Configuration of the project involves a few key steps:

Create Parameter Store Parameter: Set up a parameter store parameter named lambda-ssh-key with the value being the private key used to connect to the server.
Set Environment Variables: Configure the Lambda function with the necessary environment variables:

DB_ENDPOINT: The database endpoint to connect to.
DB_NAME: The database name.
DB_USER: The database user.
DB_PASS: The database password.
DB_PORT: The database port.
SFTP_HOST: The SFTP host.
SFTP_USER: The SFTP user.
NUM_BACKUPS: The number of backups to keep on the server.

3. Remember not to expose these variables serverless.yml for security reasons.

Additional Considerations

Several additional points are worth noting:

The tools folder contains necessary executables (mysqldump and gzip) for Linux. You can replace these with different versions if needed.
The executables are for x64 architecture. Adjust as necessary for other architectures.
Libraries required by mysqldump (libcrypto.so.1.0.0 and libssl.so.1.0.0) are located in the tools/lib folder. Replace them as needed.

Schedule Event Types

The project offers flexibility in scheduling backups using either rate or cron expressions. Here's how they work:

Rate Expressions Syntax: Define schedules based on intervals. For example, rate(1 minute) triggers the function every minute.

rate(value unit)

Cron Expressions Syntax: Define schedules using cron-like expressions.

cron(Minutes Hours Day-of-month Month Day-of-week Year)

Detailed syntax is available in AWS documentation.

Usage

After deploying the function, you can configure backup retention settings according to your requirements. By limiting the number of backups retained, you can effectively manage storage space and ensure that only relevant backups are retained on the server. This feature helps in removing redundant backups, thereby optimizing storage utilization and streamlining backup management processes.

Deployment

To deploy the function, ensure you have Serverless Framework installed and AWS credentials configured. Then, execute:

serverless deploy

Local Invocation

Test your function locally using:

serverless invoke --function dbBackupHandler

Logs

View logs using:

serverless logs --function dbBackupHandler --tail

Cleanup

Remove resources using:

serverless remove

Exploring Alternative Solutions

While the solution provides a tailored solution for niche cases, it’s important to explore alternative approaches that might better suit your needs. The most commonly employed method is using RDS automated backups. One can also push the backups from the lambda function to S3 instead of to a server.

RDS Automated Backups

Amazon RDS offers a convenient automated backup feature that takes regular snapshots of your database instances. These snapshots are automatically retained for a specified period, allowing for easy point-in-time recovery and eliminating the need for manual backup management. If your database is hosted on RDS and your backup requirements align with its capabilities, leveraging RDS automated backups can be a straightforward and efficient solution.

Pushing Backups to Amazon S3

Another approach similar to the current solution is to use Lambda functions to create backups and push them directly to Amazon S3. By leveraging Lambda’s integration with S3 and AWS SDKs, you can automate the backup process and store backups securely in scalable S3 buckets. This method offers flexibility in terms of storage management and allows for seamless integration with other AWS services.

Conclusion

The project was an exciting journey rooted in understanding and meeting our client’s needs. While we initially explored more conventional options like AWS RDS backups or S3 storage, it quickly became apparent that our client had a unique vision. Embracing this vision led us to develop a tailored solution perfectly aligned with their requirements. This project reminds us of the importance of listening to our clients and being open to unconventional approaches. Ultimately, it’s about finding the right fit for the job, even if it means stepping outside the usual playbook.