Automating Database Backups: Simplifying Operations with Scheduled Lambda Functions

Automating database backups is a critical aspect of software development, ensuring data integrity and resilience against unforeseen circumstances. While conventional methods like AWS RDS backups or storing backups on S3 are widely utilized, they may not always meet specific client requirements. Recently, I encountered a client with unique needs, prompting me to explore alternative solutions tailored to their preferences.
The journey began with researching and experimenting to develop a bespoke solution that seamlessly integrates with the client’s infrastructure. The outcome? A cron-scheduled Lambda function designed to handle database backups and securely upload them to a server via SFTP (Secure File Transfer Protocol).
The project is a Serverless Framework project, housing a Lambda function triggered by an AWS EventBridge Rule. This rule governs the execution schedule defined in the serverless.yml configuration file. At its core, the Lambda function utilizes tools like mysqldump for MySQL database backups and SFTP for secure file transfer.
Prerequisites
Before diving into setting up the project, ensure you have the following prerequisites:
- Node.js and npm are installed on your machine.
- Familiarity with the Serverless Framework.
Setup
Setting up the project is a breeze. Simply follow these steps:
- Clone the repository:
git clone https://github.com/vahiwe/serverless-aws-lambda-scheduled-cron.git
cd serverless-aws-lambda-scheduled-cron
2. Install dependencies:
npm install
Configuration
Configuration of the project involves a few key steps:
- Create Parameter Store Parameter: Set up a parameter store parameter named
lambda-ssh-key
with the value being the private key used to connect to the server. - Set Environment Variables: Configure the Lambda function with the necessary environment variables:
DB_ENDPOINT
: The database endpoint to connect to.DB_NAME
: The database name.DB_USER
: The database user.DB_PASS
: The database password.DB_PORT
: The database port.SFTP_HOST
: The SFTP host.SFTP_USER
: The SFTP user.NUM_BACKUPS
: The number of backups to keep on the server.
3. Remember not to expose these variables serverless.yml
for security reasons.
Additional Considerations
Several additional points are worth noting:
- The
tools
folder contains necessary executables (mysqldump
andgzip
) for Linux. You can replace these with different versions if needed. - The executables are for
x64
architecture. Adjust as necessary for other architectures. - Libraries required by
mysqldump
(libcrypto.so.1.0.0
andlibssl.so.1.0.0
) are located in thetools/lib
folder. Replace them as needed.
Schedule Event Types
The project offers flexibility in scheduling backups using either rate
or cron
expressions. Here's how they work:
- Rate Expressions Syntax: Define schedules based on intervals. For example,
rate(1 minute)
triggers the function every minute.
rate(value unit)
- Cron Expressions Syntax: Define schedules using cron-like expressions.
cron(Minutes Hours Day-of-month Month Day-of-week Year)
Detailed syntax is available in AWS documentation.
Usage
After deploying the function, you can configure backup retention settings according to your requirements. By limiting the number of backups retained, you can effectively manage storage space and ensure that only relevant backups are retained on the server. This feature helps in removing redundant backups, thereby optimizing storage utilization and streamlining backup management processes.
Deployment
To deploy the function, ensure you have Serverless Framework installed and AWS credentials configured. Then, execute:
serverless deploy
Local Invocation
Test your function locally using:
serverless invoke --function dbBackupHandler
Logs
View logs using:
serverless logs --function dbBackupHandler --tail
Cleanup
Remove resources using:
serverless remove
Exploring Alternative Solutions
While the solution provides a tailored solution for niche cases, it’s important to explore alternative approaches that might better suit your needs. The most commonly employed method is using RDS automated backups. One can also push the backups from the lambda function to S3 instead of to a server.
RDS Automated Backups
Amazon RDS offers a convenient automated backup feature that takes regular snapshots of your database instances. These snapshots are automatically retained for a specified period, allowing for easy point-in-time recovery and eliminating the need for manual backup management. If your database is hosted on RDS and your backup requirements align with its capabilities, leveraging RDS automated backups can be a straightforward and efficient solution.
Pushing Backups to Amazon S3
Another approach similar to the current solution is to use Lambda functions to create backups and push them directly to Amazon S3. By leveraging Lambda’s integration with S3 and AWS SDKs, you can automate the backup process and store backups securely in scalable S3 buckets. This method offers flexibility in terms of storage management and allows for seamless integration with other AWS services.
Conclusion
The project was an exciting journey rooted in understanding and meeting our client’s needs. While we initially explored more conventional options like AWS RDS backups or S3 storage, it quickly became apparent that our client had a unique vision. Embracing this vision led us to develop a tailored solution perfectly aligned with their requirements. This project reminds us of the importance of listening to our clients and being open to unconventional approaches. Ultimately, it’s about finding the right fit for the job, even if it means stepping outside the usual playbook.