Skip to content
Storage Made Easy Blog

Storage Made Easy Blog

Enterprise File Fabric™

Posted on February 16, 2020October 5, 2020 by Steven Sweeting

Using AWS Lambda to Automatically Sync Metadata With S3 Events

Lambda Metadata syncThe Enterprise File Fabric indexes Amazon S3 metadata to provide a number of enhanced file services over S3 object storage including reading/browsing and searching. The File Fabric has its own indexing engine to provide these services.

Applications may update/upload/delete objects through the File Fabric or in a bi-modal fashion directly through S3 APIs. When objects are updated directly using S3 APIs the File Fabric metadata must also be updated. There are a number of ways that the metadata can be updated:

    • On Demand – A user can select “Cloud Refresh” when browsing via the web, mobile or desktop apps.  This user driven event will re-sync the folder as needed.  Alternatively, by enabling “Real-time refresh”, metadata is updated automatically, as folders are browsed.  Finally, a third option  utilizing “Folders auto-refresh”  triggers asynchronous folder refreshes.
    • Periodically – To update the metadata for an entire S3 account (provider), a “Provider Re-sync” can be initiated.  Running as a background task the File Fabric re-syncs metadata for ALL files/folders within the S3 provider.  Periodic re-syncs can be scheduled to run (daily, weekly, etc.)  via the Dashboard.
    • On S3 Event – The S3 bucket creates an event on object creation or deletion which is sent and processed by the File Fabric.

Users of the File Fabric are likely familiar with the first two approaches so this article walks through the third approach. We’ll look at how to process “S3 Event” notifications that AWS generates when objects are created and deleted (In a future blog we’ll look at monitoring the S3 API Calls logged in Cloud Trails).

For more information on indexing and the reason why you may choose to use the third approach please see our earlier blog post –  How to Optimize S3 and S3 Compatible Object Storage Solutions for End User Access with Hundreds of Millions of Files.

Lambda Architecture

A script runs as a AWS Lambda function processing S3 event notifications. The flow is:

    1. S3 API – External applications call Amazon S3 APIs to create and delete objects.
    2. S3 Events – Amazon S3 buckets send S3 Event notifications to an Amazon SQS queue.
    3. Message – Our AWS Lambda function receives messages from the queue which contain one or more S3 Event notifications.
    4. Request Sync – The Lambda function asks the File Fabric to refresh only the updated objects via the File Fabric’s REST API. (SyncOnProviderEvent – available since v1906.00)
    5. Object Sync – The File Fabric verifies the object status with S3 and updates its metadata.

Enterprise File Fabric S3 Sync with AWS Lambda

AWS Architecture for S3 Sync
Enterprise File Fabric S3 Sync with AWS Lambda

S3 Events can only be sent to queues in the same region. For buckets in the other region send events through to an Amazon Simple Notification Service topic in that region which in turn forwards to the queue in the target region.

Installation

The files for this example can be downloaded here: jibe-blog1.zip.

The script has been built with Python 3.8.1. It uses the third-party library “requests” which can be installed via pip:

pip install requests

The AWS Lambda function, SQS queues, buckets and if needed, SNS topics can configured through the AWS Management Console.

A Makefile is also provided if you’d like to automate the setup of the AWS Lambda function and SQS Queues. The Makefile scripts use the AWS CLI.

Installing the AWS CLI

Install the AWS CLI. The Python script and make recipes have been tested with AWS CLI Version 1.

The Makefile and recipes use an AWS profile ‘sme_jibe_sync’. Set the credentials for this profile using the AWS CLI:

$ aws --profile sme_jibe_sync configure
AWS Access Key ID [None]: AKASJTERCESKJHEBMNQ
AWS Secret Access Key [None]: AKShasd98723bjks963234
Default region name [None]: us-east-1
Default output format [None]:

Automated Installation

Step 1. make aws-install

Create a configuration file with the command

make config.mak

Review the settings in the file:

# Configuration settings used in Makefile
export AWS_PROFILE=sme_jibe_sync
export AWS_REGION=us-east-1
TOPIC_REGIONS=us-west-1 eu-west-2
AWS_ACCOUNT_ID=0123456789
QUEUE_NAME=bucket-activity
LAMBDA_ROLE_NAME=sme_jibe_lambda_role
FUNCTION_NAME=sme_jibe_sync
TOPIC_NAME=bucket-activity
AWSCLIOPTS=--region $(AWS_REGION) --profile $(AWS_PROFILE)

To deploy the components via the CLI

make aws-install

The connection between the queue and the function is created by the script but not enabled on purpose. You MUST enable it via the AWS Console when ready.

Manual Installation

These are the steps to configure the Lambda function manually. They can also be useful in troubleshooting any issues with the automated setup.

Set up a Queue

Create a “Standard” SQS Queue in the region where your S3 buckets are located.

    1. Navigate to the SQS Management Console.
    2. Change regions to where (most of) your S3 buckets are located.
    3. Create a Standard Queue with the Name bucket-activity.
    4. Select the Permissions tab and choose Add a Permission:
      • Effect – Allow
      • Principal – Everybody (*)
      • Actions – SendMessage
    5. After you hit Save you should see:
      Effect Principals Actions Conditions
      Allow Everybody (*) SQS:SendMessage None

Advanced: You could also restrict incoming messages from specific buckets using a Condition such as Condition = ArnLike, Key = aws:SourceArn, Value =arn:aws:s3:::my-bucket1

Send S3 Events to SQS Queue

Select a bucket to configure from the S3 Management Console.

    1. Go to the Properties tab
    2. Scroll to Advanced settings and select Events
    3. Select Add notification
    4. Change Send to to SQS Queue
    5. Select your queue from the drop-down
  1. If you get this error you’ll need to add permissions to your SQS Queue.
Unable to validate the following destination configurations.
Permissions on the destination queue do not allow S3 to
publish notifications from this bucket.
(arn:aws:sqs:us-east-1:1234567890:bucket-activity)

Send S3 Events to SQS Topic

You can’t send S3 events to a queue in another region. Instead, create an SNS topic.

Select a bucket to configure from the S3 Management Console.

    1. Go to the Properties tab
    2. Scroll to Advanced settings and select Events
    3. Select Add notification
    4. Change Send to to SNS Topic
    5. Select your topic from the drop-down

If you get this error you’ll need to add permissions to your SNS Topic.

Unable to validate the following destination configurations.
Permissions on the destination topic do not allow S3 to
publish notifications from this bucket.
(arn:aws:sns:us-east-1:1234567890:bucket-activity)

Creating a Lambda Function

Create a config.json file with an Enterprise File Fabric endpoint and credentials. For example,

{
    "apiendpoint" : "https://storagemadeeasy.com",
    "login" : "adminuser@example.com",
    "password" : "Passsword123"
}

Create a zip of the script and config file:

zip sme_jibe_sync.zip jibe_core.py config.json

Navigate to the AWS Lambda Management Console.

    1. Choose Create function.
    2. Choose Author from scratch.
    3. Basic information:
      • Function name: sme_jibe_sync
      • Runtime: python3.8
    4. Permissions:
      • Create a new role and attach these permission policies
        • AWSLambdaRole
        • AWSLambdaSQSQueueExecutionRole
    5. Submit (which creates function)
    6. Verify you are in the Designer view with the Lambda function selected.
      • Scroll down to Function code.
      • Change Handler to sme_jibe_sync.lambda_handler.
      • Change Code entry type to Upload a .zip file.
      • Select Upload and locate sme_jibe_sync.zip.
      • Change timeout from 3 to 60 seconds.
      • Set throttle to 10.
      • Select Save, you’ll stay on the same page.
    7. Select Layers, still within the Designer tab.
    8. Select Create layerAdd third-party libraries. This package is courtesy of to Keith’s Layers (Klayers).
      arn:aws:lambda:us-east-1:770693421928:layer:Klayers-python38-requests:1

Add Queue

From the Lambda Console select the function and view the Configuration.

Add a trigger from the Queue:

    1. Select Add trigger
    2. Select SQS Queue
    3. Choose an SQS queue to read messages from.
    4. Enable trigger
    5. Then Add

The SQS queue will now be visible as a trigger.

Facebooktwitterredditpinterestlinkedinmailby feather
The following two tabs change content below.
  • Bio
  • Latest Posts

Steven Sweeting

Latest posts by Steven Sweeting (see all)

  • Using Storage Made Easy File Fabric with Exoscale Object Storage - July 6, 2021
  • Jibe: Real-time Content Monitoring of Files and Objects for Compliance and Search - November 20, 2020

Related posts:

  1. Ansible: Time series data in S3 API without HEADing metadata
  2. Storage Made Easy now supports the AWS S3 Australia region
  3. How to Optimize S3 and S3 Compatible Object Storage Solutions for End User Access with Hundreds of Millions of Files
  4. Meta-data Sync and Re-Sync Optimizations in v1906 of the Enterprise File Fabric
CategoriesAmazon EC2, Amazon S3, AWS, Chrome Extension, Metadata, Storage Made Easy TagsAmazon S3, AWS, Enterprise File Fabric, Lambda, Lambda Architecture, meta-data, meta-data sync, metadata, Metadata syc, Object Storage, SNS

Post navigation

Previous PostPrevious Using the File Fabric to Provide Security of Data Across Storage Silos With Streamed Encryption
Next PostNext Multi-Cloud Content Discovery for Data Ownership, Data Governance and Identifying Sensitive Data

Categories

SME Web CIFS

File Fabric Collaterals

White Papers
Video Gallery


Proudly powered by WordPress

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in settings.

Powered by  GDPR Cookie Compliance
Cookie Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

You can adjust all of your cookie settings by navigating the tabs on the left hand side.

Functional Cookies

Functional Cookies should be enabled at all times so that we can save your preferences for cookie settings.

If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.

Analytics Cookies

This website uses:

Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.

Leadlander to help build intelligence about visits to our blog to provide accurate and better information for our sales teams about site visitors.

Keeping these cookies enabled helps us to improve provide more targeted services and helps us improve our site.

Please enable Strictly Necessary Cookies first so that we can save your preferences!

Privacy Policy

See our Privacy Policy and Cookie Policy for more information about the information we collect and your rights.