For those folks who use and know Amazon S3 they will be aware and know about multi-part uploads. Multipart upload enables the uploading of a (normally large) single object as a set of parts.
Using multi-part uploads has the following advantages:
- Improved throughput – Parts are uploaded in parallel to improve throughput.
- Quick recovery from any network issues – Smaller part size minimizes the impact of restarting a failed upload due to a network error.
Multi-Part upload is also available with the File Fabric M-Stream file acceleration technology. M-Stream takes advantage of multi-part upload API’s for Object Storage (S3 compatible and OpenStack) enabling File Fabric Web and desktop apps to very rapidly upload large objects.
In addition to upload, M-Stream also enables downloads to be accelerated in the same way ie. Large file downloads are split into pieces, sent in parallel over multiple streams, and reassembled back into a contiguous file or object at the target, all done transparently to the end user. M-Stream maximizes network bandwidth, minimizes network latency, and increases resiliency, particularly for wide area networks.
M-Stream is activated when moving and copying large files, folders and objects. It’s supported both in File Fabric desktop tools (uploading and downloading) and in the File Fabric appliance itself (copying data server-to-server).g three areas for large file acceleration:
If you use M-Stream through the SaaS service then there will be two areas of limitation The first is your local speed, the second is our data centre which is restricted to 100MBps.
To see M-Stream in action you can either connect your own S3 account or use the default storage that comes with the File Fabric for SaaS as this is hosted on S3. In either case once this is done go to the Cloud File Manager in the web browser and drop a large object into a bucket or pseudo folder and you will see something like this:
The green triangles represent the number of multi-part objects the are being used on the upload. If you hover on this with your mouse then you will see a bandwidth graph:
We tested using our M-Stream technology with Amazon S3 from a bare metal instance in the EC2 regions of California (US-West-1) and Virginia (US-East-1) against S3 buckets in California (US-West-1), Virginia (US-East-1) and London (EU-West-2).
The results were:
As we would expect the fastest results we achieved was sending from the filesystem to S3 in the same region. The speed was about 1850MBps.
We achieve the exact same number in tests from:
California -> Virginia S3
Virginia -> California S3
Virginia -> London S3
California -> London S3
We tried multiple tweaks of M-Stream, in terms of number of parallel threads etc and we always capped out at 598MBps (which is kind of a funny number). This really led us to believe that AWS is capping data transfers or has as 5GB bandwidth limitation between instances in multiple regions.
It’s that last test that really pointed us in the direction of AWS S3 uploads being capped. Why would California -> London S3 match all the other numbers if latency is the issue.
The only references we could find to this limiting were in an Amazon blog post on how the “Floodgates” were finally open on bandwidth for EC2 instances. It appears that the floodgates are still closed if EC2 is sending data to S3 in a remote region.
That being said ~600MBps from California to London is actually pretty fast and in some cases its faster than we see in many on-premises environments with Object Storage vendors, StorNext and Isilon.
The next steps are to repeat the tests using Amazon’s S3 accelerated transfer technology to understand any increases that Amazon’s accelerate S3 transfer impact the speeds.