Hhandling multiple S3 objects can be a royal pain when you’re dealing with performance issues. I recently faced this exact problem while working on a project that required me to retrieve numerous media files from AWS S3, process them to create thumbnails, and display them on a webpage.
It sounds simple enough, right? Wrong.
The performance bottleneck hit me like a ton of bricks. When you’re retrieving files one after another, the wait time becomes unbearable – especially since the number of files was completely dynamic. Your users will absolutely abandon your application if they have to stare at loading screens for ages.
The solution is actually quite brilliant – make multiple requests in parallel instead of sequentially. This approach dramatically reduces retrieval time to essentially the duration needed for the longest file. Instead of adding up all the wait times, you’re overlapping them!
We will be using PHP based solution in this article(the stack I had to use while dealing with my project). However, even if you are trying to achieve this in a different language, chances are you would be able to do so following similar approach shown in this guide.
Thankfully, the official Amazon PHP SDK uses the Guzzle library for HTTP requests, and since version 2.0, Guzzle has supported parallel requests. This made implementing my solution much easier than expected.
Let’s dive right into the code. I’ll show you exactly how I extended the S3Client class to add parallel retrieval functionality:
<?php
namespace S3;
use Aws\S3\S3Client;
use \Aws\Common\Exception\TransferException;
/**
* Extended S3Client class for retrieving multiple objects in parallel
* @author Your Name
*/
class MyS3Client extends S3Client {
/**
* Retrieves multiple S3 objects in parallel
*
* @param Array $configs Configuration array for each object
* @param S3Client $client S3Client instance
* @return void
*/
public static function getObjects(Array $configs, S3Client $client) {
$requests = array();
$savePaths = array();
// Create request objects for each file
foreach ($configs as $config) {
$url = "https://" . $config["Bucket"] . ".s3.amazonaws.com/" . $config["Key"];
$request = $client->get($url);
$requests[] = $request;
$savePaths[$url] = $config["saveAs"];
}
// Send all requests in parallel
try {
$responses = $client->send($requests);
} catch(TransferException $e) {
echo $e->getError();
}
// Process all responses and save files
foreach ($responses as $res) {
$localPath = $savePaths[$res->getEffectiveUrl()];
file_put_contents($localPath, $res->getBody(true));
}
}
}Code language: HTML, XML (xml) Using this custom client is incredibly straightforward. Here’s how you’d implement it in your project:
// Create a new instance of our custom S3 client
$s3 = new \S3\MyS3Client([
'version' => 'latest',
'region' => 'us-east-1', // Change to your region
'credentials' => [
'key' => 'YOUR_AWS_ACCESS_KEY',
'secret' => 'YOUR_AWS_SECRET_KEY',
]
]);
// Create configuration array for multiple objects
$configs = array();
// Add first object
$configs[] = array(
'Bucket' => "my-test-bucket",
'Key' => "path/to/first-object.jpg",
'saveAs' => "local/path/first-image.jpg"
);
// Add second object
$configs[] = array(
'Bucket' => "my-test-bucket",
'Key' => "path/to/second-object.jpg",
'saveAs' => "local/path/second-image.jpg"
);
// Add as many objects as needed following the same pattern
// Retrieve all objects in parallel
\S3\MyS3Client::getObjects($configs, $s3);Code language: PHP (php) Let me break down exactly what’s happening in our implementation:
S3Client class provided by the AWS PHP SDK, which means you can use it exactly like the original client with all its methods.getObjects() that takes two parameters: send() method with our array of requests.You might be wondering why I implemented getObjects() as a static method. The original S3Client class is structured so that most methods map directly to AWS SDK REST API commands, with additional utility methods being static. I followed this pattern for consistency.
That said, if you have a better approach to implement this as a non-static method, I’d absolutely love to hear about it! Leave a comment below with your suggestions.
The performance improvement from this approach is nothing short of impressive. Let’s put it into perspective:
That’s a 90% reduction in wait time! Your users will definitely notice the difference, and your application will feel much more responsive.
Before implementing this solution, keep these points in mind:
The parallel request pattern isn’t limited to S3. You can apply similar techniques to other scenarios requiring multiple HTTP requests, such as:
Retrieving multiple S3 objects in parallel through PHP is an extremely effective way to optimize your application’s performance. By extending the AWS SDK’s S3Client class and leveraging Guzzle’s parallel request capabilities, you can dramatically reduce wait times for your users.
The code provided here is straightforward to implement and can be easily integrated into existing projects. If performance is important for your S3 operations – and let’s be honest, when isn’t it? – this approach is definitely worth implementing.
Have you tried similar optimization techniques with AWS services? I’d love to hear about your experiences in the comments!
Learn python file handling from scratch! This comprehensive guide walks you through reading, writing, and managing files in Python with real-world examples, troubleshooting tips, and…
You've conquered the service worker lifecycle, mastered caching strategies, and explored advanced features. Now it's time to lock down your implementation with battle-tested service worker…
Unlock the full potential of service workers with advanced features like push notifications, background sync, and performance optimization techniques that transform your web app into…
This website uses cookies.