How to upload large files in the BackBlaze B2 bucket - Digital Solutions, IT Services & Consulting - Payoda

How to upload large files in the BackBlaze B2 bucket

Nowadays, cloud is ruling the IT industry, a lot of companies are entering into cloud storage like Amazon S3, Azure blob, etc. Backblaze B2 is also a cloud bucket that is a simple, reliable, affordable object storage.

Price is also an important factor when we are moving to cloud storage. Backblaze offers an affordable price range as compared to other cloud storage.

S3 Compatible API can easily integrate with your existing data management tools and S3 gateways.

Uploading large file in BackBlaze B2 bucket

Uploading large files to the cloud is a challenging task for developers. Some of the cloud storage provide their own plugin or packages to stream upload.

In this blog, we will discuss how to upload large files in a reliable way, part by part, pause/resume, content verification through HTTP requests on the client side.

Problems occurring while uploading large files over HTTP

maxRequestLength

The framework will automatically throw “Page not found” If the uploaded file size is greater than the threshold size limit.

ExecutionTimeout

The default execution time of an HTTP request is 110 seconds. If the file size is in GB’s execution time will be exceeded and throws an exception.

Ex. if users might have the bandwidth in the 100-KBs range, uploading a file around 100 MB will take a significant amount of time. Many web servers, especially ones on hosting services like Heroku, Azure app service, AWS elastic beanstalk have time out as low as 30 seconds, which causes the upload to fail.

maxAllowedContentLength(IIS 7)

The maximum content length of content in a request is supported by IIS. By default, the content length is 30,000,000 bytes.

Server Efficiency & Memory issues

Retrieving a large file and storing them in temporary memory will decrease the server performance. if you are processing the file or doing a read-then-write, as opposed to streaming it directly into storage, temporary memory issues will occur.

Browser Limitation

Modern IE8 browsers have trouble going above 5GB, Browsers are reliable with 100 MB of uploads are recommended.

Part File Upload

To overcome the above issue we are going with the part upload. Large files are created by assembling parts, where each part can either be uploaded or copied from existing files in any bucket belonging to the same account as the large file which means a large file can be assembled from a mix of uploaded and copied parts.

Steps to upload large files are

  1. Start Large File
  2. Upload Part
  3. Finish Large File

Start large file

Starting the file we need to provide the file name, content type, and custom file info. The HTTP call will return a file ID for the large file, which you will need when uploading the parts of the file.

You need to decide how many bytes transfer into each part of the file, based on bytes part fragments are created in the buckets.

Upload file Part

To upload a large file as parts we need to follow the steps

Step: 1 Read the file and split it up into parts

FileInfo fileInfo = new FileInfo(“<Path to file>”);

long localFileSize = fileInfo.Length;

long totalBytesSentForUpload = 0;

long totalBytesSentForUpload = 100 * (1000 * 1000);

long minimumPartSize = bytesSentForPart;

byte[] dataToSent = new byte[100 * (1000 * 1000)];

int partNo = 1;

For each part 100MB of data will be transferred for example 2GB of file need to transfer (1024*2)/100 ~ (21 parts), After successful transfer of each part, increment the part number.

Step 2 Resume the File

Get the last part number and read the particular file stream bytes

Int startPart = partNumber * bytesSentForPart ;

FileStream f = File.OpenRead(“<Path to file>”);

f.Seek(totalBytesSent, SeekOrigin.Begin);

f.Read(data, startPart , (int)bytesSentForPart);

total Bytes Sent = totalBytesSent + bytesSentForPart;

Step 3 Generate SH1

The SHA1 checksum is a part of the file. To check the part is uploaded and make sure that the data arrived correctly verified by SHA1

// Generate SHA1

FileStream f = File.OpenRead(“<Path to file>”);

f.Seek(totalBytesSent, SeekOrigin.Begin);

f.Read(data, 0, (int)bytesSentForPart);

SHA1 sha1 = SHA1.Create();

byte[] hashData = sha1.ComputeHash(data, 0, (int)bytesSentForPart);

StringBuilder sb = new StringBuilder();

foreach (byte b in hashData)

{

sb.Append(b.ToString(“x2”));

}

f.Close();

partSha1Array.Add(sb.ToString());

Step 4: Make HTTP POST Request

After a successful file read and SHA1 generation, Add Authorization Key, SHA1, Content type , Content-length to the header and make an HTTP post request

Assuming ~208 MB File

Part 1:

{

“contentLength”: 100000000,

“contentSha1”: “062685a84ab248d2488f02f6b01b948de2514ad8”,

“fileId”: “4_ze73ede9c9c8412db49f60715_f100b4e93fbae6252_d20150824_m224353_c900_v8881000_t0001”,

“partNumber”: 1

} etc …

Finish large file

Check all part of the files are uploaded

while (totalBytesSent < localFileSize)

{

if ((localFileSize — totalBytesSent) < minimumPartSize)

{

bytesSentForPart = (localFileSize — totalBytesSent);

}

partNo++;

total Bytes Sent = total Bytes Sent + bytesSentForPart;

}

To verify all part of the files are uploaded correctly we need to create a SHA1 checksum JSON array[ArrayList partSha1Array = {“<sha1_part1>”,”<sha1_part2>”,”<sha1_part3>”}] .This is to double-check that the right parts were uploaded in the right order, and none were missed.

Conclusion

Instead of retrieving a large file in a single HTTP request, we did it in a stream part of multiple requests. To reduce uploading time you can also split all the parts and make parallel requests.

The bucket can also integrate with media workflow, Initially, for developers, they are providing 10GB free sign-in and start rollout.

Reference — .https://www.backblaze.com/

Leave a Reply

Your email address will not be published. Required fields are marked *

three × 1 =