Google Play Takes Long Time to Upload

How to Upload large files to Google Colab and remote Jupyter notebooks

by Bharath Raj

How to Upload large files to Google Colab and remote Jupyter notebooks

NHHPfQ0REk3MQuQ2gr9njfUthdZLqLnkIDbc
Photo by Thomas Kelley on Unsplash

If you oasis't heard well-nigh it, Google Colab is a platform that is widely used for testing out ML prototypes on its gratuitous K80 GPU. If y'all have heard almost it, chances are that you lot gave information technology shot. But you might have go exasperated because of the complication involved in transferring big datasets.

This blog compiles some of the methods that I've found useful for uploading and downloading large files from your local system to Google Colab. I've also included boosted methods that can useful for transferring smaller files with less effort. Some of the methods tin exist extended to other remote Jupyter notebook services, similar Paperspace Gradient.

Transferring Large Files

The most efficient method to transfer big files is to employ a cloud storage organisation such as Dropbox or Google Bulldoze.

1. Dropbox

Dropbox offers upto 2GB free storage space per account. This sets an upper limit on the amount of information that yous can transfer at any moment. Transferring via Dropbox is relatively easier. You tin can also follow the same steps for other notebook services, such as Paperspace Gradient.

Pace 1: Archive and Upload

Uploading a large number of images (or files) individually will have a very long time, since Dropbox (or Google Drive) has to individually assign IDs and attributes to every image. Therefore, I recommend that you annal your dataset first.

I possible method of archiving is to catechumen the folder containing your dataset into a '.tar' file. The code snippet below shows how to convert a folder named "Dataset" in the habitation directory to a "dataset.tar" file, from your Linux final.

                tar -cvf dataset.tar ~/Dataset              

Alternatively, you lot could use WinRar or 7zip, whatever is more user-friendly for you. Upload the archived dataset to Dropbox.

Step 2: Clone the Repository

Open Google Colab and outset a new notebook.

Clone this GitHub repository. I've modified the original lawmaking then that information technology tin can add the Dropbox access token from the notebook. Execute the following commands one by one.

                !git clone https://github.com/thatbrguy/Dropbox-Uploader.git cd Dropbox-Uploader !chmod +x dropbox_uploader.sh              

Step iii: Create an Access Token

Execute the following command to see the initial setup instructions.

                !fustigate dropbox_uploader.sh              

It will display instructions on how to obtain the access token, and volition ask yous to execute the following command. Supplant the assuming letters with your admission token, and so execute:

                !repeat "INPUT_YOUR_ACCESS_TOKEN_HERE" > token.txt              

Execute !bash dropbox_uploader.sh again to link your Dropbox business relationship to Google Colab. At present you can download and upload files from the notebook.

Step four: Transfer Contents

Download to Colab from Dropbox:

Execute the following command. The argument is the proper noun of the file on Dropbox.

                !bash dropbox_uploader.sh download YOUR_FILE.tar              

Upload to Dropbox from Colab:

Execute the following command. The outset argument (result_on_colab.txt) is the proper noun of the file you lot want to upload. The second statement (dropbox.txt) is the proper name you want to salvage the file as on Dropbox.

                !bash dropbox_uploader.sh upload result_on_colab.txt dropbox.txt              

2. Google Drive

Google Drive offers upto 15GB free storage for every Google account. This sets an upper limit on the amount of data that you can transfer at whatever moment. You lot can always expand this limit to larger amounts. Colab simplifies the authentication procedure for Google Drive.

That existence said, I've also included the necessary modifications you tin perform, so that you can access Google Bulldoze from other Python notebook services as well.

Step i: Archive and Upload

Only every bit with Dropbox, uploading a large number of images (or files) individually will take a very long time, since Google Drive has to individually assign IDs and attributes to every image. So I recommend that you annal your dataset first.

1 possible method of archiving is to convert the folder containing your dataset into a '.tar' file. The lawmaking snippet below shows how to convert a folder named "Dataset" in the home directory to a "dataset.tar" file, from your Linux terminal.

                tar -cvf dataset.tar ~/Dataset              

And once again, you tin can use WinRar or 7zip if you adopt. Upload the archived dataset to Google Drive.

Step 2: Install dependencies

Open Google Colab and commencement a new notebook. Install PyDrive using the following control:

                !pip install PyDrive              

Import the necessary libraries and methods (The bold imports are only required for Google Colab. Do non import them if yous're non using Colab).

                import os from pydrive.auth import GoogleAuth from pydrive.drive import GoogleDrive from google.colab import auth from oauth2client.customer import GoogleCredentials              

Step 3: Authorize Google SDK

For Google Colab:

Now, you have to authorize Google SDK to access Google Drive from Colab. First, execute the post-obit commands:

                auth.authenticate_user() gauth = GoogleAuth() gauth.credentials = GoogleCredentials.get_application_default() drive = GoogleDrive(gauth)              

You lot volition get a prompt as shown below. Follow the link to obtain the key. Copy and paste it in the input box and press enter.

hzfWkut06QN9A99io686hCuHLETczZemgM9Y
Prompt to cosign user

For other Jupyter notebook services (Ex: Paperspace Gradient):

Some of the post-obit steps are obtained from PyDrive's quickstart guide.

Get to APIs Panel and make your ain project. Then, search for 'Google Drive API', select the entry, and click 'Enable'. Select 'Credentials' from the left menu, click 'Create Credentials', select 'OAuth customer ID'. You should see a card such equally the image shown beneath:

jP9acVWbTFaXXFctxTPZNSuw2EP2xqZZVg0R

Set "Application Type" to "Other". Requite an appropriate name and click "Save".

Download the OAuth 2.0 client ID you just created. Rename it to client_secrets.json

Upload this JSON file to your notebook. You lot can practise this by clicking the "Upload" button from the homepage of the notebook (Shown Below). (Note: Do not utilise this push button to upload your dataset, every bit it will be extremely fourth dimension consuming.)

dOEeYDLPP98VFZDSNqXTYxkzc9B6o8fsRdTn
Upload push button shown in red

Now, execute the following commands:

                gauth = GoogleAuth() gauth.CommandLineAuth() drive = GoogleDrive(gauth)              

The rest of the procedure is similar to that of Google Colab.

Stride 4: Obtain your File's ID

Enable link sharing for the file you want to transfer. Re-create the link. You may become a link such as this:

                https://bulldoze.google.com/open?id=YOUR_FILE_ID              

Re-create merely the bold part of the above link.

Stride 5: Transfer contents

Download to Colab from Google Drive:

Execute the following commands. Here, YOUR_FILE_ID is obtained in the previous step, and DOWNLOAD.tar is the name (or path) you desire to save the file as.

                download = drive.CreateFile({'id': 'YOUR_FILE_ID'}) download.GetContentFile('DOWNLOAD.tar')              

Upload to Google Bulldoze from Colab:

Execute the following commands. Hither, FILE_ON_COLAB.txt is the proper name (or path) of the file on Colab, and Drive.txt is the name (or path) you desire to salvage the file as (On Google Drive).

                upload = drive.CreateFile({'championship': 'DRIVE.txt'}) upload.SetContentFile('FILE_ON_COLAB.txt') upload.Upload()              

Transferring Smaller Files

Occasionally, you may desire to pass simply one csv file and don't want to go through this entire hassle. No worries — there are much simpler methods for that.

1. Google Colab files module

Google Colab has its inbuilt files module, with which you can upload or download files. Yous can import it by executing the post-obit:

                from google.colab import files              

To Upload:

Apply the post-obit command to upload files to Google Colab:

                files.upload()              

You will be presented with a GUI with which you can select the files you want to upload. It is non recommended to use this method for files of large sizes. Information technology is very tiresome.

To Download:

Utilize the following command to download a file from Google Colab:

                files.download('example.txt')              

This characteristic works best in Google Chrome. In my experience, it only worked once on Firefox, out of about 10 tries.

2. GitHub

This is a "hack-ish" way to transfer files. You can create a GitHub repository with the small files that yous want to transfer.

Once yous create the repository, you can just clone information technology in Google Colab. You tin can then push your changes to the remote repository and pull the updates onto your local organisation.

Merely do annotation that GitHub has a difficult limit of 25MB per file, and a soft limit of 1GB per repository.

Give thanks you for reading this article! Go out some claps if you it interesting! If you lot accept any questions, yous could hitting me upward on social media or transport me an email (bharathrajn98[at]gmail[dot]com).

Learn to code for free. freeCodeCamp's open source curriculum has helped more than 40,000 people get jobs as developers. Get started

seidmansquethe.blogspot.com

Source: https://www.freecodecamp.org/news/how-to-transfer-large-files-to-google-colab-and-remote-jupyter-notebooks-26ca252892fa/

0 Response to "Google Play Takes Long Time to Upload"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel