top of page
Writer's pictureH-Barrio

Locating the Notebook File in Google Colab

To continue our dreaming machine demonstration from this previous post, we need to handle many modules, folder structures, and individual files. Google Colab is a good prototyping and early development tool when running code quickly and a horrible tool to handle files rapidly. The main problem is usually keeping all of our relative paths intact as we transport demonstration notebooks from location to location, carrying a large number of associated folders with them. Unfortunately, there seems to be no surefire form of locating the current directory of a given notebook file we are working on, so we developed a simple function.


First, import all needed tools to handle files and the Google Drive file system:

import requests
import urllib.parse
import google.colab
import os

google.colab.drive.mount('/content/drive')

Requests to get the information on the current notebook session, urllib parse to obtain the correct format for our request, and google.colab and os for general colab and operating system work.


Our function, locate_nb, will obtain the name of the file we are currently working with from an address request, the name is located under the 'name' entry, and we have to unquote its address from the URL response. Armed with this name, we walk across the complete file system, the COMPLETEEEEE file system, in a very inefficient and brutish manner in an attempt to locate all possible notebook file name matches. If we do not find the notebook's name in our file system, bad luck, and strange, as it should be there, somewhere, we get the "not found" message. If we get a single file location, change the working directory to that location; if we get multiple file name candidates, we add them to a list for the user to check later:

def locate_nb(set_singular=True):
    found_files = []
    paths = ['/']
    nb_address = 'http://172.28.0.2:9000/api/sessions'
    response = requests.get(nb_address).json()
    name = urllib.parse.unquote(response[0]['name'])

    dir_candidates = []

    for path in paths:
        for dirpath, subdirs, files in os.walk(path):
            for file in files:
                if file == name:
                    found_files.append(os.path.join(dirpath, file))

    found_files = list(set(found_files))

    if len(found_files) == 1:
        nb_dir = os.path.dirname(found_files[0])
        dir_candidates.append(nb_dir)
        if set_singular:
            print('Singular location found, setting directory:')
            os.chdir(dir_candidates[0])
    elif not found_files:
        print('Notebook file name not found.')
    elif len(found_files) > 1:
        print('Multiple matches found, returning list of possible locations.')
        dir_candidates = [os.path.dirname(f) for f in found_files]

    return dir_candidates

Now we can add the working directory to a variable and generate our file system with that absolute path in mind, possibly reducing the chaos in our prototype demonstration on Colab and making it more portable. But, of course, it will not work for all possible Colab access modes; at least the most common access mode is covered.


Try it in this notebook.


Do not hesitate to contact us if you require quantitative model development, deployment, verification, or validation. We will also be glad to help you with your machine learning or artificial intelligence challenges when applied to asset management, automation, intelligence gathering from satellite, drone, fixed-point imagery, even dreams.


This is the dreaming image for "google file handling" text:



110 views0 comments

Comments


bottom of page