It looks like FloydHub is no longer providing 100 hrs free usage. Free tier users now have credits that can only execute GPU instances for 2 hrs (each instance has maximum run time 1hr.)
Dogs-vs-cats Redux on floydhub here.
Feel free to leave a comment. I’m looking forward to your feedback.
1. Sign up
2. Install floyd-cli (using anaconda)
$ pip install -U floyd-cli
3. Log in to floyd
$ floyd login
Copy/Paste the token from dashboard, as shown in orange box.
4. Clone code from fast.ai’s github to local machine
$ git clone https://github.com/fastai/courses.git
or simply download and unzip its zip file to local machine.
5. Start your first project
$ cd courses/deeplearning/nbs $ floyd init YourFirstProjectName
YourFirstProjectName is the project name associated with current directory and also the name to be displayed in dashboard. You can change it any time by running $floyd init
Update: How to download data in floydhub
6. Run Jupyter notebook on floydhub
$ floyd run --mode jupyter --env theano:py2 --gpu
The command above create an GPU instance (
--gpu) that runs Jupyter notebook (
--mode jupyter) with theano(keras)/python2 pre-installed (
--env theano:py2). Floyd provides almost every mainstream frameworks, from tensorflow, theano, chainer(from Japan), to surging PyTorch.
Jupyter notebook url will show up after all files in the project directory are uploaded (synchronized.) And the home page of jupyter shows every files in your project folder.
Errors/Problems and Solutions
In vgg = Vgg16()
Exception: The shape of the input to “Flatten” is not fully defined (got (0, 7, 512). Make sure to pass a complete “input_shape” or “batch_input_shape” argument to the first layer in your model.
import keras.backend as K K.set_image_dim_ordering('th') # K.set_image_data_ordering('theano') for keras 2 # https://www.nodalpoint.com/switch-keras-backend/ # Dynamically switch keras backend to theano
If the code above doesn’t solve the error, try to restart the notebook and run again.
BTW, I only encounter this when using vgg16 and vgg16bn. The exception never showed up when using resnet50.
Reminder: In keras 2.0,
image_dim_ordering has changed to
image_data_ordering. So instead use
set_image_data_ordering() to set the backend.
val_feat = model.predict_generator(val_batches, val_batches.nb_sample)
ValueError: Error when checking : expected input_2 to have shape (None, 3, 224, 224) but got array with shape (64, 224, 224, 3)
Missing argument gen=image.ImageDataGenerator() in get_batches() or flow_from_directory()
trn_feat = model.predict_generator(batches, batches.nb_sample*3)
ERROR (theano.gof.cmodule): [Errno 12] Cannot allocate memory
Source “You may want to avoid using get_data() and instead use batches, to avoid using up memory.”
Sol. 2: Less epochs of augmented data (batches.nb_sample3 -> batches.nb_sample2).
The kernel appears to have died. (restarting kernel)
It happened frequently when pre-computing features, i.e., running model.predict(). This is because we have depleted GPU memory. My solution is to save any pre-computed data via bcolz save_array(). And once kernel restarted (if it unfortunately died), I call load_array() to load pre-computed data to RAM.
If I remember correctly, for dogs-vs-cats and fisheries competition data set, pre-computing more than 3 epochs of training data (or 2 training + 1 test data) killed the kernel.
In from utils import plots
ImportError: No module named bcolz
Sol 1: create a floyd_requirements.txt file in project folder so that the instance will launch with required package installed
Sol 2: use jupyter magics to run shell command and install bcolz
Exception: URL fetch failure on http://www.platform.ai/models/vgg16_bn.h5: None -- [Errno -2] Name or service not known
Have a look at at this thread on fast.ai forum.
Also, github repo. of YL Guo gives an instruction to start fast.ai lessons on floyd. Although I suggest using terminal in Jupyter instead of floyd data upload command to do data manipulation.
runnable with sample data. Results and code achieving top 10% public LB score will be updated in this blog post.
[Note on vgg16.py]
From keras’ source code
returns a DirectoryIterator().
In class DirectoryIterator(), we have
self.nb_class = len(classes)
self.class_indices = dict(zip(classes, range(len(classes))))
I’m not sure what exactly ‘classes’ is, but i suppose it contains all name of sub-directories .
Anyway, this explains the code in vgg16.py, where
def finetune(self, batches):
classes = list(iter(batches.class_indices))
2. Example of addition in tuple
(3, ) + target_size # (3, 256, 256)
target_size + (3,) # (256, 256, 3)
3. Dim. ordering
“in Tensorflow, the arg. of image shape should be ordering (width, height, channels)
on the contrary, in Theano it should be (channels, width, height)”
Download catsdogs.zip in Jupyter
I tried to upload dogscats.zip which I downloaded from fast.ai by running command
floyd data init catsdogs.zipped floyd data upload
but resulting memory error
Setup directories for catsdogs redux in Jupyter
#copy cats-dogs public datasets to your own output dir.
%cp cat..jpg /output/data/cats/
%cp dog..jpg /output/data/dogs/