Tag Archives: ml

Building a 6 7 image classifier with fast.ai

I recently took Fast.ai’s Practical Deep Learning for Coders (part 1) course. As practice, I used transfer learning to build an image classifer. Given a handwritten number image, the model will classify it as a 6, a 7, or neither.

With transfer learning, I was able to fine tune resnet18 against the MNIST database (handwritten digits). Using fast.ai’s library on paperspace, it was easy to create DataLoaders, fine tune the pretrained model, and evaluate the model loss. With gradio, I was able to deploy this to huggingface spaces to easily run inference.

Testing out the gradio app, I find the prediction itself to not be very reliable. This is ok since I wanted practice with the end to end process, and the image dataset can be expanded to gather more images of 6 and 7 digits for better fine tuning.

With fast.ai, paperspace, and huggingface, ML classification is accessible.

Intro to Computer Vision

I’m new to computer vision and a lot of the basic concepts are very interesting. As an iOS developer, my interests comes from using CoreML & Apple’s Vision in apps to improve the user experience.

Two common tasks are classification and object detection. Classification allows you to detect dominant objects present in an image. For example, classification can tell you that photo is probably of a car.

Object detection is much more difficult since it not only recognizes what objects are present, but also detects where they are in the image. This means that object detection can tell you that there is probably a car within these bounds of the image.

What’s important is that the machine learning model runs in an acceptable amount of time. Either asynchronous in the background or in real time. Apple provides a listing of sample models for classification at https://developer.apple.com/machine-learning/.

For real time object detection, TinyYOLO is an option, even if the frame rate is not near 60 fps today. Other real time detection models like YOLO or R-CNN are not going to provide a sufficient experience on mobile devices today.

One other interesting thing I came across is the PASCAL Visual Object Classes (VOC). These are common objects used for benchmarking object classification.

For 2012, the twenty object classes that have been selected were:

  • Person: person Animal: bird, cat, cow, dog, horse, sheep
  • Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
  • Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

These are common objects used to train classification models.

Computer vision used with machine learning has a tremendous amount of potential. Whether used with AR or other use cases, they can provide a compelling user experience beyond Not Hotdog.

Free Online CS Courses Taught by Stanford Faculty Q1 2012

Free Online Courses

I’m currently taking the free Intro to Databases course taught by the excellent Professor Widom. The course consists of video lectures with homework exercises, quizzes and midterms. There is also a helpful Q&A forum for students when you get stuck.

In the beginning of 2012, there are more free online courses available, including:

The classes are high quality and perfect for you if you can spend more than a few hours a week per course. As they are free, the courses don’t offer any official Stanford certificates, grades, or credit. You do get a Statement of Accomplishment from the instructor.

I’d recommend these courses to anyone as they are free, and any investment in your time will be easily rewarded with an understanding in increasingly relevant present-future topics.