An intro to Machine Learning in iOS with Swift, and Playgrounds





Avatar


So you’ve heard about machine learning and Apple’s framework CoreML and want to give it a whirl. If your initial thoughts are that it’s too complicated and that you don’t know where to begin, don’t worry—it’s not and I’ll walk you through exactly how to get started.

But wait, why do we even need machine learning and how can it help us? Machine learning allows you to take large data sets and apply complex mathematical calculations over and over, faster and faster.

Apple has made the entry-level for machine learning quite accessible and wrapped it up in an all-in-one package. Swift, Xcode, and Playgrounds are all the tools you’re going to need to train a model, then implement it into a project. I’m going to assume you’ve downloaded Xcode, so let’s jump right in. (If not, you can download it here.)

A note before we begin. This tutorial will be done in Playgrounds to help understand the code behind training models, however, CreateML is a great alternative to training models with no machine learning experience. It allows you to view model creation workflows in real-time.

One of the first things you’re going to need before opening Xcode is a dataset. What’s a dataset you say? It’s simple—a dataset is a collection of data. Movie reviews, locations of dog parks, or images of flowers are all examples of datasets. For our purposes, we’re going to be using a file named appStore_description.csv to train our model. There are a handful of resources to find datasets, but we’re going to use kaggle.com, which is a list of AppStore app descriptions and the app names. We’ll use this text to help predict an app for the user based on their text input. Our model will be a TextClassifier, which learns to associate labels with features of our input text. This could come in the form of a sentence, paragraph, or a whole document.

  • Download the dataset here.
  • Your dataset may have a column named track_name, you can open the csv file and rename that column to app_name so it’s consistent with this example.

Now that you have your dataset you can open Xcode 🙌.

  1. First, create a new Playground using the macOS template and choose BlankWe use macOS because the CreateML framework is not available on iOS.
  2. Delete all the code in the Playground and import Foundation and CreateML.
  3. Add the dataset to the Playground’s Resources folder.

Here’s what your Playground will look like and what’s going on inside it.

import Foundation
import CreateML

//: Create a URL path to your dataset and load it into a new MLDataTable
let filePath = Bundle.main.url(forResource: "appStore_description", withExtension: "csv")
let data = try MLDataTable(contentsOf: filePath)
//: Create two mutually exclusive, randomly divided subsets of the data table
//: The trainingData will hold the larger portion of rows
let (trainingData, testData) = data.randomSplit(by: 0.8)
//: Create your TextClassifier model using the trainingData
//: This is where the `training` happens and will take a few minutes
let model = try MLTextClassifier(trainingData: trainingData, textColumn: "app_desc", labelColumn: "app_name")
//: Test the performance of the model before saving it. See an example of the error report below
let metrics = model.evaluation(on: testData, textColumn: "app_desc", labelColumn: "app_name")
print(metrics.classificationError)

let modelPath = URL(fileURLWithPath: "/Users/joshuawalsh/Desktop/AppReviewClassifier.mlmodel")
try model.write(to: modelPath)

Once you have this in your Playground, manually run it to execute training and saving the model. This may take a few minutes. Something to note is that our original csv dataset file is 11.5 MB and our training and test models are both 1.3 MB. While these are relatively small datasets, you can see that training our model drastically reduces the file size 👍.

Example Error Report

Printing the metrics is optional when creating your model, but it’s good practice to do this before saving. You’ll get an output something like this:

Columns:
    actual_count    integer
    class    string
    missed_predicting_this    integer
    precision    float
    predicted_correctly    integer
    predicted_this_incorrectly    integer
    recall    float
Rows: 1569
Data:
+----------------+----------------+------------------------+----------------+---------------------+
| actual_count   | class          | missed_predicting_this | precision      | predicted_correctly |
+----------------+----------------+------------------------+----------------+---------------------+
| 1              | "HOOK"         | 1                      | nan            | 0                   |
| 1              | ( OFFTIME ) ...| 0                      | nan            | 1                   |
| 1              | *Solitaire*    | 0                      | nan            | 1                   |
| 1              | 1+2=3          | 0                      | nan            | 1                   |
| 1              | 10 Pin Shuff...| 0                      | nan            | 1                   |
| 1              | 10 – �..       | 0                      | nan            | 1                   |
| 1              | 100 Balls      | 0                      | nan            | 1                   |
| 1              | 1010!          | 0                      | nan            | 1                   |
| 1              | 12 Minute At...| 0                      | nan            | 1                   |
| 1              | 20 Minutes.f...| 0                      | nan            | 1                   |
+----------------+----------------+------------------------+----------------+---------------------+
+----------------------------+----------------+
| predicted_this_incorrectly | recall         |
+----------------------------+----------------+
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
| 0                          | 0              |
+----------------------------+----------------+
[1569 rows x 7 columns]

Now that you have your model trained and saved somewhere on your computer, you can create a new iOS project.

  1. Let’s make it a single view app and we’ll call it AppPredictor, and use Storyboards.
  2. Find where you saved your model and drag that file into your project in Xcode. 3. In ViewController.swift import UIKitNaturalLanguage and CoreML.
import UIKit
import NaturalLanguage
import CoreML

For simplicity’s sake, your UI will have 3 elements. A text field, a label, and a button. We’re going for functionality here, but feel free to update your designs however you see fit. Next, add the text field, label, and button to your view controller in the storyboard. The text field and label will be IBOutlets, and your button will be an IBAction.

@IBOutlet weak var textField: UITextField!
@IBOutlet weak var appNameLabel: UILabel!

@IBAction func predictApp(_ sender: Any) {

}
  1. Now add a reference to your classifier like so
private lazy var reviewClassifier: NLModel? = {
    // Create a custom model trained to classify or tag natural language text.
    // NL stands for Natual Language
    let model = try? NLModel(mlModel: AppReviewClassifier().model)
    return model
}()
  1. Let’s create a function that takes in a string, and returns a string based on the user’s input.
private func predict(_ text: String) -> String? {
    reviewClassifier?.predictedLabel(for: text)
}
  1. Back in predictApp, add the predict function and pass the text fields text in the argument.
appNameLabel.text = predict(textField.text ?? "")

Build and run your app and let’s see what you get. Describe an app that you can’t quite remember the name of, but you know what it does. I described two similar types of apps but got different results 👍.

Machine learning isn’t all that scary or complicated once you break it down into digestible chunks. Finding the right dataset can often be the biggest hurdle. Now that you have the basics down you should explore some of the other classifier types, and train different models.



Source link

Leave a Reply