Skip to main content

Machine Learning Tasks

Once you’ve received Pod access, you are now ready to run model tasks! If you’d rather learn how to run SQL queries, see SQL Tasks.

Bitfount refers to training or executing models as performing tasks, while protocols, models, and algorithms are all task elements that need to be specified as part of a task. For more details on Bitfount’s definitions of these elements, see Bitfount Glossary. All task execution is tracked in a Pod’s activity history.

Before you run tasks, it’s always a good idea to determine:

  1. If there are any Pod policy restrictions which might dictate what tasks you can perform against a dataset in a given Pod based on the role you’ve been assigned.
  2. If the Pod is online. You can tell this by the green icon in the Pod’s card on the “My Pods” page. If the Pod is offline, the Pod owner will need to bring the Pod back online for you.
  3. The structure of the dataset upon which you are acting.

Modelling with the Bitfount Python API

The standard approach to model training using Bitfount is the Python API. We recommend using a notebook tool to train models and provide a tutorial with various examples in the Training Models tutorials leveraging Jupyter. See below for detailed instructions if you are using Bitfount default protocols and algorithms vs. custom models.

Using Bitfount-Supported Task Elements

tip

Example code for each of these steps is included in the Querying and Training a Model Tutorial.

The simplest approach to running an ML task is to make use of pre-defined Bitfount Task Elements. To do this you will:

  1. Import relevant classes from bitfount for your modelling needs.

    • Relevant classes can be found in the Bitfount Task Elements guide and API Reference and will depend on the task you are planning to run.
    • For standard use cases, examples of relevant classes are covered in the tutorials.
  2. Set up the loggers. Loggers enable you to receive input on the progress of your task and details on completion or failure.

  3. Define the model and data structure you will use to train. A list of currently supported options is given on the Models page.

  4. Train the model on the desired Pod(s): model.fit(pod_identifiers=[pod-identifier])

    • Note: If training on multiple Pods, ensure the data structures for the Pods are the same. Bitfount currently only supports horizontal federated learning.
    • model.fit automatically chooses the FederatedAveraging protocol; if you would like to specify a different protocol, you can do so and run the model like so:
    protocol = FederatedAveraging(algorithm=FederatedModelTraining(model=model))protocol.run(pod_identifiers=[pod_identifier])
  5. {Optional} Serialise and save the model:

model_out= Path("desired_model_path.pt")model.serialize(model_out)

Using Custom Models

For cases when you wish to train a model that isn’t included natively with the Bitfount SDK, Bitfount supports custom models. For more details on how to use and manage custom models, please see the Custom Models guide.

Model Evaluation

Bitfount also enables remote evaluation of an existing pre-trained model without the need to return the final model output using the evaluate method.

For detailed instructions, please see our Using Pre-Trained Models tutorial.

FAQs & Additional Relevant Tutorials

Ran into errors? Want to do something a bit more advanced?

You may wish to check out the Troubleshooting & FAQs page and explore more advanced model training capabilities via our additional tutorials:

Next Steps

You did it! For more detailed illustrations of the Bitfount product suite, feel free to peruse our tutorials.