No Code: Model Training using Kandula.ai

No Code: Model Training using Kandula.ai

 Today, let’s walk through how easily, rapid model experimentation can be achieved through kandula.ai workflows, in particular by developing a car damage detection system.

 Automated car damage detection is helpful in many practical scenarios, for example deploying at entry and exit of car washing stations for monitoring of old dents, scratches, tear, glass damages…etc. Insurance companies can enable the AI system for empowering faster or automated insurance claims.

 Now we’ll show the process of building production-ready deployable models from raw data by never writing code. Simple end to end workflow pipeline in kandula includes the following steps.


Upload

 Let’s start with uploading raw images from local to onestore app which is a highly scalable data lake storage that also acts as centralized storage management.
 Create a New folder and upload Images or directly compressed zip of Images which can be extracted in Onestore.


Creating New Folder in Onestore


Extracting uploaded zip in Onestore


Annotate

 Once the upload is complete, let’s proceed to the next step which is annotating the Images(skipping *organize* which will be introduced in a separate blog). To get started with annotation, go to the Gallery app where a Dataset of the specific type has to be selected.

Currently, kandula supports the following types over Image/Video data.

 Create a new dataset of instance segmentation type to annotate damage on cars with a bounding box and mask over the damaged area


 Next, import images from our data lake Onestore and create all the classes to be annotated for the specific dataset. For sake of simplicity let’s combine all damages and create class “damage”.


Importing Images from Onestore to Dataset


Creating all classes to be annotated

 Now begin annotation by selecting Image, several manual and AI assisted tools are made available to make the annotation process very fluid. Below comparison of manual vs AI annotation over the same image is shown.

Augmented Data

 Data augmentation acts as an important step to improve models performance by adding more diverse instances such as varied lighting, scale, orientation..etc to training data. Swell tool in kandula has 60+ image processing techniques for deriving new training data with advanced geometric transformations.

 For example, to apply Horizontal Flip and affine(zoom, shear) transforms over original dataset - master version, select subset of images or whole dataset and select swell to create new dataset version name and choose transformations to apply. Below depicts the transformations applied and resultant Image.


Creating Swell dataset version

Original vs Augmented Image

Train

 Now the important model training process, kandula provides an environment for rapid experimentation over multiple dataset versioning with 50+ model architectures varying from standard CNNs,-state of the art transformers based, covering diverse situations speed vs accuracy, tiny or dense objects...etc. We also support distributed training over large datasets/models to improve training speed, overcome memory constraints.

In the ModelBuilder app, the workflow is similar to what we have seen in Onestore and Gallery apps. First new experiment has to be created with a specific name, in the experiment playground, multiple versions of model training runs can be created over different dataset versions, classes, model architectures, hyper parameters..etc.

 Here the above annotated Car Damage dataset is attached, augmented dataset version is chosen. All class labels are selected for training. Dataset split ratio into train-val-test is set as 70%-20%-10%.

 In the model architecture, cascade maskrcnn with backbone resnet-50 with default hyper parameters of Batch size, Epochs, learning rate, input size..etc and finally begin the distributed training!.

During training, all model metric logs, hyper-parameters over each iteration are monitored in tensorboard. Various plots such as Confusion Matrix, P-R, ROC curves are also plotted over train-val-test splits, giving users insights on how training is progressing. Analyzing this data for multiple model runs empowers ai solution architects to develop and finetune model performance.

Logs Monitoring over multiple runs

Evaluation Results over train-val-test splits

Infer and Publish

 Once the model is trained, evaluation results over all dataset splits are displayed, testing out the model predictions can be done at the infer section. After analyzing the P-R curves, depending on the problem statement users can select optimal threshold scores or can be set to default - 0.5 for model inference.

 Once the model is deployed, test Image can be given from Onestore or uploaded via local.  The resulting predictions are displayed along with confidence score as shown below.

Infer Workflow

Inferred results of Model Predictions

 These models can be exported and downloaded in ONNX format along with a getting started inference script. To avoid the hassle, models can also be published as an end-point API, for local inference examination.

 Kandula also provides, Droid app for building/deploying at-scale business operations with trained models which will be covered in a separate blog.

Summary
So that brings to the conclusion, we have seen how to bring raw data and perform model training on kandula. The platform offers much more amazing features/tools which will be covered in the coming blogs. In the meantime don't forget to check out kandula.ai.

Sai Nivedh

Sai Nivedh

AI Research Engineer, Pavo & Tusker Innovations