Visual Brain

See the world through AI

Duration

March 2020 to Present

Team Size

Area

Ed Tech

My Role

UX Design

Problem

The Complexity of Fine-Tuning

Computer vision represents an exciting technological domain that enables computers to interpret and analyze images and videos in a manner akin to human perception. However, one of the major hurdles in leveraging this technology is fine-tuning the models to achieve accurate results, especially for those without a deep technical background. The complexity involved in adjusting model parameters, and the overwhelming amount of technical jargon make the fine-tuning process daunting. This situation discourages many interested learners from exploring Computer vision.

Solution

Fine-Tuning the Complexity Out

We've tried to simplify the fine-tuning process of computer vision models, making it accessible for everyone. For newcomers, our fine-tuning wizard simplifies the process with straightforward explanations allowing you to learn by directly interacting with model parameters. For those with a technical background, this feature offers the flexibility to experiment and optimize with advanced settings, ensuring your projects reach their full potential. With this approach, we aim to harness the power of computer vision, opening up a world of creative and practical applications.

Survey Insight

Non-Tech Professionals Eyeing Computer Vision

Through our survey, we've gained valuable insights into the demographics, knowledge levels, and motivations of potential users interested in exploring computer vision. Our findings reveal a diverse age range with a core group aged 25-44, predominantly non-technical professionals, indicating a strong interest across various professional fields. This diversity highlights the need for a platform that bridges technical concepts with practical application in an intuitive manner.

Simplifying the Steps to Computer Vision

In our journey to make computer vsion more accessible, we recognized the importance of simplifying the model training process. Our aim was to design a platform that invites users from various backgrounds, especially those without a technical foundation, to not only engage with but also excel in the field of computer vision. This led to the creation of a four-step model training process, meticulously designed to be both comprehensive and accessible.

To Create or Curate?

At the heart of any computer vision model is the dataset. We simplified the initial step by offering users a choice: to create a new dataset or to utilize a pre-existing one. This flexibility allows users to either start from scratch, offering a learning experience in dataset creation, or to leverage existing datasets, accelerating the path to model training.

A Structured Approach to Data Labeling

Accuracy in labeling is crucial for the success of a computer vision model. Our platform facilitates a structured approach to creating and applying labels, ensuring consistency and precision. By providing users with intuitive UI elements for the labeling process, we minimize errors and improve the quality of the training data, laying a solid foundation for a more effective model.

Addressing the Crafting-Curating Dilemma Again

Continuing the theme of flexibility and accessibility, the next step involves choosing between creating a new model from scratch or selecting a pre-trained model for fine-tuning. This decision is pivotal, as it allows users to tailor the complexity of their project to their skill level. The platform's design mirrors the dataset creation/selection process, providing a consistent user experience that reduces cognitive load and streamlines the workflow.

Text-Heavy Forms for Clarity

To make the complex task of fine-tuning models less daunting, we added clear descriptions for each parameter. These brief explanations help users understand what each adjustment does and how it affects their model. This method supports users, from beginners to those with some knowledge of computer vision, in making smarter choices about their models. By offering straightforward, easy-to-understand information, we help users connect more deeply with the technology and create a learning experience that feels more natural.

Expert-Backed Secondary Research

Grouped Parameters for Smoother Tuning

To improve how users learn and interact with fine-tuning computer vision models, we carefully organized similar parameters together. This decision was based on secondary research and advice from technical experts. We talked to experts and looked into the best ways to group parameters, making the interface easier to use and helping users naturally understand how different settings affect their model. By organizing these options, we create a clear path for users, making complicated adjustments easier to manage. This setup promotes trying out new things and learning, giving users more confidence and insight as they adjust their models.

Sliding into Controlled Precision

In designing the interface for our fine-tuning wizard, we chose to implement sliders for adjusting model parameters. This choice was driven by the dual goals of enhancing user experience and enforcing practical limits on parameter values. Sliders visually represent the range of possible adjustments, making it immediately clear to users how far they can tweak each setting. This method not only simplifies the interaction but also naturally prevents users from entering values that could destabilize the model training process. By incorporating sliders, we provide a guided yet flexible environment for experimentation, allowing users to easily explore the impact of different settings on their models while ensuring a level of safety and reliability in the fine-tuning process.

Retrospective

Lessons from a Multidisciplinary Journey

Looking back, we learned a lot about working together across different fields. Our project started as a software engineering challenge at a hackathon in March 2020. But as we got more into it, we kept working on it and improving it. This experience showed us how much you can achieve by mixing different skills and viewpoints. By having software engineers, experts in computer vision, and designers work together, we were able to make our solutions strong, easy to use, and full of new ideas. This journey proved how powerful it is to combine different kinds of knowledge to solve complicated problems, showing how working together across disciplines can lead to great discoveries.