Visual Brain

See the world through AI

  • Duration

    March 2020 to Present

  • Team Size

    4

  • Area

    Ed Tech

  • My Role

    UX Design

  • Visual Brain - Fine Tuning Screen

    Problem

    The Complexity of Fine-Tuning

    Computer vision represents an exciting technological domain that enables computers to interpret and analyze images and videos in a manner akin to human perception. However, one of the major hurdles in leveraging this technology is fine-tuning the models to achieve accurate results, especially for those without a deep technical background. The complexity involved in adjusting model parameters, and the overwhelming amount of technical jargon make the fine-tuning process daunting. This situation discourages many interested learners from exploring Computer vision.

    Solution

    Fine-Tuning the Complexity Out

    We've tried to simplify the fine-tuning process of computer vision models, making it accessible for everyone. For newcomers, our fine-tuning wizard simplifies the process with straightforward explanations allowing you to learn by directly interacting with model parameters. For those with a technical background, this feature offers the flexibility to experiment and optimize with advanced settings, ensuring your projects reach their full potential. With this approach, we aim to harness the power of computer vision, opening up a world of creative and practical applications.

    Survey Insight

    Non-Tech Professionals Eyeing Computer Vision

    Through our survey, we've gained valuable insights into the demographics, knowledge levels, and motivations of potential users interested in exploring computer vision. Our findings reveal a diverse age range with a core group aged 25-44, predominantly non-technical professionals, indicating a strong interest across various professional fields. This diversity highlights the need for a platform that bridges technical concepts with practical application in an intuitive manner.

    Survey Insights - User Age Distribution
    Survey Insights - User Knowledge Distribution
    Survey Insights - User Occupation Distribution
    Survey Insights - User Interest Distribution

    Simplifying the Steps to Computer Vision

    In our journey to make computer vsion more accessible, we recognized the importance of simplifying the model training process. Our aim was to design a platform that invites users from various backgrounds, especially those without a technical foundation, to not only engage with but also excel in the field of computer vision. This led to the creation of a four-step model training process, meticulously designed to be both comprehensive and accessible.

    Fine Tuning Steps Breakdown

    To Create or Curate?

    At the heart of any computer vision model is the dataset. We simplified the initial step by offering users a choice: to create a new dataset or to utilize a pre-existing one. This flexibility allows users to either start from scratch, offering a learning experience in dataset creation, or to leverage existing datasets, accelerating the path to model training.

    Fine Tuning Step 1 - Dataset Selection
    Fine Tuning Step 1 - Dataset Selection - LoFi
    Fine Tuning Step 1 - Dataset Selection
    Fine Tuning Step 1 - Dataset Selection - Scrolled Page

    A Structured Approach to Data Labeling

    Accuracy in labeling is crucial for the success of a computer vision model. Our platform facilitates a structured approach to creating and applying labels, ensuring consistency and precision. By providing users with intuitive UI elements for the labeling process, we minimize errors and improve the quality of the training data, laying a solid foundation for a more effective model.

    Fine Tuning Step 2 - Data Labeling

    Addressing the Crafting-Curating Dilemma Again

    Continuing the theme of flexibility and accessibility, the next step involves choosing between creating a new model from scratch or selecting a pre-trained model for fine-tuning. This decision is pivotal, as it allows users to tailor the complexity of their project to their skill level. The platform's design mirrors the dataset creation/selection process, providing a consistent user experience that reduces cognitive load and streamlines the workflow.

    Fine Tuning Step 3 - Model Selection
    Fine Tuning Step 3 - Model Selection - LoFi1
    Fine Tuning Step 3 - Model Selection - LoFi2
    Fine Tuning Step 3 - Model Selection
    Fine Tuning Step 3 - Model Selection - Scrolled Page

    Text-Heavy Forms for Clarity

    To make the complex task of fine-tuning models less daunting, we added clear descriptions for each parameter. These brief explanations help users understand what each adjustment does and how it affects their model. This method supports users, from beginners to those with some knowledge of computer vision, in making smarter choices about their models. By offering straightforward, easy-to-understand information, we help users connect more deeply with the technology and create a learning experience that feels more natural.

    Visual Brain - Fine Tuning Screen

    Expert-Backed Secondary Research

    Grouped Parameters  for Smoother Tuning

    To improve how users learn and interact with fine-tuning computer vision models, we carefully organized similar parameters together. This decision was based on secondary research and advice from technical experts. We talked to experts and looked into the best ways to group parameters, making the interface easier to use and helping users naturally understand how different settings affect their model. By organizing these options, we create a clear path for users, making complicated adjustments easier to manage. This setup promotes trying out new things and learning, giving users more confidence and insight as they adjust their models.

    Fine Tuning Step 4.1 - Fine Tuning Parameters - LoFi
    Arrow Pointing Right
    Fine Tuning Step 4.1 - Fine Tuning Parameters - Grouping - LoFi
    Visual Brain - Fine Tuning Screen
    Visual Brain - Advanced Fine Tuning Screen

    Sliding into Controlled Precision

    In designing the interface for our fine-tuning wizard, we chose to implement sliders for adjusting model parameters. This choice was driven by the dual goals of enhancing user experience and enforcing practical limits on parameter values. Sliders visually represent the range of possible adjustments, making it immediately clear to users how far they can tweak each setting. This method not only simplifies the interaction but also naturally prevents users from entering values that could destabilize the model training process. By incorporating sliders, we provide a guided yet flexible environment for experimentation, allowing users to easily explore the impact of different settings on their models while ensuring a level of safety and reliability in the fine-tuning process.

    Slider Input for Learning Rate
    Slider Input for Number of Epochs
    Slider Input for Kernel Size

    Retrospective

    Lessons from a Multidisciplinary Journey

    Looking back, we learned a lot about working together across different fields. Our project started as a software engineering challenge at a hackathon in March 2020. But as we got more into it, we kept working on it and improving it. This experience showed us how much you can achieve by mixing different skills and viewpoints. By having software engineers, experts in computer vision, and designers work together, we were able to make our solutions strong, easy to use, and full of new ideas. This journey proved how powerful it is to combine different kinds of knowledge to solve complicated problems, showing how working together across disciplines can lead to great discoveries.