Bilingual learning camera

About the Project

AI-Powered Image Recognition Educational App (Gemini Vision EDU) Developed an innovative mobile learning application that enables users to capture photos of real-world objects and leverage Google Gemini Pro Vision AI for instant image analysis. The system automatically identifies objects in the photos and provides: Multilingual Object Naming: Displays object names in both Chinese and English. Contextual Example Sentences: Generates relevant English sentences to aid contextual understanding. Interactive Spelling Game: Deconstructs object names into individual letters, allowing children to learn spelling through drag-and-drop or tap interactions, thereby enhancing vocabulary retention. This application aims to provide an engaging and effective multilingual vocabulary learning experience for children through visual and interactive methods.

Challenges & Solutions

Optimizing AI Recognition Accuracy via Multimodal Analysis

Initially, basic Vision APIs were used for label recognition. However, accuracy suffered in complex environments or at specific angles. Furthermore, the system could not generate educational-grade translations or spelling suggestions tailored for children.

check_circle

Migrated to Google Gemini Ai and implemented advanced Prompt Engineering. By providing specific context instructions, we guided the AI to perform deep image analysis, generating not just object names, but also bilingual vocabulary, contextual sentences, and spelling breakdowns, significantly enhancing the educational value and accuracy.

Implementing Automated CI/CD Workflows for Enhanced Development Efficiency

In the early stages of development, manual deployment was time-consuming and prone to human error. It was also challenging to ensure consistency between development and production environments, which limited the pace of iteration.

check_circle

Integrated GitHub with Vercel to establish an automated deployment pipeline. Every time code is pushed to the main branch, the system automatically triggers build and deployment tasks, achieving "code-to-live" seamlessness. This ensures deployment reliability and allows me to focus on feature development rather than environment configuration.

Visual Showcase

zoom_in

Next Project

OpenClaw Bot

arrow_forward