DevLog 6-3
DevLog 6-3: Final Project Research and Planning
Section titled “DevLog 6-3: Final Project Research and Planning”Initial Research Links
Section titled “Initial Research Links”Potential Topics
Section titled “Potential Topics”- Education x AI: Build a productivity tool that reimagines how students interact with AI and take notes at the same time. This tool will be a canvas like Excalidraw where students can use AI-native features to build up their sketches.
- Entertainment x AI: Users interact with a web app through audio and video. The app either narrates what it sees or transforms the video in some other way in real-time.
- Office x AI: Creating web agents that replicate certain actions users take on the web.
- Sports x AI: Reimagining how athletes can train with AI. Either a vision and language model that monitors games and makes key decisions, or monitors trainings and coaches athletes.
Related Articles
Section titled “Related Articles”Inspiring / Related Projects
Section titled “Inspiring / Related Projects”- Excalidraw - Open-source collaborative whiteboard with a hand-drawn feel. Inspires the canvas-based AI note-taking idea.
- Mixboard - AI-powered concepting board for exploring and refining ideas visually. Shows how AI can augment creative brainstorming in a canvas UI.
- Pixellot Coaching - AI-powered sports video analysis platform that auto-tracks players and generates coaching insights. Directly related to the Sports x AI topic.
New Skills and Technologies
Section titled “New Skills and Technologies”- GPT, Claude, and Gemini for AI tooling
- AWS for hosting microservices and database
- Nova-Act for web-based agents
- Fal.ai for image and video generation
Summary
Section titled “Summary”I’m exploring AI-powered interactive web experiences using multimodal LLMs (Gemini, Claude, GPT), Fal.ai for media generation, and AWS infrastructure. My focus is building an experimentive project that explores AI and its impact on a separate discipline. The main challenge is integrating multiple modalities (text, vision, audio) into a smooth, low-latency experience.
Random Ideas from an Idea Generator
Section titled “Random Ideas from an Idea Generator”- A browser extension that watches your screen and generates a live comic strip of your workday.
- An AI referee that watches pickup basketball games through a phone camera and calls fouls in real time.
- A collaborative whiteboard where every stroke you draw gets auto-completed by AI into a different art style.
Concerns About the Final Project
Section titled “Concerns About the Final Project”My primary concern is being able to connect my frontend stack with all the AI tools that I will be calling via the API. I am concerned about multiple modalities such as image, text, or audio and how to get these things to be compatible for a smooth user experience together.