Concurrent AI running locally more less according to the model of self-driving cars

This might not be the proper place to ask, and if so, please feel free to ruthlessly delete this post.

I am interested in ‘concurrent/participatory/collaborative AI’ running locally (on one’s own machine) similar to the model of self-driving cars. More specifically, I would like to implement ‘angel-on-your-shoulder’ AI that can take a live whiteboard (and text and lines drawn upon it with stylus and stylus pad’) and audio (microphone) input and typed text that the AI periodically monitors (at least once per second) for changes, and if the AI finds it appropriate to respond (according to whatever realm of material it is designed to provide assistance with), then responds with a combination of audible output (or a live avatar) and additions to (and deletions from as appropriate) the whiteboard. Here is a rough mock-up of what I’m trying to achieve, in this case as a ‘virtual teacher’ of geometry/mathematics, ruthlessly hacked from the lovely math problems presented by “The Math Queen” at YouTube and using Pi.AI’s lovely British lady voice as voice actor/narrator:

If self-driving cars can take multiple channels of video, multiple channels of 2D LIDAR data, and data from multiple other sensors and process all that through an AI engine so rapidly as to drive a car safely, then what is proposed herein should be well within the realm of the feasible. Any advice or direction anyone can provide would be welcome.

And before you ask, oh math teachers of the world, The Khan Academy, Study.com, ck12.org, Mometrix.com, ExamEdge.com and many others are ALREADY trying to steal your job too, so… put yourselves in the shoes of Ukranian children trapped in bomb shelters to avoid Russian missiles and who may not be able to get to math class that day or poor children in impoverished nations who would like to learn math too and whose nations cannot afford a qualified math teachers and textbooks for them. Thanks.

Thank you for sharing your interesting idea!

As far as I’m aware, there isn’t currently a solution exactly like the one you’re describing. Such a real-time, collaborative, and locally run AI system handling live whiteboard drawings, audio, and text input would likely be quite resource-intensive at the moment.

The closest existing feature might be ChatGPT “Work with Apps”, which, combined with screen sharing & advanced voice mode, allows ChatGPT to see applications on your screen in real-time.

https://help.openai.com/en/articles/10119604-work-with-apps-on-macos#h_5ef4f27f8d