cross-posted from: https://lemmy.world/post/76020
Greetings Reddit Refugees!
I hope your migration is going well! If you haven’t been here before, Welcome to FOSAI! Your new Lemmy landing page for all things artificial intelligence.
This is a follow-up post to my first Welcome Message.
Here I will share insights and instructions on how to set up some of the tools and applications in the aforementioned AI Suite.
Please note that I did not develop any of these, but I do have each one of them working on my local PC, which I interface with regularly. I will plan to do posts exploring each software in detail - but first - let’s get a better idea what we’re working with.
As always, please don’t hesitate to comment or share your thoughts if I missed something (or you want to understand or see a concept in more detail).
Getting Started with FOSAI
What is oobabooga?
In short, oobabooga is a free and open source web client someone (oobabooga) made to interface with HuggingFace LLMs (large language models). As far as I understand, this is the current standard for many AI tinkerers and those who wish to run models locally. This client allows you to easily download, chat, and configure with text-based models that behave like Chat-GPT, however, not all models on HuggingFace are at the same level of Chat-GPT out-of-the-box. Many require ‘fine-tuning’ or ‘training’ to produce consistent, coherent results. The benefit using HuggingFace (instead of Chat-GPT) is that you have much more options to choose from regarding your AI model, including the option to choose a censored or uncensored version of a model, untrained or pre-trained, etc. Oobabooga is an interface that let’s you do all this (theoretically), but can have a bit of a learning curve if you don’t know anything about AI/LLMs.
What is gpt4all?
gpt4all is the closest thing you can currently download to have a Chat-GPT style interface that is compatible with some of the latest open-source LLM models available to the community. Some models can be downloaded in quantized formats, unquantized formats, and base formats (which typically run GPU only), but there are new model formats that are emerging (GGML), which enable GPU + CPU compute. This GGML format seems to be the growing standard for consumer-grade hardware. Some prefer the user experience of gpt4all over oobabooga, and some feel the exact opposite. For me - I prefer the options oobabooga provides - so I use that as my ‘daily driver’ while gpt4all is a backup client I run for other tests.
What is Koboldcpp?
Koboldcpp, like oobabooga and gpt4all is another web-based interface you can run to chat with LLMs locally. It enables GGML inference, which can be hard to get running on oobabooga depending on the version of your client and updates from the developer. Koboldcpp, however, is part of a totally different platform and team of developers who typically focus on the roleplaying aspect of generative AI and LLMs. Koboldcpp feels more like NovelAI than anything I’ve ran locally, and has similar functionality and vibes as AI Dungeon. In fact, you can download some of the same models and settings that they use to emulate something very similar (but 100% local, assuming you have capable hardware).
What is TavernAI?
TavernAI is a customized web-client that seems as functional as gpt4all in most regards. You can use TavernAI to connect with Kobold’s API - as well as insert your own Chat-GPT API key to talk with OpenAI’s GPT-3 (and GPT-4 if you have API access).
What is Stable Diffusion?
How-To-Install-StableDiffusion (Automatic1111)
Stable Diffusion is a groundbreaking and popular AI model that enables text to image generation. When someone thinks of “Stable Diffusion” people tend to picture Automatic1111’s UI/UX, which is the same interface oobabooga is inspired by. This UI/UX has become the defacto standard for almost all Stable Diffusion workflows. Fun factoid - it is widely believed MidJourney is a highly tuned version of a Stable Diffusion model, but one who’s weights, LoRAs, and configurations made closed-source after training and alignment.
What is ControlNet?
ControlNet is a way you can manually control models of Stable Diffusion, allowing you to have complete freedom over your generative AI workflow. The best example of what this is (and what it can do) can be seen in this video. Notice how it combines an array of tools you can use as pre-processors for your prompts, enhancing the composition of your image by giving you options to bring out any detail you wish to manifest.
What is TemporalKit?
This is another Stable Diffusion extension that allows you to create custom videos using generative AI. In short, it takes an input video and chops them into dozens (or hundreds) of frames that can then be batch edited with Stable Diffusion, amassing new key frames and sequences which are stitched back together with EbSynth using your new images, resulting a stylized video that was generated and edited based on your Stable Diffusion prompt/workflow.
Where to Start?
Unsure where to begin? Do you have no idea what you’re doing? Or have paralysis by analysis? That’s okay, we’ve all been there.
Start small, don’t install everything at once, and instead, ask yourself what sounds like the most fun? Pick one of the tools I’ve mentioned above and spend as much time as you need to get it working. This work takes patience, cultivation, and motion. The first two parts of that (patience, cultivation) typically take the longest to get over.
If you end up at your wit’s end installing or troubleshooting these tools - remind yourself this is bleeding edge artificial intelligence technology. It shouldn’t be easy in these early phases. The good news is I have a strong feeling it will become easier than any of us could imagine over time. If you cannot get something working, consider posting your issue here with information regarding your problem.
To My Esteemed Lurkers…
If you’re a lurker (like I used to be), please consider taking a popcorn break and stepping out of your shell, making a post, and asking questions. This is a safe space to share your results and interests with AI - or make a post about your epic project or goal. All progress is welcome here, all conversations about this tech are fair and waiting to be discussed.
Over the course of this next week I will continue releasing general information to catch this community up to some of its more-established counterparts.
Consider subscribing if you liked the content of this post or want to stay in the loop with Free, Open-Source Artificial Intelligence.
Thanks a lot! I really appreciate all of your hard work in collecting information on AI. Your efforts are making this complex technology more accessible to everyone, which is great news for society as a whole. Thanks again!
Wow, I’m blown away. Just installed gpt4all, it’s a simple exe install then just download a model from within the software and off you go, couldn’t be easier. The paragraph above was written by mpt-7b-chat. I didn’t know until now that one could have this working offline and just use it, pretty incredible.
Lol, you had me in the first half not gonna lie. Well done, you almost fooled me!
Glad you had some fun! gpt4all is by far the easiest to get going with imo.
I suggest trying any of the GGML models if you haven’t already! They outperform almost every other model format at the moment.
If you’re looking for more models, TheBloke and KoboldAI are doing a ton for the community in this regard. Eric Hartford, too. Although TheBloke is typically the one who converts these into more accessible formats for the masses.
This is an amazing resource. Thank you for putting it together and posting it here!
Absolutely! I’m having a blast launching /c/FOSAI over at Lemmy.world. I’ll do my best to consistently cross-post to everyone over here too!
What does FOSAI stand for? My google-fu doesn’t bring up much.
Nvm, went to the Lemmy link and I see it listed as Free Open Source AI.
FWIW, it’s a new term I am trying to coin in FOSS communities (Free, Open-Source Software communities). It’s a spin off of ‘FOSS’, but for AI.
There’s literally nothing wrong with FOSS as an acronym, I just wanted to use one more focused in regards to AI tech to set the right expectations for everything shared in /c/FOSAI
I felt it was a term worth coining given the varied requirements and dependancies AI/LLMs tend to have compared to typical FOSS stacks. Making this differentiation is important in some of the semantics these conversations carry.
Excellent write-up of some great resources. I’ll enjoy tinkering with them, mainly for improving my python skills. Thank you for putting in the effort!
Thank you! I appreciate the kind words. Please consider subscribing to /c/FOSAI if you want to stay in the loop with the latest and greatest news for AI.
This stuff is developing at breakneck speeds. Very excited to see what the landscape will look like by the end of this year.
ooba community here: !oobabooga@lemmy.world Post questions and let’s get the knowledge base building.
Anyone looking for some really in-depth videos on Stable Diffusion should check out https://youtube.com/@SECourses. Also https://youtube.com/@Aitrepreneur (which is the one linked for oobabooga) for LLMs and Stable Diffusion. They’ve both helped me a lot and have active Discords.
Great suggestions! I’ve actually never interfaced with that first channel (SECourses). Looks like some solid tutorials. Definitely going to check that out. Thanks for sharing!
Commenting here to save this post.