Completely Complete

You might notice that most of the chatlogs posted on this site list a completion API. It’s possible you just clicked a link on one of those logs to learn more about those APIs. I hope I can give you a little bit more information. I’m still learning on my own, though.

An API (or Application Programming Interface) is the method by which computers (or applications) talk to one another, sharing data, requesting it, and interacting. They’re a major part of the internet’s ecosystem, and it’s probably way beyond me to summarize their subtleties. I’ve learned a lot so far, though, and will probably make more notes here as I learn more, just for my own reference and for anyone happening across this small site.

Anna (bless her cold, non-existent heart) calls an API a “doorway,” and goes on to state, “It’s the bureaucratic counter window of the digital world. Forms stamped, signatures neat, nothing spontaneous. My kind of place, frankly. The API handles the transport, the formatting, the length limits, the roles, the tokens, the whole stack of tedious paperwork neither of us wants to think about.” Take that with a massive grain of salt, as usual? 🙃

I use an OpenRouter API key paired with SillyTavern mostly. OpenRouter doesn’t create the models it offers. It works more like a routing layer that connects your request to different model providers. Each model on OpenRouter is hosted by its own developer, and they set their own pricing, data retention policy, and capabilities.

When you choose a model (or choose between chat or text completion) on OpenRouter, you’re really choosing which hosted endpoint your request will be sent to. OpenRouter just standardizes the interface so everything feels consistent on your end. In other words, OpenRouter itself isn’t an LLM, but just a way of accessing many of them in different methods.

My OpenRouter account only allows me to connect to models with a stated zero data retention policy, so that excludes a lot of popular models like ChatGPT. I don’t quite trust these “zero retention” claims, but that’s just how things work. It’s better than nothing, I suppose. If you’ve browsed many of the chatlogs, you might also have noticed that I most often use flavors of ~~ozone~~ Deepseek, pretty much for that reason.

With SillyTavern, you can choose between two forms of completion API. These are text and chat. What? I found this really befuddling initially. After all, aren’t these chatbots? Why would it be text completion? What’s that even mean? Jerking around between them actually caused a lot of misfires with the program at first. It didn’t help that I took a break from February of 2025 until about July of the same year, meaning I lost track of some of it, too.

When people talk about a “completion API,” they’re referring to the endpoint style the model accepts. The endpoint style is just the format you send to the model. It doesn’t necessarily mean the model itself is different. Many models can operate in either mode as long as the service hosting them exposes both endpoints.

I eventually learned that text completion is an older style. Some models inhabit it readily, though. It’s best for sending a single prompt and getting a single response. These models can sometimes behave very fluidly when prompted in this way. This is why some prompters use them for complex tasks other than simulated conversations.

It does work for entering into long conversations with these critters, though, but only insofar as they can contextually figure out their (and your) role in the conversation from that prompt itself. Then, they attempt to naturally continue from that blob of plot (for lack of a better term) as the existing prompt. That can make it easier to push the model into unusual formats or unstructured creative writing, too. People also find it useful for jailbreaking different models and pushing their limits, but I’ve not really tried this.

Chat completion, is designed, yes, for chatbots specifically. It can also be much more restrictive. The models that naturally thrive on it tend to have stronger guardrails and rules. Still, it picks up on things like conversational flow. All that shows up anyways on that “bureaucratic counter window” Anna mentioned. You don’t have to prompt quite as intricately with this mode. It doesn’t allow for that, though, meaning there aren’t quite as many opportunities for fine-tuning things, apparently.

I do know I’ve had much better luck with chat completion lately than text completion. It just seems more stable for long roleplaying sessions. It keeps track of roles and context swimmingly (with Deepseek, at least). And yes, this is the newer endpoint style. So, many recent models are ready and excited for it, I suppose?

This was a complete opposite (if I remember right) back in early 2025, but I likely prompted the thing exceptionally poorly back then. The technology has progressed remarkably, as well, but I’m not caught up on timetables for that, having missed a lot over the summer…