Betting on Numerical

Quantise started out as a simple idea: building a multiplayer version of a guessing game genre I saw on local Turkish TV shows and YouTube channels. Because these games are based on numerical inputs, everything I developed has been related to that rather than multiple-choice questions, free-form text inputs, or whatever. This was just the way the game happened to be working initially, but as time passed, I realized that having numerical inputs, thus every input being comparable with each other, creates the possibility of creating many more game modes than those two initial ones, and simplifies the question generation process.

The typical reason multiple-choice questions are used so prevalently is that they allow questioners to measure almost any type of knowledge. They have a very simple scoring system, which saves a lot of time in the evaluation process. Creating compelling distractors takes time and effort. It also gives players room to roll a die and guess the correct answer rather than know it.

Free-form text inputs allow questioners to assess knowledge more precisely, but preparing a list of accepted answers requires considerable time and effort. Almost always, there are answers that are unclear whether they should be accepted or not, and a simple fuzzy search wouldn’t solve the problem.

Numerical inputs felt like the sweet spot, the best of both worlds. Yes, it limits what kind of knowledge we can measure, as not everything can be asked in numerical form, nor should it be most of the time. There is no need to have a set of distractors, and defining what is accepted doesn’t consist of too complex logic.

Scoring

If scoring consists only of being true or false, it is binary scoring. Because each numerical input is comparable, we can achieve a non-binary / continuous scoring system, which allows us to assign points to incorrect guesses in any algorithm we want. For example, we can take into account the time it took to answer. We can be harsher for proximity errors by reducing points steeply, or we can expect players to answer rapidly, assuming it shouldn’t take long.

For example, in the video above, the maximum score is 1000. I guessed in 3.57 seconds and was wrong by -2973 points. The bot guessed in 4.93 seconds and was wrong with a score of -909. I scored 763 points, while the bot scored 885 points. Being faster helped a bit, but I was much more wrong than the bot.

Tolerance

From observation of how numerical input-based games get played, there is a problem with the Negligible Distinction Trap. For example, imagine this question: “What is the average mass of male African elephants in kilograms?” That is not something that is precisely measured and averaged, nor is there an authoritative source stating “on average X kg with +-Y tolerance”. So generally, the question has an answer of just X, the exact X. They kind of rely on you to round to tens or hundreds or thousands, considering the word “average” and call it a day.

My solution to this problem was tolerances. We define a midpoint and establish a 100% correctness threshold by assigning a tolerance. Of course, now for binary-scored games, the “one above” or “one below” issue exists, but there’s got to be a hard cap for what is considered correct within an approximation.

Non-locality

Inputs being universal allows us to let players across the globe play together, just by translating the questions. Furthermore, by defining conversion ratios, we can even allow players using both metric and imperial systems to play in the same lobby. This also allows us to approach the logic of units from a different perspective, so we are no longer just creating conversion ratios, but also deriving new questions from existing information using a fact-based system. For example, having a fact about the fastest train in the world, having a top speed of 603 km/h, and another fact about the average distance between the Earth and the Moon being 384,400 km, we can now ask how much time it would take to travel that distance (≈26.6 days).

Configurability

The first mode acts like a flag race. Players take alternating turns guessing a number, and the host provides higher or lower feedback. Each wrong guess effectively narrows the possible range, forcing the next player to make a more educated, high-stakes guess until someone hits the exact target.

The second mode, which I call Hot Potato, adds a hidden random timer. The player who has the turn continuously submits their guesses to pass the turn to the next player as fast as possible. Because the UI only discloses partial clues about when the timer will expire, there is no safe zone to stall.

The next mode I developed was something we can call One Shot. Everyone guesses simultaneously, thus not turn-based like the previous two. Since it is not turn-based, we score each guess on a continuous scale combining proximity and speed.

In all these game modes, there is emergent randomness about how each player will guess, and deliberate RNG in the latter game mode related to the timer. Tolerance configurations across all game modes, the target score in Flag Race, the timer randomness formula and number of lives in Hot Potato, and the scoring formula in One Shot, all allow us to tailor the experience exactly as we want.

For convenience, I’ve created paces where each mode lasts approximately the same amount of time, but detailed configuration is still a thing.

QoL

Deducing from higher and lower feedback, some guesses become mathematically impossible to be correct. Thus, I thought, why not prevent impossible guesses from the get-go? That’s why Quantise has a digit locking mechanism.

In the video above, after guessing 100, we know that the answer is at most 99 and at least 92, so we lock the digit 9. That is trivial for this example, but imagine much larger numbers. For them, this mechanism significantly speeds up the process by avoiding unnecessary work.

Crowd Sourcing

Since the process of creating a numerical question consists of accurately defining a meaningful fact and its value within a reasonable tolerance, the moderation, translation, and tolerance assignment processes can be automated by AI. That, of course, may result in AI hallucinating, but I believe it is a good price to pay to make the whole system much more scalable. Acknowledging and acting on players’ contributions is essential to building a self-sustaining ecosystem. When players are rewarded for maintaining the quality of the question pool, they respect the game because they have actively helped build it.

What started a an attempt to replicate a TV show mechanic accidentally allowed me to bypass some flaws in modern trivia. By betting entirely on numerical inputs, Quantise stopped being a static quiz app and became a configurable, real-time engine. The inputs are universal, the scoring is continuous, and the RNG is deliberate rather than being a byproduct of multiple-choice design.

I document the development logs, architectural updates, and local build milestones on Bluesky at @quantise.app.