< Index

Judging Difficulty in osu!taiko

osu!taiko is my favorite rhythm game. At some point I was about to enter the top 100, but had to slow down due to health problems across the entire body. Nonetheless, I have a fairly deep understanding of the game. In this article, I'd like to introduce the possible challenges in judging difficulty in said game to readers not having any experience with said game.

The basics of the game are simple - there are two kinds of notes, blue (also called kat, or k for short) and red (also called d, or d for short). Additionally, there are four keys - two per each color. Levels consist of sequences of notes, you must hit them in time, depending on how accurately and how many of them you hit, your score changes. There are some additional rules, but they aren't core to the game and can be ignored for the purpose of this article.

You can see an example play here.


So, where do we begin? First, as you could imagine, there are multiple playstyles! First, playstyles are separated into two groups - kkdd (one hand plays kats, the other hand plays dons) and kddk (each hand can play both kats and dons). Second, there are sub-groups in those top-level groups. Obviously, there are different key arrangements (you can use ddkk keybindings, or you can use kkdd, you can use kddk, or you can use dkdk), but they don't really change your perception of the game. What does change your perception of the game is the order in which you use your fingers.

For example, for the kddk playstyle, you can start each same-color chunk with the same hand (known as "rolling", along with other playstyles with non-strict alternating rules), you can play 1/2 notes (1/2 is osu! notation, music notation is 1/8) with the same hand, but alternate hands for 1/4 notes (known as "semi-alt"), or you can always alternate hands (known as "full-alt", which is how I personally play). Take, for example, this note sequence: d d kdk d, where kdk indicates kat, don and one more kat with 1/4 spacing. A rolling kddk player might play it as r r rrr r (r indicating the right hand), a semi-alt player might play it as r r rlr r, and a full-alt player might play it as r l rlr r.

Obviously, each playstyle has different problems - which is why you can't exactly create a uniform difficulty scale. However, since full-alt is the playstyle I know most of, I'll use that.


Let's start with classic 1/4 notes (1/16 in music notation).

In general, patterns start on a 1/2 beat. Longer patterns can be separated into smaller patterns - each of which also starts on a 1/2 beat. One might read kdkkdkddkkddk as kdkk dkdd kkdd k. As one's proficiency increases, one learns longer patterns, so a pro player might read the above as kdkkdkdd kkddk, etc. Intuition would suggest patterns like ddddd would be the easiest, as they are the simplest. However, that isn't actually true! While they might be simple to read in isolation, they are hard to hit - a kkddk pattern has a much more uniform per-finger hit distribution - even despite the fact I personally play with 2 fingers, kkddk allows me to slide the fingers across the keyboard, while ddddd requires me to hit the same key in succession. Furthermore, consider pattern sequences like ddd d ddd ddd d d ddd d and ddk d ddk ddk d d ddk d. Humans need "landmarks" in order to recognize a pattern. Try counting the characters as fast as possible here:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

and here:

AaAAaAaAAAaAaAaAAaAaAAaAAAaAaA

It will likely be much easier to count the amount of characters in the second sequence, since you can use the exclamation marks as "landmarks". In a similar way, in case of large monocolor chunks, you need to concentrate hard to spot the boundaries between different patterns. In the second case, however, there are occasional k notes, which allow you to easily differentiate between different note chunks. Even in the case of d ddk vs k ddk, k ddk is much easier to read and play at high speed for the same reasons!

This suggests ddkkddkkddkkd is easy. However, that might not always be true either! Long repetitive patterns, once again, require more concentration, while something like kdkkdkddkkddkdkkd is in general easier to read because, while still having common patterns, it has more "landmarks" to differentiate between patterns.

What about patterns like kddkddkddkddk? Let's see how I would play them:

k d d k d d k d d k d d k

r l r l r l r l r l r l r

As you can see, odd kdd's are played as rlr, while even ones are played as lrl. This was really hard to learn - but, in practice, at this point it doesn't cause me much trouble, except when it repeats a lot and makes it hard to figure out the hand to start the next pattern with, but that's a problem of repetitive patterns rather than a problem of the pattern itself. There are similar patterns, such as dddkkkdddkkk and dkkdkkdkkdkk.

You might think that dkkdkkdkkdkk has the same difficulty as kddkddkddkdd. Wrong! In osu!taiko, k is generally used for accentuation, which means k is much less used off-beat. That means kddkdd is much more frequent than dkkdkk, so I play it less frequently, making it harder to hit. Once again, my experience makes it a non-issue in most cases - but when I face further reading challenges I'll cover next, it can cause problems.

There are also advanced off-beat patterns that are hard to separate. For example, kdddkddd is easy to separate into two kddd, each starting on a 1/2 beat. However, kddddkdddd, which is generally separated by me into two kdddd, has two sub-patterns starting with different hands - which is harder to hit than kddkddkddkdd, partly due to its rarity, partly due to lack of "landmarks" (it can be easily confused with kddd or kddddd). Longer off-beat patterns such as kddddddkdddddd currently require memorization for me - essentially, I can't read them at all.

As an example, Shinsekai is fairly slow and not rhythmically complex, but is really challenging in terms of colors (at the time of me uploading the video I had to memorize some parts, I need to memorize less now, but would still need some memorization)


So far, I only covered color, while only briefly touching upon rhythm. First, let's talk about "odd" and "even" patterns.

There are patterns like kkd, and there are patterns like kd. Most important difference is the fact that if an "even" pattern is followed by a 1/2 spacing (not by a 3/4 spacing), the next pattern starts off-beat. Even patterns are, naturally, less common, since they either don't start or don't end on a 1/2 beat. Furthermore, if even an odd patterns are interleaved, like d kk ddd kkdk d kk dkk, it can form a complex and hard to read rhythm. Just look at this: r lr lrl rlrl r lr lrl. Since I usually read patterns "alongside" their starting hands, the fact starting hands are all over the place (r, l, l, r, r, l, l) is really confusing for me. Of course, just like with other 1/4 patterns, I'm used to it at this point, but it still poses a challenge at higher speed. There's one more challenge with even patterns, specifically with those of length 2. Generally, I read two successive note chunks as two separate pattern - that means I'd read kd dd kk dk dd as 5 separate patterns. Usually I'm able to read quite a lot of notes as a single pattern - but not in this case! This makes doubles a hard to read pattern, unless there's an easily recognizable pattern (for example, kd dk kd dk kd, or if a single chunk always consists of a single color, like kk dd kk kk dd kk).

By the way, this is precisely what makes some parts of the first beatmap I linked hard to read.


If only 1/2 and 1/4 rhythms were used, the game would be much boring than it is right now! Now introducing: 1/6 and 1/8 notes!

As an example of 1/6 notes: k__d__d__d__k_k_k_d__d__k_d_d_k__d__k

(k__d notes have 1/4 spacing, k_d have 1/6 spacing, my markdown formatter forces me to use _).

As an example of 1/8 notes: kk__dd__k_d_k_k_d (k_d notes have 1/4 spacing).

To condense the notation, I'll write the above like kddd(kkkd)d(kddk)dk and (kk)(dd)kdkkd from now on, which is widely used in the osu!taiko community (alongside other less popular notations)

For starters, let's consider the following pattern: kdkk(dddk)dkkd. It can be split into two kdkkd patterns, each starting on a 1/2 beat, separated with (.dd.). However, each of those kdkkd's start with different hands:

kdkk(dddk)dkkd -> kdkkd dd kdkkd

rlrl(rlrl)rlrl -> rlrlr lr lrlrl

This, once again, imposes a significant challenge on newcomers - especially on semi-alt players, who aren't used to starting patterns from a different hand! As you can see, (....) essentially flips the "main" hand used to hit the notes along a 1/2 beat.

What if we combine something hard in terms of color and something hard in terms of rhythm? How about (dddd)dk?

Yep, it's hard. But common. Since it's so common, I learned it at this point. But I do have much more trouble with something less common, like (ddkd)dk, despite it being conceptually similar - simply because it isn't common. Similarly, while (dddd)k is fairly easy, if you speed up ddd d k by two times, it becomes a pattern that's nearly the same physically, but way harder to read, because ddd d are separated mentally. That becomes even more apparent with sequences like ddd d ddd d d k - when sped up by two times, it's easy enough to hit the first part by just spamming d, but it's hard to figure out the hand to play k with, because you essentially need to count the parity of the sequence ddd d ddd d d!

There are also other rare snappings, such as 1/5 or 1/7. They have similar characteristics, but are slightly harder by virtue of being less common.

As an example, Primula is made hard by rhythm and colors.


One more factor is the so called "SV". It doesn't matter what it stands for - all you need to know is that it specifies the scroll speed of a note (notes come from the right side of the screen).

In general, the higher the SV, the less time you have to react. Conversely, the lower the SV, the more notes you see at a time, and the smaller is the spacing between them.

You might think lower SV is easier, but that's not necessarily true! osu!standard players might know the fact low AR is hard - it's the same here. If there's too much notes on the screen, inexperienced players can find it very confusing. Of course, when the SV becomes too high it's once again hard to read because you simply can't read the notes quickly enough.

In fact, in the "old" osu! client known as osu!stable (the new client is open-source bar the audio library, but is still in development, and you can't use it to submit scores online), the amount of notes you see at a time depends on your resolution. Some people got used to playing at the 4:3 resolution - and can't switch to 16:9 because of how many notes you see at a time. Essentially, by increasing your resolution, you can reduce the base SV.

Having more space between notes can also help you be more accurate visually - it's much easier to hit the notes accurately when you can clearly see when the note reached the "target area", if the "target area" is scaled down because of lower note speed, it becomes harder to judge the right time to hit visually, requiring you to rely on audio more (players usually rely on both video and audio). In fact, you could get better accuracy simply because seeing less notes at a time is easier, leaving more brain power to spend on hitting accurately.

In converse, the easier it is to score an accurate hit, the more you can concentrate on hitting the notes at all.


Finally, I want to talk about the HD modifier, which hides the notes before you need to hit them, forcing you to keep them in your short-term memory. It hides the notes at a certain point of the screen, not at a certain time before you need to hit them. What that means is the slower the note, the longer you need to remember it.

One more aspect of HD in osu!stable is the fact it restricts your playfield to effective 4:3, causing less notes to be visible at a time. Essentially, it restricts the notes you see to a particular spot on the screen for both 4:3 and 16:9 players. Indeed, I think that's important for game balance, though it would've been better not to have the playfield depend on resolution at all.

It causes two difficulties. First, you have less time to recognize a pattern. Second, you have less time to read a pattern. These might sound similar, but there's a difference. First is the amount of time you need to differentiate between similar patterns - for example, kddd and kdd. Second is the amount of time you need to figure out what keys to press depending on what colors you saw. The first part is generally a non-issue without HD, but since HD reduces the visible area you notice "recognizing" is a thing.

There's a third aspect to HD! When combined with EZ, a modifier that reduces SV and OD (OD is accuracy difficulty), the visible area increases! When combined with HR, a modifier that increases SV and OD, the visible area reduces to the point most players can't recognize anything past 1/2 without increasing their aspect ratio.

Due to 4:3 being considerably hard at high BPM, especially over 300, HD alone is pretty hard on fast maps. But when you add EZ, the "effective" SV is roughly similar to NM (no-modifier) 16:9. This caused EZHD to become one of the most efficient ways to rank up when playing high BPM maps.


Accuracy difficulty is its own can of worms. As I already said, it partly depends on how much power you can "spare" on accurate hits, but it also depends on whether you can read a rhythm well! For example, even patterns are harder to hit accurately. Accuracy can also be affected by how hard a pattern is for you to read in general, it can be affected by physical difficulty of a pattern (e. g. speed), it can be affected by how exhausted you are, how repetitive a pattern is (if it is repetitive... English doesn't exactly have a word for this - легко сбиться с ритма, ズレやすいです, anyway, it's easy for you to get thrown off the rhythm due to lack of landmarks) and so on.

But if OD is low enough, all of what I described above becomes easier - it's easier to hit "in rhythm", as far as the game is concerned, anyway.


So, if one were to make an algorithm to calculate difficulty for osu!taiko, how would one do it?

First, the fact certain patterns are challenging to beginners but easy for pro players even at high speed needs to be acknowledged. At a high level, patterns themselves mostly pose no challenge - it's pattern transitions that are hard. Rather than "how do I hit this pattern", it becomes a game of "which hand do I start this pattern with". A notable exception is double notes - as I mentioned, they are read by me as a single pattern - they are an example of a pattern that's hard "by itself", ignoring the transitions.

But low-level players are important too - they form most of the userbase! You need to incentivize them to improve, and the difficulty system needs to know what patterns are challenging for newcomers to do that - such factors need to have a high weight on low levels, but a low weight on high levels.

The fact "general" SV, difficulty and accuracy are interlinked needs to be acknowledged too. HR increases OD - but since it increases SV too, it can actually make getting high accuracy easier! EZHD is easier than HD on high SV, due to decreased OD it can sometimes be easier than NM - but on low SV it will only make everything harder, certainly harder than NM.

At some point I wanted to update the game's poor difficulty system. But after noticing the amount of factors that need to be accounted for, I decided I don't mind the current system enough to create a new system with consideration of all I thought about. Indeed, a community member has since created a new difficulty system - it fixed some glaring issues with the former system, but, by my subjective opinion, made things less "fun", partly because mappers (beatmap creators, beatmap is an osu! term for chart/level) could no longer utilize these issues to manually mark a beatmap as "hard", partly because complex rhythms started being rewarded way less, creating a meta of simple patterns with high accuracy. It's hard to differentiate between "a map for pro players" and "a map for slightly less pro players" for an algorithm that doesn't consider a multitude of factors, but said difference is crucial for players whose skill lies somewhere inbetween "pro" and "less pro"! The new system can better estimate the difficulty, but it can't pinpoint it - which is why I was much more satisfied with my top plays in the old system.

In conclusion, for such a simple game, estimating difficulty can be surprisingly complex, and show us some insights into how humans learn and recognize patterns in general. I hope this article was interesting, in spite of how useless this knowledge is.


Have any comments, questions, feedback? You can click here to leave it, publicly or privately!