I’d assumed that emoji were all organized into a contiguous Unicode codepoint range, but this is very much not the case! There are more than a thousand different ranges containing emoji. The Unicode consortium makes the complete list available as a file, emoji-data.txt.

Here are a few lines:

25FB..25FE    ; Emoji                # E0.6   [4] (◻️..◾)    white medium square..black medium-small square
2600..2601    ; Emoji                # E0.6   [2] (☀️..☁️)    sun..cloud
2602..2603    ; Emoji                # E0.7   [2] (☂️..☃️)    umbrella..snowman
2604          ; Emoji                # E1.0   [1] (☄️)       comet

I wanted to convert this list into the form U+25FB-25FE,U+2600-2601,... for use with the kitty terminal’s symbol_map configuration option.

I wrote some shell to convert it into that format:

curl https://www.unicode.org/Public/UCD/latest/ucd/emoji/emoji-data.txt \
  | grep -v '^#' \
  | sed -e 's/;.*//' -e 's/[[:space:]]//g' -e '/^$/d' -e 's/\.\./-/' -e 's/^/U+/' \
  | sort \
  | uniq \
  | tr '\n' ',' \
  | sed 's/,$//'

Some pretty big caveats: