First download the existing tokenizer to ./my_tokenizer/tokenizer.json.
You only need tokenizer.json; for example, GPT-2’s can be obtained here.
Then open up a Python REPL and run some commands. I’m adding task & sentinel tokens for UL2R:
from transformers import AutoTokenizer
AutoTokenizer.from_pretrained("./my_tokenizer/")
tokenizer.add_tokens(["<|r|>", "<|s|>", "<|x|>"])
# 3  
tokenizer.add_tokens([f"<|mask_{i}|>" for i in range(100)])
# 100
tokenizer.save_pretrained("./my_tokenizer")