URL encoding and input normalization
How to work with unicode and special characters in the URL?
April 09, 2023
Most of the web-work that I have done in front-end has been in English. Sure, there were some cases where there were translations (which were done using i18next), but even then we didn't test it to the full extent.
Recently while working on a new project, we had to work with 'NFC-normalization' & encoding the URL when it contains special characters. This opened up a new rabbit hole for me, which I successfully navigated by sticking to 'what needs to be done', and making up my own test cases for it.
One issue I faced is with react-router v5:
The objective: I wanted to encode special characters in the URL. At first, I thought I was doing something wrong, because even the characters that encodeURIComponent
can understand, I was not able to encode. Eventually I realized it was a bug in react-router v5 + history modules (<v5
)
Rules on encoding/normalization that followed:
- NFC-normalize all user inputs.
- URL encode anything that can be in the URL (duh).
- Don't forget to decode when displaying encoded values back.