URLs or Uniform Resource Locators are addresses that are used to find resources like HTML webpages on the internet. URLs are designed to be human-readable. A common example is that of ‘Google.com.” Everyone knows that one by heart.
URLs act a lot like file paths in a Windows system. They use characters like “/” to differentiate between different folders. There are also other special characters they use that we will discuss later.
Anyway, URLs need to be encoded to ASCII so that they can be sent over the internet. The ASCII encoding helps to differentiate between the normal use of special characters and their specified role in a URL.
So, let’s see how that works, along with some examples.
As we learned before, URL encoding is the process of converting parts of a URL into ASCII format. Standard text characters like “a,b,c,d…” are already in ASCII. However, URLs have some special characters called reserved characters that need to be encoded before sending a URL.
These reserved characters are the following:
:, /, ?, #, &
All of these characters have special meanings in a URL. For example, ‘#’ is called the fragment identifier and is used to navigate to a specific section of the webpage. If ‘#’ is used present in a URL but is not supposed to be acting like the fragment identifier, then it needs to be encoded into its ASCII representation in hexadecimal format.
URL encoding is done via a method known as percent-encoding. In percent-encoding, the character in a URL is replaced with a “%” sign followed by its hexadecimal ASCII code.
A common example is the space character. In a normal URL, the space character is not used. Instead, dashes (-) or the plus sign (+) are used. However, sometimes, the URL may contain a real sentence passed as a parameter, e.g., “example.com/search?query=hello world&lang=en”. Similarly, businesses can use tools like an employee turnover calculator to analyze the rate at which employees leave, helping them identify areas to improve retention and enhance overall employee satisfaction
The space between the “hello world” sentence needs to be encoded into its ASCII form, which is “%20”. So the URL would become like this:
https://example.com/search?query=hello%20world&lang=en
This is the process of URL encoding.
URL decoding is the process of converting the encoded URL back into its original form. The process reverses all the hexadecimal text prefixed with a percentage sign. Decoding is required once a URL reaches its destination, i.e., your browser. This is why you never see encoded URLs in a normal routine.
Here’s an example of an encoded URL.
https://example.com/search?query=C%2B%2B%20Programming%3A%20A%20Beginner%27s%20Guide%20%26%20Examples%21&lang=en
You can see that there are plenty of “%” signs in there, which means a lot of characters have been encoded.
If we decode these signs, we will get the original URL. Here’s what all of the codes in this example stand for:
Going by this manual, the decoded URL appears to be as follows:
https://example.com/search?query=C++ Programming: A Beginner's Guide & Examples!&lang=en
And now, the decoding is complete.
In real life, many programming languages and web development tools provide built-in functions for encoding and decoding URLs. Some common examples include Python, JavaScript, and PHP. The specific function names are given below.
It all depends on what technology stack you are working with. Every stack has its own specific methods. You can easily search them on tech websites like StackOverflow.
So, there you have it: URL encoding and decoding. We discussed what they are and why they are necessary. We also covered how it is done manually and practically. Now, you should have a better understanding of URL encoding and decoding.