I was trying to get a nice short Regular Expression to filter and validate the url’s entered by an user in an form, and finally I come up with one that you can use as your first pass filter that its been working fine for me.
It’s not a perfect one, but at least filter most of the commons mistakes user commit while filling out forms, specially from mobile devices:
[http][s]?[:][\/\/]([^\.]\w[^\s]*\.)(.*?\.)?[a-zA-Z]{2,4}$
At first looks for the http prefix with:
[http]
Then a conditional search for the https: secure servers prefix:
[s]?
Next the colon:
[:]
Also force the double forward slashes to prevent typing “http:/”:
[\/][\/]
Next it will look for any group of letters or numbers followed by a dot that never should be preceded by a dot:
([^\.]\w[^\s]*\.)
This will prevent the user to enter something like:
http://.designersgate.com
Next will look for optional name followed by dots:
(.*?\.)?
Finally the expression will look for the last letters after the final dot, with a range from 1 to 4 digits (ex. co, com, mobi):
[a-zA-Z]{2,4}$
Now when you text this regular expression pattern with RegexBuilder this are the results:
As I said, is not a perfect filter because some mistakes are passed like:
http://ww.com http://ww.designersgate.com
But this is intended just in case the user url is something like:
http://en.designersgate.com
After filtering the first round with this you can also check later on for other errors that you might think are necessary for your application but I think this is a good start.
If you need this filtered for C++ literals here it is:
@"[http][s]?[:][\/\/]([^\.]\w[^\s]*\.)(.*?\.)?[a-zA-Z]{2,4}$"
Happy Coding!