April 21, 2015

Extract Top Level Domain from Domain name

Joao Alves’s Question:

I have an array of top level domains like:

['ag', 'asia', 'asia_sunrise', 'com', 'com.ag', 'org.hn']

Given a domain name, how can i extract the top level domain of the domain name based on the array above? Basically i dont care of how many levels the domain has, i only need to extract the top level domain.

For example:

test1.ag -> should return ag

test2.com.ag -> should return com.ag

test.test2.com.ag -> should return com.ag

test3.org -> should return false

Thanks

Updated to incorporate Traxo’s point about the . wildcard; I think my answer is a little fuller so I’ll leave it up but we’ve both essentially come to the same solution.

//set up test variables
$aTLDList = ['ag', 'asia', 'asia_sunrise', 'com', 'com.ag', 'org.hn'];
$sDomain = "badgers.co.uk"; // for example

//build the match
$reMatch = '/^.*?.(' . str_replace('.', '.', implode('|', $aTLDList)) . ')$/';
$sMatchedTLD = preg_match($reMatch, $sDomain) ? 
        preg_replace($reMatch, "$1", $sDomain) : 
        "";

Resorting to Regular Expressions may be overkill but it makes for a concise example. This will give you either the TLD matched or an empty string in the $sMatchedTLD variable.

The trick is to make the first .* match ungreedy (.*?) otherwise badgers.com.ag will match ag rather than com.ag.

parseurl() function gives you access to the host name of the url. You can use that to process the host name and find out the tld.

$url = 'http://your.url.com.np';
var_dump(parse_url($url, PHP_URL_HOST));

Next steps could be using explode() to split the host name and checking the last item in the exploded list. But I am going to leave that to you.

...

Please fill the form - I will response as fast as I can!