-
-
Save RavenHursT/fe8a95a59109096ac1f8 to your computer and use it in GitHub Desktop.
var extractRootDomain = function(url){ | |
return url.match(/^https?\:\/\/([^\/?#]+)(?:[\/?#]|$)/i)[1].split('.').slice(-2).join('.'); | |
}; |
extractRootDomain("localhost")
Uncaught TypeError: Cannot read property '1' of null at extractRootDomain (<anonymous>:2:61) at <anonymous>:1:1
Probably better to change this to parse the given url first:
const parsed = new URL(window.location.href)
Then just do string and array manipulation based on what kinds of domains you're working w/ (-2
for .com
's, -3
for .co.uk
's, etc.)
parsed.hostname.split('.').slice(-2).join('.')
not working on domain like "http://google.co.jp"
Does not work with https://www.mesdroitssociaux.gouv.fr/accueil/
Does not work with
https://www.mesdroitssociaux.gouv.fr/accueil/
You can try this one: https://github.com/scrapingapi/get-root-domain
Does not work with
https://www.mesdroitssociaux.gouv.fr/accueil/
You can try this one: https://github.com/scrapingapi/get-root-domain
Merci, mais c'est la même chose ... It would be more simpler and more reliable to use a SLD list as a base and apply a regex on the url by parsing the SDL list.
We just need to have a repo with a list which will be always up to date.
A one-liner:
const getRoot = (url = "") => (new URL(url)).hostname.split('.').slice(-2).join('.')
@innocentamadi that also does not work depending on how many ending segments the domain has. For www.fhstp.ac.at
(my school domain), it would only give you ac.at
this works -
function extractDomain(url) {
// Remove protocol if exists
let domain = url.replace(/^https?:///i, '');
// Remove www. if exists
domain = domain.replace(/^www\./i, '');
// Get the hostname from the URL
try {
domain = new URL('http://' + domain).hostname;
} catch (error) {
// If there's an error in URL parsing, return the original domain
return domain;
}
// Extract subdomains
const parts = domain.split('.');
if (parts.length > 2) {
// Check if the last part is a TLD (Top Level Domain)
if (parts[parts.length - 1].length <= 3) {
// Handles cases like co.uk, com.au, etc.
domain = parts.slice(-3).join('.');
} else {
domain = parts.slice(-2).join('.');
}
}
// Add www. prefix back if it exists in the original URL
if (url.includes('www.')) {
domain = 'www.' + domain;
}
return domain;
}
// Test cases
console.log(extractDomain("https://studio.youtube.com/channel/UCntj-iDUfMBvc8_peZWbQ4g/editing/sections")); // Output: studio.youtube.com
console.log(extractDomain("https://www.youtube.com/")); // Output: www.youtube.com
console.log(extractDomain("https://www.youtube.com/channel/UCntj-iDUfMBvc8_peZWbQ4g")); // Output: www.youtube.com
thanks
extractRootDomain('http://www.google.co.uk/blah')
"co.uk"