I'm looking how to extract a domain name from a URL in a column in my PowerBI report.
I tried to use that formula:
DOMAIN = LEFT([URL],FIND("/",[URL],9)-1)
But it says
The search Text provided to function 'FIND' could not be found in the given text.
Thanks for your help.
In DAX it would be SEARCH which has the same syntax as the Excel FIND function:
SEARCH(<find_text>, <within_text>[, [<start_num>][, <NotFoundValue>]])
FIND(find_text, within_text, [start_num])
So it will be
DOMAIN = LEFT([URL],SEARCH("/",[URL],9)-1)
Update: there is a find function in DAX, I didn't realise it, always used Search! Search supports wildcards find doesn't.
Your formula depends on the URL containing a "/" after the http section. I think it's failing because, there is no "/" after the first few characters. So, you may have to improvise, based on the URL you see. For example, if the domain ends with .com, you could use:
DOMAIN = LEFT([URL],FIND(".com",[URL],9)+3)
The type of URL you have becomes important here. Hope this helps.
Related
I am new into Google Data Studio and I am trying to cleanse some Google Analytic data.
For example I have a filed called page which shows the page name. For some pages I have duplicates e.g: contact/product/car and contact/product/car/ (ending in this case with /)
I want to create a field that always replace the last charterer of the page name if it ends with '/' with a space
I have tried this function: REPLACE(ENDS_WITH(Page,"/"), '/','')
But is not working instead giving me true or false.
Someone can help me with this?
Please use regular expression for doing this job:
REGEXP_REPLACE(Page,"/$","")
I have a list with URLs and IPs for Office365 in XML format. Now I'd like to either write a script or use a text editor's search and replace function (regex) to automatically change some of these URLs.
Example:
These URLs
<address>scus-odc.officeapps.live.com</address>
<address>scus-roaming.officeapps.live.com</address>
<address>sea-odc.officeapps.live.com</address>
Should be changed to
<address>*.officeapps.live.com</address>
<address>*.officeapps.live.com</address>
<address>*.officeapps.live.com</address>
I would appreciate any input on this issue. Thanks in advance.
Here is what I have tried so far:
1)Search for ..(?=[^.].[^.]*$) and replace with an empty string.This does a good job but unfortunately it removes the preceeding as well...
2)As pointed out by Tim, the list consists of FQDNs with different domains.The list is available from https://go.microsoft.com/fwlink/?LinkId=533185 (This list includes all FQDNs - The IPs will get deleted)
3) Solved with the help of Sergio's input. The solution was to
search for (>)[^.\n\s]+ and substitute with \1\*
I will have to write another script to delete the multiple domains but that was not part of the question so I consider this issue closed. Thank you for your input.
You can use the regex:
(>)[^.\n\s]+
and substitute with \1\*
Hi I am pretty new to regex I can do some basic functions but having trouble with this. I need to change the link in the rss feed.
I have a url like this:
http://mysite.test/Search/PropDetail.aspx?id=38464&id=38464&listingid=129-2-6430678&searchID=250554873&ResultsType=SearchResult
and want to change it to updated site:
http://mysite.test/PropertyDetail/?id=38464&id=38464&listingid=129-2-6430678&searchID=250554873&ResultsType=SearchResult
Where only thing changed is from /Search/PropDetail.aspx
to /PropertyDetail/
I don't have access to the orginal rss feed or I would change it there so I have to use pipes. Please help, Thanks!
Use the regex control.
In it, specify the DOM address of the node containing your link (prefixed by "item.") within the "In" field. For the "replace" field type
(.*)//Search//PropDetail/.aspx
and in the "with" field type use:
$1//PropertyDetail//.*
I've 'escaped' the '/' character in the with field. However, I'm not sure you need to do this except before the '.*' Some trial and error may be needed.
Hopefully this will achieve the result you want.
I have a list of domain name with parameters
www.frontdir.com/index.php?adds1205
centurydirectory.com/submit/
www.directoryhigher.com/index.php?filec-linkapproval&x_response_code1
I need to find other parts with domain and I have to replace those parts.
Finally my result should look as follows.
Expected result:
www.frontdir.com
centurydirectory.com
www.directoryhigher.com
I tried the following regex
/([^/\?]+)\?
but can not able select after " ? "
How can I attain this result?
How about replacing
\/.*$
with an empty string?
I'm assuming here that you have one URL per line (your example suggests as much) and that you want to keep just the domains (again, as per your example).
as title mentioned, is there a quick way to do that? I dont need a solid solution, anything that can differentiate, for example:
http://asdasd/
is not a valid domain name, where
http://asd.asdasd.asd
is a valid domain name.
I tried to search the solution, the closest(simple) solution is this: in python
But thats for python, I need to do in c++. Any help?
Can it be done by using "string manipulation" only? Like, substring?
I believe this can be done with libcurl.
Baring the fact that http://... is not a domain name but a URL, and that asdasd is as valid domain name if setup as a search domain (such as on local net), then purely checking for the string syntax can be done with a simple set of strncmp, strchr and strstr commands
char *str = "http://abd.xxx";
bool valid = strncmp(str,"http://",7) && str[7] && strchr(str+7,'.');
This should check that the string starts with http:// AND that there is more after the http:// and that the more after that contains a dot -- if you also want to handle where the URL contains an actual path like http://expample.com/mypath.txt, then the example become more complex, but you didn't specify if that was needed.
Alternatively, you can use regex and the pattern which you have from the python answer you point to yourself