I'm getting a string that looks like this from a database: ~\Uploads\Tree.jpg
And I would like to change it in Actionscript3 to Uploads/Tree.jpg
Any idea how I can do this in neat way?
Assuming path is the string from the database, you can use this:
var newPath:String = path.replace(new RegExp("^~\\\\", "g"), "").replace(new RegExp("\\\\", "g"), "/")
If you always have the "~\" in the beginning, you can optimize it by using String.substring() instead. And if you are gonna convert many strings at once, make a reference to the regex and use that instead, so you do not create a new regex for each string.
Related
I need to use python to match url in my text file.
However, there is a special case:
i like 🤣pic.twitter.com/Sex8JaP5w5/a7htvq🤣
In this case I would like to keep the emoji next to the url and just match the url in the middle.
Ideally, I would like to have result like this:
i like 🤣<url>🤣
Since I am new to this, this is what I have so far.
pattern = re.compile("([:///a-zA-Z////\.])+(.com)+([:///a-zA-Z////\.])")
but the return result is something unsatisfied like this:
i like 🤣<url>Sex8JaP5w5/a7htvq🤣
Would you please help me with this? Thank you so much
A solution using existing packages:
from urlextract import URLExtract
import emoji
def remove_emoji(text):
return emoji.get_emoji_regexp().sub(r'', text)
extractor = URLExtract()
source = "i like 🤣pic.twitter.com/Sex8JaP5w5/a7htvq🤣 "
urlsWithEmojis = extractor.find_urls(source)
urls = list(map(remove_emoji, urlsWithEmojis))
print(urls)
output
['pic.twitter.com/Sex8JaP5w5/a7htvq']
Try it Online!
Inspired by How do you extract a url from a string using python? and removing emojis from a string in Python
If looks like you are missing * or+ at the last matching group so it only matches one character. So you want "([:///a-zA-Z////\.])+(.com)+([:///a-zA-Z////\.])*" or "([:///a-zA-Z////\.])+(.com)+([:///a-zA-Z////\.])+".
Now I don't know if this regex is simplified for your case, but it does not match all urls. For an example of that check out https://www.regextester.com/20
If you are attempting to match any url I would recommend rethinking your problem and trying to simplify down to more specific types of urls, like the example you provided.
EDIT: Also why (.com)+? Is there really a case where multiple ".com"s appear like .com.com.com
Also I think you have small typo and it is supposed to be (\.com). But since you have ([:///a-zA-Z////\.])+ it could be reduced to (com), however i think the explicit (\.com) makes it an easier expression to read.
I have a string like "httpx://__URL__/__STUFF__?param=value"
This sample is a url by convention...it could be anything with zero or more __X__ tokens in it.
I want to use a regex to extract a list of all the tokens, so output here would be List("__URL__","__STUFF__"). Remember, I don't know beforehand how many (if any) tokens may be in the input string.
I've been struggling but unable to come up with a regex expression that will do the trick.
Something like this did not work:
(?:.?(__[a-zA-Z0-9]+__).?)+
Scala Regex, which is just a wrapper around Java Regex, will never return multiple subgroups for repetitions.
The only way about it is to have a regex for the token, and then find it multiple times. You pretty much already have everything you want:
"__[a-zA-Z0-9]+__".r findAllIn "httpx://__URL__/__STUFF__?param=value"
That returns an Iterator. Use .toSeq or similar to convert into a collection.
Greg, have you tried a simple
_+[^_]+_+
This will match all the __TOKENS__
It doesn't do any check for any __TOKENLIKE__ string after the ?params, but you have mentioned you are not only using that for urls. If you need some refinement, please let us know.
Combine a regex with split:
def urlPathComponents(s: String): Option[Array[String]] =
"""(?<=http(s?)://)[^?]+""".r findFirstIn s map (_.split("/"))
In my application, i am trying to get the name of a file, from a string retrieved from a 'content-header' tag from a server. The filename looks like \"uploads/2014/03/filename.zip\" (quotations included in value).
I have tried using Path.GetFileName(string); to get just the file name but it throws an exception stating that there are illegal characters in the path.
What should i use to get just filename.zip returned? is a regex the best way to trim this string off or is there a better one?
the \"uploads/2014/03/ part will always be the same length. The filename.zip can be any filename and extension, im just using that as an example. But the numbers may vary. It sounds like a job for a regex to me, but i have no idea how to use regular expressions.
You can try something like this:
var inputString = #"\""uploads/2014/03/filename.zip\""";
var result = inputString.Trim('\\', '"').Split('/')[3];
This should work if the format is always like \"uploads/someNumber/someOtherNumber/filename\".In order to make it more safe you might want to use Enumerable.Last method after Split:
var result = inputString.Trim('\\', '"').Split('/').Last();
I made an article spinner that used regex to find words in this syntax:
{word1|word2}
And then split them up at the "|", but I need a way to make it support tier 2 brackets, such as:
{{word1|word2}|{word3|word4}}
What my code does when presented with such a line, is take "{{word1|word2}" and "{word3|word4}", and this is not as intended.
What I want is when presented with such a line, my code breaks it up as "{word1|word2}|{word3|word4}", so that I can use this with the original function and break it into the actual words.
I am using c#.
Here is the pseudo code of how it might look like:
Check string for regex match to "{{word1|word2}|{word3|word4}}" pattern
If found, store each one as "{word1|word2}|{word3|word4}" in MatchCollection (mc1)
Split the word at the "|" but not the one inside the brackets, and select a random one (aka, "{word1|word2}" or "{word3|word4}")
Store the new results aka "{word1|word2}" and "{word3|word4}" in a new MatchCollection (mc2)
Now search the string again, this time looking for "{word1|word2}" only and ignore the double "{{" "}}"
Store these in mc2.
I can not split these up normally
Here is the regex I use to search for "{word1|word2}":
Regex regexObj = new Regex(#"\{.*?\}", RegexOptions.Singleline);
MatchCollection m = regexObj.Matches(originalText); //How I store them
Hopefully someone can help, thanks!
Edit: I solved this using a recursive method. I was building an article spinner btw.
That is not parsable using a regular expression, instead you have to use a recursive descent parser. Map it to JSON by replacing:
{ with [
| with ,
wordX with "wordX" (regex \w+)
Then your input
{{word1|word2}|{word3|word4}}
becomes valid JSON
[["word1","word2"],["word3","word4"]]
and will map directly to PHP arrays when you call json_decode.
In C#, the same should be possible with JavaScriptSerializer.
I'm really not completely sure WHAT you're asking for, but I'll give it a go:
If you want to get {word1|word2}|{word3|word4} out of any occurrence of {{word1|word2}|{word3|word4}} but not {word1|word2} or {word3|word4}, then use this:
#"\{(\{[^}]*\}\|\{[^}]*\})\}"
...which will match {{word1|word2}|{word3|word4}}, but with {word1|word2}|{word3|word4} in the first matching group.
I'm not sure if this will be helpful or even if it's along the right track, but I'll try to check back every once in a while for more questions or clarifications.
s = "{Spinning|Re-writing|Rotating|Content spinning|Rewriting|SEO Content Machine} is {fun|enjoyable|entertaining|exciting|enjoyment}! try it {for yourself|on your own|yourself|by yourself|for you} and {see how|observe how|observe} it {works|functions|operates|performs|is effective}."
print spin(s)
If you want to use the [square|brackets|syntax] use this line in the process function:
'/[(((?>[^[]]+)|(?R))*)]/x',
This one may seem basic but I don't know how to do it - anybody else?
I have a string that looks like this:
private var url:String = "http://subdomain";
What regex do I need so I can do this:
url.replace(regex,"");
and wind up with this?
trace(url); // subdomain
Or is there an even better way to do it?
Try this:
url.replace("http:\/\/","");
Like bedwyr said. :)
This will match only at the beginning of the string and will catch https as well:
url.replace("^https?:\/\/","");
ActionScript does indeed support a much richer regex repetoire than bewdwyr concluded. You just need to use an actual Regexp, not a string, as the replacement parameter. :-)
var url:String;
url = "https://foo.bar.bz/asd/asdasd?asdasd.fd";
url = url.replace(/^https?:\/\//, "");
To make this perhaps even clearer
var url:String;
var pattern:RegExp = /^https?:\/\//;
url = "https://foo.bar.bz/asd/asdasd?asdasd.fd";
url = url.replace(pattern, "");
RegExp is a first class ActionScript type.
Note that you can also use the $ char for end-of-line and use ( ) to capture substrings for later reuse. Plenty of power there!