Need help creating a regex pattern for getting an image - regex

I made an RSS reader and I'm trying to get to display a preview image too.
Here's what I'm using to get the image and the only thing that's not working is the pattern
if item?.content != nil {
print("works until here")
let htmlContent = item!.content as NSString
var imageSource = ""
let rangeOfString = NSMakeRange(0, htmlContent.length)
let regex = try! NSRegularExpression(pattern: "(http[^\\s]+(jpg|jpeg|png|tiff)\\b)", options: .caseInsensitive)
if htmlContent.length > 0 {
let match = regex.firstMatch(in: htmlContent as String, options: [], range: rangeOfString)
if match != nil {
let imageURL = htmlContent.substring(with: (match!.rangeAt(2))) as NSString
print(imageURL)
if NSString(string: imageURL.lowercased).range(of: "feedburner").location == NSNotFound {
imageSource = imageURL as String
}
}
}
if imageSource != "" {
cell.itemImageView.setImageWith(NSURL(string: imageSource) as URL!, placeholderImage: UIImage(named: "thumbnail"))
}else {
cell.itemImageView.image = UIImage(named: "thumbnail")
}
}
I need help creating a good pattern for getting the image from "st-gallery" class of the travelator.ro website.
Many thanks in advance. :)

Regular expressions can't parse HTML. Regular expressions recognize the set of Regular Languages. HTML is a context-free language, which is higher on the Chomsky Hierarchy. Regular expressions can't recognize context free languages.
You would need to use a more complicated parser. HTML parsing libraries have done this, I suggest you look at using one of those.

Related

Swift2 to Swift3 error - Cannot assign value of type 'NSDictionary?' to type 'AddressModel'

Moving an app from Swift2 to Swift3 and I've hit an error that I've been unable to fix after trying several different suggestions.
lazy var address: AddressModel? = {
[unowned self] in
var dict = self.getpayloadDict()
var model: AddressModel
model = dict
return model
}()
model = dict throws Cannot assign value of type 'NSDictionary?' to type 'AddressModel'
The AddressModel . . .
class AddressModel: Deserializable {
var City: String?
var State: String?
var PostalCode: String?
required init(data: [String: AnyObject]) {
City = data["City"] as! String?
State = data["State"] as! String?
PostalCode = data["PostalCode"] as! String?
}
}
Any help appreciated.
The error is supposed to occur also in Swift 2. It's pretty clear: getpayloadDict() returns a dictionary which doesn't match AddressModel.
You might create an AddressModel instance from the dictionary
lazy var address: AddressModel? = { // this closure does not cause a retain cycle
let dict = self.getpayloadDict()
return AddressModel(data: dict)
}()
Side note:
as! String? (force unwrap an optional to an optional) is horrible syntax. Use regular conditional downcast as? String. And please conform to the naming convention that variable names start with a lowercase letter.

Cannot call value of non function type '[String:AnyObject]'

I'm facing the issue in Swift 3
I have following piece of code:
do{
let json = try JSONSerialization.jsonObject(with: data!, options: .mutableContainers) as! [String : AnyObject]
if let datasFromJson = json["blog"] as? [[String:AnyObject]] {
for dataFromJson in datasFromJson{
if let title = dataFromJson("title")! as? String {
article.author = author
}
self.articles?.append(article)
}
}
I get this error when I try to cast title as string
Typo (brackets, not parentheses):
dataFromJson["title"] as? String // no exclamation mark after the closing bracket
Notes:
.mutableContainers is useless in Swift.
In Swift 3 JSON dictionary is [String:Any]
Is title used at all? Or is it another typo title vs. author?

Finding text between parentheses in Swift

How do you get an array of string values for the text between parentheses in Swift?
For example from: MyFileName(2015)(Type)(createdBy).zip
I would like: [2015,Type,createdBy]
Just updating the chosen answer to Swift 3:
func matchesForRegexInText(regex: String!, text: String!) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matches(in: text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substring(with: $0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}}
The usage remains the same.
Here is a complete example in Swift 4.2
func matchesForRegexInText(regex: String!, text: String!) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matches(in: text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substring(with: $0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}}
and usage :
let regex = "\\((.*?)\\)"
mmatchesForRegexInText(regex: regex, text: " exmaple (android) of (qwe123) text (heart) between parentheses")
You can use a regex for this
Thomas had a good example: \((.*?)\)
How to use a regex with Swift you can look up at: http://www.raywenderlich.com/86205/nsregularexpression-swift-tutorial
Here is my RegEx
which is actually trying to get the words between parentheses. E.g. (smile)
NSRegularExpression(pattern: "\\\\(\\\w+\\\\)",options:NSRegularExpressionOptions.CaseInsensitive)
it works for me!

Swift-Rss project-Can't get img src link inside CDATA blocks

I have a very annoying problem. I am developing an RSS Reader for Swift(with Xcode 7.1). i want each cell of my tableview show images for each news. Here is my code:
cell.itemImageView.image = UIImage(named: "placeholder")
let news = items[indexPath.row] as MWFeedItem?
if news?.content != nil {
let htmlContent = news!.content as NSString
var imageSource = ""
let rangeOfString = NSMakeRange(0, htmlContent.length)
let regex = try? NSRegularExpression(pattern: "(<img.*?src=\")(.*?)(\".*?>)", options: [])
if htmlContent.length > 0 {
let match = regex?.firstMatchInString(htmlContent as String, options: [], range: rangeOfString)
if match != nil {
let imageURL = htmlContent.substringWithRange(match!.rangeAtIndex(2)) as NSString
print(imageURL)
if NSString(string: imageURL.lowercaseString).rangeOfString("feedburner").location == NSNotFound {
imageSource = imageURL as String
}
}
}
if imageSource != "" {
cell.itemImageView.setImageWithURL(NSURL(string: imageSource)!, placeholderImage: UIImage(named: "placeholder"))
}
else{
cell.itemImageView.image = UIImage(named: "placeholder")
}
}
So, the problem is that: when the rss feed xml file doesn't have CDATA blocks, my code works perfectly; in other most cases it doesn't work because inside xml file there is a structure like this:
<![CDATA[<p> <img src="http://www.repstatic.it/content/nazionale/img/2015/11/12/115530091-51ce67c2-7b38-41c1-8aa5-21d51b157335.jpg" width="140" align="left" hspace="10">I genitori contro la scelta del consiglio interclasse delle terze elementari dell'istituto Matteotti di fermare la gita all'esposizione "Divina Bellezza" sul...</p>]]></description><guid isPermaLink="true"><!
It's clear that CDATA block doesn't let me read img src link. What can i do?
Thank in advance for your help!
I run the following code in the PlayGround using your regex and successfully got all the img src urls from the xml.
import Foundation
let url = NSURL(string: "http://www.repubblica.it/rss/homepage/rss2.0.xml")!
let xml = try String(contentsOfURL: url)
let regex = try NSRegularExpression(pattern: "(<img.*?src=\")(.*?)(\".*?>)", options: [])
let range = NSMakeRange(0, xml.characters.count)
regex.enumerateMatchesInString(xml, options: [], range: range) { (result, _, _) -> Void in
let nsrange = result!.rangeAtIndex(2)
let start = xml.startIndex.advancedBy(nsrange.location)
let end = start.advancedBy(nsrange.length)
print(xml[start..<end])
}

Convert a JavaScript Regex to a Swift Regex

I'm learning Swift, and I'm trying to convert a small bit of JavaScript code to Swift. The JavaScript code uses a Regex to split up a string, as shown below:
var text = "blah.clah##something_else";
var parts = text.match(/(^.*?)\#\#(.+$)/);
after execution, the parts array will then contain the following:
["blah.clah##something_else", "blah.clah", "something_else"]
I would like to replicate the same behavior in Swift. Below is the Swift code I've written to do split up a String into a String array using a Regex:
func matchesForRegexInText(regex: String!, text: String!) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: NSRegularExpressionOptions.CaseInsensitive)
let nsString = text as NSString
let results = regex.matchesInString(text,
options: NSMatchingOptions.ReportCompletion , range: NSMakeRange(0, nsString.length))
as [NSTextCheckingResult]
return results.map({
nsString.substringWithRange($0.range)
})
} catch {
print("exception")
return [""]
}
}
When I call the above function with the following:
matchesForRegexInText("(^.*?)\\#\\#(.+$)", text: "blah.clah##something_else")
I get the following:
["blah.clah##something_else"]
I've tried a number of different Regex's without success. Is the Regex (^.*?)\#\#(.+$) correct, or is there a problem with the matchesForRegexInText() function? I appreciate any insight.
I'm using Swift 2, and Xcode Version 7.0 beta (7A120f)
As already mentioned in a comment, your pattern matches the entire
string, so regex.matchesInString() returns a single
NSTextCheckingResult whose range describes the entire string.
What you are looking for are the substrings matching the capture groups
in your pattern. These are available as rangeAtIndex(i) with i >= 1:
func matchesForRegexInText(regex: String!, text: String!) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
guard let result = regex.firstMatchInString(text, options: [], range: NSMakeRange(0, nsString.length)) else {
return [] // pattern does not match the string
}
return (1 ..< result.numberOfRanges).map {
nsString.substringWithRange(result.rangeAtIndex($0))
}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
Example:
let matches = matchesForRegexInText("(^.*?)##(.+$)", text: "blah.clah##something_else")
print(matches)
// [blah.clah, something_else]