GetFullUrl doesn't return the complete url - sitecore - sitecore

I am trying to get a fully qualified url, here is the code
string path = string.Format("/sitecore/shell/Applications/Content%20Manager/default.aspx?id={0}&la={1}&fo={0}", contentItem.ID, contentItem.Language);
string fullPath = Sitecore.Web.WebUtil.GetFullUrl(path);
text = text.Replace("$itemUrl$", fullPath);
This returns something like this http://cp.localsite/sitecore/shell/Applications/Content%20Manager/default.aspx?id={DC6B4AE0-929D-4F19-97F4-825796A30781}&la=en&fo={DC6B4AE0-929D-4F19-97F4-825796A30781}
This is generated like a link till ?id= from the id it looks like a normal text. How can i resolve this.I want clickable url for the content. I really appreciate any help.
Thanks.

could you please give a bit more context around what you are trying to achieve. The described behavior is correct, but it is obviously not what you were hoping to achieve.
UPDATED:
Looks like you are on the right track, the only thing i noticed is that you could really get away with just using the fo querystring to get to the right item, you could also use the other ones mentioned here Custom email notification with link - sitecore to get more specific.
For the reason the url being cut off at the id= is that the text parser probably does not like the curly brakets - try encoding the URL and see if that works.
text = HttpUtility.UrlEncode(text);

Related

Regex specific question and search function on my website dealing with broken links

I've been trying to figure out my regex pattern but it doesn't seem to be working for me.
Here's what i'm trying to do:
I have broken links on my website if someone accidentally gets to a page like so:
https://example.com/catalogsearch/result/?q=
or
https://example.com/catalogsearch/result/
So i'm redirecting them back to my homepage. The problem is now the search is just sending everything back to the homepage. So i'm assuming if there is something after the equals it needs to continue the search.. obviously
https://example.com/catalogsearch/result/?q=person
but currently i can't figure this out..
Here is my regex that i've been messing with for quite sometime now... still seems to be wrong or something else is wrong with my search.
"^/catalogsearch/result((/)|(/\\?)|(/\\?[a-z])|(/\\?[a-z]=))?$"
Please forgive me i'm horrible with regex.
After a lot of discussion, it is concluded that the routes.yaml will consider the url path as a valid route but not the query string part. Hence out of the two examples in the post, you can use
"/catalogsearch/result": { to: "https://example.com/", prefix: false }
and for other one please change it in nginx config to redirect to homepage or if its not possible then check with magento support on how to incorporate the query string part in routes.yaml file.

Use RegEx to redirect using data from files

Recently, we restructured a large site of one of our customers. This caused all the news-articels on that site to be on a different place. Problem is that the google cache is still showing them on the old location, leading to A LOT of 404 not founds ( its about 1400 news entries ).
Normally, a redirect using somewhat simple regex would be fine, but not only the path to the news did change, but also some parameters. Example:
Old Url:
http://www.customers-url.com/old/path/to/the/news/details/?tx_ttnews%5Btt_news%5D=67&cHash=a782f3027c4d00face6df122a76c38ed
How the new url should look like:
http://www.customers-url.com/new/path/to/news/?tx_news_pi1%5Bnews%5D=65
As you can see, the parameter D did change from 67 to 65 and the part of the URL before the ? did also change. Also, tx_ttnews has changed to tx_news and tt_news changed to news and the &cHash part did fall away completely.
I have the changed ids in a csv in the following format:
old_id, new_id
1,2
2,3
...etc...
Same goes the the changed url part before the ?. Since im not exactly an expert in using regex my question is:
Can this be done inside the .htaccess using RegEx ( not sure if it can even use a file as input)? How complicated is that? And how would such a regular expression look like?
Rather than trying to use .htaccess, it would be easier to manage and easier to code if you simply make a new page that responds on the old url (/old/path/to/the/news/details), then make that page build the new url and return a 301 to the browser with that new url.

content empty when using scrapy

Thanks for everyone in advance.
I encountered a problem when using Scrapy on Python 2.7.
The webpage I tried to crawl is a discussion board for Chinese stock market.
When I tried to get the first number "42177" just under the banner of this page (the number you see on that webpage may not be the number you see in the picture shown here, because it represents the number of times this article has been read and is updated realtime...), I always get an empty content. I am aware that this might be the dynamic content issue, but yet don't have a clue how to crawl it properly.
The code I used is:
item["read"] = info.xpath("div[#id='zwmbti']/div[#id='zwmbtilr']/span[#class='tc1']/text()").extract()
I think the xpath is set correctly and I have checked the return value of this response and it indeed told me that there is nothing under this directory. Results shown here:'read': [u'<div id="zwmbtilr"></div>']
If it has something, there should be something between <div id="zwmbtilr"> and </div>.
Really appreciated if you guys share any thoughts on this!
I just opened your link in Firefox with NoScript enabled. There nothing inside the <div #id='zwmbtilr'></div>. If I enable the javascripts, I can see the content you want. So, as you already new, it is a dynamic content issue.
Your first option is try to identify the request generated by javascript. If you can do that, you can send the same request from scrapy. If you can't do it, the next option is usually to use some package with javascript/browser emulation or someting like that. Something like ScrapyJS or Scrapy + Selenium.

How can I use a regex to validate slideshare slideshow URLs?

I am using www.slideshare.net to allow my users to display embedded slideshows on their profiles.
I'm using slideshare's api to get the slideshow's id, given the slideshow link that users has to get by clicking 'share' on the slideshow and copy/paste the url:
What I would need is to validate thoroughly the latter url.
Just to further explain my process, when I have the slideshow's id, I compute the embedded code like so :
"<iframe src='https://www.slideshare.net/slideshow/embed_code/" + json.slideshow_id + "' frameborder='0' allowfullscreen webkitallowfullscreen mozillaallowfullscreen></iframe>"
where json is the object returned by slideshare's api.
A basic regex to answer my question would be:
^http\://www\.slideshare\.net/[a-zA-Z0-9\-]+/[a-zA-Z0-9\-]+$
But it feels a little weak to me :
I don't want my users to just copy/paste the url in the navigator address bar
I'm not sure this regex works for all slideshare's slideshows as I'm not a slideshare specialist (does that even exist?)
Ideally I would like to exclude all other regular urls from www.slideshare.net that doesn't point to a slideshow.
EDIT 7/12/2014: rewrite
You can use something like this:
(http|ftp|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,#?^=%&:/~\+#]*[\w\-\#?^=%&/~\+#])?
More example from this website

Doesn't XTK support query parameters (especially with periods) in file names?

I'm using edge XTK by directly including http://get.goXTK.com/xtk_edge.js in my html.
Following code snippet shows how I'm referring to files on my server in XTK.
var skull = new X.mesh();
skull.file = 'http://myserver.com/stls/skull.stl?accessingUserId=dave#ibm.com&accessCode=8999';
As you can see, my file uri's have query parameters, which have periods in them. In such cases, XTK fails with error message:
com&accessCode=8999 file format is not supported.
It looks like XTK forgot to consider that file uri's can have query params with periods.
If it is a bug, would you please consider fixing it before release 8.
If I'm doing something wrong, can you please point me in the right direction.
Thank you.
Haha, you're right.. it doesn't work but you could use the following:
var skull = new X.mesh();
skull.file = 'http://myserver.com/stls/skull.stl?accessingUserId=dave#ibm.com&accessCode=8999&skull.stl';
basically just append another '&.stl' to the query.
we just split the url using the last dot to get the extension.. any proposition for a better solution is welcome.