Why og:image / og:url can not be followed? - amazon-web-services

Got this error in Facebook Graph API Explorer, for scrape:
{
"error": {
"message": "Invalid parameter",
"type": "OAuthException",
"code": 100,
"error_subcode": 1611071,
"is_transient": false,
"error_user_title": "URL Follow Failed",
"error_user_msg": "There was an error in fetching the object at URL 'https://tikex-dev.com/kubl/fl50/j1vd/r36s', or one of the URLs specified via a redirect or the 'og:url' property including one of https://t44-post-cover.s3.eu-central-1.amazonaws.com/fr1n.",
"fbtrace_id": "AMZdGCazFLYGP6MfT-YZ-WF"
}
}
service used: ?scrape=true&id=https://tikex-dev.com/kubl/fl50/j1vd/r36s
Sharing page point to a gif file with og:image and og:url. what is wrong? If I share the gif file in Facebook, not sharing page, gif is loaded, rendered, animated.
Does AWS need to provide something more in header?
What OAuthException means?

The tag with og:url is supposed to be readable and providing OG tags. Your gif is not readable, it's got a binary mime-type.
https://developers.facebook.com/docs/sharing/webmasters/getting-started/versioned-link/?locale=en_US
The path specified for og:url does not need to be a page that renders in the browser. However, it must respond to the Facebook crawler and return og:* meta tags.
And
When the path referred to by og:url returns an og:url link that is different, the new link is followed. The sharing details that Facebook uses are the ones at the final link in the redirect chain. The final link in the chain should also include the og:url meta tag. If og:url isn't specified, then the URL of the page is assumed to be the canonical URL.
The canonical tag is more important than the og:url tag equivalent tho, use it too:
<link rel="canonical" href="https://tikex-dev.com/kubl/fl50/j1vd/r36s">
In your setup, you have a webpage with text/html that redirects to a gif with type image/gif.
The gif has no CORS headers at https://t44-post-cover.s3.eu-central-1.amazonaws.com/i61t

Related

Facebook fetching incorrect OG tags for NextJS project deployed on Vercel

I have built my project using NextJS and deployed it on Vercel. My project's Vercel URL is https://my-project.vercel.app. My domain (added to project settings in Vercel dashboard) is www.example.com
When I use Facebook Sharing Debugger to inspect the meta tags on a particular url, the meta tags are picked up correctly when using the my-project.vercel.app domain and not my actual domain www.example.com. The project loads correctly in the browser , including meta tags, for both domains.
For example, for a url /foo on my website, the og tags are picked up correctly for https://my-project.vercel.app/foo but not for https://www.example.com/foo.
Have a look at these screenshots. Note that the domain shown in the screenshots (esourcing.in) is added to the project esourcing-frontend.vercel.app, in the Vercel dashboard.
Here is the screenshot from my browser:
Answering my own question. There was a bug in my code
The bug
I am using a Wrapper component to wrap all my pages. I was setting the og_url after componentDidMount. I am using a functional component so the useEffect hook executes after component mount. Until that happens I set the og_url to https://example.com.
However this component mounting never happens when the page is crawled by Facebook. So you can imagine that the og:url property was being set to https://example.com for all pages when the Facebook crawler scraped these pages.
import Head from "next/head";
import { useEffect } from 'react'
const Wrapper = (props) => {
const [og_url, setog_url] = useState("")
useEffect(() => {
setog_url(window.location.href)
}, [])
return (
<>
<Head>
... other meta tags ...
<meta property="og:url" content={og_url ? og_url : "https://example.com"} />
</Head>
{props.children}
</>
)
Fix
og:url essentially tells Facebook that forget the OG tags on this page and use the OG tags from the page whose url is set to the value of og:url. I set the og:url to "" and now the sharing debugger picks up the right OG tags.

Accessing URL in 'get' method of view

I want a web page (with the url page3) to be displayed differently depending on whether a user on my website is redirected to it from the pages with urls page1 or page2.
How can I access the full url (not just the query parametres in it) from which the user was redirected in the get method in the view associated with the url page3 ?
After reading the docs more thoroughly (thanks for the tip Brandon!), I found request.META['HTTP_REFERER'] did the trick.

Custom 404 page in sitecore from content tree

I have multilingual website and I want a custom 404 page in all languages which depends on the context of the site. What is the correct approach to get the 404 page defined in content tree or to access 404 page from content tree? I am able to get 404 page if I will define that in my site root but not from content tree.
I want 404page from the content tree as my 404 Custom redirect.
My Webconfig setting:
<setting name="ItemNotFoundUrl" value="/404Page.aspx" />
IIS 404 entry.
The error i got when page is not there in sitecore:
After changing IIS to default and setting httpErrors like
<error statusCode="404" path="/404PAGE.aspx" responseMode="ExecuteURL" />
</httpErrors>
I use a custom implementation of the ExecuteRequest processor in the httpBeginRequest pipeline.
That is where the 404 requests are eventually handled.
You can override the RedirectOnItemNotFound method in there and add some logic to load a different 404 page per site.
Take a look at this blog post that explains how to implement it.
EDIT: I have added an example of how to implement it so you can return a site specific 404 page.
If you make this modification, you can return a different 404 page per site:
Add this to the configuration:
<setting name="NotFoundPage.SiteName1" value="/not-found.aspx" />
<setting name="NotFoundPage.SiteName2" value="/not-found.aspx" />
Then in the custom RedirectOnItemNotFound code, do this to return site specific 404 content:
public class ExecuteRequest : Sitecore.Pipelines.HttpRequest.ExecuteRequest
{
protected override void RedirectOnItemNotFound(string url)
{
var context = System.Web.HttpContext.Current;
try
{
// Get the domain of the current request.
string domain = context.Request.Url.GetComponents(UriComponents.Scheme | UriComponents.Host, UriFormat.Unescaped);
// Get 'not found page' setting for current site.
string notFoundUrl = Sitecore.Configuration.Settings.GetSetting(string.Conact("NotFoundPage.", Sitecore.Context.Site.Name));
// Request the contents of the 'not found' page using a web request.
string content = Sitecore.Web.WebUtil.ExecuteWebPage(string.Concat(domain, notFoundUrl));
// Send the content to the client with a 404 status code
context.Response.TrySkipIisCustomErrors = true;
context.Response.StatusCode = 404;
context.Response.Write(content);
}
catch (Exception)
{
// If our plan fails for any reason, fall back to the base method
base.RedirectOnItemNotFound(url);
}
// Must be outside the try/catch, cause Response.End() throws an exception
context.Response.End();
}
}
The idea is that you use the site name as setting key so you can resolve the configuration value per site.
The code needs some work of course, but you get the idea..
EDIT 2: Added example configuration to replace the original pipeline processor with the new one:
<pipelines>
<httpRequestBegin>
<processor type="Sitecore.Pipelines.HttpRequest.ExecuteRequest, Sitecore.Kernel">
<patch:attribute name="type">ParTech.Pipelines.ExecuteRequest, ParTech</patch:attribute>
</processor>
</httpRequestBegin>
</pipelines>
Sitecore does some of its own "item not found" handling, and also doesn't do a good job of handling 404s in an SEO-friendly manner.
The best solution I've found for handling 404s and other errors in Sitecore is the Sitecore Error Manager.
http://marketplace.sitecore.net/en/Modules/Sitecore_Error_Manager.aspx
https://github.com/unic/SitecoreErrorManager/wiki
You can set ItemNotFoundUrl in Sitecore config to any item in the content tree, it does not need to be in the site root, and as long as you don't specify a language it will use whatever the user has been browsing the site in (or default for first visit). The item has to be in the same tree structure for all sites though:
<setting name="ItemNotFoundUrl" value="/errors/404.aspx" />
<setting name="LayoutNotFoundUrl" value="/errors/500.aspx"/>
<setting name="NoAccessUrl" value="/errors/403.aspx"/>
Techphoria414 is right that Sitecore out of the box does not do a good at handling 404s in an SEO friendly manner, it will do a 302 redirect to the error page. However, if you set the following to true then a Server.Transfer will be used:
<!-- USE SERVER-SIDE REDIRECT FOR REQUEST ERRORS
If true, Sitecore will use Server.Transfer instead of Response.Redirect to redirect request to service pages when an error occurs (item not , access denied etc).
Default value: false
-->
<setting name="RequestErrors.UseServerSideRedirect" value="true"/>
Which triggers the following code in the ExecuteRequest pipeline.
protected virtual void PerformRedirect(string url)
{
if (Settings.RequestErrors.UseServerSideRedirect)
HttpContext.Current.Server.Transfer(url);
else
WebUtil.Redirect(url, false);
}
You can read more about it in this blog post. Be sure to set Status property of the current System.Web.HttpResponse to 404 in your code that handles it. There is more info in the Handling HTTP 404 document.
Error Manager is a great module though and is much more flexible if you need to set different url locations for different sites.
I have written a blog post on this, might be useful for someone in future.
http://sitecoreblog.tools4geeks.com/Blog/35/Better-way-of-handling-sitecore-404-pages
The approach i have taken is :
Add a processor after HttpRequest.ItemResolver on HttpRequestBegin pipeline.
Find if context item is null, if null set 404 content page as context item.
Add another processor in the pipeline httpRequestEnd after EndDiagnostics processor to set the 404 status code for not found Item.
With this approach, we don't need to set 404 status code on Renderings. As we are setting it at the end of "httpRequestEnd" pipeline. /404 page returns 200 status code as this page suppose to return 200 status code.
works in multi-site/multi-language environment.

Regex for parsing Facebook Open Graph meta tag

I'm trying to pull the og:title attribute from a Bing Local page for a Windows Store app.
There is no HTML parser for WinRT and C++/CX, so I've resorted to using a regex to grab the tag, then an XML parser to pull out relevant attributes.
This is what the tag looks like.
<meta property="og:title" content="Some Location Name"/>
I'm using the following regex to pull out the tag from the HTML, but whenever the content attribute has a space in it, it fails to find a match.
<meta property="og:title" content="[\s\S]*"/>
So, my regex will work for McDonald's, but not for Jack In The Box.
What do I need to do to get the entire title?
This is one of my open graph regex queries which match most things with specific problems in content, but those are rare and I'd rather have a more readable regex
<meta [^>]*property=[\"']og:title[\"'] [^>]*content=[\"']([^'^\"]+?)[\"'][^>]*>
But I do come across some times where the content comes before property so I also run this
<meta [^>]*content=[\"']([^'^\"]+?)[\"'] [^>]*property=[\"']og:image[\"'][^>]*>
You can just add a space to the regex. [ \s\S]*
DISCLAIMER: OpenGraph.io is a commercial product I work on and support.
Unfortunately, any regex you come up with is going to be hit or miss. If you end up needing to do this you can use the API available at http://www.opengraph.io/
One of its major benefits is that it will infer information like the title or description (if you end up needing it) from the content on the page if OpenGraph tags don't exist.
To get information about a site use:
GET https://opengraph.io/api/1.0/site/<URL encoded site URL>
Which will return something like:
{
"hybridGraph": {
"title": "Google",
"description": "Search the world's information...",
"image": "http://google.com/images/srpr/logo9w.png",
"url": "http://google.com",
"type": "site",
"site_name": "Google"
},
"openGraph": {..}
"htmlInferred": {..}
}

Facebook: Change the description/title that appears in a facebook post when I post a URL

I am using a PHP script to post a URL to a fan page that I am an admin of, but the contents of the post always appear as defined by page's <title> tag/<meta type="description"> tag.
Can't the contents of facebook post be changed by using facebook's open graph description(og:description)/title(og:title) tags in the page being posted?
The title can be changed using og:title as long as the page has less than 50 likes (otherwise it is locked in by Facebook). The og:description can be changed at any time.
See: https://developers.facebook.com/docs/opengraph/