Wikipedia - to - Facebook linked data - facebook-graph-api

I'm trying to include facebook pages in a linked-data application,
like, the page:
facebook.com/pages/Friendship/105625816137032
connects to:
en.wikipedia.org/wiki/Friendship
without ambiguity,
is there a "105625816137032" => "Friendship" dataset somewhere?
or a method that isn't screen-scraping facebook?
it seems crazy that facebook would intentionally try to avoid these sort of links, if that's what's happening, or I am just missing them.
thanks!

I'm not aware of any pre-compiled resources mapping Facebook ids with dbpedia ids - it woudl certainly be a very nice project. There are a few options:
1.) Asking Facebook nicely - you'd never know... Ultimately, it should benefit everybody.
2.) Scraping Facebook - the ethics are probably sketchy, and lots of interesting pages aren't linked to wikipedia. eg. http://www.facebook.com/PulpFiction
3.) Guessing eg. Wikipedizing the name - eg. putting in underscores for spaces
The resulting mapping set could be tested by comparing the types of the matches (i think that the mappings would have to be done manaully, and probably aren't too much use)
Facebook don't seem to have put too much thought into the graph - the data is all pretty unstructured and difficult to work with as far as I can see.

Related

Ember Way to Add Rss Feed without third party widget, Front-end only

I am using Ember 3.0 at the moment. Wrote my first lines of code in ANY language about 1 year ago (I switched careers from something totally unrelated to development), but I quickly took to ember. So, not a ton of experience, but not none. I am writing a multi-tenant site which will include about 20 different sites, all with one Ember frontend and a RubyOnRails backend. I am about 75% done with the front end, now just loading content into it. I haven’t started on the backend yet, one, because I don’t have MUCH experience with backend stuff, and two, because I haven’t needed it yet. My sites will be informational to begin with and I’ll build it up from there.
So. I am trying to implement a news feed on my site. I need it to pull in multiple rss feeds, perhaps dozens, filtered by keyword, and display them on my site. I’ve been scouring the web for days just trying to figure out where to get started. I was thinking of writing a service that parses the incoming xml, I tried using a third party widget (which I DON’T really want to do. Everything on my site so far has been built from scratch and I’d like to keep it that way), but in using these third party systems I get some random cross domain errors and node-child errors which only SOMETIMES pop up. Anyway, I’d like to write this myself, if possible, since I’m trying to learn (and my brain is wired to do the code myself - the only way it sticks with me).
Ultimately, every google result I read says RSS feeds are easy to implement. I don’t know where I’m going wrong, but I’m simply looking for:
1: An “Ember-way” starting point. 2: Is this possible without a backend? 3: Do I have to use a third party widget/aggregator? 4: Whatever else you think might help on the subject.
Any help would be appreciated. Here in New Hampshire, there are basically no resources, no meetings, nothing. Thanks for any help.
Based on the results I get back when searching on this topic, it looks like you’ll get a few snags if you try to do this in the browser:
CORS header issues (sounds like you’ve already hit this)
The joy of working with XML in JavaScript (that just might be sarcasm 😉, it’s actually unlikely to be fun)
If your goal is to do this as a learning exercise, then doing it Javascript/Ember will definitely help you learn lots of new things. You might start with this article as a jumping off point: https://www.raymondcamden.com/2015/12/08/parsing-rss-feeds-in-javascript-options/
However, if you want to have this be maintainable for the long run and want things to go quickly and smoothly, I would highly recommend moving the RSS parsing system into your backend and feeding simple data out to Ember. There are enough gotchas and complexities to RSS feeds over time that using a battle-tested library is going to be your best way to stay sane. And loading that type of library up in Ember (while quite doable) will end up increasing your application size. You will avoid all those snags (and more I’m probably not thinking of) if you move your parsing back to the server ...

Facebook Share showing incorrect info when the debugger is correct

First post on StackOverflow, so please forgive any protocol lapses!
I've found similar problems to the one below elsewhere on SO, but none that's an exact match, nor a solution that hits the spot.
I have a client site with FB Share and Like buttons, all of which work perfectly on straightforward named pages. In the case of the shop and blog pages I need to use a querystring, which works perfectly on other sites, but not this one! I've run the FB Debugger on the affected pages and all looks hunky dory.
Here are two example pages with the problem:
http://www.fabniki.com/productdetail?pid=251 and http://www.fabniki.com/blogdetail?id=327&p=1.
In the case of the shop item, the text Facebook is showing isn't even on the page. I've tried clearing cache, forcing an FB cache refresh etc.
My own site uses a similar querystring system for my blog, and this works absolutely fine with Facebook shares and likes.
I'd be very grateful for any suggestions!
OK, the problem was - unsurprisingly - entirely of my own making. I'd allowed myself to get bogged down in the debugger, which looks at the exact URL you paste into its input field. If you're sufficiently idiotic, it may not occur to you to check if this is the actual value being passed to FB by your Like and Share buttons. Doh!
Thanks to some excellent support from FB developers, the problem was spotted and corrected with only minor dents to my ego.
I wanted to answer my own question, a) to avoid wasting people's time and b) to give Facebook Support a bit of appreciation for a change!

Commenting / Forum type software for ColdFusion

We operate a ColdFusion site with a custom CSS acting as a directory of various companies. Depending on the type of company, we have a set of subpages containing specific information pulled from the CMS about the company, such as "location/directions". We're looking to add functionality enabling users to add comments to the existing content. I'm looking for suggestions on open source or other available ColdFusion software out there that could work for this. While we could write something custom, commenting tools have been done a thousand times and probably better than we can do it.
While what we're looking for sounds like a blog or forum, its more of a hybrid. We'd like to be able to add functionality enabling commenting on the content we post in the context we post it in. Seems like there must be something out there that can be easily modified and integrated with our CMS.
Does anyone know of anything out there we should look into?
I'm going to vote to close this too, as per the others, but here's an answer anyway.
If you just want to add commenting to existing content, perhaps use Disqus. It's not locally installable (and is not CFML-based; it's all JS), but it does handle most things one would need if just wanting to add comments to a site.
If you want a native, self-managed solution, unfortunately StackOverflow have deemed that sort of question "unworthy", so you'll need to ask elsewhere. Despite being an entirely reasonable question, for which the answers would be helpful to other people later on (which is - in theory - the raison d'etre of Stack Overflow. Although that's hard to tell, sometimes).

Geocoding things from the database?

I am kind of new to Geocoding. What I want to do is pull a bunch of names of places from the DB, and display them as markers on the page. And then allow people to choose different options which would force another db query, which would place a number of new markers on the page.
Is that possible? It seems like relatively simple functionality, but since I am not good at JSON, it is giving me a hard time.
Thanks,
Alex
There are lots of ways to geocode and really you need to give more information!
For example, in an offline environment, MapPoint is a pretty good solution (costs about $200/300 license). It can be made to work on a webserver but isn't usually worth the effort.
For a web server, then I would look at a web service. These are usually limited for free use, or pay for heavier (or commercial) use. Your question is too wide to give specifics, but look at the web services provided by Bing Maps, Google Maps, Yahoo (yes they're still around), and OpenStreetMaps-based. bing Maps and Google Maps look like they'll be around for a long time - but might cost, depending on your application. OpenStreetMaps promises to have the widest coverage (including non NAm/EUR countries), but probably doesn't have the coverage of the others, yet.
After figuring this out, I made a tool that takes an address and converts it to lat/lng. It also has a code tutorial: http://comehike.com/utils/address_to_geolocation.php
I hope it helps people. Would be fun to get feedback too :)

Retrieve a list of the most popular GET param variations for a given URL?

I'm working on building intelligence around link propagation, and because I need to deal with many short URL services where a reverse-lookup from an exact URL address is required, I need to be able to resolve multiple approximate versions of the same URL.
An example would be a URL like http://www.example.com?ref=affil&hl=en&ct=0
Of course, changing GET params in certain circumstances can refer to a completely different page, especially if the GET params in question refer to a profile or content ID.
But a quick parse of the page would quickly determine how similar the pages were to each other. Using a bit of machine learning, it could quickly become clear which GET params don't effect the content of the pages returned for a given site.
I'm assuming a service to send a URL and get a list of very similar URLs could only be offered by the likes of Google or Yahoo (or Twitter), but they don't seem to offer this feature, and I haven't found any other services that do.
If you know of any services that do cluster together groups of almost identical URLs in the aforementioned way, please let me know.
My bounty is a hug.
Every URL is akin an "address" to a location of data on the internet. The "host" part of the URL (in your example, "www.example.com") is a web-server, or a set of web-servers somewhere in the world. If we think of a URL as an "address", then the host could be a "country".
The country itself might keep track of every piece of mail that enters it. Some do, some don't. I'm talking about web-servers! Of course real countries don't make note of every piece of mail you get! :-)
But even if that "country" keeps track of every piece of mail - I really doubt they have any mechanism in place to send that list to you.
As for organizations that might do that harvesting themselves, I think the best bet would be Google, but even there the situation is rather grim. You see, because Google isn't the owner every web-server ("country") in the world, they cannot know of every URL that accesses that web-server.
But they can do the reverse. Since they can index every page they encounter, they can get a pretty good idea of every URL that appears in public HTML pages on the web. Of course, this won't include URLs people send to each other in chats, SMSs, or e-mails. But still, they can get a pretty good idea of what URLs exist.
I guess what I'm trying to say is that what you're looking for doesn't exist, really. The only way you can get all the URLs used to access a single website, is to be owner of that website.
Sorry, mate.
It sounds like you need to create some sort of discrete similarity rank between pages. This could be done by finding the number of similar words between two pages and normalizing the value to a bounded range then mapping certain portions of the range to different similarity ranks.
You would also need to know for each pair that you compare what GET parameters they had in common or how close they were. This information would become the attributes that define each of your instances (stored along side the rank mentioned above). After you have amassed a few hundred pairs of comparisons you could perhaps do some feature subset selection to identify the GET parameters that most identify how similar two pages are.
Of course, this could end up not finding anything useful at all as this dataset is likely to contain a great deal of noise.
If you are interested in this approach you should look into Infogain and feature subset selection in general. This is a link to my professors lecture notes which may come in handy. http://stuff.ttoy.net/cs591o/FSS.html