What is the sense of using robots.txt in GitHub pages? - github-pages

I know that the file robots.txt is used for to block web crawler of index content sites of third parties.
However, if the goal this file is to delimit a private area of site or to protect a private area, which is the sense in try to hidden the content with robots.txt, if all will can be see in GitHub repository?
My question extend the examples using custom domain.
Is there motivation in to use file robots.txt inside of GitHub pages? Yes or no? And why?
Alternative 1
For that content stay effectively hidden, then will been need to pay for the web site is to get a private repository.

The intention of robots.txt is not to delimit private areas, because robots don't even have access to them. Instead it's in case you have some garbage or whatever miscellaneous that you don't want to be indexed by search engines or so.
Say for example. I write Flash games for entertainment and I use GitHub Pages to allow the games to check for updates. I have this file hosted on my GHP, all of whose content is
10579
2.2.3
https://github.com/iBug/SpaceRider/tree/master/SpaceRider%202
It contains three pieces of information: internal number of new version, display name of new version, and download link. Therefore it is surely useless when indexed by crawlers, so when I have a robots.txt that's a kind of stuff I would keep away from being indexed.

Related

Embedding a functional website inside a Squarespace webpage

First of all, thank you for everything that you do. Without this community, I would hate web design and be reliant on my teacher's outdated, static methods. Much love <3
So, this is a tricky one (maybe).
I want to have, essentially, an iframe on a webpage that contains a website I coded previously. It was a project for school that never went live, but I'd like to include it as part of my portfolio. Problem is, an iframe needs a URL for a source, but I just have the folder with more folders full of code, fonts, and images. How can I tell the browser to populate this box with everything from "name" folder? And then how will it know to run the code instead of just showing a file tree or something?
In the end, I want a page describing a previous web project and let the client experience that project within the one page. And I don't want to get a domain for every project I do.
Maybe there's an easier way I'm not thinking of?
To make it interesting, my new portfolio site is being made in Squarespace...maybe. I bought a domain from them because I had a promo code and wanted to try the platform, but I kind of hate it. I can't change any of the code and it won't maintain a connection to Typekit. So all I can do is change the basic appearance of preexisting elements. It's like WordPress all over again....LAME! Sadly, I already bought the domain.
Can Squarespace just be a host? Is there a way to download the raw code of these templates, edit it, and upload it again?
Thanks for all your help!
I want to have, essentially, an iframe on a webpage that contains a
website I coded previously.
Squarespace's file upload mechanism is very limited. Without using the Developers Platform, there is no effective way to upload many files at once. Furthermore, there is no way to create folders. Therefore, even if you were willing to upload each .html file and each asset one-by-one, there'd be no way to organize the files into folders (assuming that the "tree" you mentioned includes additional sub-folders).
Initially, in order to get the files to be accessible by Squarespace, you'd have to do one of the following:
Use Squarespace Developers Platform (A.K.A. "Developer Mode") and upload your to-be-iframed
(TBI) website files to the "assets" folder using SFTP or Git.
Host your TBI website files somewhere else (a different host
environment, for example) which will maintain your file/folder
structure.
How can I tell the browser to populate this box with everything from
"name" folder? And then how will it know to run the code instead of
just showing a file tree or something?
Assuming that the TBI website has an index.html file or home.html file or similar, and assuming you were to use the Squarespace Developer Platform, you'd insert the iframe either in a Code Block or within a template/.region file directly using something like
<iframe src="/assets/tbiwebsitefolder/index.html"></iframe>
while setting your other iframe attributes (such as height and width) as needed.
Is there a way to download the raw code of these templates, edit it,
and upload it again?
Yes. You select a template and then enable Developer Mode on that template. From there, you use SFTP or Git to download the template files, edit, and reupload.
You may benefit by reviewing some considerations of enabling Developer Mode on a Squarespace Template.
One other idea, to avoid the iframe and Developer Mode entirely, would be to capture images of the TBI website rendered in a browser, and then simply add those images to a gallery block or gallery page. This could allow you to convey the general idea of the project but would of course not capture the full "experience" of it.

preventing XSS/JS attacks on hosted CMS

I am working on a hosted CMS, and am thinking about allow site editors to add custom javascript and html (a much requested feature).
I am concerned that this will open up an attack vector - nasty js could make calls to the functions that our hosted CMS exposes (see the Samy worm for an example of what user scripts did to myspace), but I really want to give users control over their site (what's the point of a CMS you can't add your own clever stuff to?)
What is a good approach to fixing this issue? I can think of several which I would like commentary on, but am not going to list them for fear of the 'no list questions mods'!
I suspect that Caja is on your list, so I'll mention that this is squarely in Caja's use cases; for example, Google Sites is very like a CMS and uses Caja to embed arbitrary JS and HTML.
Caja host pages can provide arbitrary additional interfaces for use by the sandboxed content, which can include, for example, embedding widgets provided by your CMS inside the user-supplied HTML while maintaining encapsulation.
(Disclosure: I work for Google on the Caja team.)

How to setup groups (sub-sites) in Django

I'm new to Django and I come from Drupal family. There we have Organic Groups with which we can create groups of content and subsites; how do I do something like that with Django?
Say I'm making this site for my company using Django and every department in my company needs a private section on the site. For example, the design people have their own part of the website into which the back-end developers can not come in. And the back-end developers will have the same thing too.
I want to build the site in such a way that I just login into Django admin and add a new category or subsite or group (whatever the Django term is) with the same settings from other groups or with similar settings.
It depends on what you mean by "private section". You should probably try looking at it from a different angle:
Django splits a site's functionality by means of "apps". Each app does its specific thing, and gets a set of tables in the database. Apps can access each others' tables. For example, it's common for other apps to access the Auth app's user, group, and permissions tables. Is this what you mean by "sub sites"?
As for access control, users can be assigned to groups and they can have various administrative permissions assigned to them. Add, change, and delete permissions are automatically generated for each model (i.e. database table). You can also add your own permissions.
I don't think you'll be able to separate the designers from the back-end developers at the Django level. You'll need to do something else, such as maintain separate source repositories for each and merge them to create the usable site (each group would have read-only access to the other). It really depends on your teams' discipline, because these elements can get intertwined.
Django recommends that static files be served by something else, say directly from your web server, or from another machine with a simple HTTP server (no CGI/WSGI/whatever). This is because Django can only slow down static files compared to direct service. However, for testing, ther is a static page server you can enable.
Given all that, static files usually amount to CSS, images, media, and JavaScript. Of these, the back-end people might want to mess with the JS, but that's it, so this could be in the designers' repo.
The Django tree itself has the code for the site and the apps. It's almost all back end stuff. The exception is the HTML template files, located in the "templates" directory in each app. These are the files that are filled in with the context data supplied by the back-end view code. I have no idea if this is front or back end for you guys; it could be mostly back end if there's a lot of CSS discipline, but I think that's unlikely.
There are a lot of things that you can do in Django that make life easier for one side or the other. For example, template tags allow custom Python code to generate HTML to insert into the page. I use these to generate tab bars and panes, for example.
I really can't help much more without getting a better picture of what your needs are. The question is still vague. You're probably best off taking a day or two going through the tutorial, seeing what the Django perspective is, and then working out how (or if!) it fits into your needs.

Sitecore Multisite using querystring instead of domain/subdomain?

Is there a way to setup mulitple sites to run using querystrings rather than domains/subdomains?
I am developing a site that has a Global site and multiple country specific sites (exact list of countries to be confirmed later). For development I have a Global and a Local site created and running on a temporary subdomain. If this works correctly we may run the entire application this way rather than on separate domains (similar to how apple.com appears to work)
I have successfully got the sites running locally as:
global.domain.com
a.domain.com
b.domain.com
but would like them to be able to run as:
www.domain.com/global
www.domain.com/a
www.domain.com/b
We will be implementing multiple languages on certain country sites aswell so locale will need to remain independant.
Could this be done using some sort of URL mapping rather than multiple sites or something? Where can I find information about URL mapping?
There are settings for using virtual folders (see web.config under sites node)
virtualFolder: The prefix to match for incoming URL's.
This value will be removed from the URL and the remainder will be treated as the item path.
How that works in practice I'm not sure - it's on a domain by domain basis, and all your sites will be operating from the same domain.
But I think you might want to reconsider your approach. Sub domains have several advantages. They're simple to configure in the web.config (just add a domain and point it at the right bit of the content tree).
They simplify search engine optimisation - e.g. telling google to target a specific subdomain to a geographical area in Google webmaster tools.
They're simple for visitors to understand.
Bear in mind that if you're going to use multiple languages per site then you will probably want to keep the language parameter in the URL as part of the (virtual) filepath (e.g. www.mysite.com/en-GB/products)
If you use both language and locale in the URL in that way you end up with something like www.mysite.com/UK/en-GB/products

How to configure server for small hosting company for django-powered flash sites?

I'm looking at setting up a small company that hosts flash-based websites for artist portfolios. The customer control panel would be django-powered, and would provide the interface for uploading their images, managing galleries, selling prints, etc.
Seeing as the majority of traffic to the hosted sites would end up at their top level domain, this would result in only static media hits (the HTML page with the embedded flash movie), I could set up lighttpd or nginx to handle those requests, and pass the django stuff back to apache/mod_whatever.
Seems as if I could set this all up on one box, with the django sites framework keeping each site's admin separate.
I'm not much of a server admin. Are there any gotchas I'm not seeing?
Maybe. I don't think the built-in admin interface is really designed to corral admins into their own sites. The sites framework is more suited to publish the same content on multiple sites, not to constrain users to one site or another. You'd be better off writing your own admin interface that enforces those separations.
As far as serving content goes, it seems like you could serve up a common (static) Flash file that uses a dynamic XML file to fill in content. If you use Django to generate the XML, that would give you the dynamic content you need.
This django snippet might be what you need to keep them seperate:
http://www.djangosnippets.org/snippets/1054/
"A very simple multiple user blog model with an admin interface configured to only allow people to edit or delete entries that they have created themselves, unless they are a super user."
Depending on the amount of sites you're going to host it might be easier to write a single Django app once, with admin, and to create a separate Django project for each new site. This is simple, it works for sure AND as an added bonus you can add features to newer sites without running the risk of causing problems in older sites.
Then again, it might be handier to customize the admin such that you limit the amount of objects users can see to those on the given site itself. This is fairly easy to do, allthough you might want to use RequestSite instead of the usual Site from the sites framework as that requires separate settings for each site.
There exists this one method in the ModelAdmin which you can override to have manual control over the objects being edited.