Superset loading examples don't seem to disable - apache-superset

I am running Superset in docker. At first the example datasets, charts and so on were loading. After some time I decided to disable examples.
I changed the configuration to SUPERSET_LOAD_EXAMPLES=no in the .env file. I also tried to delete this key from .env. However, examples don't seem to disappear. How can they be deleted completely?

If you've run Superset once with SUPERSET_LOAD_EXAMPLES=yes, that will populate the example data and charts/dashboards into your metadata database alongside any actual charts and data. My understanding from searching the codebase is that there's no way to undo this.
If you really want those examples gone, you can start fresh from a new metadata db, but that would mean discarding any content you've created. Or you can try deleting the examples manually, either in the UI or in the database backend.
If you can't start fresh, my personal advice is to just ignore them. Eventually they'll get pushed to the bottom by your new content, and it's nice to have them in case you need to file a reproducible bug report example. I tried deleting some of them from the database backend and all I did was corrupt them so I get 500 errors when I try to load the examples.

Related

How do I clean xDB in Sitecore?

Have recently tried working with xDB in Sitecore 8 and now looking for the way of cleaning out current stats from xDB without re-installing Sitecore. I deleted data files for Mongo (as was suggested) but still see figures in Analytics in Sitecore; also did iisreset but also did not help. What am I doing wrong? (I am new to Sitecore so might be missing something).
Have you tried to clean-up only MongoDB files, without Reporting database?
If yes, I think that is a point of your confusion. The way it works in xDB is that all tracking analytics data is written into Mongo and then by SessionEnd processed and saved into Reporting database, that is SQL database, same way as it was before previously in DMS. In that case you need to clean that database as well.
If you have access to SQL, you may use __DeleteAllReportingData stored procedure as the quickest:
More correct approach that goes well for instances where there is no direct access to DB is using admin tool for that located at /sitecore/admin/RebuildReportingDB.aspx. Also there was a module Analytics Database Manager previously, however I do not know its current state.
Reference: Walkthrough: Rebuilding the reporting database (from official documentation)

About Sitecore Backup

I am trying to backup a whole Sitecore website.
I know that the package designer can do part of the job, but not entirely.
Having a backup is always a good way when the site is broken accidently.
Is there a way or a tool to backup the whole Sitecore website?
I am new to the Sitecore, so any advise is welcome.
Thank you!
We've got a SQL job running to back-up the databases nightly.
Apart from that, when I deploy code and it's a small bit I usually end up backing up only the parts I'm going to replace. If it's a big code deploy I just back up the whole website (code-wise anyway) before deploying the code package.
Apart from that we also run scheduled backups of the code (although I don't know the intervals), and of course we've got source control if everything else fails.
If you've got an automated deployment tool you could also automate the above of course.
Before a major deploy of content or code, I typically backup the master database and zip everything in the website directory minus the App_Data and temp directories. That way if the deploy goes wrong, I can restore the code and database fairly quickly and be back to the previous state.
I have no knowledge of a tool that can do this for you, but there are a few ways you can handle this in an easy way:
1) you can create a database backup of the master database, but this only contains content and no files like media files that are saved on disk or your complete and build solution. It is always a good idea to schedule your database backup every night and save the backups for at least a week or more.
2) When you use the package designer, you can create dynamic pacakges that can contain all your content, media files and solution files on disk. This is an easy way to deploy the site onto a new Sitecore installation all at once, but it requires a manual backup every time.
3) Another way you can use is to serialize your entire content-tree to an xml-format on disk from the Developer tab. Once serialized, you can revert them back into the content tree.
I'd suggest thinking of this in two parts, the first part is backing up the application which is a simple as making sure your application is in some SCM system.
For that you can use Team Development for Sitecore. One of it's features allows you to connect a Visual Studio project to your Sitecore instance.
You can select Sitecore items that you want to be stored in your solution and it will serialize them and place them into your solution.
You can then check them into your SCM system and sleep easier.
The thing to note is deciding which item to place in source control, generally you can think of Sitecore items has developer owned and Content Editor owned. The items you will place in your solution are the items that are developer owned; templates, sublayouts, layouts, and content items that you need for the site to function are good examples.
This way if something goes bad a base restoration is quick and easy.
The second part is the backup of the content in Sitecore that has been added since your deployment. For that like Trayek said above use a SQL job to do the back-ups at whatever interval your are comfortable with.
If you're bored I have a post on using TDS (Team Development for Sitecore) you can check out at Working with Sitecore, Part Nine: TDS
Expanding bit more on what Trayek said, my suggestion would be to have a Continuous Integration (CI) and have automated deploy using Team City.
A good answer is also given here on Stack Overflow.
Basically in your case Teamcity would automatically
1. take back up of the current website (i.e. code) and deploy the new code on top of it.
2. Scripts can also be written to take a differential backup of the SQL databases, if need be.
Hope this helps.
Take a look at Sitecore Instance Manager module. Works really well for packaging entire Sitecore instance.

Django action after file upload

We have an extensive existing codebase and we've added load-balanced servers with a single master server to the equation now. There are various apps that contain models with uploaded files and images which all work fine... However, this raises the obvious problem of the rsync delay. Rsync is in the crontab and set to run every minute but this still means there's a potential 59 second wait between content being created and it actually existing on the webservers.
What I would like, is to be able to register some kind of 'post file changed' handler that triggers rsync whenever a new file is uploaded. I can't find anything of the sort though! Django has file upload handlers, but these appear to only deal with the actual upload stream, not the file as it is saved to the filesystem thereafter.
The best approach I can see is to create simple extensions to FileField, FieldFile, ImageField and ImageFieldFile as part of my project and hook into the save and delete methods in the FileField. Essentially, to create custom File and Image fields with this behaviour added. This isn't massively complicated to do but it doesn't seem like the most elegant solution to me. I'll need to teach South about my new fields, update every model that is affected and then create hordes of south migrations (which I'm pretty sure will clash with some code we have pending).
I'm also looking into creating a custom Storage class for the project, but I'm nervous about this having far-reaching effects on other pieces of code.
I can't believe no-one has come across this issue before, is there a canonical approach?
Thanks very much!
If you want to tackle this problem from the server-side (eg. similar solution to rsync) and you're running Linux, you might want to check out lsyncd:
http://code.google.com/p/lsyncd/
lsyncd uses inotify in the Linux kernel to watch directories and invoke an rsync as soon as files are modified. Fairly simple to drop in.

Coldfusion Report Builder - How can you set different datasources externally between prod/staging/dev?

Coldfusion Report Builder is great.
One small issue. We use ANT+CFANT to deploy.
When we create the report, say in a datasource called MyApp_dev on a dev box.
Our other server is the production server. It also contains a staging build to ensure everything is going smoothly before we publish to live. (thanks to Al Everett for bringing this clarification to my attention.)
Everything works great when the report is created.
We deploy the report to our staging server, which has a datasource of MyApp_Staging. That server also, may or may not, have the live app working under MyApp_Live. Ant pushes the update to Staging just great.
Run the report, crashes and burns. Why?
It seems the report is looking for the MyApp_Dev data_source, even though the application is using the MyApp_Staging datasource.
In digging around I found a few approaches, I would like to do this one, final, ideal way from the beginning instead of having to go back to do dozens of reports differently when I have a new Aha! moment.
1) Obvious: Pass in the datasource in to the cfreport tag. Doesn't work for ColdFusion Builder Reports as of v8, or v9 as tested on Linux.
2) Most realistic option (but painful) so far: Pass in the query as an object into the ColdFusion Builder report. Let's think about this:
Create the Report with the report builder to my heart's content using the RDS, etc on my local box.
When I'm done, copy the query into a snippet of code, or into a database column to be dynamically be injected at runtime with correct datasource.
Modify my "run report" event to find the query from the database column, insert it into another dynamic cfquery and potentially... evaluate (!?!) it? Fun side is I can set the cfquery datasource to what I would need for each environment.
When I modify the report's columns in CF Report Builder, I always have to update the query in the database. Is there a snippet of code that can extract this for me? Hmm.
3) Less than ideal. Suck it up and let all the reports in staging run off the live server. Maybe copy the live data into staging (sans structural changes) to let it seem similar.
Are there any eloquent ways to accomplish the above?
Thanks in Advance!
If you have different dev/staging/production boxes, why not just use the same datasource name on each? That'll save you from having the code figure out where it is.
Because security concerns at my current assignment preclude me from using RDS, I use option 2 as a matter of course. I also like it as it makes it easier to debug.

How to ensure database changes can be easily moved over DVCS using django

Overview
I'm building a website in django. I need to allow people to begin to add flatpages, and set some settings in the admin. These changes should be definitive, since that information comes from the client. However, I'm also developing the backend, and as such will am creating and migrating tables. I push these changes to the hub.
Tools
django
git
south
postgres
Problem
How can I ensure that I get the database changes from the online site down to me on my lappy, and also how can I push my database changes up to the live site, so that we have a minimum of co-ordination needed? I am familiar with git hooks, so that option is in play.
Addendum:
I guess I know which tables can be modified via the admin. There should not be much overlap really. As I consider further, the danger really is me pushing data that would overwrite something they have done.
Thanks.
For getting your schema changes up to the server, just use South carefully. If you modify any table they might have data in, make sure you write both a schema migration and as necessary a data migration to preserve the sense of their data.
For getting their updated data back down to you (which doesn't seem critical, but might be nice to work with up-to-date test data as you're developing), I generally just use Django fixtures and the dumpdata and loaddata commands. It's easy enough to dump a fixture and commit it to your repo, then a loaddata on your end.
You could try using git hooks to automate some of this, but if you want automation I do recommend trying something like Fabric instead. Much of this stuff doesn't need to be run every single time you push/pull (in particular, I usually wouldn't want to dump a new data fixture that frequently).
You should probably take a look at South:
http://south.aeracode.org/
It seems to me that you could probably create a git hook that triggers off South if you are doing some sort of continuous integration system.
Otherwise, every time you do a push you will have to manually execute the migration steps yourself. Don't forget to put up the "site is under maintenance" message. ;)
I recommend that you use mk-table-sync to pull changes from live server to your laptop.
mk-table-sync takes a lot of parameters so you can automate this process by using fabric. You would basically create a fabric function that executes mk-table-sync on each tablet that you want to pull from the server.
This means that you can not make dabatase changes yourself, because they will be overwritten by the pull.
The only changes that you would be making to the live database are using South. You would push the code to the server and then run migrate to update the database schema.