Shader source code compactor - glsl

Back at the time when the Internet was expensive and slow, the website authors used all sorts of HTML / JavaScript compression tools that would remove whitespace and shorten the names of variables.
Is there such a tool for GLSL shaders? I was going to write it myself, but then I realized there should be such a tool already out there, yet, I was unable to find one.

There is GLSL-unit doing this.
Also a browser version version following this link.

Related

How can I compile my ColdFusion code for sourceless distribution, and have it be unreadable?

I've been tasked with creating a deployable version of a ColdFusion web app to be installed on a clients server. I'm trying to find a way to give them a compiled version of our code, and my first inclination was to use the CFCompile utility that I found here. However, after running CFCompile, most of the code in the CFM files is still readable. The only thing that appears to be obfuscated at all is the actual ColdFusion code - all of the SQL Queries are still perfectly readable. (Example in the screenshot below)
The HTML and JavaScript are also still readable in the compiled code, but that doesn't matter as those can be seen in a web browser anyways.
Is there another way to distribute my source code in a format that is completely unreadable to the user? I'm guessing that for whatever method I choose, there will be some way of decompiling the code. That's not an issue, I just need to find a way to make it more difficult than opening the file and seeing the queries.
Hostek has a pretty good write up on the subject over on their site - How to Encrypt or Compile ColdFusion Files.
Basically, from that article:
Using cfcompile.bat
The cfcompile.bat utility will compile all .cfm and .cfc files within a given directory into Java bytecode. This has the effect of making your source code unreadable, and it also prevents ColdFusion from having to compile your ColdFusion files on first use which provides a small performance enhancement.
More details about using cfcompile.bat can be found in ColdFusion's Documentation
Using cfencode.exe
The cfencode.exe utility will apply basic encryption to a specific file or directory. If used to encrypt a directory, it will apply encryption to ALL files in the directory which can break any JS, CSS, images, or other non-ColdFusion files.
They do also include this note at the bottom:
Note: Encrypting your site files with cfencode does not guarantee absolute security of your source code, but it does add a layer of obfuscation to help prevent unauthorized individuals from viewing the source.
The article goes on to give basic instructions on how to use each.
Adobe has this note on their site regarding cfencode:
Note: You can also use the cfencode utility, located in the cf_root/bin directory, to obscure ColdFusion pages that you distribute. Although this technique cannot prevent persistent hackers from determining the contents of your pages, it does prevent inspection of the pages. The cfencode utility is not available on OS X.
I would also add that it will be trivial for anyone familiar with ColdFusion to decode anything encoded with this utility because they also provide the decoder.

Handling really large multi language projects

I am working on an really large multi language project (1000+ Classes + Configs + Scripts), with files distributed over network drives. I am having trouble fighting through the code, since the available Tools are not helping. The main problem is finding things. For the C++ Part: VS with VAX can only find files and symbols which are in the solution. A lot of them are not. Same problem with Reshaper. Right now i am stuck with doing unindexed string and file searches, which is highly inefficient on a network drive. I heared that SourceInsight would be an option since it allows you to just specify the folders that are part of the project and than indexes them, but my company wont spent money on it.
So my question ist: what Tools are there available to fight through an incredible large amount of code? And if possible they should be low cost or even free/open source.
Check out -
ctags
cscope
idutils
snavigator
In every one of these tools, you would have to invest(*) some time in reading the documentation, and then building your index. Consider switching to an editor that will work with these tools.
(*): I do mean invest, because it will reap dividends once you do.
hope this helps,
If you need to maintain a large amount of code, you really should have a source code managment system, a lot of them will help you find text by indexing all the files
And Most of them will work with various language.
Otherwise you can install some indexer like Apache Lucene and index all your files...
You should take a look at LXR. This is used by many Linux kernel source listings.
Try ndexer http://code.google.com/p/ndexer/
promises to Handle extremely large codebases!
The Perl program ack is also worth a look -- think of it as multi-file grep on steroids. The new version (in what I would call late beta) even lets you specify regexes for the files to process as well as regexes to search for -- a feature I've used extensively since it came out (I've got a subproject with 30k lines in 300+ classes, where this feature has been very helpful). You can even chain the new ack with itself so you can subselect the files to process.
VS with VAX can only find files and symbols which are in the solution. A lot of them are not.
You can add all the files that are not in your solution and set them to not build in the settings. Your VS build will not be affected by this, but now VS knows about those files and you can search them along with your VS native files.

Google Bot information?

Does anyone know any more details about google's web-crawler (aka GoogleBot)? I was curious about what it was written in (I've made a few crawlers myself and am about to make another) and if it parses images and such. I'm assuming it does somewhere along the line, b/c the images in images.google.com are all resized. It also wouldn't surprise me if it was all written in Python and if they used all their own libraries for most everything, including html/image/pdf parsing. Maybe they don't though. Maybe it's all written in C/C++. Thanks in advance-
you can find a bit about how googlebot works here:
http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=158587
for example the "fetch as googlebot" tool lets you see a page as Googlebot sees it.
The crawler is very likely written in C or C++, at least backrub's crawler was written in one of these.
Be aware that the crawler only takes a snapshot of the page, then stores it in a temporary database for later processing. The indexing and other attached algorithms will extract the data, for example the image references.
Officially allowed languages at Google, I think, are Python/C++/Java.
The bot likely uses all 3 for different tasks.

Publishing toolchain

I have a book project which I'd like to start sooner than later. This would follow an agile-like publishing workflow, i.e: publish early and often. It is meant to be self-publsihed by me and I'm not really looking to paper-publish it, even though we never know.
If I weren't a geek, I'd probably have already started writting in Word or any other WYSIWYG tool and just export to PDF. However, we know it is not the best solution, and emacs rules my text-editing life, so, the output format should be as simple as possible and be text-based.
I've thought about the following options:
Just use orgmode and export to PDF (orgmode has this feature natively)
Use markdown mode and export to PDF (markdown->LaTeX->PDF should not be hard to setup);
Use something similar to what the guys # Pragmatic Progammers do: A XML + XSLT + LaTeX.
More complex, but much more control over the style.
EDIT: Someone just told me that he uses a combo of Textile+Adobe In Design and the XTags plugin. Not sure how they are glued together though, gotta do some research.
Any other ideas / references ?
I want to start writting as soon as possible. In fact, I already have a draft in an org-formatted file. However, I do want to have and use the full power of LaTex later on to format it the way I want and make it look fabulous :)
Thanks in advance,
Marcelo.
I have done a TON of research on this lately, since I'm planning on starting my own small press soon.
It really depends on what you want your final output to be (PDF, HTML, other?), and what the book is about.
Org mode is great, as I'm sure you know, because it expands as you do. I often write my outlines in org mode, then just fill in the body text when I'm really ready to start writing.
IF it's prose, and you just need some simple divisions (chapters and sections and not much else), org mode -> latex should do you just fine. Then you also have the possibility of org mode -> html
IF you need math in it, you can just write the math right in the org mode file.
If it's really really technical information, docbook might be nice (emacs + nxml), then dockbook 4.5 -> jade -> jadetex -> pdf.
I'd stay away from docbook 5, because it uses FOP to generate PDFs, and the typesetting is really inferior to latex.
BOTTOM LINE: If you want a PDF, use org -> latex, the path of least resistance ;) -- whatever you do, concentrate on the content of the book first, and worry about what it looks like til after.
And why not paper publish? Have you looked at lulu.com? I recently formatted a book with latex, uploaded the pdf to lulu, and had them print it. The quality is pretty good, and definitely worth a look. I have a ton of bookmarks at home about publishing in general, if you're interested.
Typography is hard.
TeX/LaTeX are tools that can get you the best possible results, however they require knowledge about typography to be used correctly--especially with a big document like a book. And I haven't seen any other cheap (=not for professional use) software that would do things correctly automatically. (I haven't seen any professional software, so it is possible they don't do that either)
However, assuming that you'll write your book in some machine-readable format, putting it into TeX/LaTeX should not be very hard: once I had a set of documents in a custom XML format. Proper usage of XSLT, TeXML and LaTeX gave me something I could tweak manually (and this tweaking was necessary!) and get the best possible result.
My advice: prepare content in something that is easy to parse and easy to write in. I'd dismiss XML. Markdown seems to be good choice. This will also allow you to quickly show your work. Then if you decide to make the result better, write some simple script to translate that to TeX (it is not that hard to get basic functionality) and fix things by hand. This might actually be a good exercise to learn TeX.
Don't try to get everything right from the beginning. Firstly get the content, then play with formatting.
If you are really wanting to do online only, I would suggest you use org mode and just stay in HTML. Then you can use CSS to style it however you would like.
That being said, if you really want to output to PDF for technical stuff, I would strongly suggest using Docbook (www.docbook.org). It's made for that, it works great with Emacs.
You have already answered yourself. Not mentioning that you already started writing in org-mode. Org-mode is really extremely powerful and will enable you to publish to PDF and HTML eventually with no effort.
In case of PDF you can take advantage of LaTeX and how org-mode is working with exports. You can include any LaTeX code to your org file. Also IMHO it's way better to write the book/article in org-mode since something becomes even easier than in plain .tex files take for example tables.
Regarding Publishing it's a same story with one single function you can trigger exporting to HTML/PDF and uploading to your server. And notice that you are still using just plain text file which is human readable and very clean.
Org-mode really follows the Emacs philosphy just start using it and it will grow with you.
If you are writing a book, it would certainly be worth the overhead of learning tex.
Even something like,
\documentclass[a4paper,10pt]{book}
\title{SERPA'S BOOK}
\author{SERPA}
\date{\today}
\begin{document}
\maketitle
\tableofcontents
\include{chapterA}
\include{chapterB}
\include{chapterC}
\end{document}
Then, in the same directory have files chapterA.tex, chapterB.tex, chapterC.tex that look like
\chapter{My chapter title}
Lorem ipsum dolor sit amet, consectetur adipiscing elit....
That alone will produce an extremely nice looking document. You can edit each chapter separately and then just compile the main tex file. I think if you try to learn intermediate tools that try to abstract away from tex, you'll only make it more difficult later to do what you actually want, because you will be both fighting tex and an abstraction of tex at the same time.
Best of luck on such an undertaking.
Also, no matter what you do, make sure to use some kind of version control system, such as SVN, to manage your files. It will be worth it.
I would write it in Latex and have an online repository that does nightly compiles to PDF of the 'publish-ready' branch, available to readers.
I would not start with using LaTeX these days. TeX input is unstructured and the only thing you can get out of TeX input is PDF. If you need HTML or anything else, you are screwed.
Use something structured, such as XML (DocBook is a good suggestion) or define your own XML subset as you need it. Use XSLT to transform it into something usable (HTML etc.) That way you are set for the future.
Depending on your typographical needs, you can then use TeX as a backend processor, or XSLT or whatever.
Also, have a look at ConTeXt, it can read XML directly and has great typography!

library for doing diffs

I've been tasked with creating a tool that can diff and merge the configuration files for my company's product. The configurations are stored as either XML or URL-encoded strings. I'm looking for a library, preferably open source with a license compatible with commercial software, that can do these diffs. Our app is written in C++, so C++ libraries would be best, but I'm willing to look at libraries that are C#-specific since I can write a wrapper that exposes it to C++ via COM. Three-way diffs would be ideal, but two-way is acceptable. If it has an understanding of XML, that would also be a plus (since XML nodes can be reordered without changing the document, etc). Any library suggestions? Should I even consider writing my own diff tools in the hopes of giving it semantic knowledge of our formats?
Thanks to this similar question, I've already discovered this google library, which seems really great, but I'm still looking for other options. It also seems to be able to output the diffs in HTML format (using the <ins> and <del> tags that I didn't know existed before I discovered it), which could be really handy, but it seems to be a unified diff only. I'm going to need to display the results in a web browser, and probably have to build an interface for doing the merges in the browser as well. I don't expect a library to be able to help with these tasks, but it must produce output in a format that is amenable to me building this on top of it. I'm currently envisioning something along the lines of TortoiseMerge (side-by-side diffs, not unified), except browser-based. Any tips/tricks/design ideas on how to present this would be appreciated too.
Subversion comes with libsvn_diff and libsvn_delta licensed under Apache Software License.
Here is a C++ library that can diff what the author calls semistructured data. It deals nicely with HTML and XML. Since your data is XML it would make a lot of sense to use this instead of plain text diff. This is especially the case when the files are machine generated.
I am currently trying to use this library to build a tool that diffs Visual Studio project files. These are basically XML files and using a plain diff tool like Winmerge is too painful because Visual Studio pretty much mucks up the whole file by crazy reordering. The idea is to do some kind of a structured diff to address the problem.
For diffing the XML I would propose that you normalize it first: sort all the elements in alphabetic order, then generate a stream of tokens/xml that represents the original document but is independent of the original formatting. After running the diff, parse the result to get a tree containing what was added / removed.